linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-12-23	sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true)	Vincent Guittot
	sched_feat(UTIL_EST_FASTUP) has been added to easily disable the feature in order to check for possibly related regressions. After 3 years, it has never been used and no regression has been reported. Let's remove it and make fast increase a permanent behavior. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Hongyan Xia <hongyan.xia2@arm.com> Reviewed-by: Tang Yizhou <yizhou.tang@shopee.com> Reviewed-by: Yanteng Si <siyanteng@loongson.cn> [for the Chinese translation] Reviewed-by: Alex Shi <alexs@kernel.org> Link: https://lore.kernel.org/r/20231201161652.1241695-2-vincent.guittot@linaro.org
2023-12-23	arm64/amu: Use capacity_ref_freq() to set AMU ratio	Vincent Guittot
	Use the new capacity_ref_freq() method to set the ratio that is used by AMU for computing the arch_scale_freq_capacity(). This helps to keep everything aligned using the same reference for computing CPUs capacity. The default value of the ratio (stored in per_cpu(arch_max_freq_scale)) ensures that arch_scale_freq_capacity() returns max capacity until it is set to its correct value with the cpu capacity and capacity_ref_freq(). Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20231211104855.558096-8-vincent.guittot@linaro.org
2023-12-23	cpufreq/cppc: Set the frequency used for computing the capacity	Vincent Guittot
	Save the frequency associated to the performance that has been used when initializing the capacity of CPUs. Also, cppc cpufreq driver can register an artificial energy model. In such case, it needs the frequency for this compute capacity. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20231211104855.558096-7-vincent.guittot@linaro.org
2023-12-23	cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz\|khz_to_perf}()	Vincent Guittot
	Move and rename cppc_cpufreq_perf_to_khz() and cppc_cpufreq_khz_to_perf() to use them outside cppc_cpufreq in topology_init_cpu_capacity_cppc(). Modify the interface to use struct cppc_perf_caps caps instead of struct cppc_cpudata cpu_data as we only use the fields of cppc_perf_caps. cppc_cpufreq was converting the lowest and nominal freq from MHz to kHz before using them. We move this conversion inside cppc_perf_to_khz and cppc_khz_to_perf to make them generic and usable outside cppc_cpufreq. No functional change Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20231211104855.558096-6-vincent.guittot@linaro.org
2023-12-23	energy_model: Use a fixed reference frequency	Vincent Guittot
	The last item of a performance domain is not always the performance point that has been used to compute CPU's capacity. This can lead to different target frequency compared with other part of the system like schedutil and would result in wrong energy estimation. A new arch_scale_freq_ref() is available to return a fixed and coherent frequency reference that can be used when computing the CPU's frequency for an level of utilization. Use this function to get this reference frequency. Energy model is never used without defining arch_scale_freq_ref() but can be compiled. Define a default arch_scale_freq_ref() returning 0 in such case. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20231211104855.558096-5-vincent.guittot@linaro.org
2023-12-23	cpufreq/schedutil: Use a fixed reference frequency	Vincent Guittot
	cpuinfo.max_freq can change at runtime because of boost as an example. This implies that the value could be different than the one that has been used when computing the capacity of a CPU. The new arch_scale_freq_ref() returns a fixed and coherent reference frequency that can be used when computing a frequency based on utilization. Use this arch_scale_freq_ref() when available and fallback to policy otherwise. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20231211104855.558096-4-vincent.guittot@linaro.org
2023-12-23	cpufreq: Use the fixed and coherent frequency for scaling capacity	Vincent Guittot
	cpuinfo.max_freq can change at runtime because of boost as an example. This implies that the value could be different from the frequency that has been used to compute the capacity of a CPU. The new arch_scale_freq_ref() returns a fixed and coherent frequency that can be used to compute the capacity for a given frequency. [ Also fix a arch_set_freq_scale() newline style wart in <linux/cpufreq.h>. ] Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20231211104855.558096-3-vincent.guittot@linaro.org
2023-12-23	sched/topology: Add a new arch_scale_freq_ref() method	Vincent Guittot
	Create a new method to get a unique and fixed max frequency. Currently cpuinfo.max_freq or the highest (or last) state of performance domain are used as the max frequency when computing the frequency for a level of utilization, but: - cpuinfo_max_freq can change at runtime. boost is one example of such change. - cpuinfo.max_freq and last item of the PD can be different leading to different results between cpufreq and energy model. We need to save the reference frequency that has been used when computing the CPUs capacity and use this fixed and coherent value to convert between frequency and CPU's capacity. In fact, we already save the frequency that has been used when computing the capacity of each CPU. We extend the precision to save kHz instead of MHz currently and we modify the type to be aligned with other variables used when converting frequency to capacity and the other way. [ mingo: Minor edits. ] Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20231211104855.558096-2-vincent.guittot@linaro.org
2023-12-23	Merge tag 'v6.7-rc6' into sched/core, to pick up fixes	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-12-23	fs: factor out backing_file_mmap() helper	Amir Goldstein
	Assert that the file object is allocated in a backing_file container so that file_user_path() could be used to display the user path and not the backing file's path in /proc/<pid>/maps. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-12-23	fs: factor out backing_file_splice_{read,write}() helpers	Amir Goldstein
	There is not much in those helpers, but it makes sense to have them logically next to the backing_file_{read,write}_iter() helpers as they may grow more common logic in the future. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-12-23	fs: factor out backing_file_{read,write}_iter() helpers	Amir Goldstein
	Overlayfs submits files io to backing files on other filesystems. Factor out some common helpers to perform io to backing files, into fs/backing-file.c. Suggested-by: Miklos Szeredi <miklos@szeredi.hu> Link: https://lore.kernel.org/r/CAJfpeguhmZbjP3JLqtUy0AdWaHOkAPWeP827BBWwRFEAUgnUcQ@mail.gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-12-23	fs: prepare for stackable filesystems backing file helpers	Amir Goldstein
	In preparation for factoring out some backing file io helpers from overlayfs, move backing_file_open() into a new file fs/backing-file.c and header. Add a MAINTAINERS entry for stackable filesystems and add a Kconfig FS_STACK which stackable filesystems need to select. For now, the backing_file struct, the backing_file alloc/free functions and the backing_file_real_path() accessor remain internal to file_table.c. We may change that in the future. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-12-23	kbuild: fix build ID symlinks to installed debug VDSO files	Masahiro Yamada
	Commit 56769ba4b297 ("kbuild: unify vdso_install rules") accidentally dropped the '.debug' suffix from the build ID symlinks. Fixes: 56769ba4b297 ("kbuild: unify vdso_install rules") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-12-23	gen_compile_commands.py: fix path resolve with symlinks in it	Jialu Xu
	When a path contains relative symbolic links, os.path.abspath() might not follow the symlinks and instead return the absolute path with just the relative paths resolved, resulting in an incorrect path. 1. Say "drivers/hdf/" has some symlinks: # ls -l drivers/hdf/ total 364 drwxrwxr-x 2 ... 4096 ... evdev lrwxrwxrwx 1 ... 44 ... framework -> ../../../../../../drivers/hdf_core/framework -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h lrwxrwxrwx 1 ... 55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api lrwxrwxrwx 1 ... 53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux -rw-r--r-- 1 ... 74 ... Makefile drwxrwxr-x 3 ... 4096 ... wifi 2. One .cmd file records that: # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \ /path/to/src/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c 3. os.path.abspath returns "/path/to/src/framework/core/manager/src/devmgr_service.c", not correct: # ./scripts/clang-tools/gen_compile_commands.py INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \ /path/to/src/framework/core/manager/src/devmgr_service.c not found Use os.path.realpath(), which resolves the symlinks and normalizes the paths correctly. # cat compile_commands.json ... { "command": ... "directory": ... "file": "/path/to/bla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c" }, ... Also fix it in parse_arguments(). Signed-off-by: Jialu Xu <xujialu@vimux.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-12-23	MAINTAINERS: Add scripts/clang-tools to Kbuild section	Nathan Chancellor
	Masahiro has always applied scripts/clang-tools patches but it is not included in the Kbuild section, so neither he nor linux-kbuild get cc'd on patches that touch those files. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-12-23	linux/export: Fix alignment for 64-bit ksymtab entries	Helge Deller
	An alignment of 4 bytes is wrong for 64-bit platforms which don't define CONFIG_HAVE_ARCH_PREL32_RELOCATIONS (which then store 64-bit pointers). Fix their alignment to 8 bytes. Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost") Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-12-22	Input: soc_button_array - add mapping for airplane mode button	Christoffer Sandberg
	This add a mapping for the airplane mode button on the TUXEDO Pulse Gen3. While it is physically a key it behaves more like a switch, sending a key down on first press and a key up on 2nd press. Therefor the switch event is used here. Besides this behaviour it uses the HID usage-id 0xc6 (Wireless Radio Button) and not 0xc8 (Wireless Radio Slider Switch), but since neither 0xc6 nor 0xc8 are currently implemented at all in soc_button_array this not to standard behaviour is not put behind a quirk for the moment. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Link: https://lore.kernel.org/r/20231215171718.80229-1-wse@tuxedocomputers.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2023-12-23	drm: Warn when freeing a framebuffer that's still on a list	Ville Syrjälä
	Sprinkle some extra WARNs around so that we might catch premature framebuffer destruction more readily. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231211081625.25704-2-ville.syrjala@linux.intel.com Acked-by: Javier Martinez Canillas <javierm@redhat.com>
2023-12-23	drm: Don't unref the same fb many times by mistake due to deadlock handling	Ville Syrjälä
	If we get a deadlock after the fb lookup in drm_mode_page_flip_ioctl() we proceed to unref the fb and then retry the whole thing from the top. But we forget to reset the fb pointer back to NULL, and so if we then get another error during the retry, before the fb lookup, we proceed the unref the same fb again without having gotten another reference. The end result is that the fb will (eventually) end up being freed while it's still in use. Reset fb to NULL once we've unreffed it to avoid doing it again until we've done another fb lookup. This turned out to be pretty easy to hit on a DG2 when doing async flips (and CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y). The first symptom I saw that drm_closefb() simply got stuck in a busy loop while walking the framebuffer list. Fortunately I was able to convince it to oops instead, and from there it was easier to track down the culprit. Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231211081625.25704-1-ville.syrjala@linux.intel.com Acked-by: Javier Martinez Canillas <javierm@redhat.com>
2023-12-22	Merge tag 'block-6.7-2023-12-22' of git://git.kernel.dk/linux	Linus Torvalds
	Pull block fixes from Jens Axboe: "Just an NVMe pull request this time, with a fix for bad sleeping context, and a revert of a patch that caused some trouble" * tag 'block-6.7-2023-12-22' of git://git.kernel.dk/linux: nvme-pci: fix sleeping function called from interrupt context Revert "nvme-fc: fix race between error recovery and creating association"
2023-12-22	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull kvm fixes from Paolo Bonzini: "RISC-V: - Fix a race condition in updating external interrupt for trap-n-emulated IMSIC swfile - Fix print_reg defaults in get-reg-list selftest ARM: - Ensure a vCPU's redistributor is unregistered from the MMIO bus if vCPU creation fails - Fix building KVM selftests for arm64 from the top-level Makefile x86: - Fix breakage for SEV-ES guests that use XSAVES Selftests: - Fix bad use of strcat(), by not using strcat() at all" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests KVM: selftests: Fix dynamic generation of configuration names RISCV: KVM: update external interrupt atomically for IMSIC swfile KVM: riscv: selftests: Fix get-reg-list print_reg defaults KVM: selftests: Ensure sysreg-defs.h is generated at the expected path KVM: Convert comment into an assertion in kvm_io_bus_register_dev() KVM: arm64: vgic: Ensure that slots_lock is held in vgic_register_all_redist_iodevs() KVM: arm64: vgic: Force vcpu vgic teardown on vcpu destroy KVM: arm64: vgic: Add a non-locking primitive for kvm_vgic_vcpu_destroy() KVM: arm64: vgic: Simplify kvm_vgic_destroy()
2023-12-22	arm64: defconfig: Enable Qualcomm SC8280XP camera clock controller	Bjorn Andersson
	With the camera clock controller added to the DeviceTree of SC8280XP the interconnect providers no longer reaches sync_state, resulting in a noticeable reduction in battery life. Enable the camera clock controller (as a module) to avoid this, and hopefully soon provide some level of camera support. Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Link: https://lore.kernel.org/r/20231221-enable-sc8280xp-camcc-v1-1-2249581dd538@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-12-23	Merge branch 'dpaa2-switch-small-improvements'	David S. Miller
	Ioana Ciornei says: ==================== dpaa2-switch: small improvements This patch set consists of a series of small improvements on the dpaa2-switch driver ranging from adding some more verbosity when encountering errors to reorganizing code to be easily extensible. Changes in v3: - 4/8: removed the fixes tag and moved it to the commit message - 5/8: specified that there is no user-visible effect - 6/8: removed the initialization of the err variable Changes in v2: - No changes to the actual diff, only rephrased some commit messages and added more information. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	dpaa2-switch: cleanup the egress flood of an unused FDB	Ioana Ciornei
	In case a DPAA2 switch interface joins a bridge, the FDB used on the port will be changed to the one associated with the bridge. What this means exactly is that any VLAN installed on the port will need to be removed and then installed back so that it points to the new FDB. Once this is done, the previous FDB will become unused (no VLAN to point to it). Even though no traffic will reach this FDB, it's best to just cleanup the state of the FDB by zeroing its egress flood domain. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	dpaa2-switch: move a check to the prechangeupper stage	Ioana Ciornei
	Two different DPAA2 switch ports from two different DPSW instances cannot be under the same bridge. Instead of checking for this unsupported configuration in the CHANGEUPPER event, check it as early as possible in the PRECHANGEUPPER one. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	dpaa2-switch: reorganize the [pre]changeupper events	Ioana Ciornei
	Create separate functions, dpaa2_switch_port_prechangeupper and dpaa2_switch_port_changeupper, to be called directly when a DPSW port changes its upper device. This way we are not open-coding everything in the main event callback and we can easily extent, for example, with bond offload. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	dpaa2-switch: do not clear any interrupts automatically	Ioana Ciornei
	The DPSW object has multiple event sources multiplexed over the same IRQ. The driver has the capability to configure only some of these events to trigger the IRQ. The dpsw_get_irq_status() can clear events automatically based on the value stored in the 'status' variable passed to it. We don't want that to happen because we could get into a situation when we are clearing more events than we actually handled. Just resort to manually clearing the events that we handled. Also, since status is not used on the out path we remove its initialization to zero. This change does not have a user-visible effect because the dpaa2-switch driver enables and handles all the DPSW events which exist at the moment. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	dpaa2-switch: add ENDPOINT_CHANGED to the irq_mask	Ioana Ciornei
	Commit 84cba72956fd ("dpaa2-switch: integrate the MAC endpoint support") added support for MAC endpoints in the dpaa2-switch driver but omitted to add the ENDPOINT_CHANGED irq to the list of interrupt sources. Fix this by extending the list of events which can raise an interrupt by extending the mask passed to the dpsw_set_irq_mask() firmware API. There is no user visible impact even without this patch since whenever a switch interface is connected/disconnected from an endpoint both events are set (LINK_CHANGED and ENDPOINT_CHANGED) and, luckily, the LINK_CHANGED event could actually raise the interrupt and thus get the MAC/PHY SW configuration started. Even with this, it's better to just not rely on undocumented firmware behavior which can change. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	dpaa2-switch: print an error when the vlan is already configured	Ioana Ciornei
	Print a netdev error when we hit a case in which a specific VLAN is already configured on the port. While at it, change the already existing netdev_warn into an _err for consistency purposes. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	dpaa2-switch: declare the netdev as IFF_LIVE_ADDR_CHANGE capable	Ioana Ciornei
	There is no restriction around the change of the MAC address on the switch ports, thus declare the interface netdevs IFF_LIVE_ADDR_CHANGE capable. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	dpaa2-switch: set interface MAC address only on endpoint change	Ioana Ciornei
	There is no point in updating the MAC address of a switch interface each time the link state changes, this only needs to happen in case the endpoint changes (the switch interface is [dis]connected from/to a MAC). Just move the call to dpaa2_switch_port_set_mac_addr() under DPSW_IRQ_EVENT_ENDPOINT_CHANGED. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	Merge branch 'am65-cpsw-preemption-coalescing'	David S. Miller
	Roger Quadros says: ==================== net: ethernet: am65-cpsw: Add mqprio, frame preemption & coalescing This series adds mqprio qdisc offload in channel mode, Frame Preemption MAC merge support and RX/TX coalesing for AM65 CPSW driver. In v11 following changes were made - Fix patch "net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode" by including units.h Changelog information in each patch file. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	net: ethernet: ti: am65-cpsw: add sw tx/rx irq coalescing based on hrtimers	Grygorii Strashko
	Add SW IRQ coalescing based on hrtimers for TX and RX data path which can be enabled by ethtool commands: - RX coalescing ethtool -C eth1 rx-usecs 50 - TX coalescing can be enabled per TX queue - by default enables coalesing for TX0 ethtool -C eth1 tx-usecs 50 - configure TX0 ethtool -Q eth0 queue_mask 1 --coalesce tx-usecs 100 - configure TX1 ethtool -Q eth0 queue_mask 2 --coalesce tx-usecs 100 - configure TX0 and TX1 ethtool -Q eth0 queue_mask 3 --coalesce tx-usecs 100 --coalesce tx-usecs 100 show configuration for TX0 and TX1: ethtool -Q eth0 queue_mask 3 --show-coalesce Comparing to gro_flush_timeout and napi_defer_hard_irqs, this patch allows to enable IRQ coalesing for RX path separately. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	net: ethernet: ti: am65-cpsw-qos: Add Frame Preemption MAC Merge support	Roger Quadros
	Add driver support for viewing / changing the MAC Merge sublayer parameters and seeing the verification state machine's current state via ethtool. As hardware does not support interrupt notification for verification events we resort to polling on link up. On link up we try a couple of times for verification success and if unsuccessful then give up. The Frame Preemption feature is described in the Technical Reference Manual [1] in section: 12.3.1.4.6.7 Intersperced Express Traffic (IET – P802.3br/D2.0) Due to Silicon Errata i2208 [2] we set limit min IET fragment size to 124 (excluding 4 bytes mCRC). [1] AM62x TRM - https://www.ti.com/lit/ug/spruiv7a/spruiv7a.pdf [2] AM62x Silicon Errata - https://www.ti.com/lit/er/sprz487c/sprz487c.pdf Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode	Grygorii Strashko
	This patch adds MQPRIO Qdisc offload in full 'channel' mode which allows not only setting up pri:tc mapping, but also configuring TX shapers (rate-limiting) on external port FIFOs. The MQPRIO Qdisc offload is expected to work with or without VLAN/priority tagged packets. The CPSW external Port FIFO has 8 Priority queues. The rate-limit can be set for each of these priority queues. Which Priority queue a packet is assigned to depends on PN_REG_TX_PRI_MAP register which maps header priority to switch priority. The header priority of a packet is assigned via the RX_PRI_MAP_REG which maps packet priority to header priority. The packet priority is either the VLAN priority (for VLAN tagged packets) or the thread/channel offset. For simplicity, we assign the same priority queue to all queues of a Traffic Class so it can be rate-limited correctly. Configuration example: ethtool -L eth1 tx 5 ethtool --set-priv-flags eth1 p0-rx-ptype-rrobin off tc qdisc add dev eth1 parent root handle 100: mqprio num_tc 3 \ map 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 1@1 1@2 hw 1 mode channel \ shaper bw_rlimit min_rate 0 100mbit 200mbit max_rate 0 101mbit 202mbit tc qdisc replace dev eth2 handle 100: parent root mqprio num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 hw 1 ip link add link eth1 name eth1.100 type vlan id 100 ip link set eth1.100 type vlan egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 In the above example two ports share the same TX CPPI queue 0 for low priority traffic. 3 traffic classes are defined for eth1 and mapped to: TC0 - low priority, TX CPPI queue 0 -> ext Port 1 fifo0, no rate limit TC1 - prio 2, TX CPPI queue 1 -> ext Port 1 fifo1, CIR=100Mbit/s, EIR=1Mbit/s TC2 - prio 3, TX CPPI queue 2 -> ext Port 1 fifo2, CIR=200Mbit/s, EIR=2Mbit/s Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	net: ethernet: am65-cpsw: Move register definitions to header file	Roger Quadros
	Move register definitions to header file. No functional change. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	net: ethernet: ti: am65-cpsw: Move code to avoid forward declaration	Roger Quadros
	Move this code around to avoid forward declaration. No functional change. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	net: ethernet: am65-cpsw: cleanup TAPRIO handling	Roger Quadros
	Handle offloading commands using switch-case in am65_cpsw_setup_taprio(). Move checks to am65_cpsw_taprio_replace(). Use NL_SET_ERR_MSG_MOD for error messages. Change error message from "Failed to set cycle time extension" to "cycle time extension not supported" Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	net: ethernet: am65-cpsw: Rename TI_AM65_CPSW_TAS to TI_AM65_CPSW_QOS	Roger Quadros
	We will use this Kconfig option to not only enable TAS/EST offload but also other QoS features like Multiqueue priority descriptors and MAC-Merge/Frame Preemption. TI_AM65_CPSW_QOS seems a more appropriate Kconfig option name than TI_AM65_CPSW_TAS. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	net: ethernet: am65-cpsw: Build am65-cpsw-qos only if required	Roger Quadros
	Build am65-cpsw-qos only if CONFIG_TI_AM65_CPSW_TAS is enabled. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	selftests: forwarding: ethtool_mm: fall back to aggregate if device does not ↵	Vladimir Oltean
	report pMAC stats Some devices do not support individual 'pmac' and 'emac' stats. For such devices, resort to 'aggregate' stats. Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Roger Quadros <rogerq@kernel.org> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	selftests: forwarding: ethtool_mm: support devices with higher rx-min-frag-size	Vladimir Oltean
	Some devices have errata due to which they cannot report ETH_ZLEN (60) in the rx-min-frag-size. This was foreseen of course, and lldpad has logic that when we request it to advertise addFragSize 0, it will round it up to the lowest value that is _actually_ supported by the hardware. The problem is that the selftest expects lldpad to report back to us the same value as we requested. Make the selftest smarter by figuring out on its own what is a reasonable value to expect. Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Roger Quadros <rogerq@kernel.org> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	Merge branch 'net-selftests-unique-namespace-last-part'	David S. Miller
	Hangbin Liu says: ==================== Convert net selftests to run in unique namespace (last part) Here is the last part of converting net selftests to run in unique namespace. This part converts all left tests. After the conversion, we can run the net sleftests in parallel. e.g. # ./run_kselftest.sh -n -t net:reuseport_bpf TAP version 13 1..1 # selftests: net: reuseport_bpf ok 1 selftests: net: reuseport_bpf mod 10... # Socket 0: 0 # Socket 1: 1 ... # Socket 4: 19 # Testing filter add without bind... # SUCCESS # ./run_kselftest.sh -p -n -t net:cmsg_so_mark.sh -t net:cmsg_time.sh -t net:cmsg_ipv6.sh TAP version 13 1..3 # selftests: net: cmsg_so_mark.sh ok 1 selftests: net: cmsg_so_mark.sh # selftests: net: cmsg_time.sh ok 2 selftests: net: cmsg_time.sh # selftests: net: cmsg_ipv6.sh ok 3 selftests: net: cmsg_ipv6.sh # ./run_kselftest.sh -p -n -c net TAP version 13 1..95 # selftests: net: reuseport_bpf_numa ok 3 selftests: net: reuseport_bpf_numa # selftests: net: reuseport_bpf_cpu ok 2 selftests: net: reuseport_bpf_cpu # selftests: net: sk_bind_sendto_listen ok 9 selftests: net: sk_bind_sendto_listen # selftests: net: reuseaddr_conflict ok 5 selftests: net: reuseaddr_conflict ... Here is the part 1 link: https://lore.kernel.org/netdev/20231202020110.362433-1-liuhangbin@gmail.com part 2 link: https://lore.kernel.org/netdev/20231206070801.1691247-1-liuhangbin@gmail.com part 3 link: https://lore.kernel.org/netdev/20231213060856.4030084-1-liuhangbin@gmail.com ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	kselftest/runner.sh: add netns support	Hangbin Liu
	Add a variable RUN_IN_NETNS if the user wants to run all the selected tests in namespace in parallel. With this, we can save a lot of testing time. Note that some tests may not fit to run in namespace, e.g. net/drop_monitor_tests.sh, as the dwdump needs to be run in init ns. I also added another parameter -p to make all the logs reported separately instead of mixing them in the stdout or output.log. Nit: the NUM in run_one is not used, rename it to test_num. Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	selftests/net: convert pmtu.sh to run it in unique namespace	Hangbin Liu
	pmtu test use /bin/sh, so we need to source ./lib.sh instead of lib.sh Here is the test result after conversion. # ./pmtu.sh TEST: ipv4: PMTU exceptions [ OK ] TEST: ipv4: PMTU exceptions - nexthop objects [ OK ] TEST: ipv6: PMTU exceptions [ OK ] TEST: ipv6: PMTU exceptions - nexthop objects [ OK ] ... TEST: ipv4: list and flush cached exceptions - nexthop objects [ OK ] TEST: ipv6: list and flush cached exceptions [ OK ] TEST: ipv6: list and flush cached exceptions - nexthop objects [ OK ] TEST: ipv4: PMTU exception w/route replace [ OK ] TEST: ipv4: PMTU exception w/route replace - nexthop objects [ OK ] TEST: ipv6: PMTU exception w/route replace [ OK ] TEST: ipv6: PMTU exception w/route replace - nexthop objects [ OK ] Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	selftests/net: use unique netns name for setup_loopback.sh setup_veth.sh	Hangbin Liu
	The setup_loopback and setup_veth use their own way to create namespace. So let's just re-define server_ns/client_ns to unique name. At the same time update the namespace name in gro.sh and toeplitz.sh. As I don't have env to run toeplitz.sh. Here is only the gro test result. # ./gro.sh running test ipv4 data Expected {200 }, Total 1 packets Received {200 }, Total 1 packets. ... Gro::large test passed. All Tests Succeeded! Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	selftests/net: convert xfrm_policy.sh to run it in unique namespace	Hangbin Liu
	Here is the test result after conversion. # ./xfrm_policy.sh PASS: policy before exception matches PASS: ping to .254 bypassed ipsec tunnel (exceptions) PASS: direct policy matches (exceptions) PASS: policy matches (exceptions) PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies) PASS: direct policy matches (exceptions and block policies) PASS: policy matches (exceptions and block policies) PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after hresh changes) PASS: direct policy matches (exceptions and block policies after hresh changes) PASS: policy matches (exceptions and block policies after hresh changes) PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after hthresh change in ns3) PASS: direct policy matches (exceptions and block policies after hthresh change in ns3) PASS: policy matches (exceptions and block policies after hthresh change in ns3) PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after htresh change to normal) PASS: direct policy matches (exceptions and block policies after htresh change to normal) PASS: policy matches (exceptions and block policies after htresh change to normal) PASS: policies with repeated htresh change Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	selftests/net: convert stress_reuseport_listen.sh to run it in unique namespace	Hangbin Liu
	Here is the test result after conversion. # ./stress_reuseport_listen.sh listen 24000 socks took 0.47714 Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23	selftests/net: convert rtnetlink.sh to run it in unique namespace	Hangbin Liu
	When running the test in namespace, the debugfs may not load automatically. So add a checking to make sure debugfs loaded. Here is the test result after conversion. # ./rtnetlink.sh PASS: policy routing PASS: route get ... PASS: address proto IPv4 PASS: address proto IPv6 Acked-by: David Ahern <dsahern@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>