linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-11-16	selftests/pcie_bwctrl: Create selftests	Ilpo Järvinen
	Create selftests for PCIe BW control through the PCIe cooling device sysfs interface. First, the BW control selftest finds the PCIe Port to test with. By default, the PCIe Port with the highest Link Speed is selected but another PCIe Port can be provided with -d parameter. The actual test steps the cur_state of the cooling device one-by-one from max_state to what the cur_state was initially. The speed change is confirmed by observing the current_link_speed for the corresponding PCIe Port. Link: https://lore.kernel.org/r/20241018144755.7875-10-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-16	thermal: Add PCIe cooling driver	Ilpo Järvinen
	Add a thermal cooling driver to provide path to access PCIe bandwidth controller using the usual thermal interfaces. A cooling device is instantiated for controllable PCIe Ports from the bwctrl service driver. If registering the cooling device fails, allow bwctrl's probe to succeed regardless. As cdev in that case contains IS_ERR() pseudo "pointer", clean that up inside the probe function so the remove side doesn't need to suddenly make an odd looking IS_ERR() check. The thermal side state 0 means no throttling, i.e., maximum supported PCIe Link Speed. Link: https://lore.kernel.org/r/20241018144755.7875-9-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: dropped data->cdev test per https://lore.kernel.org/r/ZzRm1SJTwEMRsAr8@wunner.de] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> # From the cooling device interface perspective
2024-11-16	PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed	Ilpo Järvinen
	Currently, PCIe Link Speeds are adjusted by custom code rather than in a common function provided in PCI core. The PCIe bandwidth controller (bwctrl) introduces an in-kernel API, pcie_set_target_speed(), to set PCIe Link Speed. Convert Target Speed quirk to use the new API. The Target Speed quirk runs very early when bwctrl is not yet probed for a Port and can also run later when bwctrl is already setup for the Port, which requires the per port mutex (set_speed_mutex) to be only taken if the bwctrl setup is already complete. The new API is also intended to be used in an upcoming commit that adds a thermal cooling device to throttle PCIe bandwidth when thermal thresholds are reached. The PCIe bandwidth control procedure is as follows. The highest speed supported by the Port and the PCIe device which is not higher than the requested speed is selected and written into the Target Link Speed in the Link Control 2 Register. Then bandwidth controller retrains the PCIe Link. Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to keep track PCIe Link Speed changes. While Bandwidth Notifications should also be generated when bandwidth controller alters the PCIe Link Speed, a few platforms do not deliver LMBS interrupt after Link Training as expected. Thus, after changing the Link Speed, bandwidth controller makes additional read for the Link Status Register to ensure cur_bus_speed is consistent with the new PCIe Link Speed. Link: https://lore.kernel.org/r/20241018144755.7875-8-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash devm_mutex_init() error checking from https://lore.kernel.org/r/20241030163139.2111689-1-andriy.shevchenko@linux.intel.com, drop export of pcie_set_target_speed()] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-16	PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller	Ilpo Järvinen
	This mostly reverts the commit b4c7d2076b4e ("PCI/LINK: Remove bandwidth notification"). An upcoming commit extends this driver building PCIe bandwidth controller on top of it. PCIe bandwidth notifications were first added in the commit e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth notification") but later had to be removed. The significant changes compared with the old bandwidth notification driver include: 1) Don't print the notifications into kernel log, just keep the Link Speed cached in struct pci_bus updated. While somewhat unfortunate, the log spam was the source of complaints that eventually lead to the removal of the bandwidth notifications driver (see the links below for further information). 2) Besides the Link Bandwidth Management Interrupt, also enable Link Autonomous Bandwidth Interrupt to cover the other source of bandwidth changes. 3) Handle Link Speed updates robustly. Refresh the cached Link Speed when enabling Bandwidth Notification Interrupts, and solve the race between Link Speed read and LBMS/LABS update in pcie_bwnotif_irq_thread(). 4) Use concurrency safe LNKCTL RMW operations. 5) The driver is now called PCIe bwctrl (bandwidth controller) instead of just bandwidth notifications because of increased scope and functionality within the driver. 6) Coexist with the Target Link Speed quirk in pcie_failed_link_retrain(). Provide LBMS counting API for it. 7) Tweaks to variable/functions names for consistency and length reasons. Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to keep track PCIe Link Speed changes. [bhelgaas: This is based on previous work by Alexandru Gagniuc <mr.nuke.me@gmail.com>; see e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth notification")] Link: https://lore.kernel.org/r/20241018144755.7875-7-ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/all/20190429185611.121751-1-helgaas@kernel.org/ Link: https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.com/ Link: https://lore.kernel.org/linux-pci/20200115221008.GA191037@google.com/ Suggested-by: Lukas Wunner <lukas@wunner.de> # Building bwctrl on top of bwnotif Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash fix to drop IRQF_ONESHOT and convert to hardirq handler: https://lore.kernel.org/r/20241115165717.15233-1-ilpo.jarvinen@linux.intel.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-16	Documentation: alienware-wmi: Describe THERMAL_INFORMATION operation 0x02	Kurt Borja
	This operation is used by alienware-wmi driver to avoid brute-forcing operation 0x03. Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183639.14726-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16	alienware-wmi: create_thermal_profile() no longer brute-forces IDs	Kurt Borja
	WMAX_METHOD_THERMAL_INFORMATION has a system description operation that outputs a buffer with the following structure: out[0] -> Number of fans out[1] -> Number of sensors out[2] -> 0x00 out[3] -> Number of thermal modes This is now used by create_thermal_profile() to retrieve available thermal codes instead of brute-forcing every ID. Tested on an Alienware x15 R1. Verified by checking ACPI tables of supported models. Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183623.14691-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16	alienware-wmi: Adds support to Alienware x17 R2	Kurt Borja
	Adds support to Alienware x17 R2 Tested-by: Samith Castro <SamithNarayam@hotmail.com> Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183609.14653-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16	alienware-wmi: extends the list of supported models	Kurt Borja
	Adds thermal + gmode quirk to: - Dell G15 5510 - Dell G15 5511 - Dell G15 5515 - Dell G3 3500 - Dell G3 3590 - Dell G5 5500 Adds thermal quirk to: - Alienware m18 R2 - Alienware m17 R5 AMD Support for these models was manually verified by reading their respective ACPI tables. Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183546.14617-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16	alienware-wmi: order alienware_quirks[] alphabetically	Kurt Borja
	alienware_quirks[] entries are now ordered alphabetically Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241111183520.14573-1-kuurtb@gmail.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-16	Revert "drm/amd/pm: correct the workload setting"	Alex Deucher
	This reverts commit 74e1006430a5377228e49310f6d915628609929e. This causes a regression in the workload selection. A more extensive fix is being worked on. For now, revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Fixes: 74e1006430a5 ("drm/amd/pm: correct the workload setting") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-15	remoteproc: qcom: wcss: Remove double assignment in q6v5_wcss_probe()	Yuesong Li
	cocci report a double assignment warning.'wcss->version' was assigned twice in 'q6v5_wcss_probe()'. Signed-off-by: Yuesong Li <liyuesong@vivo.com> Link: https://lore.kernel.org/r/20240823065546.3371378-1-liyuesong@vivo.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom_q6v5_mss: Re-order writes to the IMEM region	Sibi Sankar
	Any write access to the IMEM region when the Q6 is setting up XPU protection on it will result in a XPU violation. Fix this by ensuring IMEM writes related to the MBA post-mortem logs happen before the Q6 is brought out of reset. Fixes: 318130cc9362 ("remoteproc: qcom_q6v5_mss: Add MBA log extraction support") Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240819073020.3291287-1-quic_sibis@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	rpmsg: glink: use only lower 16-bits of param2 for CMD_OPEN name length	Jonathan Marek
	The name len field of the CMD_OPEN packet is only 16-bits and the upper 16-bits of "param2" are a different "prio" field, which can be nonzero in certain situations, and CMD_OPEN packets can be unexpectedly dropped because of this. Fix this by masking out the upper 16 bits of param2. Fixes: b4f8e52b89f6 ("rpmsg: Introduce Qualcomm RPM glink driver") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241007235935.6216-1-jonathan@marek.ca Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom_wcnss_iris: Simplify with dev_err_probe()	Krzysztof Kozlowski
	Use dev_err_probe() to make error and defer code handling simpler. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241011-remote-proc-dev-err-probe-v1-10-5abb4fc61eca@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom_q6v5_wcss: Simplify with dev_err_probe()	Krzysztof Kozlowski
	Use dev_err_probe() to make error and defer code handling simpler. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241011-remote-proc-dev-err-probe-v1-9-5abb4fc61eca@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom_q6v5_pas: Simplify with dev_err_probe()	Krzysztof Kozlowski
	Use dev_err_probe() to make error and defer code handling simpler. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241011-remote-proc-dev-err-probe-v1-8-5abb4fc61eca@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom_q6v5_mss: Drop redundant error printks in probe	Krzysztof Kozlowski
	Do not print errors of getting clocks and regulators in probe twice: once in q6v5_init_clocks() or q6v5_regulator_init() and then again in the probe function. This also avoids dmesg flood on deferred probe. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241011-remote-proc-dev-err-probe-v1-7-5abb4fc61eca@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom_q6v5_mss: Simplify with dev_err_probe()	Krzysztof Kozlowski
	Use dev_err_probe() to make error and defer code handling simpler. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241011-remote-proc-dev-err-probe-v1-6-5abb4fc61eca@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom_q6v5_adsp: Simplify with dev_err_probe()	Krzysztof Kozlowski
	Use dev_err_probe() to make error and defer code handling simpler. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241011-remote-proc-dev-err-probe-v1-5-5abb4fc61eca@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom_q6v5_pas: disable auto boot for wpss	Balaji Pothunoori
	Currently, the rproc "atomic_t power" variable is incremented during: a. WPSS rproc auto boot. b. AHB power on for ath11k. During AHB power off (rmmod ath11k_ahb.ko), rproc_shutdown fails to unload the WPSS firmware because the rproc->power value is '2', causing the atomic_dec_and_test(&rproc->power) condition to fail. Consequently, during AHB power on (insmod ath11k_ahb.ko), QMI_WLANFW_HOST_CAP_REQ_V01 fails due to the host and firmware QMI states being out of sync. Fixes: 300ed425dfa9 ("remoteproc: qcom_q6v5_pas: Add SC7280 ADSP, CDSP & WPSS") Cc: stable@vger.kernel.org Signed-off-by: Balaji Pothunoori <quic_bpothuno@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241018105911.165415-1-quic_bpothuno@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom: pas: Make remoteproc name human friendly	Bjorn Andersson
	The remoteproc "name" property is supposed to present the "human readable" name of the remoteproc, while using the device name is readable, it's not "friendly". Instead, use the "sysmon_name" as the identifier for the remoteproc instance. It matches the typical names used when we speak about each instance, while still being unique. Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Chris Lew <quic_clew@quicinc.com> Link: https://lore.kernel.org/r/20241022-rproc-friendly-name-v1-1-350c82b075cb@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom: pas: enable SAR2130P audio DSP support	Dmitry Baryshkov
	Enable support for the Audio DSP on the Qualcomm SAR2130P platform, reusing the SM8350 resources. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20241027-sar2130p-adsp-v1-3-bd204e39d24e@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom: pas: add minidump_id to SM8350 resources	Dmitry Baryshkov
	Specify minidump_id for the SM8350 DSPs. It was omitted for in the original commit e8b4e9a21af7 ("remoteproc: qcom: pas: Add SM8350 PAS remoteprocs"). Fixes: e8b4e9a21af7 ("remoteproc: qcom: pas: Add SM8350 PAS remoteprocs") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20241027-sar2130p-adsp-v1-2-bd204e39d24e@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	dt-bindings: remoteproc: qcom,sm8350-pas: add SAR2130P aDSP compatible	Dmitry Baryshkov
	Document compatible for audio DSP on Qualcomm SAR2130P platform. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20241027-sar2130p-adsp-v1-1-bd204e39d24e@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	dt-bindings: remoteproc: qcom,sm8550-pas: Add SM8750 ADSP	Krzysztof Kozlowski
	Document compatible for Qualcomm SM8750 SoC ADSP PAS which looks fully compatible with SM8550 variant. The only difference from bindings point of view is one more interrupt ("shutdown-ack"). Marking devices as compatible, using SM8550 ADSP PAS fallback, requires changing some of the conditionals in "if:then:" to "contains". Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Melody Olvera <quic_molvera@quicinc.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20241101170309.382782-1-krzysztof.kozlowski@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom: wcss: Remove subdevs on the error path of q6v5_wcss_probe()	Joe Hattori
	Current implementation of q6v5_wcss_probe() in qcom_q6v5_wcss.c and does not remove the subdevs on the error path. Fix this bug by calling qcom_remove_{ssr,sysmon,pdm,glink}_subdev(), and qcom_q6v5_deinit() appropriately. Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/c4437393bfaeda69351157849b5e0a904586b1c2.1731038950.git.joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom: adsp: Remove subdevs on the error path of adsp_probe()	Joe Hattori
	Current implementation of adsp_probe() in qcom_q6v5_adsp.c and does not remove the subdevs of adsp on the error path. Fix this bug by calling qcom_remove_{ssr,sysmon,pdm,smd,glink}_subdev(), and qcom_q6v5_deinit() appropriately. Fixes: dc160e449122 ("remoteproc: qcom: Introduce Non-PAS ADSP PIL driver") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/fed3df4219543d46b88bacf87990d947f3fac8d7.1731038950.git.joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	remoteproc: qcom: pas: Remove subdevs on the error path of adsp_probe()	Joe Hattori
	Current implementation of adsp_probe() in qcom_q6v5_pas.c does not remove the subdevs of adsp on the error path. Fix this bug by calling qcom_remove_{ssr,sysmon,pdm,smd,glink}_subdev(), qcom_q6v5_deinit(), and adsp_unassign_memory_region() appropriately. Fixes: 4b48921a8f74 ("remoteproc: qcom: Use common SMD edge handler") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/a1cabc64240022a7f1d5237aa2aa6f72d8fb7052.1731038950.git.joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-11-15	Merge branch 'virtio-net-support-af_xdp-zero-copy-tx'	Jakub Kicinski
	Xuan Zhuo says: ==================== virtio-net: support AF_XDP zero copy (tx) XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero copy feature of xsk (XDP socket) needs to be supported by the driver. The performance of zero copy is very good. mlx5 and intel ixgbe already support this feature, This patch set allows virtio-net to support xsk's zerocopy xmit feature. At present, we have completed some preparation: 1. vq-reset (virtio spec and kernel code) 2. virtio-core premapped dma 3. virtio-net xdp refactor So it is time for Virtio-Net to complete the support for the XDP Socket Zerocopy. Virtio-net can not increase the queue num at will, so xsk shares the queue with kernel. This patch set includes some refactor to the virtio-net to let that to support AF_XDP. The current configuration sets the virtqueue (vq) to premapped mode, implying that all buffers submitted to this queue must be mapped ahead of time. This presents a challenge for the virtnet send queue (sq): the virtnet driver would be required to keep track of dma information for vq size * 17, which can be substantial. However, if the premapped mode were applied on a per-buffer basis, the complexity would be greatly reduced. With AF_XDP enabled, AF_XDP buffers would become premapped, while kernel skb buffers could remain unmapped. We can distinguish them by sg_page(sg), When sg_page(sg) is NULL, this indicates that the driver has performed DMA mapping in advance, allowing the Virtio core to directly utilize sg_dma_address(sg) without conducting any internal DMA mapping. Additionally, DMA unmap operations for this buffer will be bypassed. ENV: Qemu with vhost-user(polling mode). Host CPU: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 19531092064 RX-missed: 0 RX-bytes: 1093741155584 RX-errors: 0 RX-nombuf: 0 TX-packets: 5959955552 TX-errors: 0 TX-bytes: 371030645664 Throughput (since last show) Rx-pps: 8861574 Rx-bps: 3969985208 Tx-pps: 8861493 Tx-bps: 3969962736 ############################################################################ testpmd> show port stats all ######################## NIC statistics for port 0 ######################## RX-packets: 68152727 RX-missed: 0 RX-bytes: 3816552712 RX-errors: 0 RX-nombuf: 0 TX-packets: 68114967 TX-errors: 33216 TX-bytes: 3814438152 Throughput (since last show) Rx-pps: 6333196 Rx-bps: 2837272088 Tx-pps: 6333227 Tx-bps: 2837285936 ############################################################################ But AF_XDP consumes more CPU for tx and rx napi(100% and 86%). ==================== Link: https://patch.msgid.link/20241112012928.102478-1-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: xdp_features add NETDEV_XDP_ACT_XSK_ZEROCOPY	Xuan Zhuo
	Now, we support AF_XDP(xsk). Add NETDEV_XDP_ACT_XSK_ZEROCOPY to xdp_features. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-14-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: update tx timeout record	Xuan Zhuo
	If send queue sent some packets, we update the tx timeout record to prevent the tx timeout. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-13-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: xsk: tx: support xmit xsk buffer	Xuan Zhuo
	The driver's tx napi is very important for XSK. It is responsible for obtaining data from the XSK queue and sending it out. At the beginning, we need to trigger tx napi. virtnet_free_old_xmit distinguishes three type ptr(skb, xdp frame, xsk buffer) by the last bits of the pointer. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-12-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: xsk: prevent disable tx napi	Xuan Zhuo
	Since xsk's TX queue is consumed by TX NAPI, if sq is bound to xsk, then we must stop tx napi from being disabled. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-11-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: xsk: bind/unbind xsk for tx	Xuan Zhuo
	This patch implement the logic of bind/unbind xsk pool to sq and rq. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-10-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_net: refactor the xmit type	Xuan Zhuo
	Because the af-xdp will introduce a new xmit type, so I refactor the xmit type mechanism first. We know both xdp_frame and sk_buff are at least 4 bytes aligned. For the xdp tx, we do not pass any pointer to virtio core as data, we just need to pass the len of the packet. So we will push len to the void pointer. We can make sure the pointer is 4 bytes aligned. And the data structure of AF_XDP also is at least 4 bytes aligned. So the last two bits of the pointers are free, we can't use these to distinguish them. 00 for skb 01 for SKB_ORPHAN 10 for XDP 11 for AF-XDP tx Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-9-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: remove API virtqueue_set_dma_premapped	Xuan Zhuo
	Now, this API is useless. remove it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-8-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio-net: rq submits premapped per-buffer	Xuan Zhuo
	virtio-net rq submits premapped per-buffer by setting sg page to NULL; Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-7-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: introduce add api for premapped	Xuan Zhuo
	Two APIs are introduced to submit premapped per-buffers. int virtqueue_add_inbuf_premapped(struct virtqueue vq, struct scatterlist sg, unsigned int num, void data, void ctx, gfp_t gfp); int virtqueue_add_outbuf_premapped(struct virtqueue vq, struct scatterlist sg, unsigned int num, void *data, gfp_t gfp); Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-6-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: perform premapped operations based on per-buffer	Xuan Zhuo
	The current configuration sets the virtqueue (vq) to premapped mode, implying that all buffers submitted to this queue must be mapped ahead of time. This presents a challenge for the virtnet send queue (sq): the virtnet driver would be required to keep track of dma information for vq size * 17, which can be substantial. However, if the premapped mode were applied on a per-buffer basis, the complexity would be greatly reduced. With AF_XDP enabled, AF_XDP buffers would become premapped, while kernel skb buffers could remain unmapped. And consider that some sgs are not generated by the virtio driver, that may be passed from the block stack. So we can not change the sgs, new APIs are the better way. So we pass the new argument 'premapped' to indicate the buffers submitted to virtio are premapped in advance. Additionally, DMA unmap operations for these buffers will be bypassed. Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-5-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: packed: record extras for indirect buffers	Xuan Zhuo
	The subsequent commit needs to know whether every indirect buffer is premapped or not. So we need to introduce an extra struct for every indirect buffer to record this info. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-4-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: split: record extras for indirect buffers	Xuan Zhuo
	The subsequent commit needs to know whether every indirect buffer is premapped or not. So we need to introduce an extra struct for every indirect buffer to record this info. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-3-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	virtio_ring: introduce vring_need_unmap_buffer	Xuan Zhuo
	To make the code readable, introduce vring_need_unmap_buffer() to replace do_unmap. use_dma_api premapped -> vring_need_unmap_buffer() 1. false false false 2. true false true 3. true true false Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241112012928.102478-2-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	Merge branch '100GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-11-05 (ice, ixgbe, igc. igb, igbvf, e1000) For ice: Mateusz refactors and adds additional SerDes configuration values to be output. Przemek refactors processing of DDP and adds support for a flag field in the DDP's signature segment header. Joe Damato adds support for persistent NAPI config. Brett adjusts setting of Tx promiscuous based on unicast/multicast setting. Jake moves setting of pf->supported_rxdids to occur directly after DDP load and changes a small struct to use stack memory. Frederic Weisbecker adds WQ_UNBOUND flag to the workqueue. For ixgbe: Diomidis Spinellis removes a circular dependency. For igc: Vitaly removes an unneeded autoneg parameter. For igb: Johnny Park fixes a couple of typos. For igbvf: Wander Lairson Costa removes an unused spinlock. For e1000: Joe Damato adds RTNL lock to some calls where it is expected to be held. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: e1000: Hold RTNL when e1000_down can be called igbvf: remove unused spinlock igb: Fix 2 typos in comments in igb_main.c igc: remove autoneg parameter from igc_mac_info ixgbe: Break include dependency cycle ice: Unbind the workqueue ice: use stack variable for virtchnl_supported_rxdids ice: initialize pf->supported_rxdids immediately after loading DDP ice: only allow Tx promiscuous for multicast ice: Add support for persistent NAPI config ice: support optional flags in signature segment header ice: refactor "last" segment of DDP pkg ice: extend dump serdes equalizer values feature ice: rework of dump serdes equalizer values feature ==================== Link: https://patch.msgid.link/20241113185431.1289708-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	Merge branch 'net-ndo_fdb_add-del-have-drivers-report-whether-they-notified'	Jakub Kicinski
	Petr Machata says: ==================== net: ndo_fdb_add/del: Have drivers report whether they notified Currently when FDB entries are added to or deleted from a VXLAN netdevice, the VXLAN driver emits one notification, including the VXLAN-specific attributes. The core however always sends a notification as well, a generic one. Thus two notifications are unnecessarily sent for these operations. A similar situation comes up with bridge driver, which also emits notifications on its own. # ip link add name vx type vxlan id 1000 dstport 4789 # bridge monitor fdb & [1] 1981693 # bridge fdb add de:ad:be:ef:13:37 dev vx self dst 192.0.2.1 de:ad:be:ef:13:37 dev vx dst 192.0.2.1 self permanent de:ad:be:ef:13:37 dev vx self permanent In order to prevent this duplicity, add a parameter, bool *notified, to ndo_fdb_add and ndo_fdb_del. The flag is primed to false, and if the callee sends a notification on its own, it sets the flag to true, thus informing the core that it should not generate another notification. Patches #1 to #2 are concerned with the above. In the remaining patches, #3 to #7, add a selftest. This takes place across several patches. Many of the helpers we would like to use for the test are in forwarding/lib.sh, whereas net/ is a more suitable place for the test, so the libraries need to be massaged a bit first. ==================== Link: https://patch.msgid.link/cover.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: fdb_notify: Add a test for FDB notifications	Petr Machata
	Check that only one notification is produced for various FDB edit operations. Regarding the ip_link_add() and ip_link_master() helpers. This pattern of action plus corresponding defer is bound to come up often, and a dedicated vocabulary to capture it will be handy. tunnel_create() and vlan_create() from forwarding/lib.sh are somewhat opaque and perhaps too kitchen-sinky, so I tried to go in the opposite direction with these ones, and wrapped only the bare minimum to schedule a corresponding cleanup. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/910c5880ae6d3b558d6889cbdba2be690c2615c6.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: lib: Add kill_process	Petr Machata
	A number of selftests run processes in the background and need to kill them afterwards. Instead for everyone to open-code the kill / wait / redirect mantra, add a helper in net/lib.sh. Convert existing open-code sites. Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Link: https://patch.msgid.link/a9db102067d741c118f0bd93b10c75e2a34665ea.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: lib: Move checks from forwarding/lib.sh here	Petr Machata
	For logging to be useful, something has to set RET and retmsg by calling ret_set_ksft_status(). There is a suite of functions to that end in forwarding/lib: check_err, check_fail et.al. Move them to net/lib.sh so that every net test can use them. Existing lib.sh users might be using these same names for their functions. However lib.sh is always sourced near the top of the file (checked), and whatever new definitions will simply override the ones provided by lib.sh. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/f488a00dc85b8e0c1f3c71476b32b21b5189a847.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: lib: Move tests_run from forwarding/lib.sh here	Petr Machata
	It would be good to use the same mechanism for scheduling and dispatching general net tests as the many forwarding tests already use. To that end, move the logging helpers to net/lib.sh so that every net test can use them. Existing lib.sh users might be using the name themselves. However lib.sh is always sourced near the top of the file (checked), and whatever new definition will simply override the one provided by lib.sh. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/a6fc083486493425b2c61185c327845b6ce3233a.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	selftests: net: lib: Move logging from forwarding/lib.sh here	Petr Machata
	Many net selftests invent their own logging helpers. These really should be in a library sourced by these tests. Currently forwarding/lib.sh has a suite of perfectly fine logging helpers, but sourcing a forwarding/ library from a higher-level directory smells of layering violation. In this patch, move the logging helpers to net/lib.sh so that every net test can use them. Together with the logging helpers, it's also necessary to move pause_on_fail(), and EXIT_STATUS and RET. Existing lib.sh users might be using these same names for their functions or variables. However lib.sh is always sourced near the top of the file (checked), and whatever new definitions will simply override the ones provided by lib.sh. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://patch.msgid.link/edd3785a3bd72ffbe1409300989e993ee50ae98b.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15	ndo_fdb_del: Add a parameter to report whether notification was sent	Petr Machata
	In a similar fashion to ndo_fdb_add, which was covered in the previous patch, add the bool *notified argument to ndo_fdb_del. Callees that send a notification on their own set the flag to true. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/06b1acf4953ef0a5ed153ef1f32d7292044f2be6.1731589511.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>