summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-08misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handlingRengarajan S
Resolve kernel panic caused by improper handling of IRQs while accessing GPIO values. This is done by replacing generic_handle_irq with handle_nested_irq. Fixes: 1f4d8ae231f4 ("misc: microchip: pci1xxxx: Add gpio irq handler and irq helper functions irq_ack, irq_mask, irq_unmask and irq_set_type of irq_chip.") Cc: stable <stable@kernel.org> Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20241205133626.1483499-2-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08dm thin: make get_first_thin use rcu-safe list first functionKrister Johansen
The documentation in rculist.h explains the absence of list_empty_rcu() and cautions programmers against relying on a list_empty() -> list_first() sequence in RCU safe code. This is because each of these functions performs its own READ_ONCE() of the list head. This can lead to a situation where the list_empty() sees a valid list entry, but the subsequent list_first() sees a different view of list head state after a modification. In the case of dm-thin, this author had a production box crash from a GP fault in the process_deferred_bios path. This function saw a valid list head in get_first_thin() but when it subsequently dereferenced that and turned it into a thin_c, it got the inside of the struct pool, since the list was now empty and referring to itself. The kernel on which this occurred printed both a warning about a refcount_t being saturated, and a UBSAN error for an out-of-bounds cpuid access in the queued spinlock, prior to the fault itself. When the resulting kdump was examined, it was possible to see another thread patiently waiting in thin_dtr's synchronize_rcu. The thin_dtr call managed to pull the thin_c out of the active thins list (and have it be the last entry in the active_thins list) at just the wrong moment which lead to this crash. Fortunately, the fix here is straight forward. Switch get_first_thin() function to use list_first_or_null_rcu() which performs just a single READ_ONCE() and returns NULL if the list is already empty. This was run against the devicemapper test suite's thin-provisioning suites for delete and suspend and no regressions were observed. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Fixes: b10ebd34ccca ("dm thin: fix rcu_read_lock being held in code that can sleep") Cc: stable@vger.kernel.org Acked-by: Ming-Hung Tsai <mtsai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-01-08dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITYMikulas Patocka
dm-ebs uses dm-bufio to process requests that are not aligned on logical sector size. dm-bufio doesn't support passing integrity data (and it is unclear how should it do it), so we shouldn't set the DM_TARGET_PASSES_INTEGRITY flag. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: d3c7b35c20d6 ("dm: add emulated block size target")
2025-01-08xfs: don't return an error from xfs_update_last_rtgroup_size for !XFS_RTChristoph Hellwig
Non-rtg file systems have a fake RT group even if they do not have a RT device, and thus an rgcount of 1. Ensure xfs_update_last_rtgroup_size doesn't fail when called for !XFS_RT to handle this case. Fixes: 87fe4c34a383 ("xfs: create incore realtime group structures") Reported-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-08MAINTAINERS: add missing maintainers for Simple Audio CardKuninori Morimoto
Mark Brown will take the patch. Add his name. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/87v7uqkpye.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-01-08arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6SAnton Kirilov
Fix the SD card detection on FriendlyElec NanoPi R6C/R6S boards. Signed-off-by: Anton Kirilov <anton.kirilov@arm.com> Link: https://lore.kernel.org/r/20241219113145.483205-1-anton.kirilov@arm.com Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-01-08staging: gpib: mite: remove unused global functionsGreg Kroah-Hartman
The mite.c file was originally copied from the COMEDI code, and now that it is in the kernel tree, along with the comedi code, on some build configurations there are errors due to duplicate symbols (specifically mite_dma_disarm). Remove all of the unused functions in the gpib mite.c and .h files as they aren't needed and cause the compiler to be confused. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202501081239.BAPhfAHJ-lkp@intel.com/ Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/r/2025010809-padding-survive-91b3@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08USB: serial: option: add Neoway N723-EA supportMichal Hrusecky
Update the USB serial option driver to support Neoway N723-EA. ID 2949:8700 Marvell Mobile Composite Device Bus T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2949 ProdID=8700 Rev= 1.00 S: Manufacturer=Marvell S: Product=Mobile Composite Device Bus S: SerialNumber=200806006809080000 C:* #Ifs= 5 Cfg#= 1 Atr=c0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=03 I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=89(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0e(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms Tested successfully connecting to the Internet via rndis interface after dialing via AT commands on If#=4 or If#=6. Not sure of the purpose of the other serial interface. Signed-off-by: Michal Hrusecky <michal.hrusecky@turris.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
2025-01-08USB: serial: option: add MeiG Smart SRM815Chukun Pan
It looks like SRM815 shares ID with SRM825L. T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2dee ProdID=4d22 Rev= 4.14 S: Manufacturer=MEIG S: Product=LTE-A Module S: SerialNumber=123456 C:* #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn> Link: https://lore.kernel.org/lkml/20241215100027.1970930-1-amadeus@jmu.edu.cn/ Link: https://lore.kernel.org/all/4333b4d0-281f-439d-9944-5570cbc4971d@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
2025-01-08USB: serial: cp210x: add Phoenix Contact UPS DeviceJohan Hovold
Phoenix Contact sells UPS Quint devices [1] with a custom datacable [2] that embeds a Silicon Labs converter: Bus 001 Device 003: ID 1b93:1013 Silicon Labs Phoenix Contact UPS Device Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x1b93 idProduct 0x1013 bcdDevice 1.00 iManufacturer 1 Silicon Labs iProduct 2 Phoenix Contact UPS Device iSerial 3 <redacted> bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0020 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 bInterfaceProtocol 0 iInterface 2 Phoenix Contact UPS Device Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 [1] https://www.phoenixcontact.com/en-pc/products/power-supply-unit-quint-ps-1ac-24dc-10-2866763 [2] https://www.phoenixcontact.com/en-il/products/data-cable-preassembled-ifs-usb-datacable-2320500 Reported-by: Giuseppe Corbelli <giuseppe.corbelli@antaresvision.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
2025-01-08gpio: loongson: Fix Loongson-2K2000 ACPI GPIO register offsetBinbin Zhou
Since commit 3feb70a61740 ("gpio: loongson: add more gpio chip support"), the Loongson-2K2000 GPIO is supported. However, according to the firmware development specification, the Loongson-2K2000 ACPI GPIO register offsets in the driver do not match the register base addresses in the firmware, resulting in the registers not being accessed properly. Now, we fix it to ensure the GPIO function works properly. Cc: stable@vger.kernel.org Cc: Yinbo Zhu <zhuyinbo@loongson.cn> Fixes: 3feb70a61740 ("gpio: loongson: add more gpio chip support") Co-developed-by: Hongliang Wang <wanghongliang@loongson.cn> Signed-off-by: Hongliang Wang <wanghongliang@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Link: https://lore.kernel.org/r/20250107103856.1037222-1-zhoubinbin@loongson.cn Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-01-08Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link"Suraj Kandpal
This reverts commit 483f7d94a0453564ad9295288c0242136c5f36a0. This needs to be reverted since HDCP even after updating the connector state HDCP property we don't reenable HDCP until the next commit in which the CP Property is set causing compliance to fail. --v2 -Fix build issue [Dnyaneshwar] Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250103084517.239998-1-suraj.kandpal@intel.com (cherry picked from commit fcf73e20cd1fe60c3ba5f9626f1e8f9cd4511edf) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2025-01-07Merge branch ↵Jakub Kicinski
'intel-wired-lan-driver-updates-2025-01-06-igb-igc-ixgbe-ixgbevf-i40e-fm10k' Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-01-06 (igb, igc, ixgbe, ixgbevf, i40e, fm10k) For igb: Sriram Yagnaraman and Kurt Kanzenbach add support for AF_XDP zero-copy. Original cover letter: The first couple of patches adds helper functions to prepare for AF_XDP zero-copy support which comes in the last couple of patches, one each for Rx and TX paths. As mentioned in v1 patchset [0], I don't have access to an actual IGB device to provide correct performance numbers. I have used Intel 82576EB emulator in QEMU [1] to test the changes to IGB driver. The tests use one isolated vCPU for RX/TX and one isolated vCPU for the xdp-sock application [2]. Hope these measurements provide at the least some indication on the increase in performance when using ZC, especially in the TX path. It would be awesome if someone with a real IGB NIC can test the patch. AF_XDP performance using 64 byte packets in Kpps. Benchmark: XDP-SKB XDP-DRV XDP-DRV(ZC) rxdrop 220 235 350 txpush 1.000 1.000 410 l2fwd 1.000 1.000 200 AF_XDP performance using 1500 byte packets in Kpps. Benchmark: XDP-SKB XDP-DRV XDP-DRV(ZC) rxdrop 200 210 310 txpush 1.000 1.000 410 l2fwd 0.900 1.000 160 [0]: https://lore.kernel.org/intel-wired-lan/20230704095915.9750-1-sriram.yagnaraman@est.tech/ [1]: https://www.qemu.org/docs/master/system/devices/igb.html [2]: https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-example Subsequent changes and information can be found here: https://lore.kernel.org/intel-wired-lan/20241018-b4-igb_zero_copy-v9-0-da139d78d796@linutronix.de/ Yue Haibing converts use of ERR_PTR return to traditional error code which resolves a smatch warning. For igc: Song Yoong Siang allows for the XDP program to be hot-swapped. Yue Haibing converts use of ERR_PTR return to traditional error code which resolves a smatch warning. Joe Damato adds sets IRQ and queues to NAPI instances to allow for reporting via netdev-genl API. For ixgbe: Yue Haibing converts use of ERR_PTR return to traditional error code which resolves a smatch warning. For ixgbevf: Yue Haibing converts use of ERR_PTR return to traditional error code which resolves a smatch warning. For i40e: Alex implements "mdd-auto-reset-vf" private flag to automatically reset VFs when encountering an MDD event. For fm10k: Dr. David Alan Gilbert removes an unused function. ==================== Link: https://patch.msgid.link/20250106221929.956999-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07intel/fm10k: Remove unused fm10k_iov_msg_mac_vlan_pfDr. David Alan Gilbert
fm10k_iov_msg_mac_vlan_pf() has been unused since 2017's commit 1f5c27e52857 ("fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requests") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-16-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igc: Link queues to NAPI instancesJoe Damato
Link queues to NAPI instances via netdev-genl API so that users can query this information with netlink. Handle a few cases in the driver: 1. Link/unlink the NAPIs when XDP is enabled/disabled 2. Handle IGC_FLAG_QUEUE_PAIRS enabled and disabled Example output when IGC_FLAG_QUEUE_PAIRS is enabled: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump queue-get --json='{"ifindex": 2}' [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'}, {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'}, {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'}, {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'}, {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'}, {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'}, {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}] Since IGC_FLAG_QUEUE_PAIRS is enabled, you'll note that the same NAPI ID is present for both rx and tx queues at the same index, for example index 0: {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'}, To test IGC_FLAG_QUEUE_PAIRS disabled, a test system was booted using the grub command line option "maxcpus=2" to force igc_set_interrupt_capability to disable IGC_FLAG_QUEUE_PAIRS. Example output when IGC_FLAG_QUEUE_PAIRS is disabled: $ lscpu | grep "On-line CPU" On-line CPU(s) list: 0,2 $ ethtool -l enp86s0 | tail -5 Current hardware settings: RX: n/a TX: n/a Other: 1 Combined: 2 $ cat /proc/interrupts | grep enp 144: [...] enp86s0 145: [...] enp86s0-rx-0 146: [...] enp86s0-rx-1 147: [...] enp86s0-tx-0 148: [...] enp86s0-tx-1 1 "other" IRQ, and 2 IRQs for each of RX and Tx, so we expect netlink to report 4 IRQs with unique NAPI IDs: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump napi-get --json='{"ifindex": 2}' [{'id': 8196, 'ifindex': 2, 'irq': 148}, {'id': 8195, 'ifindex': 2, 'irq': 147}, {'id': 8194, 'ifindex': 2, 'irq': 146}, {'id': 8193, 'ifindex': 2, 'irq': 145}] Now we examine which queues these NAPIs are associated with, expecting that since IGC_FLAG_QUEUE_PAIRS is disabled each RX and TX queue will have its own NAPI instance: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump queue-get --json='{"ifindex": 2}' [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'}, {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'}, {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'}, {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}] Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-15-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igc: Link IRQs to NAPI instancesJoe Damato
Link IRQs to NAPI instances via netdev-genl API so that users can query this information with netlink. Compare the output of /proc/interrupts (noting that IRQ 128 is the "other" IRQ which does not appear to have a NAPI instance): $ cat /proc/interrupts | grep enp86s0 | cut --delimiter=":" -f1 128 129 130 131 132 The output from netlink shows the mapping of NAPI IDs to IRQs (again noting that 128 is absent as it is the "other" IRQ): $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump napi-get --json='{"ifindex": 2}' [{'defer-hard-irqs': 0, 'gro-flush-timeout': 0, 'id': 8196, 'ifindex': 2, 'irq': 132}, {'defer-hard-irqs': 0, 'gro-flush-timeout': 0, 'id': 8195, 'ifindex': 2, 'irq': 131}, {'defer-hard-irqs': 0, 'gro-flush-timeout': 0, 'id': 8194, 'ifindex': 2, 'irq': 130}, {'defer-hard-irqs': 0, 'gro-flush-timeout': 0, 'id': 8193, 'ifindex': 2, 'irq': 129}] Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-14-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07i40e: add ability to reset VF for Tx and Rx MDD eventsAleksandr Loktionov
Implement "mdd-auto-reset-vf" priv-flag to handle Tx and Rx MDD events for VFs. This flag is also used in other network adapters like ICE. Usage: - "on" - The problematic VF will be automatically reset if a malformed descriptor is detected. - "off" - The problematic VF will be disabled. In cases where a VF sends malformed packets classified as malicious, it can cause the Tx queue to freeze, rendering it unusable for several minutes. When an MDD event occurs, this new implementation allows for a graceful VF reset to quickly restore operational state. Currently, VF queues are disabled if an MDD event occurs. This patch adds the ability to reset the VF if a Tx or Rx MDD event occurs. It also includes MDD event logging throttling to avoid dmesg pollution and unifies the format of Tx and Rx MDD messages. Note: Standard message rate limiting functions like dev_info_ratelimited() do not meet our requirements. Custom rate limiting is implemented, please see the code for details. Co-developed-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Co-developed-by: Padraig J Connolly <padraig.j.connolly@intel.com> Signed-off-by: Padraig J Connolly <padraig.j.connolly@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Michal Schmidt <mschmidt@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-13-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07ixgbevf: Fix passing 0 to ERR_PTR in ixgbevf_run_xdp()Yue Haibing
ixgbevf_run_xdp() converts customed xdp action to a negative error code with the sk_buff pointer type which be checked with IS_ERR in ixgbevf_clean_rx_irq(). Remove this error pointer handing instead use plain int return value. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-12-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07ixgbe: Fix passing 0 to ERR_PTR in ixgbe_run_xdp()Yue Haibing
ixgbe_run_xdp() converts customed xdp action to a negative error code with the sk_buff pointer type which be checked with IS_ERR in ixgbe_clean_rx_irq(). Remove this error pointer handing instead use plain int return value. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-11-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igb: Fix passing 0 to ERR_PTR in igb_run_xdp()Yue Haibing
igb_run_xdp() converts customed xdp action to a negative error code with the sk_buff pointer type which be checked with IS_ERR in igb_clean_rx_irq(). Remove this error pointer handing instead use plain int return value. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-10-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igc: Fix passing 0 to ERR_PTR in igc_xdp_run_prog()Yue Haibing
igc_xdp_run_prog() converts customed xdp action to a negative error code with the sk_buff pointer type which be checked with IS_ERR in igc_clean_rx_irq(). Remove this error pointer handing instead use plain int return value to fix this smatch warnings: drivers/net/ethernet/intel/igc/igc_main.c:2533 igc_xdp_run_prog() warn: passing zero to 'ERR_PTR' Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-9-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igc: Allow hot-swapping XDP programSong Yoong Siang
Currently, the driver would always close and reopen the network interface when setting/removing the XDP program, regardless of the presence of XDP resources. This could cause unnecessary disruptions. To avoid this, introduces a check to determine if there is a need to close and reopen the interface, allowing for seamless hot-swapping of XDP programs. Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-8-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igb: Add AF_XDP zero-copy Tx supportSriram Yagnaraman
Add support for AF_XDP zero-copy transmit path. A new TX buffer type IGB_TYPE_XSK is introduced to indicate that the Tx frame was allocated from the xsk buff pool, so igb_clean_tx_ring() and igb_clean_tx_irq() can clean the buffers correctly based on type. igb_xmit_zc() performs the actual packet transmit when AF_XDP zero-copy is enabled. We share the TX ring between slow path, XDP and AF_XDP zero-copy, so we use the netdev queue lock to ensure mutual exclusion. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> [Kurt: Set olinfo_status in igb_xmit_zc() so that frames are transmitted, Use READ_ONCE() for xsk_pool and check Tx disabled and carrier in igb_xmit_zc(), Add FIXME for RS bit] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-7-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igb: Add AF_XDP zero-copy Rx supportSriram Yagnaraman
Add support for AF_XDP zero-copy receive path. When AF_XDP zero-copy is enabled, the rx buffers are allocated from the xsk buff pool using igb_alloc_rx_buffers_zc(). Use xsk_pool_get_rx_frame_size() to set SRRCTL rx buf size when zero-copy is enabled. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> [Kurt: Port to v6.12 and provide napi_id for xdp_rxq_info_reg(), RCT, remove NETDEV_XDP_ACT_XSK_ZEROCOPY, update NTC handling, READ_ONCE() xsk_pool, likelyfy for XDP_REDIRECT case] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-6-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igb: Add XDP finalize and stats update functionsKurt Kanzenbach
Move XDP finalize and Rx statistics update into separate functions. This way, they can be reused by the XDP and XDP/ZC code later. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igb: Introduce XSK data structures and helpersSriram Yagnaraman
Add the following ring flag: - IGB_RING_FLAG_TX_DISABLED (when xsk pool is being setup) Add a xdp_buff array for use with XSK receive batch API, and a pointer to xsk_pool in igb_adapter. Add enable/disable functions for TX and RX rings. Add enable/disable functions for XSK pool. Add xsk wakeup function. None of the above functionality will be active until NETDEV_XDP_ACT_XSK_ZEROCOPY is advertised in netdev->xdp_features. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> [Kurt: Add READ/WRITE_ONCE(), synchronize_net(), remove IGB_RING_FLAG_AF_XDP_ZC] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-4-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igb: Introduce igb_xdp_is_enabled()Sriram Yagnaraman
Introduce igb_xdp_is_enabled() to check if an XDP program is assigned to the device. Use that wherever xdp_prog is read and evaluated. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> [Kurt: Split patches and use READ_ONCE()] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07igb: Remove static qualifiersSriram Yagnaraman
Remove static qualifiers on the following functions to be able to call from XSK specific file that is added in the later patches: - igb_xdp_tx_queue_mapping() - igb_xdp_ring_update_tail() - igb_clean_tx_ring() - igb_clean_rx_ring() - igb_xdp_xmit_back() - igb_process_skb_fields() While at it, inline igb_xdp_tx_queue_mapping() and igb_xdp_ring_update_tail(). These functions are small enough and used in XDP hot paths. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> [Kurt: Split patches, inline small XDP functions] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07Merge branch 'tools-ynl-decode-link-types-present-in-tests'Jakub Kicinski
Jakub Kicinski says: ==================== tools: ynl: decode link types present in tests Using a kernel built for the net selftest target to run drivers/net tests currently fails, because the net kernel automatically spawns a handful of tunnel devices which YNL can't decode. Fill in those missing link types in rt_link. We need to extend subset support a bit for it to work. v1: https://lore.kernel.org/20250105012523.1722231-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250107022820.2087101-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07netlink: specs: rt_link: decode ip6tnl, vti and vti6 link attrsJakub Kicinski
Some of our tests load vti and ip6tnl so not being able to decode the link attrs gets in the way of using Python YNL for testing. Decode link attributes for ip6tnl, vti and vti6. ip6tnl uses IFLA_IPTUN_FLAGS as u32, while ipv4 and sit expect a u16 attribute, so we have a (first?) subset type override... Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250107022820.2087101-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07tools: ynl: print some information about attribute we can't parseJakub Kicinski
When parsing throws an exception one often has to figure out which attribute couldn't be parsed from first principles. For families with large message parsing trees like rtnetlink guessing the attribute can be hard. Print a bit of information as the exception travels out, e.g.: # when dumping rt links Error decoding 'flags' from 'linkinfo-ip6tnl-attrs' Error decoding 'data' from 'linkinfo-attrs' Error decoding 'linkinfo' from 'link-attrs' Traceback (most recent call last): File "/home/kicinski/linux/./tools/net/ynl/cli.py", line 119, in <module> main() File "/home/kicinski/linux/./tools/net/ynl/cli.py", line 100, in main reply = ynl.dump(args.dump, attrs) File "/home/kicinski/linux/tools/net/ynl/lib/ynl.py", line 1064, in dump return self._op(method, vals, dump=True) File "/home/kicinski/linux/tools/net/ynl/lib/ynl.py", line 1058, in _op return self._ops(ops)[0] File "/home/kicinski/linux/tools/net/ynl/lib/ynl.py", line 1045, in _ops rsp_msg = self._decode(decoded.raw_attrs, op.attr_set.name) File "/home/kicinski/linux/tools/net/ynl/lib/ynl.py", line 738, in _decode subdict = self._decode(NlAttrs(attr.raw), attr_spec['nested-attributes'], search_attrs) File "/home/kicinski/linux/tools/net/ynl/lib/ynl.py", line 763, in _decode decoded = self._decode_sub_msg(attr, attr_spec, search_attrs) File "/home/kicinski/linux/tools/net/ynl/lib/ynl.py", line 714, in _decode_sub_msg subdict = self._decode(NlAttrs(attr.raw, offset), msg_format.attr_set) File "/home/kicinski/linux/tools/net/ynl/lib/ynl.py", line 749, in _decode decoded = attr.as_scalar(attr_spec['type'], attr_spec.byte_order) File "/home/kicinski/linux/tools/net/ynl/lib/ynl.py", line 147, in as_scalar return format.unpack(self.raw)[0] struct.error: unpack requires a buffer of 2 bytes The Traceback is what we would previously see, the "Error..." messages are new. We print a message per level (in the stack order). Printing single combined message gets tricky quickly given sub-messages etc. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250107022820.2087101-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07tools: ynl: correctly handle overrides of fields in subsetJakub Kicinski
We stated in documentation [1] and previous discussions [2] that the need for overriding fields in members of subsets is anticipated. Implement it. Since each attr is now a new object we need to make sure that the modifications are propagated. Specifically C codegen wants to annotate which attrs are used in requests and replies to generate the right validation artifacts. [1] https://docs.kernel.org/next/userspace-api/netlink/specs.html#subset-of [2] https://lore.kernel.org/netdev/20231004171350.1f59cd1d@kernel.org/ Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250107022820.2087101-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07eth: gve: use appropriate helper to set xdp_featuresJakub Kicinski
Commit f85949f98206 ("xdp: add xdp_set_features_flag utility routine") added routines to inform the core about XDP flag changes. GVE support was added around the same time and missed using them. GVE only changes the flags on error recover or resume. Presumably the flags may change during resume if VM migrated. User would not get the notification and upper devices would not get a chance to recalculate their flags. Fixes: 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format") Reviewed-By: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250106180210.1861784-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07if_vlan: fix kdoc warningsJakub Kicinski
While merging net to net-next I noticed that the kdoc above __vlan_get_protocol_offset() has the wrong function name. Fix that and all the other kdoc warnings in this file. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250106174620.1855269-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07Merge branch 'net-dsa-cleanup-eee-part-2'Jakub Kicinski
Russell King says: ==================== net: dsa: cleanup EEE (part 2) This is part 2 of the DSA EEE cleanups, removing what has become dead code as a result of the EEE management phylib now does. Patch 1 removes the useless setting of tx_lpi parameters in the ksz driver. Patch 2 does the same for mt753x. Patch 3 removes the DSA core code that calls the get_mac_eee() operation. This needs to be done before removing the implementations because doing otherwise would cause dsa_user_get_eee() to return -EOPNOTSUPP. Patches 4..8 remove the trivial get_mac_eee() implementations from DSA drivers. Patch 9 finally removes the get_mac_eee() method from struct dsa_switch_ops. ==================== Link: https://patch.msgid.link/Z3vDwwsHSxH5D6Pm@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: dsa: remove get_mac_eee() methodRussell King (Oracle)
The get_mac_eee() is no longer called by the core DSA code, nor are there any implementations of this method. Remove it. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tUllU-007UzL-KV@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: dsa: qca: remove qca8k_get_mac_eee()Russell King (Oracle)
qca8k_get_mac_eee() is no longer called by the core DSA code. Remove it. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tUllP-007UzF-Gk@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: dsa: mv88e6xxx: remove mv88e6xxx_get_mac_eee()Russell King (Oracle)
mv88e6xxx_get_mac_eee() is no longer called by the core DSA code. Remove it. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tUllK-007Uz9-D7@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: dsa: mt753x: remove ksz_get_mac_eee()Russell King (Oracle)
mt753x_get_mac_eee() is no longer called by the core DSA code. Remove it. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Chester A. Unal <chester.a.unal@arinc9.com> Link: https://patch.msgid.link/E1tUllF-007Uz3-95@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: dsa: ksz: remove ksz_get_mac_eee()Russell King (Oracle)
ksz_get_mac_eee() is no longer called by the core DSA code. Remove it. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tUllA-007Uyx-4o@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: dsa: b53/bcm_sf2: remove b53_get_mac_eee()Russell King (Oracle)
b53_get_mac_eee() is no longer called by the core DSA code. Remove it. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tUll5-007Uyr-1U@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: dsa: no longer call ds->ops->get_mac_eee()Russell King (Oracle)
All implementations of get_mac_eee() now just return zero without doing anything useful. Remove the call to this method in preparation to removing the method from each DSA driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tUlkz-007Uyl-UA@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: dsa: mt753x: remove setting of tx_lpi parametersRussell King (Oracle)
dsa_user_get_eee() calls the DSA switch get_mac_eee() method followed by phylink_ethtool_get_eee(), which goes on to call phy_ethtool_get_eee(). This overwrites all members of the passed ethtool_keee, which means anything written by the DSA switch get_mac_eee() method will be discarded. Remove setting any members in mt753x_get_mac_eee(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Chester A. Unal <chester.a.unal@arinc9.com> Link: https://patch.msgid.link/E1tUlku-007Uyc-RP@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: dsa: ksz: remove setting of tx_lpi parametersRussell King (Oracle)
dsa_user_get_eee() calls the DSA switch get_mac_eee() method followed by phylink_ethtool_get_eee(), which goes on to call phy_ethtool_get_eee(). This overwrites all members of the passed ethtool_keee, which means anything written by the DSA switch get_mac_eee() method will be discarded. Remove setting any members in ksz_get_mac_eee(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tUlkp-007UyW-OR@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07ipvlan: Fix use-after-free in ipvlan_get_iflink().Kuniyuki Iwashima
syzbot presented an use-after-free report [0] regarding ipvlan and linkwatch. ipvlan does not hold a refcnt of the lower device unlike vlan and macvlan. If the linkwatch work is triggered for the ipvlan dev, the lower dev might have already been freed, resulting in UAF of ipvlan->phy_dev in ipvlan_get_iflink(). We can delay the lower dev unregistration like vlan and macvlan by holding the lower dev's refcnt in dev->netdev_ops->ndo_init() and releasing it in dev->priv_destructor(). Jakub pointed out calling .ndo_XXX after unregister_netdevice() has returned is error prone and suggested [1] addressing this UAF in the core by taking commit 750e51603395 ("net: avoid potential UAF in default_operstate()") further. Let's assume unregistering devices DOWN and use RCU protection in default_operstate() not to race with the device unregistration. [0]: BUG: KASAN: slab-use-after-free in ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353 Read of size 4 at addr ffff0000d768c0e0 by task kworker/u8:35/6944 CPU: 0 UID: 0 PID: 6944 Comm: kworker/u8:35 Not tainted 6.13.0-rc2-g9bc5c9515b48 #12 4c3cb9e8b4565456f6a355f312ff91f4f29b3c47 Hardware name: linux,dummy-virt (DT) Workqueue: events_unbound linkwatch_event Call trace: show_stack+0x38/0x50 arch/arm64/kernel/stacktrace.c:484 (C) __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0xbc/0x108 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x16c/0x6f0 mm/kasan/report.c:489 kasan_report+0xc0/0x120 mm/kasan/report.c:602 __asan_report_load4_noabort+0x20/0x30 mm/kasan/report_generic.c:380 ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353 dev_get_iflink+0x7c/0xd8 net/core/dev.c:674 default_operstate net/core/link_watch.c:45 [inline] rfc2863_policy+0x144/0x360 net/core/link_watch.c:72 linkwatch_do_dev+0x60/0x228 net/core/link_watch.c:175 __linkwatch_run_queue+0x2f4/0x5b8 net/core/link_watch.c:239 linkwatch_event+0x64/0xa8 net/core/link_watch.c:282 process_one_work+0x700/0x1398 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x8c4/0xe10 kernel/workqueue.c:3391 kthread+0x2b0/0x360 kernel/kthread.c:389 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862 Allocated by task 9303: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x30/0x68 mm/kasan/common.c:68 kasan_save_alloc_info+0x44/0x58 mm/kasan/generic.c:568 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x84/0xa0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __do_kmalloc_node mm/slub.c:4283 [inline] __kmalloc_node_noprof+0x2a0/0x560 mm/slub.c:4289 __kvmalloc_node_noprof+0x9c/0x230 mm/util.c:650 alloc_netdev_mqs+0xb4/0x1118 net/core/dev.c:11209 rtnl_create_link+0x2b8/0xb60 net/core/rtnetlink.c:3595 rtnl_newlink_create+0x19c/0x868 net/core/rtnetlink.c:3771 __rtnl_newlink net/core/rtnetlink.c:3896 [inline] rtnl_newlink+0x122c/0x15c0 net/core/rtnetlink.c:4011 rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901 netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542 rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] __sys_sendto+0x2ec/0x438 net/socket.c:2197 __do_sys_sendto net/socket.c:2204 [inline] __se_sys_sendto net/socket.c:2200 [inline] __arm64_sys_sendto+0xe4/0x110 net/socket.c:2200 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151 el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762 el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600 Freed by task 10200: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x30/0x68 mm/kasan/common.c:68 kasan_save_free_info+0x58/0x70 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x48/0x68 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2338 [inline] slab_free mm/slub.c:4598 [inline] kfree+0x140/0x420 mm/slub.c:4746 kvfree+0x4c/0x68 mm/util.c:693 netdev_release+0x94/0xc8 net/core/net-sysfs.c:2034 device_release+0x98/0x1c0 kobject_cleanup lib/kobject.c:689 [inline] kobject_release lib/kobject.c:720 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x2b0/0x438 lib/kobject.c:737 netdev_run_todo+0xdd8/0xf48 net/core/dev.c:10924 rtnl_unlock net/core/rtnetlink.c:152 [inline] rtnl_net_unlock net/core/rtnetlink.c:209 [inline] rtnl_dellink+0x484/0x680 net/core/rtnetlink.c:3526 rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901 netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542 rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] ____sys_sendmsg+0x410/0x708 net/socket.c:2583 ___sys_sendmsg+0x178/0x1d8 net/socket.c:2637 __sys_sendmsg net/socket.c:2669 [inline] __do_sys_sendmsg net/socket.c:2674 [inline] __se_sys_sendmsg net/socket.c:2672 [inline] __arm64_sys_sendmsg+0x12c/0x1c8 net/socket.c:2672 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151 el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762 el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600 The buggy address belongs to the object at ffff0000d768c000 which belongs to the cache kmalloc-cg-4k of size 4096 The buggy address is located 224 bytes inside of freed 4096-byte region [ffff0000d768c000, ffff0000d768d000) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x117688 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff0000c77ef981 flags: 0xbfffe0000000040(head|node=0|zone=2|lastcpupid=0x1ffff) page_type: f5(slab) raw: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122 raw: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981 head: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122 head: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981 head: 0bfffe0000000003 fffffdffc35da201 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff0000d768bf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff0000d768c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff0000d768c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff0000d768c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff0000d768c180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: 8c55facecd7a ("net: linkwatch: only report IF_OPER_LOWERLAYERDOWN if iflink is actually down") Reported-by: syzkaller <syzkaller@googlegroups.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20250102174400.085fd8ac@kernel.org/ [1] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250106071911.64355-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07Merge branch 'net-hold-per-netns-rtnl-during-netdev-notifier-registration'Jakub Kicinski
Kuniyuki Iwashima says: ==================== net: Hold per-netns RTNL during netdev notifier registration. This series adds per-netns RTNL for registration of the global and per-netns netdev notifiers. v1: https://lore.kernel.org/netdev/20250104063735.36945-1-kuniyu@amazon.com/ ==================== Link: https://patch.msgid.link/20250106070751.63146-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().Kuniyuki Iwashima
(un)?register_netdevice_notifier_dev_net() hold RTNL before triggering the notifier for all netdev in the netns. Let's convert the RTNL to rtnl_net_lock(). Note that move_netdevice_notifiers_dev_net() is assumed to be (but not yet) protected by per-netns RTNL of both src and dst netns; we need to convert wireless and hyperv drivers that call dev_change_net_namespace(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250106070751.63146-4-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_net().Kuniyuki Iwashima
(un)?register_netdevice_notifier_net() hold RTNL before triggering the notifier for all netdev in the netns. Let's convert the RTNL to rtnl_net_lock(). Note that the per-netns netdev notifier is protected by per-netns RTNL. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250106070751.63146-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07net: Hold __rtnl_net_lock() in (un)?register_netdevice_notifier().Kuniyuki Iwashima
(un)?register_netdevice_notifier() hold pernet_ops_rwsem and RTNL, iterate all netns, and trigger the notifier for all netdev. Let's hold __rtnl_net_lock() before triggering the notifier. Note that we will need protection for netdev_chain when RTNL is removed. (e.g. blocking_notifier conversion [0] with a lockdep annotation [1]) Link: https://lore.kernel.org/netdev/20250104063735.36945-2-kuniyu@amazon.com/ [0] Link: https://lore.kernel.org/netdev/20250105075957.67334-1-kuniyu@amazon.com/ [1] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250106070751.63146-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07ixgbevf: Remove unused ixgbevf_hv_mbx_opsDr. David Alan Gilbert
The const struct ixgbevf_hv_mbx_ops was added in 2016 as part of commit c6d45171d706 ("ixgbevf: Support Windows hosts (Hyper-V)") but has remained unused. The functions it references are still referenced elsewhere. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://patch.msgid.link/20250105122847.27341-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>