summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-07-09gve: make IRQ handlers and page allocation NUMA awareBailey Forrest
All memory in GVE is currently allocated without regard for the NUMA node of the device. Because access to NUMA-local memory access is significantly cheaper than access to a remote node, this change attempts to ensure that page frags used in the RX path, including page pool frags, are allocated on the NUMA node local to the gVNIC device. Note that this attempt is best-effort. If necessary, the driver will still allocate non-local memory, as __GFP_THISNODE is not passed. Descriptor ring allocations are not updated, as dma_alloc_coherent handles that. This change also modifies the IRQ affinity setting to only select CPUs from the node local to the device, preserving the behavior that TX and RX queues of the same index share CPU affinity. Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707210107.2742029-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: ethernet: ti: am65-cpsw-nuss: Fix skb size by accounting for ↵Chintan Vankar
skb_shared_info While transitioning from netdev_alloc_ip_align() to build_skb(), memory for the "skb_shared_info" member of an "skb" was not allocated. Fix this by allocating "PAGE_SIZE" as the skb length, accounting for the packet length, headroom and tailroom, thereby including the required memory space for skb_shared_info. Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Chintan Vankar <c-vankar@ti.com> Link: https://patch.msgid.link/20250707085201.1898818-1-c-vankar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: thunderx: avoid direct MTU assignment after WRITE_ONCE()Alok Tiwari
The current logic in nicvf_change_mtu() writes the new MTU to netdev->mtu using WRITE_ONCE() before verifying if the hardware update succeeds. However on hardware update failure, it attempts to revert to the original MTU using a direct assignment (netdev->mtu = orig_mtu) which violates the intended of WRITE_ONCE protection introduced in commit 1eb2cded45b3 ("net: annotate writes on dev->mtu from ndo_change_mtu()") Additionally, WRITE_ONCE(netdev->mtu, new_mtu) is unnecessarily performed even when the device is not running. Fix this by: Only writing netdev->mtu after successfully updating the hardware. Skipping hardware update when the device is down, and setting MTU directly. Remove unused variable orig_mtu. This ensures that all writes to netdev->mtu are consistent with WRITE_ONCE expectations and avoids unintended state corruption on failure paths. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250706194327.1369390-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09selftests/tc-testing: Create test case for UAF scenario with ↵Victor Nogueira
DRR/NETEM/BLACKHOLE chain Create a tdc test for the UAF scenario with DRR/NETEM/BLACKHOLE chain shared by Lion on his report [1]. [1] https://lore.kernel.org/netdev/45876f14-cf28-4177-8ead-bb769fd9e57a@gmail.com/ Signed-off-by: Victor Nogueira <victor@mojatatu.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250705203638.246350-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09atm: clip: Fix NULL pointer dereference in vcc_sendmsg()Yue Haibing
atmarpd_dev_ops does not implement the send method, which may cause crash as bellow. BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: Oops: 0010 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 5324 Comm: syz.0.0 Not tainted 6.15.0-rc6-syzkaller-00346-g5723cc3450bc #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffc9000d3cf778 EFLAGS: 00010246 RAX: 1ffffffff1910dd1 RBX: 00000000000000c0 RCX: dffffc0000000000 RDX: ffffc9000dc82000 RSI: ffff88803e4c4640 RDI: ffff888052cd0000 RBP: ffffc9000d3cf8d0 R08: ffff888052c9143f R09: 1ffff1100a592287 R10: dffffc0000000000 R11: 0000000000000000 R12: 1ffff92001a79f00 R13: ffff888052cd0000 R14: ffff88803e4c4640 R15: ffffffff8c886e88 FS: 00007fbc762566c0(0000) GS:ffff88808d6c2000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 0000000041f1b000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> vcc_sendmsg+0xa10/0xc50 net/atm/common.c:644 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg+0x219/0x270 net/socket.c:727 ____sys_sendmsg+0x52d/0x830 net/socket.c:2566 ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2620 __sys_sendmmsg+0x227/0x430 net/socket.c:2709 __do_sys_sendmmsg net/socket.c:2736 [inline] __se_sys_sendmmsg net/socket.c:2733 [inline] __x64_sys_sendmmsg+0xa0/0xc0 net/socket.c:2733 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xf6/0x210 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+e34e5e6b5eddb0014def@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/682f82d5.a70a0220.1765ec.0143.GAE@google.com/T Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250705085228.329202-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09Merge branch 'add-microchip-zl3073x-support-part-1'Jakub Kicinski
Ivan Vecera says: ==================== Add Microchip ZL3073x support (part 1) Add support for Microchip Azurite DPLL/PTP/SyncE chip family that provides DPLL and PTP functionality. This series bring first part that adds the core functionality and basic DPLL support. The next part of the series will bring additional DPLL functionality like eSync support, phase offset and frequency offset reporting and phase adjustments. Testing was done by myself and by Prathosh Satish on Microchip EDS2 development board with ZL30732 DPLL chip connected over I2C bus. ==================== Link: https://patch.msgid.link/20250704182202.1641943-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dpll: zl3073x: Add support to get/set frequency on pinsIvan Vecera
Add support to get/set frequency on pins. The frequency for input pins (references) is computed in the device according this formula: freq = base_freq * multiplier * (nominator / denominator) where the base_freq comes from the list of supported base frequencies and other parameters are arbitrary numbers. All these parameters are 16-bit unsigned integers. The frequency for output pin is determined by the frequency of synthesizer the output pin is connected to and divisor of the output to which is the given pin belongs. The resulting frequency of the P-pin and the N-pin from this output pair depends on the signal format of this output pair. The device supports so-called N-divided signal formats where for the N-pin there is an additional divisor. The frequencies for both pins from such output pair are computed: P-pin-freq = synth_freq / output_div N-pin-freq = synth_freq / output_div / n_div For other signal-format types both P and N pin have the same frequency based only synth frequency and output divisor. Implement output pin callbacks to get and set frequency. The frequency setting for the output non-N-divided signal format is simple as we have to compute just new output divisor. For N-divided formats it is more complex because by changing of output divisor we change frequency for both P and N pins. In this case if we are changing frequency for P-pin we have to compute also new N-divisor for N-pin to keep its current frequency. From this and the above it follows that the frequency of the N-pin cannot be higher than the frequency of the P-pin and the callback must take this limitation into account. Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-13-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dpll: zl3073x: Implement input pin state setting in automatic modeIvan Vecera
Implement input pin state setting when the DPLL is running in automatic mode. Unlike manual mode, the DPLL mode switching is not used here and the implementation uses special priority value (15) to make the given pin non-selectable. When the user sets state of the pin as disconnected the driver internally sets its priority in HW to 15 that prevents the DPLL to choose this input pin. Conversely, if the pin status is set to selectable, the driver sets the pin priority in HW to the original saved value. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-12-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dpll: zl3073x: Add support to get/set priority on input pinsIvan Vecera
Add support for getting and setting input pin priority. Implement required callbacks and set appropriate capability for input pins. Although the pin priority make sense only if the DPLL is running in automatic mode we have to expose this capability unconditionally because input pins (references) are shared between all DPLLs where one of them can run in automatic mode while the other one not. Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-11-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dpll: zl3073x: Implement input pin selection in manual modeIvan Vecera
Implement input pin state setting if the DPLL is running in manual mode. The driver indicates manual mode if the DPLL mode is one of ref-lock, forced-holdover, freerun. Use these modes to implement input pin state change between connected and disconnected states. When the user set the particular pin as connected the driver marks this input pin as forced reference and switches the DPLL mode to ref-lock. When the use set the pin as disconnected the driver switches the DPLL to freerun or forced holdover mode. The switch to holdover mode is done if the DPLL has holdover capability (e.g is currently locked with holdover acquired). Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-10-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dpll: zl3073x: Register DPLL devices and pinsIvan Vecera
Enumerate all available DPLL channels and registers a DPLL device for each of them. Check all input references and outputs and register DPLL pins for them. Number of registered DPLL pins depends on configuration of references and outputs. If the reference or output is configured as differential one then only one DPLL pin is registered. Both references and outputs can be also disabled from firmware configuration and in this case no DPLL pins are registered. All registrable references are registered to all available DPLL devices with exception of DPLLs that are configured in NCO (numerically controlled oscillator) mode. In this mode DPLL channel acts as PHC and cannot be locked to any reference. Device outputs are connected to one of synthesizers and each synthesizer is driven by some DPLL channel. So output pins belonging to given output are registered to DPLL device that drives associated synthesizer. Finally add kworker task to monitor async changes on all DPLL channels and input pins and to notify about them DPLL core. Output pins are not monitored as their parameters are not changed asynchronously by the device. Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-9-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dpll: zl3073x: Read DPLL types and pin properties from system firmwareIvan Vecera
Add support for reading of DPLL types and optional pin properties from the system firmware (DT, ACPI...). The DPLL types are stored in property 'dpll-types' as string array and possible values 'pps' and 'eec' are mapped to DPLL enums DPLL_TYPE_PPS and DPLL_TYPE_EEC. The pin properties are stored under 'input-pins' and 'output-pins' sub-nodes and the following ones are supported: * reg integer that specifies pin index * label string that is used by driver as board label * connection-type string that indicates pin connection type * supported-frequencies-hz array of u64 values what frequencies are supported / allowed for given pin with respect to hardware wiring Do not blindly trust system firmware and filter out frequencies that cannot be configured/represented in device (input frequencies have to be factorized by one of the base frequencies and output frequencies have to divide configured synthesizer frequency). Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-8-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dpll: zl3073x: Fetch invariants during probeIvan Vecera
Several configuration parameters will remain constant at runtime, so we can load them during probe to avoid excessive reads from the hardware. Read the following parameters from the device during probe and store them for later use: * enablement status and frequencies of the synthesizers and their associated DPLL channels * enablement status and type (single-ended or differential) of input pins * associated synthesizers, signal format, and enablement status of outputs Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-7-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dpll: Add basic Microchip ZL3073x supportIvan Vecera
Microchip Azurite ZL3073x represents chip family providing DPLL and optionally PHC (PTP) functionality. The chips can be connected be connected over I2C or SPI bus. They have the following characteristics: * up to 5 separate DPLL units (channels) * 5 synthesizers * 10 input pins (references) * 10 outputs * 20 output pins (output pin pair shares one output) * Each reference and output can operate in either differential or single-ended mode (differential mode uses 2 pins) * Each output is connected to one of the synthesizers * Each synthesizer is driven by one of the DPLL unit The device uses 7-bit addresses and 8-bits values. It exposes 8-, 16-, 32- and 48-bits registers in address range <0x000,0x77F>. Due to 7bit addressing, the range is organized into pages of 128 bytes, with each page containing a page selector register at address 0x7F. For reading/writing multi-byte registers, the device supports bulk transfers. Add basic functionality to access device registers, probe functionality both I2C and SPI cases and add devlink support to provide info and to set clock ID parameter. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-6-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09devlink: Add new "clock_id" generic device paramIvan Vecera
Add a new device generic parameter to specify clock ID that should be used by the device for registering DPLL devices and pins. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-5-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09devlink: Add support for u64 parametersIvan Vecera
Only 8, 16 and 32-bit integers are supported for numeric devlink parameters. The subsequent patch adds support for DPLL clock ID that is defined as 64-bit number. Add support for u64 parameter type. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-4-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dt-bindings: dpll: Add support for Microchip Azurite chip familyIvan Vecera
Add DT bindings for Microchip Azurite DPLL chip family. These chips provide up to 5 independent DPLL channels, 10 differential or single-ended inputs and 10 differential or 20 single-ended outputs. They can be connected via I2C or SPI busses. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-3-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dt-bindings: dpll: Add DPLL device and pinIvan Vecera
Add a common DT schema for DPLL device and its associated pins. The DPLL (device phase-locked loop) is a device used for precise clock synchronization in networking and telecom hardware. The device includes one or more DPLLs (channels) and one or more physical input/output pins. Each DPLL channel is used either to provide a pulse-per-clock signal or to drive an Ethernet equipment clock. The input and output pins have the following properties: * label: specifies board label * connection type: specifies its usage depending on wiring * list of supported or allowed frequencies: depending on how the pin is connected and where) * embedded sync capability: indicates whether the pin supports this Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20250704182202.1641943-2-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09virtio-net: xsk: rx: move the xdp->data adjustment to buf_to_xdp()Bui Quang Minh
This commit does not do any functional changes. It moves xdp->data adjustment for buffer other than first buffer to buf_to_xdp() helper so that the xdp_buff adjustment does not scatter over different functions. Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Link: https://patch.msgid.link/20250705075515.34260-1-minhquangbui99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09Merge branch 'add-vf-drivers-for-wangxun-virtual-functions'Jakub Kicinski
Mengyuan Lou says: ==================== Add vf drivers for wangxun virtual functions Introduces basic support for Wangxun’s virtual function (VF) network drivers, specifically txgbevf and ngbevf. These drivers provide SR-IOV VF functionality for Wangxun 10/25/40G network devices. The first three patches add common APIs for Wangxun VF drivers, including mailbox communication and shared initialization logic.These abstractions are placed in libwx to reduce duplication across VF drivers. Patches 4–8 introduce the txgbevf driver, including: PCI device initialization, Hardware reset, Interrupt setup, Rx/Tx datapath implementation and link status changeing flow. Patches 9–12 implement the ngbevf driver, mirroring the functionality added in txgbevf. v2: https://lore.kernel.org/20250625102058.19898-1-mengyuanlou@net-swift.com v1: https://lore.kernel.org/20250611083559.14175-1-mengyuanlou@net-swift.com ==================== Link: https://patch.msgid.link/20250704094923.652-1-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: ngbevf: add link update flowMengyuan Lou
Add link update flow to wangxun 1G virtual functions. Get link status from pf in mbox, and if it is failed then check the vx_status, because vx_status switching is too slow. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-13-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: ngbevf: init interrupts and request irqsMengyuan Lou
Add specific parameters for irq alloc, then use wx_init_interrupt_scheme to initialize interrupt allocation in probe. Add .ndo_start_xmit support and start all queues. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-12-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: ngbevf: add sw init pci info and reset hardwareMengyuan Lou
Do sw init and reset hw for ngbevf virtual functions, then register netdev. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-11-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: wangxun: add ngbevf buildMengyuan Lou
Add doc build infrastructure for ngbevf driver. Implement the basic PCI driver loading and unloading interface. Initialize the id_table which support 1G virtual functions for Wangxun. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-10-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: txgbevf: add link update flowMengyuan Lou
Add link update flow to wangxun 10/25/40G virtual functions. Get link status from pf in mbox, and if it is failed then check the vx_status, because vx_status switching is too slow. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-9-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: txgbevf: Support Rx and Tx process pathMengyuan Lou
Improve the configuration of Rx and Tx ring. Setup and alloc resources. Configure Rx and Tx unit on hardware. Add .ndo_start_xmit support and start all queues. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-8-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: txgbevf: init interrupts and request irqsMengyuan Lou
Add irq alloc flow functions for vf. Alloc pcie msix irqs for drivers and request_irq for tx/rx rings and misc other events. If the application is successful, config vertors for interrupts. Enable interrupts mask in wxvf_irq_enable. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-7-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: txgbevf: add sw init pci info and reset hardwareMengyuan Lou
Add sw init and reset hw for txgbevf virtual functions which initialize basic parameters, and then register netdev. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-6-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: wangxun: add txgbevf buildMengyuan Lou
Add doc build infrastructure for txgbevf driver. Implement the basic PCI driver loading and unloading interface. Initialize the id_table which support 10/25/40G virtual functions for Wangxun. Ioremap the space of bar0 and bar4 which will be used. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-5-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: libwx: add wangxun vf common apiMengyuan Lou
Add common wx_configure_vf and wx_set_mac_vf for ngbevf and txgbevf. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-4-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: libwx: add base vf api for vf driversMengyuan Lou
Implement mbox_write_and_read_ack functions which are used to set basic functions like set_mac, get_link.etc for vf. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-3-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09net: libwx: add mailbox api for wangxun vf driversMengyuan Lou
Implements the mailbox interfaces for Wangxun vf drivers which will be used in txgbevf and ngbevf. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Link: https://patch.msgid.link/20250704094923.652-2-mengyuanlou@net-swift.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10netfilter: nf_tables: adjust lockdep assertions handlingFedor Pchelkin
It's needed to check the return value of lockdep_commit_lock_is_held(), otherwise there's no point in this assertion as it doesn't print any debug information on itself. Found by Linux Verification Center (linuxtesting.org) with Svace static analysis tool. Fixes: b04df3da1b5c ("netfilter: nf_tables: do not defer rule destruction via call_rcu") Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-07-10netfilter: nf_tables: Reintroduce shortened deletion notificationsPhil Sutter
Restore commit 28339b21a365 ("netfilter: nf_tables: do not send complete notification of deletions") and fix it: - Avoid upfront modification of 'event' variable so the conditionals become effective. - Always include NFTA_OBJ_TYPE attribute in object notifications, user space requires it for proper deserialisation. - Catch DESTROY events, too. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-07-10netfilter: nf_tables: Drop dead code from fill_*_info routinesPhil Sutter
This practically reverts commit 28339b21a365 ("netfilter: nf_tables: do not send complete notification of deletions"): The feature was never effective, due to prior modification of 'event' variable the conditional early return never happened. User space also relies upon the current behaviour, so better reintroduce the shortened deletion notifications once it is fixed. Fixes: 28339b21a365 ("netfilter: nf_tables: do not send complete notification of deletions") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-07-09Merge branch ↵Jakub Kicinski
'atm-clip-fix-infinite-recursion-potential-null-ptr-deref-and-memleak' Kuniyuki Iwashima says: ==================== atm: clip: Fix infinite recursion, potential null-ptr-deref, and memleak. Patch 1 fixes racy access to atmarpd found while checking RTNL usage in clip.c. Patch 2 fixes memory leak by ioctl(ATMARP_MKIP) and ioctl(ATMARPD_CTRL). Patch 3 fixes infinite recursive call of clip_vcc->old_push(), which was reported by syzbot. v1: https://lore.kernel.org/20250702020437.703698-1-kuniyu@google.com ==================== Link: https://patch.msgid.link/20250704062416.1613927-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09atm: clip: Fix infinite recursive call of clip_push().Kuniyuki Iwashima
syzbot reported the splat below. [0] This happens if we call ioctl(ATMARP_MKIP) more than once. During the first call, clip_mkip() sets clip_push() to vcc->push(), and the second call copies it to clip_vcc->old_push(). Later, when the socket is close()d, vcc_destroy_socket() passes NULL skb to clip_push(), which calls clip_vcc->old_push(), triggering the infinite recursion. Let's prevent the second ioctl(ATMARP_MKIP) by checking vcc->user_back, which is allocated by the first call as clip_vcc. Note also that we use lock_sock() to prevent racy calls. [0]: BUG: TASK stack guard page was hit at ffffc9000d66fff8 (stack is ffffc9000d670000..ffffc9000d678000) Oops: stack guard page: 0000 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 5322 Comm: syz.0.0 Not tainted 6.16.0-rc4-syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:clip_push+0x5/0x720 net/atm/clip.c:191 Code: e0 8f aa 8c e8 1c ad 5b fa eb ae 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 55 <41> 57 41 56 41 55 41 54 53 48 83 ec 20 48 89 f3 49 89 fd 48 bd 00 RSP: 0018:ffffc9000d670000 EFLAGS: 00010246 RAX: 1ffff1100235a4a5 RBX: ffff888011ad2508 RCX: ffff8880003c0000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888037f01000 RBP: dffffc0000000000 R08: ffffffff8fa104f7 R09: 1ffffffff1f4209e R10: dffffc0000000000 R11: ffffffff8a99b300 R12: ffffffff8a99b300 R13: ffff888037f01000 R14: ffff888011ad2500 R15: ffff888037f01578 FS: 000055557ab6d500(0000) GS:ffff88808d250000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc9000d66fff8 CR3: 0000000043172000 CR4: 0000000000352ef0 Call Trace: <TASK> clip_push+0x6dc/0x720 net/atm/clip.c:200 clip_push+0x6dc/0x720 net/atm/clip.c:200 clip_push+0x6dc/0x720 net/atm/clip.c:200 ... clip_push+0x6dc/0x720 net/atm/clip.c:200 clip_push+0x6dc/0x720 net/atm/clip.c:200 clip_push+0x6dc/0x720 net/atm/clip.c:200 vcc_destroy_socket net/atm/common.c:183 [inline] vcc_release+0x157/0x460 net/atm/common.c:205 __sock_release net/socket.c:647 [inline] sock_close+0xc0/0x240 net/socket.c:1391 __fput+0x449/0xa70 fs/file_table.c:465 task_work_run+0x1d1/0x260 kernel/task_work.c:227 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop+0xec/0x110 kernel/entry/common.c:114 exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline] do_syscall_64+0x2bd/0x3b0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff31c98e929 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffb5aa1f78 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4 RAX: 0000000000000000 RBX: 0000000000012747 RCX: 00007ff31c98e929 RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003 RBP: 00007ff31cbb7ba0 R08: 0000000000000001 R09: 0000000db5aa226f R10: 00007ff31c7ff030 R11: 0000000000000246 R12: 00007ff31cbb608c R13: 00007ff31cbb6080 R14: ffffffffffffffff R15: 00007fffb5aa2090 </TASK> Modules linked in: Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+0c77cccd6b7cd917b35a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2371d94d248d126c1eb1 Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250704062416.1613927-4-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09atm: clip: Fix memory leak of struct clip_vcc.Kuniyuki Iwashima
ioctl(ATMARP_MKIP) allocates struct clip_vcc and set it to vcc->user_back. The code assumes that vcc_destroy_socket() passes NULL skb to vcc->push() when the socket is close()d, and then clip_push() frees clip_vcc. However, ioctl(ATMARPD_CTRL) sets NULL to vcc->push() in atm_init_atmarp(), resulting in memory leak. Let's serialise two ioctl() by lock_sock() and check vcc->push() in atm_init_atmarp() to prevent memleak. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250704062416.1613927-3-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09atm: clip: Fix potential null-ptr-deref in to_atmarpd().Kuniyuki Iwashima
atmarpd is protected by RTNL since commit f3a0592b37b8 ("[ATM]: clip causes unregister hang"). However, it is not enough because to_atmarpd() is called without RTNL, especially clip_neigh_solicit() / neigh_ops->solicit() is unsleepable. Also, there is no RTNL dependency around atmarpd. Let's use a private mutex and RCU to protect access to atmarpd in to_atmarpd(). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250704062416.1613927-2-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09dt-bindings: net: Add support for Sophgo CV1800 dwmacInochi Amaoto
The GMAC IP on CV1800 series SoC is a standard Synopsys DesignWare MAC (version 3.70a). Add necessary compatible string for this device. Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20250703021220.124195-1-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09PM: sleep: Call pm_restore_gfp_mask() after dpm_resume()Rafael J. Wysocki
Commit 12ffc3b1513e ("PM: Restrict swap use to later in the suspend sequence") changed two pm_restore_gfp_mask() calls in enter_state() and hibernation_restore() into one pm_restore_gfp_mask() call in dpm_resume_end(), but it put that call before the dpm_resume() invocation which is too early (some swap-backing devices may not be ready at that point). Moreover, this code ordering change was not even mentioned in the changelog of the commit mentioned above. Address this by moving that call after the dpm_resume() one. Fixes: 12ffc3b1513e ("PM: Restrict swap use to later in the suspend sequence") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/2797018.mvXUDI8C0e@rjwysocki.net
2025-07-09KVM: x86: avoid underflow when scaling TSC frequencyPaolo Bonzini
In function kvm_guest_time_update(), __scale_tsc() is used to calculate a TSC *frequency* rather than a TSC value. With low-enough ratios, a TSC value that is less than 1 would underflow to 0 and to an infinite while loop in kvm_get_time_scale(): kvm_guest_time_update(struct kvm_vcpu *v) if (kvm_caps.has_tsc_control) tgt_tsc_khz = kvm_scale_tsc(tgt_tsc_khz, v->arch.l1_tsc_scaling_ratio); __scale_tsc(u64 ratio, u64 tsc) ratio=122380531, tsc=2299998, N=48 ratio*tsc >> N = 0.999... -> 0 Later in the function: Call Trace: <TASK> kvm_get_time_scale arch/x86/kvm/x86.c:2458 [inline] kvm_guest_time_update+0x926/0xb00 arch/x86/kvm/x86.c:3268 vcpu_enter_guest.constprop.0+0x1e70/0x3cf0 arch/x86/kvm/x86.c:10678 vcpu_run+0x129/0x8d0 arch/x86/kvm/x86.c:11126 kvm_arch_vcpu_ioctl_run+0x37a/0x13d0 arch/x86/kvm/x86.c:11352 kvm_vcpu_ioctl+0x56b/0xe60 virt/kvm/kvm_main.c:4188 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x12d/0x190 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x59/0x110 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2 This can really happen only when fuzzing, since the TSC frequency would have to be nonsensically low. Fixes: 35181e86df97 ("KVM: x86: Add a common TSC scaling function") Reported-by: Yuntao Liu <liuyuntao12@huawei.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-07-09eventpoll: don't decrement ep refcount while still holding the ep mutexLinus Torvalds
Jann Horn points out that epoll is decrementing the ep refcount and then doing a mutex_unlock(&ep->mtx); afterwards. That's very wrong, because it can lead to a use-after-free. That pattern is actually fine for the very last reference, because the code in question will delay the actual call to "ep_free(ep)" until after it has unlocked the mutex. But it's wrong for the much subtler "next to last" case when somebody *else* may also be dropping their reference and free the ep while we're still using the mutex. Note that this is true even if that other user is also using the same ep mutex: mutexes, unlike spinlocks, can not be used for object ownership, even if they guarantee mutual exclusion. A mutex "unlock" operation is not atomic, and as one user is still accessing the mutex as part of unlocking it, another user can come in and get the now released mutex and free the data structure while the first user is still cleaning up. See our mutex documentation in Documentation/locking/mutex-design.rst, in particular the section [1] about semantics: "mutex_unlock() may access the mutex structure even after it has internally released the lock already - so it's not safe for another context to acquire the mutex and assume that the mutex_unlock() context is not using the structure anymore" So if we drop our ep ref before the mutex unlock, but we weren't the last one, we may then unlock the mutex, another user comes in, drops _their_ reference and releases the 'ep' as it now has no users - all while the mutex_unlock() is still accessing it. Fix this by simply moving the ep refcount dropping to outside the mutex: the refcount itself is atomic, and doesn't need mutex protection (that's the whole _point_ of refcounts: unlike mutexes, they are inherently about object lifetimes). Reported-by: Jann Horn <jannh@google.com> Link: https://docs.kernel.org/locking/mutex-design.html#semantics [1] Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-07-09Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - Fix bogus KASAN splat on EFI runtime stack - Select JUMP_LABEL unconditionally to avoid boot failure with pKVM and the legacy implementation of static keys - Avoid touching GCS registers when 'arm64.nogcs' has been passed on the command-line - Move a 'cpumask_t' off the stack in smp_send_stop() - Don't advertise SME-related hwcaps to userspace when ID_AA64PFR1_EL1 indicates that SME is not implemented - Always check the VMA when handling an Overlay fault - Avoid corrupting TCR2_EL1 during boot * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/mm: Drop wrong writes into TCR2_EL1 arm64: poe: Handle spurious Overlay faults arm64: Filter out SME hwcaps when FEAT_SME isn't implemented arm64: move smp_send_stop() cpu mask off stack arm64/gcs: Don't try to access GCS registers if arm64.nogcs is enabled arm64: Unconditionally select CONFIG_JUMP_LABEL arm64: efi: Fix KASAN false positive for EFI runtime stack
2025-07-09Merge tag 'pinctrl-v6.16-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Mark som pins as invalid for IRQ use in the Qualcomm driver - Fix up the use of device properties on the MA35DX Nuvoton, apparently something went sidewise - Clear the GPIO debounce settings when going down for suspend in the AMD driver. Very good for some AMD laptops that now wake up from suspend again! - Add the compulsory .can_sleep bool flag in the AW9523 driver, should have been there from the beginning, now there are users finding the bug - Drop some bouncing email address from MAINTAINERS * tag 'pinctrl-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: aw9523: fix can_sleep flag for GPIO chip pinctrl: amd: Clear GPIO debounce for suspend pinctrl: nuvoton: Fix boot on ma35dx platforms MAINTAINERS: drop bouncing Lakshmi Sowjanya D pinctrl: qcom: msm: mark certain pins as invalid for interrupts
2025-07-09gpio: of: initialize local variable passed to the .of_xlate() callbackAlexander Stein
of_flags is passed down to GPIO chip's xlate function, so ensure this one is properly initialized as - if the xlate callback does nothing with it - we may end up with various configuration errors like: gpio-720 (enable): multiple pull-up, pull-down or pull-disable enabled, invalid configuration Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Link: https://lore.kernel.org/r/20250708083829.658051-1-alexander.stein@ew.tq-group.com [Bartosz: tweaked the commit message] Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-07-09wifi: mac80211: don't complete management TX on SAE commitJohannes Berg
When SAE commit is sent and received in response, there's no ordering for the SAE confirm messages. As such, don't call drivers to stop listening on the channel when the confirm message is still expected. This fixes an issue if the local confirm is transmitted later than the AP's confirm, for iwlwifi (and possibly mt76) the AP's confirm would then get lost since the device isn't on the channel at the time the AP transmit the confirm. For iwlwifi at least, this also improves the overall timing of the authentication handshake (by about 15ms according to the report), likely since the session protection won't be aborted and rescheduled. Note that even before this, mgd_complete_tx() wasn't always called for each call to mgd_prepare_tx() (e.g. in the case of WEP key shared authentication), and the current drivers that have the complete callback don't seem to mind. Document this as well though. Reported-by: Jan Hendrik Farr <kernel@jfarr.cc> Closes: https://lore.kernel.org/all/aB30Ea2kRG24LINR@archlinux/ Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250609213232.12691580e140.I3f1d3127acabcd58348a110ab11044213cf147d3@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-09wifi: cfg80211/mac80211: implement dot11ExtendedRegInfoSupportSomashekhar Puttagangaiah
Implement dot11ExtendedRegInfoSupport to advertise non-AP station regulatory power capability as part of regulatory connectivity element in (Re)Association request frames so that AP can achieve maximum client connectivity. Control field which was interpreted using value of 3-bits B5 to B3, now uses value of 4-bits B6 to B3 to interpret the type of AP. Hence update IEEE80211_HE_6GHZ_OPER_CTRL_REG_INFO to parse 4-bits control field. If older AP still updates only 3-bits value of control field, station can still interpret the value as per section E.2.7 of IEEE 802.11 REVme D7.0 and support the appropriate AP type. Also update IEEE80211_6GHZ_CTRL_REG_INDOOR_SP_AP as the value of standard power AP is changed to 8 instead of 4 so that AP can support both LPI AP and SP AP to maximize the connectivity with stations. For backward compatibility, keeping value 4 as old AP by limiting it to SP AP only. Signed-off-by: Somashekhar Puttagangaiah <somashekhar.puttagangaiah@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250609213232.90cdef116aad.I85da390fbee59355e3855691933e6a5e55c47ac4@changeid [fix kernel-doc] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-09wifi: mac80211: send extended MLD capa/ops if AP has itJohannes Berg
Currently the code only sends extended MLD capa/ops in strict mode, but if the AP has it then it should also be able to parse it. There could be cases where the AP doesn't have it but we would want to advertise it (e.g. if the AP supports nothing but we want to have BTM.), but given the broken deployed APs out there right now this is the best we can do. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250609213232.c9b8b3a6ca77.I1153d4283d1fbb9e5db60e7b939cc133a6345db5@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-09wifi: mac80211: copy first_part into HW scanBenjamin Berg
cfg80211 now reports whether this is the first part of a scan. Copy that information into the driver request. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250609213231.63f6078bd7be.Ia6e5cee945e6d9617c2f427552d89d23c92eee83@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>