linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-07-10	dt-bindings: net: sophgo,sg2044-dwmac: Add support for Sophgo SG2042 dwmac	Inochi Amaoto
	The GMAC IP on SG2042 is a standard Synopsys DesignWare MAC (version 5.00a) with tx clock. Add necessary compatible string for this device. Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Tested-by: Han Gao <rabenda.cn@gmail.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20250708064052.507094-2-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	Merge branch 'further-mt7988-devicetree-work'	Jakub Kicinski
	Frank Wunderlich says: ==================== further mt7988 devicetree work This series continues mt7988 devicetree work - Extend cpu frequency scaling with CCI - GPIO leds - Basic network-support (ethernet controller + builtin switch + SFP Cages) depencies (i hope this list is complete and latest patches/series linked): support interrupt-names is optional again as i re-added the reserved IRQs (they are not unusable as i thought and can allow features in future) https://patchwork.kernel.org/project/netdevbpf/patch/20250619132125.78368-2-linux@fw-web.de/ needs change in mtk ethernet driver for the sram to be read from separate node: https://patchwork.kernel.org/project/netdevbpf/patch/c2b9242229d06af4e468204bcf42daa1535c3a72.1751461762.git.daniel@makrotopia.org/ for SFP-Function (macs currently disabled): PCS clearance which is a 1.5 year discussion currently ongoing Daniel asked netdev for a way 2 go: https://lore.kernel.org/netdev/aEwfME3dYisQtdCj@pidgin.makrotopia.org/ e.g. something like this (one of): * https://patchwork.kernel.org/project/netdevbpf/patch/20250610233134.3588011-4-sean.anderson@linux.dev/ (v6) * https://patchwork.kernel.org/project/netdevbpf/patch/20250511201250.3789083-4-ansuelsmth@gmail.com/ (v4) * https://patchwork.kernel.org/project/netdevbpf/patch/ba4e359584a6b3bc4b3470822c42186d5b0856f9.1721910728.git.daniel@makrotopia.org/ full usxgmii driver: https://patchwork.kernel.org/project/netdevbpf/patch/07845ec900ba41ff992875dce12c622277592c32.1702352117.git.daniel@makrotopia.org/ first PCS-discussion is here: https://patchwork.kernel.org/project/netdevbpf/patch/8aa905080bdb6760875d62cb3b2b41258837f80e.1702352117.git.daniel@makrotopia.org/ some more here: https://lore.kernel.org/netdev/20250511201250.3789083-4-ansuelsmth@gmail.com/ and then dts nodes for sgmiisys+usxgmii+2g5 firmware when above depencies are solved the mac1/2 can be enabled and 2.5G phy/SFP slots will work. ==================== Link: https://patch.msgid.link/20250709111147.11843-1-linux@fw-web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	dt-bindings: net: dsa: mediatek,mt7530: add internal mdio bus	Frank Wunderlich
	Mt7988 buildin switch has own mdio bus where ge-phys are connected. Add related property for this. Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20250709111147.11843-7-linux@fw-web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	dt-bindings: net: dsa: mediatek,mt7530: add dsa-port definition for mt7988	Frank Wunderlich
	Add own dsa-port binding for SoC with internal switch where only phy-mode 'internal' is valid. Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20250709111147.11843-6-linux@fw-web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	dt-bindings: net: mediatek,net: add sram property	Frank Wunderlich
	Meditak Filogic SoCs (MT798x) have dedicated MMIO-SRAM for dma operations. MT7981 and MT7986 currently use static offset to ethernet MAC register which will be changed in separate patch once this way is accepted. Add "sram" property to map ethernet controller to dedicated mmio-sram node. Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250709111147.11843-5-linux@fw-web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	dt-bindings: net: mediatek,net: allow irq names	Frank Wunderlich
	In preparation for MT7988 and RSS/LRO allow the interrupt-names property. In this way driver can request the interrupts by name which is much more readable in the driver code and SoC's dtsi than relying on a specific order. Frame-engine-IRQs (fe0..3): MT7621, MT7628: 1 FE-IRQ MT7622, MT7623: 3 FE-IRQs (only two used by the driver for now) MT7981, MT7986: 4 FE-IRQs (only two used by the driver for now) RSS/LRO IRQs (pdma0..3) additional only on Filogic (MT798x) with count of 4. So all IRQ-names (8) for Filogic. Set boundaries for all compatibles same as irq count. Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250709111147.11843-4-linux@fw-web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	dt-bindings: net: mediatek,net: allow up to 8 IRQs	Frank Wunderlich
	Increase the maximum IRQ count to 8 (4 FE + 4 RSS/LRO). Frame-engine-IRQs (max 4): MT7621, MT7628: 1 FE-IRQ MT7622, MT7623: 3 FE-IRQs (only two used by the driver for now) MT7981, MT7986, MT7988: 4 FE-IRQs (only two used by the driver for now) Mediatek Filogic SoCs (mt798x) have 4 additional IRQs for RSS and/or LRO. So MT798x have 8 IRQs in total. MT7981 does not have a ethernet-node yet. MT7986 Ethernet node is updated with RSS/LRO IRQs in this series. MT7988 Ethernet node is added in this series. Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250709111147.11843-3-linux@fw-web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	dt-bindings: net: mediatek,net: update mac subnode pattern for mt7988	Frank Wunderlich
	MT7888 have 3 Macs and so its nodes have names from mac0 - mac2. Update pattern to fix this. Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250709111147.11843-2-linux@fw-web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	Merge branch 'virtio_udp_tunnel_08_07_2025' of ↵	Jakub Kicinski
	https://github.com/pabeni/linux-devel Paolo Abeni says: ==================== virtio: introduce GSO over UDP tunnel Some virtualized deployments use UDP tunnel pervasively and are impacted negatively by the lack of GSO support for such kind of traffic in the virtual NIC driver. The virtio_net specification recently introduced support for GSO over UDP tunnel, this series updates the virtio implementation to support such a feature. Currently the kernel virtio support limits the feature space to 64, while the virtio specification allows for a larger number of features. Specifically the GSO-over-UDP-tunnel-related virtio features use bits 65-69. The first four patches in this series rework the virtio and vhost feature support to cope with up to 128 bits. The limit is set by a define and could be easily raised in future, as needed. This implementation choice is aimed at keeping the code churn as limited as possible. For the same reason, only the virtio_net driver is reworked to leverage the extended feature space; all other virtio/vhost drivers are unaffected, but could be upgraded to support the extended features space in a later time. The last four patches bring in the actual GSO over UDP tunnel support. As per specification, some additional fields are introduced into the virtio net header to support the new offload. The presence of such fields depends on the negotiated features. New helpers are introduced to convert the UDP-tunneled skb metadata to an extended virtio net header and vice versa. Such helpers are used by the tun and virtio_net driver to cope with the newly supported offloads. Tested with basic stream transfer with all the possible permutations of host kernel/qemu/guest kernel with/without GSO over UDP tunnel support. ==================== Link: https://patch.msgid.link/cover.1751874094.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.16-rc6). No conflicts. Adjacent changes: Documentation/devicetree/bindings/net/allwinner,sun8i-a83t-emac.yaml 0a12c435a1d6 ("dt-bindings: net: sun8i-emac: Add A100 EMAC compatible") b3603c0466a8 ("dt-bindings: net: sun8i-emac: Rename A523 EMAC0 to GMAC0") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	Merge tag 'net-6.16-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from Bluetooth. Current release - regressions: - tcp: refine sk_rcvbuf increase for ooo packets - bluetooth: fix attempting to send HCI_Disconnect to BIS handle - rxrpc: fix over large frame size warning - eth: bcmgenet: initialize u64 stats seq counter Previous releases - regressions: - tcp: correct signedness in skb remaining space calculation - sched: abort __tc_modify_qdisc if parent class does not exist - vsock: fix transport_{g2h,h2g} TOCTOU - rxrpc: fix bug due to prealloc collision - tipc: fix use-after-free in tipc_conn_close(). - bluetooth: fix not marking Broadcast Sink BIS as connected - phy: qca808x: fix WoL issue by utilizing at8031_set_wol() - eth: am65-cpsw-nuss: fix skb size by accounting for skb_shared_info Previous releases - always broken: - netlink: fix wraparounds of sk->sk_rmem_alloc. - atm: fix infinite recursive call of clip_push(). - eth: - stmmac: fix interrupt handling for level-triggered mode in DWC_XGMAC2 - rtsn: fix a null pointer dereference in rtsn_probe()" * tag 'net-6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits) net/sched: sch_qfq: Fix null-deref in agg_dequeue rxrpc: Fix oops due to non-existence of prealloc backlog struct rxrpc: Fix bug due to prealloc collision MAINTAINERS: remove myself as netronome maintainer selftests/net: packetdrill: add tcp_ooo-before-and-after-accept.pkt tcp: refine sk_rcvbuf increase for ooo packets net/sched: Abort __tc_modify_qdisc if parent class does not exist net: ethernet: ti: am65-cpsw-nuss: Fix skb size by accounting for skb_shared_info net: thunderx: avoid direct MTU assignment after WRITE_ONCE() selftests/tc-testing: Create test case for UAF scenario with DRR/NETEM/BLACKHOLE chain atm: clip: Fix NULL pointer dereference in vcc_sendmsg() atm: clip: Fix infinite recursive call of clip_push(). atm: clip: Fix memory leak of struct clip_vcc. atm: clip: Fix potential null-ptr-deref in to_atmarpd(). net: phy: smsc: Fix link failure in forced mode with Auto-MDIX net: phy: smsc: Force predictable MDI-X state on LAN87xx net: phy: smsc: Fix Auto-MDIX configuration when disabled by strap net: stmmac: Fix interrupt handling for level-triggered mode in DWC_XGMAC2 rxrpc: Fix over large frame size warning net: airoha: Fix an error handling path in airoha_probe() ...
2025-07-10	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull KVM fixes from Paolo Bonzini: "Many patches, pretty much all of them small, that accumulated while I was on vacation. ARM: - Remove the last leftovers of the ill-fated FPSIMD host state mapping at EL2 stage-1 - Fix unexpected advertisement to the guest of unimplemented S2 base granule sizes - Gracefully fail initialising pKVM if the interrupt controller isn't GICv3 - Also gracefully fail initialising pKVM if the carveout allocation fails - Fix the computing of the minimum MMIO range required for the host on stage-2 fault - Fix the generation of the GICv3 Maintenance Interrupt in nested mode x86: - Reject SEV{-ES} intra-host migration if one or more vCPUs are actively being created, so as not to create a non-SEV{-ES} vCPU in an SEV{-ES} VM - Use a pre-allocated, per-vCPU buffer for handling de-sparsification of vCPU masks in Hyper-V hypercalls; fixes a "stack frame too large" issue - Allow out-of-range/invalid Xen event channel ports when configuring IRQ routing, to avoid dictating a specific ioctl() ordering to userspace - Conditionally reschedule when setting memory attributes to avoid soft lockups when userspace converts huge swaths of memory to/from private - Add back MWAIT as a required feature for the MONITOR/MWAIT selftest - Add a missing field in struct sev_data_snp_launch_start that resulted in the guest-visible workarounds field being filled at the wrong offset - Skip non-canonical address when processing Hyper-V PV TLB flushes to avoid VM-Fail on INVVPID - Advertise supported TDX TDVMCALLs to userspace - Pass SetupEventNotifyInterrupt arguments to userspace - Fix TSC frequency underflow" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: avoid underflow when scaling TSC frequency KVM: arm64: Remove kvm_arch_vcpu_run_map_fp() KVM: arm64: Fix handling of FEAT_GTG for unimplemented granule sizes KVM: arm64: Don't free hyp pages with pKVM on GICv2 KVM: arm64: Fix error path in init_hyp_mode() KVM: arm64: Adjust range correctly during host stage-2 faults KVM: arm64: nv: Fix MI line level calculation in vgic_v3_nested_update_mi() KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush KVM: SVM: Add missing member in SNP_LAUNCH_START command structure Documentation: KVM: Fix unexpected unindent warnings KVM: selftests: Add back the missing check of MONITOR/MWAIT availability KVM: Allow CPU to reschedule while setting per-page memory attributes KVM: x86/xen: Allow 'out of range' event channel ports in IRQ routing table. KVM: x86/hyper-v: Use preallocated per-vCPU buffer for de-sparsified vCPU masks KVM: SVM: Initialize vmsa_pa in VMCB to INVALID_PAGE if VMSA page is NULL KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight KVM: TDX: Report supported optional TDVMCALLs in TDX capabilities KVM: TDX: Exit to userspace for SetupEventNotifyInterrupt
2025-07-10	Merge branch 'net-dsa-rzn1_a5psw-add-compile_test'	Paolo Abeni
	Rosen Penev says: ==================== net: dsa: rzn1_a5psw: add COMPILE_TEST Allows the various bots to test compilation. Also threw in a small devm conversion for enabling clocks. ==================== Link: https://patch.msgid.link/20250708014144.2514-1-rosenp@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-10	net: dsa: rzn1_a5psw: use devm to enable clocks	Rosen Penev
	The remove function has these in the wrong order. The switch should be unregistered last. Simpler to use devm so that the right thing is done. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20250708014144.2514-3-rosenp@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-10	net: dsa: rzn1_a5psw: add COMPILE_TEST	Rosen Penev
	There's no architecture specific requirement for it to compile. Allows the bots to test compilation properly. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250708014144.2514-2-rosenp@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-10	net: xsk: introduce XDP_MAX_TX_SKB_BUDGET setsockopt	Jason Xing
	This patch provides a setsockopt method to let applications leverage to adjust how many descs to be handled at most in one send syscall. It mitigates the situation where the default value (32) that is too small leads to higher frequency of triggering send syscall. Considering the prosperity/complexity the applications have, there is no absolutely ideal suggestion fitting all cases. So keep 32 as its default value like before. The patch does the following things: - Add XDP_MAX_TX_SKB_BUDGET socket option. - Set max_tx_budget to 32 by default in the initialization phase as a per-socket granular control. - Set the range of max_tx_budget as [32, xs->tx->nentries]. The idea behind this comes out of real workloads in production. We use a user-level stack with xsk support to accelerate sending packets and minimize triggering syscalls. When the packets are aggregated, it's not hard to hit the upper bound (namely, 32). The moment user-space stack fetches the -EAGAIN error number passed from sendto(), it will loop to try again until all the expected descs from tx ring are sent out to the driver. Enlarging the XDP_MAX_TX_SKB_BUDGET value contributes to less frequency of sendto() and higher throughput/PPS. Here is what I did in production, along with some numbers as follows: For one application I saw lately, I suggested using 128 as max_tx_budget because I saw two limitations without changing any default configuration: 1) XDP_MAX_TX_SKB_BUDGET, 2) socket sndbuf which is 212992 decided by net.core.wmem_default. As to XDP_MAX_TX_SKB_BUDGET, the scenario behind this was I counted how many descs are transmitted to the driver at one time of sendto() based on [1] patch and then I calculated the possibility of hitting the upper bound. Finally I chose 128 as a suitable value because 1) it covers most of the cases, 2) a higher number would not bring evident results. After twisting the parameters, a stable improvement of around 4% for both PPS and throughput and less resources consumption were found to be observed by strace -c -p xxx: 1) %time was decreased by 7.8% 2) error counter was decreased from 18367 to 572 [1]: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/ Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20250704160138.48677-1-kerneljasonxing@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-10	net/sched: sch_qfq: Fix null-deref in agg_dequeue	Xiang Mei
	To prevent a potential crash in agg_dequeue (net/sched/sch_qfq.c) when cl->qdisc->ops->peek(cl->qdisc) returns NULL, we check the return value before using it, similar to the existing approach in sch_hfsc.c. To avoid code duplication, the following changes are made: 1. Changed qdisc_warn_nonwc(include/net/pkt_sched.h) into a static inline function. 2. Moved qdisc_peek_len from net/sched/sch_hfsc.c to include/net/pkt_sched.h so that sch_qfq can reuse it. 3. Applied qdisc_peek_len in agg_dequeue to avoid crashing. Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250705212143.3982664-1-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-09	Merge branch 'net-mlx5-misc-changes-2025-07-09'	Jakub Kicinski
	Tariq Toukan says: ==================== net/mlx5: misc changes 2025-07-09 This series contains misc enhancements to the mlx5 driver. ==================== Link: https://patch.msgid.link/1752009387-13300-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net/mlx5e: RX, Remove unnecessary RQT redirects	Tariq Toukan
	RQTs (Receive Queue Table) should redirect traffic to the channels' RQs when they're active. Otherwise, redirect to the designated "drop RQ". RQTs are created in "inactive" state, pointing to the "drop RQ". In activate and de-activate flows, do not "deactivate" the rest of RQTs (beyond the num of channels), as they are already inactive. This cuts down unnecessary execution of FW commands (MODIFY_RQT), and improves the latency of open/close channels or configuration change. Perf: NIC: Connect-X7. Configuration: 1 combined channel, max num channels 248. Measure time for "interface up + interface down". Before: 0.313 sec After: 0.057 sec (5.5x faster) 247 MODIFY_RQT commands saved in interface up. 247 MODIFY_RQT commands saved in interface down. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1752009387-13300-6-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net/mlx5: Warn when write combining is not supported	Maor Gottlieb
	Warn if write combining is not supported, as it can impact latency. Add the warning message to be printed only when the driver actually run the test and detect unsupported state, rather than when inheriting parent's result for SFs. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1752009387-13300-5-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net/mlx5e: Replace recursive VLAN push handling with an iterative loop	Gal Pressman
	mlx5e_tc_act_vlan_add_push_action() uses tail-recursion to walk through a stack of VLAN devices. There is no need for a complicated recursion with unnecessary stack consumption and less obvious code flow, rewrite the function so that it uses a do while loop instead. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1752009387-13300-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net/mlx5e: CT: extract a memcmp from a spinlock section	Cosmin Ratiu
	This reduces the time the lock is held and reduces contention. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1752009387-13300-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net/mlx5e: Remove unused VLAN insertion logic in TX path	Carolina Jubran
	The VLAN insertion capability (`wqe_vlan_insert`) was never enabled on all mlx5 devices. When VLAN TX offload is advertised but this capability is not supported, the driver uses inline headers to insert the VLAN tag. To support this, the driver used to set the `MLX5E_SQ_STATE_VLAN_NEED_L2_INLINE` bit to enforce L2 inline mode when `wqe_vlan_insert` was not supported. Since the capability is disabled on all devices, this logic was always active, and the SQ flag has become redundant. L2 inline is enforced unconditionally for VLAN-tagged packets. The `skb_vlan_tag_present()` check in the else-if block of `mlx5e_sq_xmit_wqe()` is never true by this point in the TX flow, as the VLAN tag has already been inserted by the driver using inline headers. As a result, this code is never executed. Remove the redundant SQ state, dead VLAN insertion code block, and related logic. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1752009387-13300-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	Merge branch 'rxrpc-miscellaneous-fixes'	Jakub Kicinski
	David Howells says: ==================== rxrpc: Miscellaneous fixes Here are some miscellaneous fixes for rxrpc: (1) Fix assertion failure due to preallocation collision. (2) Fix oops due to prealloc backlog struct not yet having been allocated if no service calls have yet been preallocated. ==================== Link: https://patch.msgid.link/20250708211506.2699012-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	rxrpc: Fix oops due to non-existence of prealloc backlog struct	David Howells
	If an AF_RXRPC service socket is opened and bound, but calls are preallocated, then rxrpc_alloc_incoming_call() will oops because the rxrpc_backlog struct doesn't get allocated until the first preallocation is made. Fix this by returning NULL from rxrpc_alloc_incoming_call() if there is no backlog struct. This will cause the incoming call to be aborted. Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> Suggested-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Marc Dionne <marc.dionne@auristor.com> cc: Willy Tarreau <w@1wt.eu> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250708211506.2699012-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	rxrpc: Fix bug due to prealloc collision	David Howells
	When userspace is using AF_RXRPC to provide a server, it has to preallocate incoming calls and assign to them call IDs that will be used to thread related recvmsg() and sendmsg() together. The preallocated call IDs will automatically be attached to calls as they come in until the pool is empty. To the kernel, the call IDs are just arbitrary numbers, but userspace can use the call ID to hold a pointer to prepared structs. In any case, the user isn't permitted to create two calls with the same call ID (call IDs become available again when the call ends) and EBADSLT should result from sendmsg() if an attempt is made to preallocate a call with an in-use call ID. However, the cleanup in the error handling will trigger both assertions in rxrpc_cleanup_call() because the call isn't marked complete and isn't marked as having been released. Fix this by setting the call state in rxrpc_service_prealloc_one() and then marking it as being released before calling the cleanup function. Fixes: 00e907127e6f ("rxrpc: Preallocate peers, conns and calls for incoming service requests") Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250708211506.2699012-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	vsock/test: fix test for null ptr deref when transport changes	Stefano Garzarella
	In test_stream_transport_change_client(), the client sends CONTROL_CONTINUE on each iteration, even when connect() is unsuccessful. This causes a flood of control messages in the server that hangs around for more than 10 seconds after the test finishes, triggering several timeouts and causing subsequent tests to fail. This was discovered in testing a newly proposed test that failed in this way on the client side: ... 33 - SOCK_STREAM transport change null-ptr-deref...ok 34 - SOCK_STREAM ioctl(SIOCINQ) functionality...recv timed out The CONTROL_CONTINUE message is used only to tell to the server to call accept() to consume successful connections, so that subsequent connect() will not fail for finding the queue full. Send CONTROL_CONTINUE message only when the connect() has succeeded, or found the queue full. Note that the second connect() can also succeed if the first one was interrupted after sending the request. Fixes: 3a764d93385c ("vsock/test: Add test for null ptr deref when transport changes") Cc: leonardi@redhat.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Link: https://patch.msgid.link/20250708111701.129585-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	Merge branch 'net-phy-bcm54811-phy-initialization'	Jakub Kicinski
	says: ==================== net: phy: bcm54811: PHY initialization Proper bcm54811 PHY driver initialization for MII-Lite. The bcm54811 PHY in MLP package must be setup for MII-Lite interface mode by software. Normally, the PHY to MAC interface is selected in hardware by setting the bootstrap pins of the PHY. However, MII and MII-Lite share the same hardware setup and must be distinguished by software, setting appropriate bit in a configuration register. The MII-Lite interface mode is non-standard one, defined by Broadcom for some of their PHYs. The MII-Lite lightness consist in omitting RXER, TXER, CRS and COL signals of the standard MII interface. Absence of COL them makes half-duplex links modes impossible but does not interfere with Broadcom's BroadR-Reach link modes, because they are full-duplex only. To do it in a clean way, MII-Lite must be introduced first, including its limitation to link modes (no half-duplex), because it is a prerequisite for the patch #3 of this series. The patch #4 does not depend on MII-Lite directly but both #3 and #4 are necessary for bcm54811 to work properly without additional configuration steps to be done - for example in the bootloader, before the kernel starts. PATCH 1 - Add MII-Lite PHY interface mode as defined by Broadcom for their two-wire PHYs. It can be used with most Ethernet controllers under certain limitations (no half-duplex link modes etc.). PATCH 2 - Add MII-Lite PHY interface type PATCH 3 - Activation of MII-Lite interface mode on Broadcom bcm5481x PHYs PATCH 4 - Initialize the BCM54811 PHY properly so that it conforms to the datasheet regarding a reserved bit in the LRE Control register, which must be written to zero after every device reset. Ignore the LDS capability bit in LRE Status register on bcm54811. ==================== Link: https://patch.msgid.link/20250708090140.61355-1-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net: phy: bcm54811: PHY initialization	Kamil Horák - 2N
	Reset the bit 12 in PHY's LRE Control register upon initialization. According to the datasheet, this bit must be written to zero after every device reset. Signed-off-by: Kamil Horák - 2N <kamilh@axis.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250708090140.61355-5-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net: phy: bcm5481x: MII-Lite activation	Kamil Horák - 2N
	Broadcom PHYs featuring the BroadR-Reach two-wire link mode are usually capable to operate in simplified MII mode, without TXER, RXER, CRS and COL signals as defined for the MII. The absence of COL signal makes half-duplex link modes impossible, however, the BroadR-Reach modes are all full-duplex only. Depending on the IC encapsulation, there exist MII-Lite-only PHYs such as bcm54811 in MLP. The PHY itself is hardware-strapped to select among multiple RGMII and MII-Lite modes, but the MII-Lite mode must be also activated by software. Add MII-Lite activation for bcm5481x PHYs. Signed-off-by: Kamil Horák - 2N <kamilh@axis.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250708090140.61355-4-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	dt-bindings: ethernet-phy: add MII-Lite phy interface type	Kamil Horák - 2N
	Some Broadcom PHYs are capable to operate in simplified MII mode, without TXER, RXER, CRS and COL signals as defined for the MII. The MII-Lite mode can be used on most Ethernet controllers with full MII interface by just leaving the input signals (RXER, CRS, COL) inactive. The absence of COL signal makes half-duplex link modes impossible but does not interfere with BroadR-Reach link modes on Broadcom PHYs, because they are all full-duplex only. Add new interface type "mii-lite" to phy-connection-type enum. Signed-off-by: Kamil Horák - 2N <kamilh@axis.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250708090140.61355-3-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net: phy: MII-Lite PHY interface mode	Kamil Horák - 2N
	Some Broadcom PHYs are capable to operate in simplified MII mode, without TXER, RXER, CRS and COL signals as defined for the MII. The MII-Lite mode can be used on most Ethernet controllers with full MII interface by just leaving the input signals (RXER, CRS, COL) inactive. The absence of COL signal makes half-duplex link modes impossible but does not interfere with BroadR-Reach link modes on Broadcom PHYs, because they are all full-duplex only. Add MII-Lite interface mode, especially for Broadcom two-wire PHYs. Signed-off-by: Kamil Horák - 2N <kamilh@axis.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250708090140.61355-2-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	MAINTAINERS: remove myself as netronome maintainer	Louis Peens
	I am moving on from Corigine to different things, for the moment slightly removed from kernel development. Right now there is nobody I can in good conscience recommend to take over the maintainer role, but there are still people available for review, so put the driver state to 'Odd Fixes'. Additionally add Simon Horman as reviewer - thanks Simon. Signed-off-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net: usb: enable the work after stop usbnet by ip down/up	Zqiang
	Oleksij reported that: The smsc95xx driver fails after one down/up cycle, like this: $ nmcli device set enu1u1 managed no $ p a a 10.10.10.1/24 dev enu1u1 $ ping -c 4 10.10.10.3 $ ip l s dev enu1u1 down $ ip l s dev enu1u1 up $ ping -c 4 10.10.10.3 The second ping does not reach the host. Networking also fails on other interfaces. Enable the work by replacing the disable_work_sync() with cancel_work_sync(). [Jun Miao: completely write the commit changelog] Fixes: 2c04d279e857 ("net: usb: Convert tasklet API to new bottom half workqueue mechanism") Reported-by: Oleksij Rempel <o.rempel@pengutronix.de> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Zqiang <qiang.zhang@linux.dev> Signed-off-by: Jun Miao <jun.miao@intel.com> Link: https://patch.msgid.link/20250708081653.307815-1-jun.miao@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	Merge branch 'vsock-introduce-siocinq-ioctl-support'	Jakub Kicinski
	Xuewei Niu says: ==================== vsock: Introduce SIOCINQ ioctl support Introduce SIOCINQ ioctl support for vsock, indicating the length of unread bytes. Similar with SIOCOUTQ ioctl, the information is transport-dependent. The first patch adds SIOCINQ ioctl support in AF_VSOCK. Thanks to @dexuan, the second patch is to fix the issue where hyper-v `hvs_stream_has_data()` doesn't return the readable bytes. The third patch wraps the ioctl into `ioctl_int()`, which implements a retry mechanism to prevent immediate failure. The last one adds two test cases to check the functionality. The changes have been tested, and the results are as expected. ==================== Link: https://patch.msgid.link/20250708-siocinq-v6-0-3775f9a9e359@antgroup.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	test/vsock: Add ioctl SIOCINQ tests	Xuewei Niu
	Add SIOCINQ ioctl tests for both SOCK_STREAM and SOCK_SEQPACKET. The client waits for the server to send data, and checks if the SIOCINQ ioctl value matches the data size. After consuming the data, the client checks if the SIOCINQ value is 0. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Luigi Leonardi <leonardi@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Link: https://patch.msgid.link/20250708-siocinq-v6-4-3775f9a9e359@antgroup.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	test/vsock: Add retry mechanism to ioctl wrapper	Xuewei Niu
	Wrap the ioctl in `ioctl_int()`, which takes a pointer to the actual int value and an expected int value. The function will not return until either the ioctl returns the expected value or a timeout occurs, thus avoiding immediate failure. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Link: https://patch.msgid.link/20250708-siocinq-v6-3-3775f9a9e359@antgroup.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	vsock: Add support for SIOCINQ ioctl	Xuewei Niu
	Add support for SIOCINQ ioctl, indicating the length of bytes unread in the socket. The value is obtained from `vsock_stream_has_data()`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Link: https://patch.msgid.link/20250708-siocinq-v6-2-3775f9a9e359@antgroup.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	hv_sock: Return the readable bytes in hvs_stream_has_data()	Dexuan Cui
	When hv_sock was originally added, __vsock_stream_recvmsg() and vsock_stream_has_data() actually only needed to know whether there is any readable data or not, so hvs_stream_has_data() was written to return 1 or 0 for simplicity. However, now hvs_stream_has_data() should return the readable bytes because vsock_data_ready() -> vsock_stream_has_data() needs to know the actual bytes rather than a boolean value of 1 or 0. The SIOCINQ ioctl support also needs hvs_stream_has_data() to return the readable bytes. Let hvs_stream_has_data() return the readable bytes of the payload in the next host-to-guest VMBus hv_sock packet. Note: there may be multiple incoming hv_sock packets pending in the VMBus channel's ringbuffer, but so far there is not a VMBus API that allows us to know all the readable bytes in total without reading and caching the payload of the multiple packets, so let's just return the readable bytes of the next single packet. In the future, we'll either add a VMBus API that allows us to know the total readable bytes without touching the data in the ringbuffer, or the hv_sock driver needs to understand the VMBus packet format and parse the packets directly. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://patch.msgid.link/20250708-siocinq-v6-1-3775f9a9e359@antgroup.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	Documentation: xsk: correct the obsolete references and examples	Jason Xing
	The modified lines are mainly related to the following commits[1][2] which remove those tests and examples. Since samples/bpf has been deprecated, we can refer to more examples that are easily searched in the various xdp-projects, like the following link: https://github.com/xdp-project/bpf-examples/tree/main/AF_XDP-example [1] commit f36600634282 ("libbpf: move xsk.{c,h} into selftests/bpf") [2] commit cfb5a2dbf141 ("bpf, samples: Remove AF_XDP samples") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250708062907.11557-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	skbuff: Add MSG_MORE flag to optimize tcp large packet transmission	Feng Yang
	When using sockmap for forwarding, the average latency for different packet sizes after sending 10,000 packets is as follows: size old(us) new(us) 512 56 55 1472 58 58 1600 106 81 3000 145 105 5000 182 125 Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250708054053.39551-1-yangfeng59949@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	Merge branch 'converge-on-using-secs_to_jiffies-part-two'	Jakub Kicinski
	Easwar Hariharan says: ==================== Converge on using secs_to_jiffies() part two This is the second series (part 1) that converts users of msecs_to_jiffies() that either use the multiply pattern of either of: - msecs_to_jiffies(N1000) or - msecs_to_jiffies(N*MSEC_PER_SEC) where N is a constant or an expression, to avoid the multiplication. The conversion is made with Coccinelle with the secs_to_jiffies() script in scripts/coccinelle/misc. Attention is paid to what the best change can be rather than restricting to what the tool provides. v1: https://lore.kernel.org/20250219-netdev-secs-to-jiffies-part-2-v1-0-c484cc63611b@linux.microsoft.com ==================== Link: https://patch.msgid.link/20250707-netdev-secs-to-jiffies-part-2-v2-0-b7817036342f@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net: ipconfig: convert timeouts to secs_to_jiffies()	Easwar Hariharan
	Commit b35108a51cf7 ("jiffies: Define secs_to_jiffies()") introduced secs_to_jiffies(). As the value here is a multiple of 1000, use secs_to_jiffies() instead of msecs_to_jiffies to avoid the multiplication. This is converted using scripts/coccinelle/misc/secs_to_jiffies.cocci with the following Coccinelle rules: @depends on patch@ expression E; @@ -msecs_to_jiffies(E * 1000) +secs_to_jiffies(E) -msecs_to_jiffies(E * MSEC_PER_SEC) +secs_to_jiffies(E) While here, manually convert a couple timeouts denominated in seconds Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Link: https://patch.msgid.link/20250707-netdev-secs-to-jiffies-part-2-v2-2-b7817036342f@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net/smc: convert timeouts to secs_to_jiffies()	Easwar Hariharan
	Commit b35108a51cf7 ("jiffies: Define secs_to_jiffies()") introduced secs_to_jiffies(). As the value here is a multiple of 1000, use secs_to_jiffies() instead of msecs_to_jiffies to avoid the multiplication. This is converted using scripts/coccinelle/misc/secs_to_jiffies.cocci with the following Coccinelle rules: @depends on patch@ expression E; @@ -msecs_to_jiffies(E * 1000) +secs_to_jiffies(E) -msecs_to_jiffies(E * MSEC_PER_SEC) +secs_to_jiffies(E) Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Reviewed-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Link: https://patch.msgid.link/20250707-netdev-secs-to-jiffies-part-2-v2-1-b7817036342f@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	Merge branch 'tcp-better-memory-control-for-not-yet-accepted-sockets'	Jakub Kicinski
	Eric Dumazet says: ==================== tcp: better memory control for not-yet-accepted sockets Address a possible OOM condition caused by a recent change. Add a new packetdrill test checking the expected behavior. ==================== Link: https://patch.msgid.link/20250707213900.1543248-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	selftests/net: packetdrill: add tcp_ooo-before-and-after-accept.pkt	Eric Dumazet
	Test how new passive flows react to ooo incoming packets. Their sk_rcvbuf can increase only after accept(). Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250707213900.1543248-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	tcp: refine sk_rcvbuf increase for ooo packets	Eric Dumazet
	When a passive flow has not been accepted yet, it is not wise to increase sk_rcvbuf when receiving ooo packets. A very busy server might tune down tcp_rmem[1] to better control how much memory can be used by sockets waiting in its listeners accept queues. Fixes: 63ad7dfedfae ("tcp: adjust rcvbuf in presence of reorders") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250707213900.1543248-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net/sched: Abort __tc_modify_qdisc if parent class does not exist	Victor Nogueira
	Lion's patch [1] revealed an ancient bug in the qdisc API. Whenever a user creates/modifies a qdisc specifying as a parent another qdisc, the qdisc API will, during grafting, detect that the user is not trying to attach to a class and reject. However grafting is performed after qdisc_create (and thus the qdiscs' init callback) is executed. In qdiscs that eventually call qdisc_tree_reduce_backlog during init or change (such as fq, hhf, choke, etc), an issue arises. For example, executing the following commands: sudo tc qdisc add dev lo root handle a: htb default 2 sudo tc qdisc add dev lo parent a: handle beef fq Qdiscs such as fq, hhf, choke, etc unconditionally invoke qdisc_tree_reduce_backlog() in their control path init() or change() which then causes a failure to find the child class; however, that does not stop the unconditional invocation of the assumed child qdisc's qlen_notify with a null class. All these qdiscs make the assumption that class is non-null. The solution is ensure that qdisc_leaf() which looks up the parent class, and is invoked prior to qdisc_create(), should return failure on not finding the class. In this patch, we leverage qdisc_leaf to return ERR_PTRs whenever the parentid doesn't correspond to a class, so that we can detect it earlier on and abort before qdisc_create is called. [1] https://lore.kernel.org/netdev/d912cbd7-193b-4269-9857-525bee8bbb6a@gmail.com/ Fixes: 5e50da01d0ce ("[NET_SCHED]: Fix endless loops (part 2): "simple" qdiscs") Reported-by: syzbot+d8b58d7b0ad89a678a16@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68663c93.a70a0220.5d25f.0857.GAE@google.com/ Reported-by: syzbot+5eccb463fa89309d8bdc@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68663c94.a70a0220.5d25f.0858.GAE@google.com/ Reported-by: syzbot+1261670bbdefc5485a06@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/686764a5.a00a0220.c7b3.0013.GAE@google.com/ Reported-by: syzbot+15b96fc3aac35468fe77@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/686764a5.a00a0220.c7b3.0014.GAE@google.com/ Reported-by: syzbot+4dadc5aecf80324d5a51@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68679e81.a70a0220.29cf51.0016.GAE@google.com/ Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20250707210801.372995-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	gve: make IRQ handlers and page allocation NUMA aware	Bailey Forrest
	All memory in GVE is currently allocated without regard for the NUMA node of the device. Because access to NUMA-local memory access is significantly cheaper than access to a remote node, this change attempts to ensure that page frags used in the RX path, including page pool frags, are allocated on the NUMA node local to the gVNIC device. Note that this attempt is best-effort. If necessary, the driver will still allocate non-local memory, as __GFP_THISNODE is not passed. Descriptor ring allocations are not updated, as dma_alloc_coherent handles that. This change also modifies the IRQ affinity setting to only select CPUs from the node local to the device, preserving the behavior that TX and RX queues of the same index share CPU affinity. Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707210107.2742029-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	net: ethernet: ti: am65-cpsw-nuss: Fix skb size by accounting for ↵	Chintan Vankar
	skb_shared_info While transitioning from netdev_alloc_ip_align() to build_skb(), memory for the "skb_shared_info" member of an "skb" was not allocated. Fix this by allocating "PAGE_SIZE" as the skb length, accounting for the packet length, headroom and tailroom, thereby including the required memory space for skb_shared_info. Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Chintan Vankar <c-vankar@ti.com> Link: https://patch.msgid.link/20250707085201.1898818-1-c-vankar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>