linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2025-07-22	ibmveth: Add multi buffers rx replenishment hcall support	Mingming Cao
	This patch enables batched RX buffer replenishment in ibmveth by using the new firmware-supported h_add_logical_lan_buffers() hcall to submit up to 8 RX buffers in a single call, instead of repeatedly calling the single-buffer h_add_logical_lan_buffer() hcall. During the probe, with the patch, the driver queries ILLAN attributes to detect IBMVETH_ILLAN_RX_MULTI_BUFF_SUPPORT bit. If the attribute is present, rx_buffers_per_hcall is set to 8, enabling batched replenishment. Otherwise, it defaults to 1, preserving the original upstream behavior with no change in code flow for unsupported systems. The core rx replenish logic remains the same. But when batching is enabled, the driver aggregates up to 8 fully prepared descriptors into a single h_add_logical_lan_buffers() hypercall. If any allocation or DMA mapping fails while preparing a batch, only the successfully prepared buffers are submitted, and the remaining are deferred for the next replenish cycle. If at runtime the firmware stops accepting the batched hcall—e,g, after a Live Partition Migration (LPM) to a host that does not support h_add_logical_lan_buffers(), the hypercall returns H_FUNCTION. In that case, the driver transparently disables batching, resets rx_buffers_per_hcall to 1, and falls back to the single-buffer hcall in next future replenishments to take care of these and future buffers. Test were done on systems with firmware that both supports and does not support the new h_add_logical_lan_buffers hcall. On supported firmware, this reduces hypercall overhead significantly over multiple buffers. SAR measurements showed about a 15% improvement in packet processing rate under moderate RX load, with heavier traffic seeing gains more than 30% Signed-off-by: Mingming Cao <mmc@linux.ibm.com> Reviewed-by: Brian King <bjking1@linux.ibm.com> Reviewed-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250719091356.57252-1-mmc@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17	ibmvnic: Use ndo_get_stats64 to fix inaccurate SAR reporting	Mingming Cao
	VNIC testing on multi-core Power systems showed SAR stats drift and packet rate inconsistencies under load. Implements ndo_get_stats64 to provide safe aggregation of queue-level atomic64 counters into rtnl_link_stats64 for use by tools like 'ip -s', 'ifconfig', and 'sar'. Switch to ndo_get_stats64 to align SAR reporting with the standard kernel interface for retrieving netdev stats. This removes redundant per-adapter stat updates, reduces overhead, eliminates cacheline bouncing from hot path updates, and improves the accuracy of reported packet rates. Signed-off-by: Mingming Cao <mmc@linux.ibm.com> Reviewed-by: Brian King <bjking1@linux.ibm.com> Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> ---- Changes since v3: link to v3: https://www.spinics.net/lists/netdev/msg1107999.html -- keep per queue counters as u64 (this patch) and drop off patch 1 in v3 Link: https://patch.msgid.link/20250716152115.61143-1-mmc@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10	ibmvnic: Fix hardcoded NUM_RX_STATS/NUM_TX_STATS with dynamic sizeof	Mingming Cao
	The previous hardcoded definitions of NUM_RX_STATS and NUM_TX_STATS were not updated when new fields were added to the ibmvnic_{rx,tx}_queue_stats structures. Specifically, commit 2ee73c54a615 ("ibmvnic: Add stat for tx direct vs tx batched") added a fourth TX stat, but NUM_TX_STATS remained 3, leading to a mismatch. This patch replaces the static defines with dynamic sizeof-based calculations to ensure the stat arrays are correctly sized. This fixes incorrect indexing and prevents incomplete stat reporting in tools like ethtool. Fixes: 2ee73c54a615 ("ibmvnic: Add stat for tx direct vs tx batched") Signed-off-by: Mingming Cao <mmc@linux.ibm.com> Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com> Reviewed-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250709153332.73892-1-mmc@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-07	net: ibmveth: Refactored veth_pool_store for better maintainability	Dave Marquardt
	Make veth_pool_store detect requested pool changes, close device if necessary, update pool, and reopen device. Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250506160004.328347-1-davemarq@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-06	net: ibmveth: added KUnit tests for some buffer pool functions	Dave Marquardt
	Added KUnit tests for ibmveth_remove_buffer_from_pool and ibmveth_rxq_get_buffer under new IBMVETH_KUNIT_TEST config option. Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250501194944.283729-4-davemarq@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06	net: ibmveth: Reset the adapter when unexpected states are detected	Dave Marquardt
	Reset the adapter through new function ibmveth_reset, called in WARN_ON situations. Removed conflicting and unneeded forward declaration. Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com> Link: https://patch.msgid.link/20250501194944.283729-3-davemarq@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-06	net: ibmveth: Indented struct ibmveth_adapter correctly	Dave Marquardt
	Made struct ibmveth_adapter follow indentation rules Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250501194944.283729-2-davemarq@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-04	net: ibmveth: make veth_pool_store stop hanging	Dave Marquardt
	v2: - Created a single error handling unlock and exit in veth_pool_store - Greatly expanded commit message with previous explanatory-only text Summary: Use rtnl_mutex to synchronize veth_pool_store with itself, ibmveth_close and ibmveth_open, preventing multiple calls in a row to napi_disable. Background: Two (or more) threads could call veth_pool_store through writing to /sys/devices/vio/30000002/pool/. You can do this easily with a little shell script. This causes a hang. I configured LOCKDEP, compiled ibmveth.c with DEBUG, and built a new kernel. I ran this test again and saw: Setting pool0/active to 0 Setting pool1/active to 1 [ 73.911067][ T4365] ibmveth 30000002 eth0: close starting Setting pool1/active to 1 Setting pool1/active to 0 [ 73.911367][ T4366] ibmveth 30000002 eth0: close starting [ 73.916056][ T4365] ibmveth 30000002 eth0: close complete [ 73.916064][ T4365] ibmveth 30000002 eth0: open starting [ 110.808564][ T712] systemd-journald[712]: Sent WATCHDOG=1 notification. [ 230.808495][ T712] systemd-journald[712]: Sent WATCHDOG=1 notification. [ 243.683786][ T123] INFO: task stress.sh:4365 blocked for more than 122 seconds. [ 243.683827][ T123] Not tainted 6.14.0-01103-g2df0c02dab82-dirty #8 [ 243.683833][ T123] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.683838][ T123] task:stress.sh state:D stack:28096 pid:4365 tgid:4365 ppid:4364 task_flags:0x400040 flags:0x00042000 [ 243.683852][ T123] Call Trace: [ 243.683857][ T123] [c00000000c38f690] [0000000000000001] 0x1 (unreliable) [ 243.683868][ T123] [c00000000c38f840] [c00000000001f908] __switch_to+0x318/0x4e0 [ 243.683878][ T123] [c00000000c38f8a0] [c000000001549a70] __schedule+0x500/0x12a0 [ 243.683888][ T123] [c00000000c38f9a0] [c00000000154a878] schedule+0x68/0x210 [ 243.683896][ T123] [c00000000c38f9d0] [c00000000154ac80] schedule_preempt_disabled+0x30/0x50 [ 243.683904][ T123] [c00000000c38fa00] [c00000000154dbb0] __mutex_lock+0x730/0x10f0 [ 243.683913][ T123] [c00000000c38fb10] [c000000001154d40] napi_enable+0x30/0x60 [ 243.683921][ T123] [c00000000c38fb40] [c000000000f4ae94] ibmveth_open+0x68/0x5dc [ 243.683928][ T123] [c00000000c38fbe0] [c000000000f4aa20] veth_pool_store+0x220/0x270 [ 243.683936][ T123] [c00000000c38fc70] [c000000000826278] sysfs_kf_write+0x68/0xb0 [ 243.683944][ T123] [c00000000c38fcb0] [c0000000008240b8] kernfs_fop_write_iter+0x198/0x2d0 [ 243.683951][ T123] [c00000000c38fd00] [c00000000071b9ac] vfs_write+0x34c/0x650 [ 243.683958][ T123] [c00000000c38fdc0] [c00000000071bea8] ksys_write+0x88/0x150 [ 243.683966][ T123] [c00000000c38fe10] [c0000000000317f4] system_call_exception+0x124/0x340 [ 243.683973][ T123] [c00000000c38fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec ... [ 243.684087][ T123] Showing all locks held in the system: [ 243.684095][ T123] 1 lock held by khungtaskd/123: [ 243.684099][ T123] #0: c00000000278e370 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x50/0x248 [ 243.684114][ T123] 4 locks held by stress.sh/4365: [ 243.684119][ T123] #0: c00000003a4cd3f8 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x88/0x150 [ 243.684132][ T123] #1: c000000041aea888 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x154/0x2d0 [ 243.684143][ T123] #2: c0000000366fb9a8 (kn->active#64){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x160/0x2d0 [ 243.684155][ T123] #3: c000000035ff4cb8 (&dev->lock){+.+.}-{3:3}, at: napi_enable+0x30/0x60 [ 243.684166][ T123] 5 locks held by stress.sh/4366: [ 243.684170][ T123] #0: c00000003a4cd3f8 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x88/0x150 [ 243.684183][ T123] #1: c00000000aee2288 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x154/0x2d0 [ 243.684194][ T123] #2: c0000000366f4ba8 (kn->active#64){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x160/0x2d0 [ 243.684205][ T123] #3: c000000035ff4cb8 (&dev->lock){+.+.}-{3:3}, at: napi_disable+0x30/0x60 [ 243.684216][ T123] #4: c0000003ff9bbf18 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0x138/0x12a0 From the ibmveth debug, two threads are calling veth_pool_store, which calls ibmveth_close and ibmveth_open. Here's the sequence: T4365 T4366 ----------------- ----------------- --------- veth_pool_store veth_pool_store ibmveth_close ibmveth_close napi_disable napi_disable ibmveth_open napi_enable <- HANG ibmveth_close calls napi_disable at the top and ibmveth_open calls napi_enable at the top. https://docs.kernel.org/networking/napi.html]] says The control APIs are not idempotent. Control API calls are safe against concurrent use of datapath APIs but an incorrect sequence of control API calls may result in crashes, deadlocks, or race conditions. For example, calling napi_disable() multiple times in a row will deadlock. In the normal open and close paths, rtnl_mutex is acquired to prevent other callers. This is missing from veth_pool_store. Use rtnl_mutex in veth_pool_store fixes these hangs. Signed-off-by: Dave Marquardt <davemarq@linux.ibm.com> Fixes: 860f242eb534 ("[PATCH] ibmveth change buffer pools dynamically") Reviewed-by: Nick Child <nnac123@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250402154403.386744-1-davemarq@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-26	Merge tag 'net-next-6.15' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Continue Netlink conversions to per-namespace RTNL lock (IPv4 routing, routing rules, routing next hops, ARP ioctls) - Continue extending the use of netdev instance locks. As a driver opt-in protect queue operations and (in due course) ethtool operations with the instance lock and not RTNL lock. - Support collecting TCP timestamps (data submitted, sent, acked) in BPF, allowing for transparent (to the application) and lower overhead tracking of TCP RPC performance. - Tweak existing networking Rx zero-copy infra to support zero-copy Rx via io_uring. - Optimize MPTCP performance in single subflow mode by 29%. - Enable GRO on packets which went thru XDP CPU redirect (were queued for processing on a different CPU). Improving TCP stream performance up to 2x. - Improve performance of contended connect() by 200% by searching for an available 4-tuple under RCU rather than a spin lock. Bring an additional 229% improvement by tweaking hash distribution. - Avoid unconditionally touching sk_tsflags on RX, improving performance under UDP flood by as much as 10%. - Avoid skb_clone() dance in ping_rcv() to improve performance under ping flood. - Avoid FIB lookup in netfilter if socket is available, 20% perf win. - Rework network device creation (in-kernel) API to more clearly identify network namespaces and their roles. There are up to 4 namespace roles but we used to have just 2 netns pointer arguments, interpreted differently based on context. - Use sysfs_break_active_protection() instead of trylock to avoid deadlocks between unregistering objects and sysfs access. - Add a new sysctl and sockopt for capping max retransmit timeout in TCP. - Support masking port and DSCP in routing rule matches. - Support dumping IPv4 multicast addresses with RTM_GETMULTICAST. - Support specifying at what time packet should be sent on AF_XDP sockets. - Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin users. - Add Netlink YAML spec for WiFi (nl80211) and conntrack. - Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols which only need to be exported when IPv6 support is built as a module. - Age FDB entries based on Rx not Tx traffic in VxLAN, similar to normal bridging. - Allow users to specify source port range for GENEVE tunnels. - netconsole: allow attaching kernel release, CPU ID and task name to messages as metadata Driver API: - Continue rework / fixing of Energy Efficient Ethernet (EEE) across the SW layers. Delegate the responsibilities to phylink where possible. Improve its handling in phylib. - Support symmetric OR-XOR RSS hashing algorithm. - Support tracking and preserving IRQ affinity by NAPI itself. - Support loopback mode speed selection for interface selftests. Device drivers: - Remove the IBM LCS driver for s390 - Remove the sb1000 cable modem driver - Add support for SFP module access over SMBus - Add MCTP transport driver for MCTP-over-USB - Enable XDP metadata support in multiple drivers - Ethernet high-speed NICs: - Broadcom (bnxt): - add PCIe TLP Processing Hints (TPH) support for new AMD platforms - support dumping RoCE queue state for debug - opt into instance locking - Intel (100G, ice, idpf): - ice: rework MSI-X IRQ management and distribution - ice: support for E830 devices - iavf: add support for Rx timestamping - iavf: opt into instance locking - nVidia/Mellanox: - mlx4: use page pool memory allocator for Rx - mlx5: support for one PTP device per hardware clock - mlx5: support for 200Gbps per-lane link modes - mlx5: move IPSec policy check after decryption - AMD/Solarflare: - support FW flashing via devlink - Cisco (enic): - use page pool memory allocator for Rx - enable 32, 64 byte CQEs - get max rx/tx ring size from the device - Meta (fbnic): - support flow steering and RSS configuration - report queue stats - support TCP segmentation - support IRQ coalescing - support ring size configuration - Marvell/Cavium: - support AF_XDP - Wangxun: - support for PTP clock and timestamping - Huawei (hibmcge): - checksum offload - add more statistics - Ethernet virtual: - VirtIO net: - aggressively suppress Tx completions, improve perf by 96% with 1 CPU and 55% with 2 CPUs - expose NAPI to IRQ mapping and persist NAPI settings - Google (gve): - support XDP in DQO RDA Queue Format - opt into instance locking - Microsoft vNIC: - support BIG TCP - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - cleanup Tx and Tx clock setting and other link-focused cleanups - enable SGMII and 2500BASEX mode switching for Intel platforms - support Sophgo SG2044 - Broadcom switches (b53): - support for BCM53101 - TI: - iep: add perout configuration support - icssg: support XDP - Cadence (macb): - implement BQL - Xilinx (axinet): - support dynamic IRQ moderation and changing coalescing at runtime - implement BQL - report standard stats - MediaTek: - support phylink managed EEE - Intel: - igc: don't restart the interface on every XDP program change - RealTek (r8169): - support reading registers of internal PHYs directly - increase max jumbo packet size on RTL8125/RTL8126 - Airoha: - support for RISC-V NPU packet processing unit - enable scatter-gather and support MTU up to 9kB - Tehuti (tn40xx): - support cards with TN4010 MAC and an Aquantia AQR105 PHY - Ethernet PHYs: - support for TJA1102S, TJA1121 - dp83tg720: add randomized polling intervals for link detection - dp83822: support changing the transmit amplitude voltage - support for LEDs on 88q2xxx - CAN: - canxl: support Remote Request Substitution bit access - flexcan: add S32G2/S32G3 SoC - WiFi: - remove cooked monitor support - strict mode for better AP testing - basic EPCS support - OMI RX bandwidth reduction support - batman-adv: add support for jumbo frames - WiFi drivers: - RealTek (rtw88): - support RTL8814AE and RTL8814AU - RealTek (rtw89): - switch using wiphy_lock and wiphy_work - add BB context to manipulate two PHY as preparation of MLO - improve BT-coexistence mechanism to play A2DP smoothly - Intel (iwlwifi): - add new iwlmld sub-driver for latest HW/FW combinations - MediaTek (mt76): - preparation for mt7996 Multi-Link Operation (MLO) support - Qualcomm/Atheros (ath12k): - continued work on MLO - Silabs (wfx): - Wake-on-WLAN support - Bluetooth: - add support for skb TX SND/COMPLETION timestamping - hci_core: enable buffer flow control for SCO/eSCO - coredump: log devcd dumps into the monitor - Bluetooth drivers: - intel: add support to configure TX power - nxp: handle bootloader error during cmd5 and cmd7" * tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits) unix: fix up for "apparmor: add fine grained af_unix mediation" mctp: Fix incorrect tx flow invalidation condition in mctp-i2c net: usb: asix: ax88772: Increase phy_name size net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string net: libwx: fix Tx L4 checksum net: libwx: fix Tx descriptor content for some tunnel packets atm: Fix NULL pointer dereference net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus net: phy: aquantia: add essential functions to aqr105 driver net: phy: aquantia: search for firmware-name in fwnode net: phy: aquantia: add probe function to aqr105 for firmware loading net: phy: Add swnode support to mdiobus_scan gve: add XDP DROP and PASS support for DQ gve: update XDP allocation path support RX buffer posting gve: merge packet buffer size fields gve: update GQ RX to use buf_size gve: introduce config-based allocation for XDP gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics ...
2025-03-26	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Merge in late fixes to prepare for the 6.15 net-next PR. No conflicts, adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt.c 919f9f497dbc ("eth: bnxt: fix out-of-range access of vnic_info array") fe96d717d38e ("bnxt_en: Extend queue stop/start for TX rings") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25	ibmvnic: Use kernel helpers for hex dumps	Nick Child
	Previously, when the driver was printing hex dumps, the buffer was cast to an 8 byte long and printed using string formatters. If the buffer size was not a multiple of 8 then a read buffer overflow was possible. Therefore, create a new ibmvnic function that loops over a buffer and calls hex_dump_to_buffer instead. This patch address KASAN reports like the one below: ibmvnic 30000003 env3: Login Buffer: ibmvnic 30000003 env3: 01000000af000000 <...> ibmvnic 30000003 env3: 2e6d62692e736261 ibmvnic 30000003 env3: 65050003006d6f63 ================================================================== BUG: KASAN: slab-out-of-bounds in ibmvnic_login+0xacc/0xffc [ibmvnic] Read of size 8 at addr c0000001331a9aa8 by task ip/17681 <...> Allocated by task 17681: <...> ibmvnic_login+0x2f0/0xffc [ibmvnic] ibmvnic_open+0x148/0x308 [ibmvnic] __dev_open+0x1ac/0x304 <...> The buggy address is located 168 bytes inside of allocated 175-byte region [c0000001331a9a00, c0000001331a9aaf) <...> ================================================================= ibmvnic 30000003 env3: 000000000033766e Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Nick Child <nnac123@linux.ibm.com> Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250320212951.11142-1-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-24	Merge tag 'bitmap-for-6.15' of https://github.com/norov/linux	Linus Torvalds
	Pull bitmap updates from Yury Norov: - cpumask_next_wrap() rework (me) - GENMASK() simplification (I Hsin) - rust bindings for cpumasks (Viresh and me) - scattered cleanups (Andy, Tamir, Vincent, Ignacio and Joel) * tag 'bitmap-for-6.15' of https://github.com/norov/linux: (22 commits) cpumask: align text in comment riscv: fix test_and_{set,clear}_bit ordering documentation treewide: fix typo 'unsigned __init128' -> 'unsigned __int128' MAINTAINERS: add rust bindings entry for bitmap API rust: Add cpumask helpers uapi: Revert "bitops: avoid integer overflow in GENMASK(_ULL)" cpumask: drop cpumask_next_wrap_old() PCI: hv: Switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap() scsi: lpfc: rework lpfc_next_{online,present}_cpu() scsi: lpfc: switch lpfc_irq_rebalance() to using cpumask_next_wrap() s390: switch stop_machine_yield() to using cpumask_next_wrap() padata: switch padata_find_next() to using cpumask_next_wrap() cpumask: use cpumask_next_wrap() where appropriate cpumask: re-introduce cpumask_next{,_and}_wrap() cpumask: deprecate cpumask_next_wrap() powerpc/xmon: simplify xmon_batch_next_cpu() ibmvnic: simplify ibmvnic_set_queue_affinity() virtio_net: simplify virtnet_set_affinity() objpool: rework objpool_pop() cpumask: add for_each_{possible,online}_cpu_wrap ...
2025-02-24	ibmvnic: simplify ibmvnic_set_queue_affinity()	Yury Norov
	A loop based on cpumask_next_wrap() opencodes the dedicated macro for_each_online_cpu_wrap(). Use it as it improves readability and simplifies maintenance. This also helps to drop cpumask handling code in the caller function. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Nick Child <nnac123@linux.ibm.com>
2025-02-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.14-rc4). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-17	ibmvnic: Don't reference skb after sending to VIOS	Nick Child
	Previously, after successfully flushing the xmit buffer to VIOS, the tx_bytes stat was incremented by the length of the skb. It is invalid to access the skb memory after sending the buffer to the VIOS because, at any point after sending, the VIOS can trigger an interrupt to free this memory. A race between reading skb->len and freeing the skb is possible (especially during LPM) and will result in use-after-free: ================================================================== BUG: KASAN: slab-use-after-free in ibmvnic_xmit+0x75c/0x1808 [ibmvnic] Read of size 4 at addr c00000024eb48a70 by task hxecom/14495 <...> Call Trace: [c000000118f66cf0] [c0000000018cba6c] dump_stack_lvl+0x84/0xe8 (unreliable) [c000000118f66d20] [c0000000006f0080] print_report+0x1a8/0x7f0 [c000000118f66df0] [c0000000006f08f0] kasan_report+0x128/0x1f8 [c000000118f66f00] [c0000000006f2868] __asan_load4+0xac/0xe0 [c000000118f66f20] [c0080000046eac84] ibmvnic_xmit+0x75c/0x1808 [ibmvnic] [c000000118f67340] [c0000000014be168] dev_hard_start_xmit+0x150/0x358 <...> Freed by task 0: kasan_save_stack+0x34/0x68 kasan_save_track+0x2c/0x50 kasan_save_free_info+0x64/0x108 __kasan_mempool_poison_object+0x148/0x2d4 napi_skb_cache_put+0x5c/0x194 net_tx_action+0x154/0x5b8 handle_softirqs+0x20c/0x60c do_softirq_own_stack+0x6c/0x88 <...> The buggy address belongs to the object at c00000024eb48a00 which belongs to the cache skbuff_head_cache of size 224 ================================================================== Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Nick Child <nnac123@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250214155233.235559-1-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07	net: ibm: emac: Use of_get_available_child_by_name()	Biju Das
	Use the helper of_get_available_child_by_name() to simplify emac_dt_mdio_probe(). Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-03	net: ibm: emac: mal: move irq maps down	Rosen Penev
	Moves the handling right before they are used and allows merging a branch. Also get rid of the error handling as devm_request_irq can handle that. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-13-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: mal: use devm for request_irq	Rosen Penev
	Avoids manual frees. Also replaced irq_of_parse_and_map with platform_get_irq since it's simpler and does the same thing. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-12-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: mal: use devm for kzalloc	Rosen Penev
	Simplifies the probe function by removing gotos. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-11-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: zmii: devm_platform_get_resource	Rosen Penev
	Simplifies the probe function by a bit and allows removing the _remove function such that devm now handles all cleanup. printk gets converted to dev_err as np is now gone. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-10-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: zmii: use devm for mutex_init	Rosen Penev
	It seems that since inception, this driver never called mutex_destroy in _remove. Use devm to handle this automatically. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-9-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: zmii: use devm for kzalloc	Rosen Penev
	Simplifies the probe function by removing gotos. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-8-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: rgmii: devm_platform_get_resource	Rosen Penev
	Simplifies the probe function by a bit and allows removing the _remove function such that devm now handles all cleanup. printk gets converted to dev_err as np is now gone. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-7-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: rgmii: use devm for mutex_init	Rosen Penev
	It seems that since inception, this driver never called mutex_destroy in _remove. Use devm to handle this automatically. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-6-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: rgmii: use devm for kzalloc	Rosen Penev
	Simplifies the probe function by removing gotos. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-5-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: tah: devm_platform_get_resources	Rosen Penev
	Simplifies the probe function by a bit and allows removing the _remove function such that devm now handles all cleanup. printk gets converted to dev_err as np is now gone. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-4-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: tah: use devm for mutex_init	Rosen Penev
	It seems that since inception, this driver never called mutex_destroy in _remove. Use devm to handle this automatically. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-3-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03	net: ibm: emac: tah: use devm for kzalloc	Rosen Penev
	Simplifies the probe function by removing gotos. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241030203727.6039-2-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29	ibmvnic: use ethtool string helpers	Rosen Penev
	They are the preferred way to copy ethtool strings. Avoids manually incrementing the data pointer. Signed-off-by: Rosen Penev <rosenp@gmail.com> Tested-by: Nick Child <nnac123@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241022203240.391648-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29	net: ibm: emac: generate random MAC if not found	Rosen Penev
	On this Cisco MX60W, u-boot sets the local-mac-address property. Unfortunately by default, the MAC is wrong and is actually located on a UBI partition. Which means nvmem needs to be used to grab it. In the case where that fails, EMAC fails to initialize instead of generating a random MAC as many other drivers do. Match behavior with other drivers to have a working ethernet interface. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-29	net: ibm: emac: use devm for mutex_init	Rosen Penev
	It seems since inception that mutex_destroy was never called for these in _remove. Instead of handling this manually, just use devm for simplicity. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-29	net: ibm: emac: use platform_get_irq	Rosen Penev
	No need for irq_of_parse_and_map since we have platform_device. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-29	net: ibm: emac: use devm_platform_ioremap_resource	Rosen Penev
	No need to have a struct resource. Gets rid of the TODO. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-29	net: ibm: emac: use netif_receive_skb_list	Rosen Penev
	Small rx improvement. Would use napi_gro_receive instead but that's a lot more involved than netif_receive_skb_list because of how the function is implemented. Before: > iperf -c 192.168.1.1 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 1] local 192.168.1.101 port 51556 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.04 sec 559 MBytes 467 Mbits/sec > iperf -c 192.168.1.1 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 1] local 192.168.1.101 port 48228 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.03 sec 558 MBytes 467 Mbits/sec > iperf -c 192.168.1.1 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 1] local 192.168.1.101 port 47600 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.04 sec 557 MBytes 466 Mbits/sec > iperf -c 192.168.1.1 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 1] local 192.168.1.101 port 37252 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.05 sec 559 MBytes 467 Mbits/sec After: > iperf -c 192.168.1.1 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 1] local 192.168.1.101 port 40786 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.05 sec 572 MBytes 478 Mbits/sec > iperf -c 192.168.1.1 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 1] local 192.168.1.101 port 52482 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.04 sec 571 MBytes 477 Mbits/sec > iperf -c 192.168.1.1 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 1] local 192.168.1.101 port 48370 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.04 sec 572 MBytes 478 Mbits/sec > iperf -c 192.168.1.1 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 1] local 192.168.1.101 port 46086 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.05 sec 571 MBytes 476 Mbits/sec > iperf -c 192.168.1.1 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 1] local 192.168.1.101 port 46062 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.00-10.04 sec 572 MBytes 478 Mbits/sec Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.12-rc3). No conflicts and no adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09	net: ibm: emac: mal: add dcr_unmap to _remove	Rosen Penev
	It's done in probe so it should be undone here. Fixes: 1d3bb996481e ("Device tree aware EMAC driver") Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20241008233050.9422-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net: ibm: emac: mal: fix wrong goto	Rosen Penev
	dcr_map is called in the previous if and therefore needs to be unmapped. Fixes: 1ff0fcfcb1a6 ("ibm_newemac: Fix new MAL feature handling") Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241007235711.5714-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	net: ethernet: Switch back to struct platform_driver::remove()	Uwe Kleine-König
	After commit 0edb555a65d1 ("platform: Make platform_driver::remove() return void") .remove() is (again) the right callback to implement for platform drivers. Convert all platform drivers below drivers/net/ethernet to use .remove(), with the eventual goal to drop struct platform_driver::remove_new(). As .remove() and .remove_new() have the same prototypes, conversion is done by just changing the structure member name in the driver initializer. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://patch.msgid.link/18f7c585a1a8a8ac8b03a2fca7de19bd5c52ac2b.1727949050.git.u.kleine-koenig@baylibre.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04	ibmvnic: Inspect header requirements before using scrq direct	Nick Child
	Previously, the TX header requirement for standard frames was ignored. This requirement is a bitstring sent from the VIOS which maps to the type of header information needed during TX. If no header information, is needed then send subcrq direct can be used (which can be more performant). This bitstring was previously ignored for standard packets (AKA non LSO, non CSO) due to the belief that the bitstring was over-cautionary. It turns out that there are some configurations where the backing device does need header information for transmission of standard packets. If the information is not supplied then this causes continuous "Adapter error" transport events. Therefore, this bitstring should be respected and observed before considering the use of send subcrq direct. Fixes: 74839f7a8268 ("ibmvnic: Introduce send sub-crq direct") Signed-off-by: Nick Child <nnac123@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241001163200.1802522-2-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03	ibmvnic: Add stat for tx direct vs tx batched	Nick Child
	Allow tracking of packets sent with send_subcrq direct vs indirect. `ethtool -S <dev>` will now provide a counter of the number of uses of each xmit method. This metric will be useful in performance debugging. Signed-off-by: Nick Child <nnac123@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241001163531.1803152-1-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	net: ibm: emac: get rid of wol_irq	Rosen Penev
	This is completely unused. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240912024903.6201-10-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	net: ibm: emac: remove all waiting code	Rosen Penev
	EPROBE_DEFER, which probably wasn't available when this driver was written, can be used instead of waiting manually. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240912024903.6201-9-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	net: ibm: emac: replace of_get_property	Rosen Penev
	of_property_read_u32 can be used. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240912024903.6201-8-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	net: ibm: emac: use netdev's phydev directly	Rosen Penev
	Avoids having to use own struct member. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240912024903.6201-7-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	net: ibm: emac: use devm for register_netdev	Rosen Penev
	Cleans it up automatically. No need to handle manually. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240912024903.6201-6-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	net: ibm: emac: remove mii_bus with devm	Rosen Penev
	Switching to devm management of mii_bus allows to remove mdiobus_unregister calls and thus avoids needing a mii_bus global struct member. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240912024903.6201-5-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	net: ibm: emac: use devm for of_iomap	Rosen Penev
	Allows removing manual iounmap. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240912024903.6201-4-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	net: ibm: emac: manage emac_irq with devm	Rosen Penev
	It's the last to go in remove. Safe to let devm handle it. Also move request_irq to probe for clarity. It's removed in _remove not close. Use dev_err_probe instead of printk. Handles EPROBE_DEFER automatically. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240912024903.6201-3-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	net: ibm: emac: use devm for alloc_etherdev	Rosen Penev
	Allows to simplify the code slightly. This is safe to do as free_netdev gets called last. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240912024903.6201-2-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-09	net: ibm: emac: Use __iomem annotation for emac_[xg]aht_base	Simon Horman
	dev->emacp contains an __iomem pointer and values derived from it are used as __iomem pointers. So use this annotation in the return type for helpers that derive pointers from dev->emacp. Flagged by Sparse as: .../core.c:444:36: warning: incorrect type in argument 1 (different address spaces) .../core.c:444:36: expected unsigned int volatile [noderef] [usertype] __iomem addr .../core.c:444:36: got unsigned int [usertype] .../core.c: note: in included file: .../core.h:416:25: warning: cast removes address space '__iomem' of expression Compile tested only. No functional change intended. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240906-emac-iomem-v1-1-207cc4f3fed0@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>