linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-12-07	rtase: Refine the if statement	Justin Lai
	Refine the if statement to improve readability. Signed-off-by: Justin Lai <justinlai0215@realtek.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20241206084851.760475-1-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	rtnetlink: fix error code in rtnl_newlink()	Dan Carpenter
	If rtnl_get_peer_net() fails, then propagate the error code. Don't return success. Fixes: 48327566769a ("rtnetlink: fix double call of rtnl_link_get_net_ifla()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/a2d20cd4-387a-4475-887c-bb7d0e88e25a@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	Merge branch 'ocelot-ptp-fixes'	Jakub Kicinski
	Vladimir Oltean says: ==================== Ocelot PTP fixes This is another attempt at "net: mscc: ocelot: be resilient to loss of PTP packets during transmission". https://lore.kernel.org/netdev/20241203164755.16115-1-vladimir.oltean@nxp.com/ The central change is in patch 4/5. It recovers a port from a very reproducible condition where previously, it would become unable to take any hardware TX timestamp. Then we have patches 1/5 and 5/5 which are extra bug fixes. Patches 2/5 and 3/5 are logical changes split out of the monolithic patch previously submitted as v1, so that patch 4/5 is hopefully easier to follow. ==================== Link: https://patch.msgid.link/20241205145519.1236778-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	net: mscc: ocelot: perform error cleanup in ocelot_hwstamp_set()	Vladimir Oltean
	An unsupported RX filter will leave the port with TX timestamping still applied as per the new request, rather than the old setting. When parsing the tx_type, don't apply it just yet, but delay that until after we've parsed the rx_filter as well (and potentially returned -ERANGE for that). Similarly, copy_to_user() may fail, which is a rare occurrence, but should still be treated by unwinding what was done. Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-6-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	net: mscc: ocelot: be resilient to loss of PTP packets during transmission	Vladimir Oltean
	The Felix DSA driver presents unique challenges that make the simplistic ocelot PTP TX timestamping procedure unreliable: any transmitted packet may be lost in hardware before it ever leaves our local system. This may happen because there is congestion on the DSA conduit, the switch CPU port or even user port (Qdiscs like taprio may delay packets indefinitely by design). The technical problem is that the kernel, i.e. ocelot_port_add_txtstamp_skb(), runs out of timestamp IDs eventually, because it never detects that packets are lost, and keeps the IDs of the lost packets on hold indefinitely. The manifestation of the issue once the entire timestamp ID range becomes busy looks like this in dmesg: mscc_felix 0000:00:00.5: port 0 delivering skb without TX timestamp mscc_felix 0000:00:00.5: port 1 delivering skb without TX timestamp At the surface level, we need a timeout timer so that the kernel knows a timestamp ID is available again. But there is a deeper problem with the implementation, which is the monotonically increasing ocelot_port->ts_id. In the presence of packet loss, it will be impossible to detect that and reuse one of the holes created in the range of free timestamp IDs. What we actually need is a bitmap of 63 timestamp IDs tracking which one is available. That is able to use up holes caused by packet loss, but also gives us a unique opportunity to not implement an actual timer_list for the timeout timer (very complicated in terms of locking). We could only declare a timestamp ID stale on demand (lazily), aka when there's no other timestamp ID available. There are pros and cons to this approach: the implementation is much more simple than per-packet timers would be, but most of the stale packets would be quasi-leaked - not really leaked, but blocked in driver memory, since this algorithm sees no reason to free them. An improved technique would be to check for stale timestamp IDs every time we allocate a new one. Assuming a constant flux of PTP packets, this avoids stale packets being blocked in memory, but of course, packets lost at the end of the flux are still blocked until the flux resumes (nobody left to kick them out). Since implementing per-packet timers is way too complicated, this should be good enough. Testing procedure: Persistently block traffic class 5 and try to run PTP on it: $ tc qdisc replace dev swp3 parent root taprio num_tc 8 \ map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 sched-entry S 0xdf 100000 flags 0x2 [ 126.948141] mscc_felix 0000:00:00.5: port 3 tc 5 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS $ ptp4l -i swp3 -2 -P -m --socket_priority 5 --fault_reset_interval ASAP --logSyncInterval -3 ptp4l[70.351]: port 1 (swp3): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[70.354]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[70.358]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE [ 70.394583] mscc_felix 0000:00:00.5: port 3 timestamp id 0 ptp4l[70.406]: timed out while polling for tx timestamp ptp4l[70.406]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[70.406]: port 1 (swp3): send peer delay response failed ptp4l[70.407]: port 1 (swp3): clearing fault immediately ptp4l[70.952]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1 [ 71.394858] mscc_felix 0000:00:00.5: port 3 timestamp id 1 ptp4l[71.400]: timed out while polling for tx timestamp ptp4l[71.400]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[71.401]: port 1 (swp3): send peer delay response failed ptp4l[71.401]: port 1 (swp3): clearing fault immediately [ 72.393616] mscc_felix 0000:00:00.5: port 3 timestamp id 2 ptp4l[72.401]: timed out while polling for tx timestamp ptp4l[72.402]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[72.402]: port 1 (swp3): send peer delay response failed ptp4l[72.402]: port 1 (swp3): clearing fault immediately ptp4l[72.952]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1 [ 73.395291] mscc_felix 0000:00:00.5: port 3 timestamp id 3 ptp4l[73.400]: timed out while polling for tx timestamp ptp4l[73.400]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[73.400]: port 1 (swp3): send peer delay response failed ptp4l[73.400]: port 1 (swp3): clearing fault immediately [ 74.394282] mscc_felix 0000:00:00.5: port 3 timestamp id 4 ptp4l[74.400]: timed out while polling for tx timestamp ptp4l[74.401]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[74.401]: port 1 (swp3): send peer delay response failed ptp4l[74.401]: port 1 (swp3): clearing fault immediately ptp4l[74.953]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1 [ 75.396830] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 0 which seems lost [ 75.405760] mscc_felix 0000:00:00.5: port 3 timestamp id 0 ptp4l[75.410]: timed out while polling for tx timestamp ptp4l[75.411]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[75.411]: port 1 (swp3): send peer delay response failed ptp4l[75.411]: port 1 (swp3): clearing fault immediately (...) Remove the blocking condition and see that the port recovers: $ same tc command as above, but use "sched-entry S 0xff" instead $ same ptp4l command as above ptp4l[99.489]: port 1 (swp3): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[99.490]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[99.492]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE [ 100.403768] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 0 which seems lost [ 100.412545] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 1 which seems lost [ 100.421283] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 2 which seems lost [ 100.430015] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 3 which seems lost [ 100.438744] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 4 which seems lost [ 100.447470] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 100.505919] mscc_felix 0000:00:00.5: port 3 timestamp id 0 ptp4l[100.963]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1 [ 101.405077] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 101.507953] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 102.405405] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 102.509391] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 103.406003] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 103.510011] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 104.405601] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 104.510624] mscc_felix 0000:00:00.5: port 3 timestamp id 0 ptp4l[104.965]: selected best master clock d858d7.fffe.00ca6d ptp4l[104.966]: port 1 (swp3): assuming the grand master role ptp4l[104.967]: port 1 (swp3): LISTENING to GRAND_MASTER on RS_GRAND_MASTER [ 105.106201] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.232420] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.359001] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.405500] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.485356] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.511220] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.610938] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.737237] mscc_felix 0000:00:00.5: port 3 timestamp id 0 (...) Notice that in this new usage pattern, a non-congested port should basically use timestamp ID 0 all the time, progressing to higher numbers only if there are unacknowledged timestamps in flight. Compare this to the old usage, where the timestamp ID used to monotonically increase modulo OCELOT_MAX_PTP_ID. In terms of implementation, this simplifies the bookkeeping of the ocelot_port :: ts_id and ptp_skbs_in_flight. Since we need to traverse the list of two-step timestampable skbs for each new packet anyway, the information can already be computed and does not need to be stored. Also, ocelot_port->tx_skbs is always accessed under the switch-wide ocelot->ts_id_lock IRQ-unsafe spinlock, so we don't need the skb queue's lock and can use the unlocked primitives safely. This problem was actually detected using the tc-taprio offload, and is causing trouble in TSN scenarios, which Felix (NXP LS1028A / VSC9959) supports but Ocelot (VSC7514) does not. Thus, I've selected the commit to blame as the one adding initial timestamping support for the Felix switch. Fixes: c0bcf537667c ("net: dsa: ocelot: add hardware timestamping support for Felix") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	net: mscc: ocelot: ocelot->ts_id_lock and ocelot_port->tx_skbs.lock are IRQ-safe	Vladimir Oltean
	ocelot_get_txtstamp() is a threaded IRQ handler, requested explicitly as such by both ocelot_ptp_rdy_irq_handler() and vsc9959_irq_handler(). As such, it runs with IRQs enabled, and not in hardirq context. Thus, ocelot_port_add_txtstamp_skb() has no reason to turn off IRQs, it cannot be preempted by ocelot_get_txtstamp(). For the same reason, dev_kfree_skb_any_reason() will always evaluate as kfree_skb_reason() in this calling context, so just simplify the dev_kfree_skb_any() call to kfree_skb(). Also, ocelot_port_txtstamp_request() runs from NET_TX softirq context, not with hardirqs enabled. Thus, ocelot_get_txtstamp() which shares the ocelot_port->tx_skbs.lock lock with it, has no reason to disable hardirqs. This is part of a larger rework of the TX timestamping procedure. A logical subportion of the rework has been split into a separate change. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	net: mscc: ocelot: improve handling of TX timestamp for unknown skb	Vladimir Oltean
	This condition, theoretically impossible to trigger, is not really handled well. By "continuing", we are skipping the write to SYS_PTP_NXT which advances the timestamp FIFO to the next entry. So we are reading the same FIFO entry all over again, printing stack traces and eventually killing the kernel. No real problem has been observed here. This is part of a larger rework of the timestamp IRQ procedure, with this logical change split out into a patch of its own. We will need to "goto next_ts" for other conditions as well. Fixes: 9fde506e0c53 ("net: mscc: ocelot: warn when a PTP IRQ is raised for an unknown skb") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	net: mscc: ocelot: fix memory leak on ocelot_port_add_txtstamp_skb()	Vladimir Oltean
	If ocelot_port_add_txtstamp_skb() fails, for example due to a full PTP timestamp FIFO, we must undo the skb_clone_sk() call with kfree_skb(). Otherwise, the reference to the skb clone is lost. Fixes: 52849bcf0029 ("net: mscc: ocelot: avoid overflowing the PTP timestamp FIFO") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	ip: Return drop reason if in_dev is NULL in ip_route_input_rcu().	Kuniyuki Iwashima
	syzkaller reported a warning in __sk_skb_reason_drop(). Commit 61b95c70f344 ("net: ip: make ip_route_input_rcu() return drop reasons") missed a path where -EINVAL is returned. Then, the cited commit started to trigger the warning with the invalid error. Let's fix it by returning SKB_DROP_REASON_NOT_SPECIFIED. [0]: WARNING: CPU: 0 PID: 10 at net/core/skbuff.c:1216 __sk_skb_reason_drop net/core/skbuff.c:1216 [inline] WARNING: CPU: 0 PID: 10 at net/core/skbuff.c:1216 sk_skb_reason_drop+0x97/0x1b0 net/core/skbuff.c:1241 Modules linked in: CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.12.0-10686-gbb18265c3aba #10 1c308307628619808b5a4a0495c4aab5637b0551 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 Workqueue: wg-crypt-wg2 wg_packet_decrypt_worker RIP: 0010:__sk_skb_reason_drop net/core/skbuff.c:1216 [inline] RIP: 0010:sk_skb_reason_drop+0x97/0x1b0 net/core/skbuff.c:1241 Code: 5d 41 5c 41 5d 41 5e e9 e7 9e 95 fd e8 e2 9e 95 fd 31 ff 44 89 e6 e8 58 a1 95 fd 45 85 e4 0f 85 a2 00 00 00 e8 ca 9e 95 fd 90 <0f> 0b 90 e8 c1 9e 95 fd 44 89 e6 bf 01 00 00 00 e8 34 a1 95 fd 41 RSP: 0018:ffa0000000007650 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 000000000000ffff RCX: ffffffff83bc3592 RDX: ff110001002a0000 RSI: ffffffff83bc34d6 RDI: 0000000000000007 RBP: ff11000109ee85f0 R08: 0000000000000001 R09: ffe21c00213dd0da R10: 000000000000ffff R11: 0000000000000000 R12: 00000000ffffffea R13: 0000000000000000 R14: ff11000109ee86d4 R15: ff11000109ee8648 FS: 0000000000000000(0000) GS:ff1100011a000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020177000 CR3: 0000000108a3d006 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000600 PKRU: 55555554 Call Trace: <IRQ> kfree_skb_reason include/linux/skbuff.h:1263 [inline] ip_rcv_finish_core.constprop.0+0x896/0x2320 net/ipv4/ip_input.c:424 ip_list_rcv_finish.constprop.0+0x1b2/0x710 net/ipv4/ip_input.c:610 ip_sublist_rcv net/ipv4/ip_input.c:636 [inline] ip_list_rcv+0x34a/0x460 net/ipv4/ip_input.c:670 __netif_receive_skb_list_ptype net/core/dev.c:5715 [inline] __netif_receive_skb_list_core+0x536/0x900 net/core/dev.c:5762 __netif_receive_skb_list net/core/dev.c:5814 [inline] netif_receive_skb_list_internal+0x77c/0xdc0 net/core/dev.c:5905 gro_normal_list include/net/gro.h:515 [inline] gro_normal_list include/net/gro.h:511 [inline] napi_complete_done+0x219/0x8c0 net/core/dev.c:6256 wg_packet_rx_poll+0xbff/0x1e40 drivers/net/wireguard/receive.c:488 __napi_poll.constprop.0+0xb3/0x530 net/core/dev.c:6877 napi_poll net/core/dev.c:6946 [inline] net_rx_action+0x9eb/0xe30 net/core/dev.c:7068 handle_softirqs+0x1ac/0x740 kernel/softirq.c:554 do_softirq kernel/softirq.c:455 [inline] do_softirq+0x48/0x80 kernel/softirq.c:442 </IRQ> <TASK> __local_bh_enable_ip+0xed/0x110 kernel/softirq.c:382 spin_unlock_bh include/linux/spinlock.h:396 [inline] ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline] wg_packet_decrypt_worker+0x3ba/0x580 drivers/net/wireguard/receive.c:499 process_one_work+0x940/0x1a70 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x639/0xe30 kernel/workqueue.c:3391 kthread+0x283/0x350 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244 </TASK> Fixes: 82d9983ebeb8 ("net: ip: make ip_route_input_noref() return drop reasons") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20241206020715.80207-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	Merge branch 'net-net-add-negotiation-of-in-band-capabilities-remainder'	Jakub Kicinski
	Russell King says: ==================== net: net: add negotiation of in-band capabilities (remainder) Here are the last three patches which were not included in the non-RFC posting, but were in the RFC posting. These add the .pcs_inband() method to the Lynx, MTK Lynx and XPCS drivers. ==================== Link: https://patch.msgid.link/Z1F1b8eh8s8T627j@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	net: pcs: xpcs: implement pcs_inband_caps() method	Russell King (Oracle)
	Report the PCS inband capabilities to phylink for XPCS. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tJ8NW-006L5V-I9@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	net: pcs: pcs-mtk-lynxi: implement pcs_inband_caps() method	Russell King (Oracle)
	Report the PCS in-band capabilities to phylink for the LynxI PCS. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tJ8NR-006L5P-E3@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	net: pcs: pcs-lynx: implement pcs_inband_caps() method	Russell King (Oracle)
	Report the PCS in-band capabilities to phylink for the Lynx PCS. Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tJ8NM-006L5J-AH@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	tun: fix group permission check	Stas Sergeev
	Currently tun checks the group permission even if the user have matched. Besides going against the usual permission semantic, this has a very interesting implication: if the tun group is not among the supplementary groups of the tun user, then effectively no one can access the tun device. CAP_SYS_ADMIN still can, but its the same as not setting the tun ownership. This patch relaxes the group checking so that either the user match or the group match is enough. This avoids the situation when no one can access the device even though the ownership is properly set. Also I simplified the logic by removing the redundant inversions: tun_not_capable() --> !tun_capable() Signed-off-by: Stas Sergeev <stsp2@yandex.ru> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20241205073614.294773-1-stsp2@yandex.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	net: stmmac: fix TSO DMA API usage causing oops	Russell King (Oracle)
	Commit 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") moved the assignment of tx_skbuff_dma[]'s members to be later in stmmac_tso_xmit(). The buf (dma cookie) and len stored in this structure are passed to dma_unmap_single() by stmmac_tx_clean(). The DMA API requires that the dma cookie passed to dma_unmap_single() is the same as the value returned from dma_map_single(). However, by moving the assignment later, this is not the case when priv->dma_cap.addr64 > 32 as "des" is offset by proto_hdr_len. This causes problems such as: dwc-eth-dwmac 2490000.ethernet eth0: Tx DMA map failed and with DMA_API_DEBUG enabled: DMA-API: dwc-eth-dwmac 2490000.ethernet: device driver tries to +free DMA memory it has not allocated [device address=0x000000ffffcf65c0] [size=66 bytes] Fix this by maintaining "des" as the original DMA cookie, and use tso_des to pass the offset DMA cookie to stmmac_tso_allocator(). Full details of the crashes can be found at: https://lore.kernel.org/all/d8112193-0386-4e14-b516-37c2d838171a@nvidia.com/ https://lore.kernel.org/all/klkzp5yn5kq5efgtrow6wbvnc46bcqfxs65nz3qy77ujr5turc@bwwhelz2l4dw/ Reported-by: Jon Hunter <jonathanh@nvidia.com> Reported-by: Thierry Reding <thierry.reding@gmail.com> Fixes: 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Furong Xu <0x1207@gmail.com> Link: https://patch.msgid.link/E1tJXcx-006N4Z-PC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	tools: ynl-gen-c: don't require -o argument	Johannes Berg
	Without -o the tool currently crashes, but it's not marked as required. The only thing we can't do without it is to generate the correct #include for user source files, but we can put a placeholder instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20241206113100.89d35bf124d6.I9228fb704e6d5c9d8e046ef15025a47a48439c1e@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	tools: ynl-gen-c: annotate valid choices for --mode	Johannes Berg
	This makes argparse validate the input and helps users understand which modes are possible. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20241206113100.e2ab5cf6937c.Ie149a0ca5df713860964b44fe9d9ae547f2e1553@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-07	Merge tag '6.13-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6	Linus Torvalds
	Pull smb client fixes from Steve French: - DFS fix (for race with tree disconnect and dfs cache worker) - Four fixes for SMB3.1.1 posix extensions: - improve special file support e.g. to Samba, retrieving the file type earlier - reduce roundtrips (e.g. on ls -l, in some cases) * tag '6.13-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix potential race in cifs_put_tcon() smb3.1.1: fix posix mounts to older servers fs/smb/client: cifs_prime_dcache() for SMB3 POSIX reparse points fs/smb/client: Implement new SMB3 POSIX type fs/smb/client: avoid querying SMB2_OP_QUERY_WSL_EA for SMB3 POSIX
2024-12-07	Merge tag 'scsi-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Large number of small fixes, all in drivers" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits) scsi: scsi_debug: Fix hrtimer support for ndelay scsi: storvsc: Do not flag MAINTENANCE_IN return of SRB_STATUS_DATA_OVERRUN as an error scsi: ufs: core: Add missing post notify for power mode change scsi: sg: Fix slab-use-after-free read in sg_release() scsi: ufs: core: sysfs: Prevent div by zero scsi: qla2xxx: Update version to 10.02.09.400-k scsi: qla2xxx: Supported speed displayed incorrectly for VPorts scsi: qla2xxx: Fix NVMe and NPIV connect issue scsi: qla2xxx: Remove check req_sg_cnt should be equal to rsp_sg_cnt scsi: qla2xxx: Fix use after free on unload scsi: qla2xxx: Fix abort in bsg timeout scsi: mpi3mr: Update driver version to 8.12.0.3.50 scsi: mpi3mr: Handling of fault code for insufficient power scsi: mpi3mr: Start controller indexing from 0 scsi: mpi3mr: Fix corrupt config pages PHY state is switched in sysfs scsi: mpi3mr: Synchronize access to ioctl data buffer scsi: mpt3sas: Update driver version to 51.100.00.00 scsi: mpt3sas: Diag-Reset when Doorbell-In-Use bit is set during driver load time scsi: ufs: pltfrm: Dellocate HBA during ufshcd_pltfrm_remove() scsi: ufs: pltfrm: Drop PM runtime reference count after ufshcd_remove() ...
2024-12-07	Merge tag 'block-6.13-20241207' of git://git.kernel.dk/linux	Linus Torvalds
	Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Target fix using incorrect zero buffer (Nilay) - Device specifc deallocate quirk fixes (Christoph, Keith) - Fabrics fix for handling max command target bugs (Maurizio) - Cocci fix usage for kzalloc (Yu-Chen) - DMA size fix for host memory buffer feature (Christoph) - Fabrics queue cleanup fixes (Chunguang) - CPU hotplug ordering fixes - Add missing MODULE_DESCRIPTION for rnull - bcache error value fix - virtio-blk queue freeze fix * tag 'block-6.13-20241207' of git://git.kernel.dk/linux: blk-mq: move cpuhp callback registering out of q->sysfs_lock blk-mq: register cpuhp callback after hctx is added to xarray table virtio-blk: don't keep queue frozen during system suspend nvme-tcp: simplify nvme_tcp_teardown_io_queues() nvme-tcp: no need to quiesce admin_q in nvme_tcp_teardown_io_queues() nvme-rdma: unquiesce admin_q before destroy it nvme-tcp: fix the memleak while create new ctrl failed nvme-pci: don't use dma_alloc_noncontiguous with 0 merge boundary nvmet: replace kmalloc + memset with kzalloc for data allocation nvme-fabrics: handle zero MAXCMD without closing the connection bcache: revert replacing IS_ERR_OR_NULL with IS_ERR again nvme-pci: remove two deallocate zeroes quirks block: rnull: add missing MODULE_DESCRIPTION nvme: don't apply NVME_QUIRK_DEALLOCATE_ZEROES when DSM is not supported nvmet: use kzalloc instead of ZERO_PAGE in nvme_execute_identify_ns_nvm()
2024-12-07	Merge tag 'io_uring-6.13-20241207' of git://git.kernel.dk/linux	Linus Torvalds
	Pull io_uring fix from Jens Axboe: "A single fix for a parameter type which affects 32-bit" * tag 'io_uring-6.13-20241207' of git://git.kernel.dk/linux: io_uring: Change res2 parameter type in io_uring_cmd_done
2024-12-07	Merge tag 'ubifs-for-linus-6.13-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull jffs2 fix from Richard Weinberger: - Fixup rtime compressor bounds checking * tag 'ubifs-for-linus-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: jffs2: Fix rtime decompressor
2024-12-07	ALSA: usb-audio: Add implicit feedback quirk for Yamaha THR5	Jaakko Salo
	Use implicit feedback from the capture endpoint to fix popping sounds during playback. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219567 Signed-off-by: Jaakko Salo <jaakkos@gmail.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241206164448.8136-1-jaakkos@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-12-07	headers/cleanup.h: Remove the if_not_guard() facility	Ingo Molnar
	Linus noticed that the new if_not_guard() definition is fragile: "This macro generates actively wrong code if it happens to be inside an if-statement or a loop without a block. IOW, code like this: for (iterate-over-something) if_not_guard(a) return -BUSY; looks like will build fine, but will generate completely incorrect code." The reason is that the __if_not_guard() macro is multi-statement, so while most kernel developers expect macros to be simple or at least compound statements - but for __if_not_guard() it is not so: #define __if_not_guard(_name, _id, args...) \ BUILD_BUG_ON(!__is_cond_ptr(_name)); \ CLASS(_name, _id)(args); \ if (!__guard_ptr(_name)(&_id)) To add insult to injury, the placement of the BUILD_BUG_ON() line makes the macro appear to compile fine, but it will generate incorrect code as Linus reported, for example if used within iteration or conditional statements that will use the first statement of a macro as a loop body or conditional statement body. [ I'd also like to note that the original submission by David Lechner did not contain the BUILD_BUG_ON() line, so it was safer than what we ended up committing. Mea culpa. ] It doesn't appear to be possible to turn this macro into a robust single or compound statement that could be used in single statements, due to the necessity to define an auto scope variable with an open scope and the necessity of it having to expand to a partial 'if' statement with no body. Instead of trying to work around this fragility, just remove the construct before it gets used. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Lechner <dlechner@baylibre.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/Z1LBnX9TpZLR5Dkf@gmail.com
2024-12-06	Merge branch 'net-convert-some-udp-tunnel-drivers-to-netdev_pcpu_stat_dstats'	Jakub Kicinski
	Guillaume Nault says: ==================== net: Convert some UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS. VXLAN, Geneve and Bareudp use various device counters for managing RX and TX statistics: * VXLAN uses the device core_stats for RX and TX drops, tstats for regular RX/TX counters and DEV_STATS_INC() for various types of RX/TX errors. * Geneve uses tstats for regular RX/TX counters and DEV_STATS_INC() for everything else, include RX/TX drops. * Bareudp, was recently converted to follow VXLAN behaviour, that is, device core_stats for RX and TX drops, tstats for regular RX/TX counters and DEV_STATS_INC() for other counter types. Let's consolidate statistics management around the dstats counters instead. This avoids using core_stats in VXLAN and Bareudp, as core_stats is supposed to be used by core networking code only (and not in drivers). This also allows Geneve to avoid using atomic increments when updating RX and TX drop counters, as dstats is per-cpu. Finally, this also simplifies the code as all three modules now handle stats in the same way and with only two different sets of counters (the per-cpu dstats and the atomic DEV_STATS_INC()). Patch 1 creates dstats helper functions that can be used outside of VRF (until then, dstats was VRF-specific). Then patches 2 to 4, convert VXLAN, Geneve and Bareudp, one by one. ==================== Link: https://patch.msgid.link/cover.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	bareudp: Handle stats using NETDEV_PCPU_STAT_DSTATS.	Guillaume Nault
	Bareudp uses the TSTATS infrastructure (dev_sw_netstats_()) for RX packet counters. It was also recently converted to use the device core stats (dev_core_stats_()) for RX and TX drops (see commit 788d5d655bc9 ("bareudp: Use pcpu stats to update rx_dropped counter.")). Since core stats are to be avoided in drivers, and for consistency with VXLAN and Geneve, let's convert packet stats handling to DSTATS, which can handle RX/TX stats and packet drops. Statistics that don't fit DSTATS are still updated atomically with DEV_STATS_INC(). Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/0f4f8448db3ff449ac6e939872b28cf3f8982da7.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	geneve: Handle stats using NETDEV_PCPU_STAT_DSTATS.	Guillaume Nault
	Geneve uses the TSTATS infrastructure (dev_sw_netstats_*()) for RX packet counters. All other counters are handled using atomic increments with DEV_STATS_INC(). Let's convert packet stats handling to DSTATS, which has a per-cpu counter for packet drops too, to avoid the cost of atomic increments in these cases. Statistics that don't fit DSTATS are still updated atomically with DEV_STATS_INC(). Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/7af5c09f3c26f0f231fbe383822ca5d1ce0278fa.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	vxlan: Handle stats using NETDEV_PCPU_STAT_DSTATS.	Guillaume Nault
	VXLAN uses the TSTATS infrastructure (dev_sw_netstats_()) for RX and TX packet counters. It also uses the device core stats (dev_core_stats_()) for RX and TX drops. Let's consolidate that using the DSTATS infrastructure, which can handle both packet counters and packet drops. Statistics that don't fit DSTATS are still updated atomically with DEV_STATS_INC(). While there, convert the "len" variable of vxlan_encap_bypass() to unsigned int, to respect the types of skb->len and dev_dstats_[rt]x_add(). Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/145558b184b3cda77911ca5682b6eb83c3ffed8e.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	vrf: Make pcpu_dstats update functions available to other modules.	Guillaume Nault
	Currently vrf is the only module that uses NETDEV_PCPU_STAT_DSTATS. In order to make this kind of statistics available to other modules, we need to define the update functions in netdevice.h. Therefore, let's define dev_dstats_*() functions for RX and TX packet updates (packets, bytes and drops). Use these new functions in vrf.c instead of vrf_rx_stats() and the other manual counter updates. While there, update the type of the "len" variables to "unsigned int", so that there're aligned with both skb->len and the new dstats update functions. Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/d7a552ee382c79f4854e7fcc224cf176cd21150d.1733313925.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	Merge branch 'lan78xx-preparations-for-phylink'	Jakub Kicinski
	Oleksij Rempel says: ==================== lan78xx: Preparations for PHYlink This patch set is part of the preparatory work for migrating the lan78xx USB Ethernet driver to the PHYlink framework. During extensive testing, I observed that resetting the USB adapter can lead to various read/write errors. While the errors themselves are acceptable, they generate excessive log messages, resulting in significant log spam. This set improves error handling to reduce logging noise by addressing errors directly and returning early when necessary. Key highlights of this series include: - Enhanced error handling to reduce log spam while preserving the original error values, avoiding unnecessary overwrites. - Improved error reporting using the `%pe` specifier for better clarity in log messages. - Removal of redundant and problematic PHY fixups for LAN8835 and KSZ9031, with detailed explanations in the respective patches. - Cleanup of code structure, including unified `goto` labels for better readability and maintainability, even in simple editors. ==================== Link: https://patch.msgid.link/20241204084142.1152696-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Improve error handling in dataport and multicast writes	Oleksij Rempel
	Update `lan78xx_dataport_write` and `lan78xx_deferred_multicast_write` to: - Handle errors during register read/write operations. - Exit immediately on errors and log them using `%pe` for clarity. - Avoid silent failures by propagating error codes properly. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-11-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Add error handling to lan78xx_irq_bus_sync_unlock	Oleksij Rempel
	Update `lan78xx_irq_bus_sync_unlock` to handle errors in register read/write operations. If an error occurs, log it and exit the function appropriately. This ensures proper handling of failures during IRQ synchronization. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-10-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Add error handling to set_rx_max_frame_length and set_mtu	Oleksij Rempel
	Improve error handling in `lan78xx_set_rx_max_frame_length` by: - Checking return values from register read/write operations and propagating errors. - Exiting immediately on failure to ensure proper error reporting. In `lan78xx_change_mtu`, log errors when changing MTU fails, using `%pe` for clear error representation. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-9-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Add error handling to lan78xx_init_ltm	Oleksij Rempel
	Convert `lan78xx_init_ltm` to return error codes and handle errors properly. Previously, errors during the LTM initialization process were not propagated, potentially leading to undetected issues. This patch ensures: - Errors in `lan78xx_read_reg` and `lan78xx_write_reg` are checked and handled. - Errors are logged with detailed messages using `%pe` for clarity. - The function exits immediately on error, returning the error code. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-8-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Improve error handling in EEPROM and OTP operations	Oleksij Rempel
	Refine error handling in EEPROM and OTP read/write functions by: - Return error values immediately upon detection. - Avoid overwriting correct error codes with `-EIO`. - Preserve initial error codes as they were appropriate for specific failures. - Use `-ETIMEDOUT` for timeout conditions instead of `-EIO`. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-7-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Fix error handling in MII read/write functions	Oleksij Rempel
	Ensure proper error handling in `lan78xx_mdiobus_read` and `lan78xx_mdiobus_write` by checking return values of register read/write operations and returning errors to the caller. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-6-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Improve error reporting with %pe specifier	Oleksij Rempel
	Replace integer error codes with the `%pe` format specifier in register read and write error messages. This change provides human-readable error strings, making logs more informative and debugging easier. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-5-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: move functions to avoid forward definitions	Oleksij Rempel
	Move following functions to avoid forward declarations in the code: - lan78xx_start_hw() - lan78xx_stop_hw() - lan78xx_flush_fifo() - lan78xx_start_tx_path() - lan78xx_stop_tx_path() - lan78xx_flush_tx_fifo() - lan78xx_start_rx_path() - lan78xx_stop_rx_path() - lan78xx_flush_rx_fifo() These functions will be used in an upcoming PHYlink migration patch. No modifications to the functionality of the code are made. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Remove KSZ9031 PHY fixup	Oleksij Rempel
	Remove the KSZ9031RNX PHY fixup from the lan78xx driver. The fixup applied specific RGMII pad skew configurations globally, but these settings violate the RGMII specification and cause more harm than benefit. Key issues with the fixup: 1. Non-Compliant Timing: The fixup's delay settings fall outside the RGMII specification requirements of 1.5 ns to 2.0 ns: - RX Path: Total delay of 2.16 ns (PHY internal delay of 1.2 ns + 0.96 ns skew). - TX Path: Total delay of 0.96 ns, significantly below the RGMII minimum of 1.5 ns. 2. Redundant or Incorrect Configurations: - The RGMII skew registers written by the fixup do not meaningfully alter the PHY's default behavior and fail to account for its internal delays. - The TX_DATA pad skew was not configured, relying on power-on defaults that are insufficient for RGMII compliance. 3. Micrel Driver Support: By setting `PHY_INTERFACE_MODE_RGMII_ID`, the Micrel driver can calculate and assign appropriate skew values for the KSZ9031 PHY. This ensures better timing configurations without relying on external fixups. 4. System Interference: The fixup applied globally, reconfiguring all KSZ9031 PHYs in the system, even those unrelated to the LAN78xx adapter. This could lead to unintended and harmful behavior on unrelated interfaces. While the fixup is removed, a better mechanism is still needed to dynamically determine the optimal combination of PHY and MAC delays to fully meet RGMII requirements without relying on Device Tree or global fixups. This would allow for robust operation across different hardware configurations. The Micrel driver is capable of using the interface mode value to calculate and apply better skew values, providing a configuration much closer to the RGMII specification than the fixup. Removing the fixup ensures better default behavior and prevents harm to other system interfaces. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: usb: lan78xx: Remove LAN8835 PHY fixup	Oleksij Rempel
	Remove the PHY fixup for the LAN8835 PHY in the lan78xx driver due to the following reasons: - There is no publicly available information about the LAN8835 PHY. However, it appears to be the integrated PHY used in the LAN7800 and LAN7850 USB Ethernet controllers. These PHYs use the GMII interface, not RGMII as configured by the fixup. - The correct driver for handling the LAN8835 PHY functionality is the Microchip PHY driver (`drivers/net/phy/microchip.c`), which properly supports these integrated PHYs. - The PHY ID `0x0007C130` is actually used by the LAN8742A PHY, which only supports RMII. This interface is incompatible with the LAN78xx MAC, as the LAN7801 (the only LAN78xx version without an integrated PHY) supports only RGMII. - The mask applied for this fixup is overly broad, inadvertently covering both Microchip LAN88xx PHYs and unrelated SMSC LAN8742A PHYs, leading to potential conflicts with other devices. - Testing has shown that removing this fixup for LAN7800 and LAN7850 does not result in any noticeable difference in functionality, as the Microchip PHY driver (`drivers/net/phy/microchip.c`) handles all necessary configurations for these integrated PHYs. - Registering this fixup globally (not limited to USB devices) risks conflicts by unintentionally modifying other interfaces whenever a LAN7801 adapter is connected to the system. Note that both LAN7800 and LAN7850 USB Ethernet controllers use an integrated PHY with the ID `0x0007C132`. Additionally, the LAN7515, a specialized part for Raspberry Pi, includes an integrated LAN7800 USB Ethernet controller and USB hub in a multifunctional chip design, and it also uses the same PHY ID (`0x0007C132`). Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241204084142.1152696-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	Merge branch 'net-phylib-eee-cleanups'	Jakub Kicinski
	Russell King says: ==================== net: phylib EEE cleanups Clean up phylib's EEE support. Patches previously posted as RFC as part of the phylink EEE series. Patch 1 changes the Marvell driver to use the state we store in struct phy_device, rather than manually calling phydev->eee_cfg.eee_enabled. Patch 2 avoids genphy_c45_ethtool_get_eee() setting ->eee_enabled, as we copy that from phydev->eee_cfg.eee_enabled later, and after patch 3 mo one uses this after calling genphy_c45_ethtool_get_eee(). In fact, the only caller of this function now is phy_ethtool_get_eee(). As all callers to genphy_c45_eee_is_active() now pass NULL as its is_enabled flag, this is no longer useful. Remove the argument in patch 3. Patch 4 updates the phylib documentation to make it absolutely clear that phy_ethtool_get_eee() now fills in all members of struct ethtool_keee, which is why we now have so many buggy network drivers. ==================== Link: https://patch.msgid.link/Z1GDZlFyF2fsFa3S@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: phy: update phy_ethtool_get_eee() documentation	Russell King (Oracle)
	Update the phy_ethtool_get_eee() documentation to make it clear that all members of struct ethtool_keee are written by this function. keee.supported, keee.advertised, keee.lp_advertised and keee.eee_active are all written by genphy_c45_ethtool_get_eee(). keee.tx_lpi_timer, keee.tx_lpi_enabled and keee.eee_enabled are all written by eeecfg_to_eee(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tJ9JH-006LIz-SO@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: phy: remove genphy_c45_eee_is_active()'s is_enabled arg	Russell King (Oracle)
	All callers to genphy_c45_eee_is_active() now pass NULL as the is_enabled argument, which means we never use the value computed in this function. Remove the argument and clean up this function. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/E1tJ9JC-006LIt-Ne@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: phy: avoid genphy_c45_ethtool_get_eee() setting eee_enabled	Russell King (Oracle)
	genphy_c45_ethtool_get_eee() is only called from phy_ethtool_get_eee(), which then calls eeecfg_to_eee(). eeecfg_to_eee() will overwrite keee.eee_enabled, so there's no point setting keee.eee_enabled in genphy_c45_ethtool_get_eee(). Remove this assignment. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/E1tJ9J7-006LIn-Jr@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: phy: marvell: use phydev->eee_cfg.eee_enabled	Russell King (Oracle)
	Rather than calling genphy_c45_ethtool_get_eee() to retrieve whether EEE is enabled, use the value stored in the phy_device eee_cfg structure. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/E1tJ9J2-006LIh-Fl@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: defer final 'struct net' free in netns dismantle	Eric Dumazet
	Ilya reported a slab-use-after-free in dst_destroy [1] Issue is in xfrm6_net_init() and xfrm4_net_init() : They copy xfrm[46]_dst_ops_template into net->xfrm.xfrm[46]_dst_ops. But net structure might be freed before all the dst callbacks are called. So when dst_destroy() calls later : if (dst->ops->destroy) dst->ops->destroy(dst); dst->ops points to the old net->xfrm.xfrm[46]_dst_ops, which has been freed. See a relevant issue fixed in : ac888d58869b ("net: do not delay dst_entries_add() in dst_release()") A fix is to queue the 'struct net' to be freed after one another cleanup_net() round (and existing rcu_barrier()) [1] BUG: KASAN: slab-use-after-free in dst_destroy (net/core/dst.c:112) Read of size 8 at addr ffff8882137ccab0 by task swapper/37/0 Dec 03 05:46:18 kernel: CPU: 37 UID: 0 PID: 0 Comm: swapper/37 Kdump: loaded Not tainted 6.12.0 #67 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.1-1.el9 04/01/2014 Call Trace: <IRQ> dump_stack_lvl (lib/dump_stack.c:124) print_address_description.constprop.0 (mm/kasan/report.c:378) ? dst_destroy (net/core/dst.c:112) print_report (mm/kasan/report.c:489) ? dst_destroy (net/core/dst.c:112) ? kasan_addr_to_slab (mm/kasan/common.c:37) kasan_report (mm/kasan/report.c:603) ? dst_destroy (net/core/dst.c:112) ? rcu_do_batch (kernel/rcu/tree.c:2567) dst_destroy (net/core/dst.c:112) rcu_do_batch (kernel/rcu/tree.c:2567) ? __pfx_rcu_do_batch (kernel/rcu/tree.c:2491) ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4339 kernel/locking/lockdep.c:4406) rcu_core (kernel/rcu/tree.c:2825) handle_softirqs (kernel/softirq.c:554) __irq_exit_rcu (kernel/softirq.c:589 kernel/softirq.c:428 kernel/softirq.c:637) irq_exit_rcu (kernel/softirq.c:651) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049 arch/x86/kernel/apic/apic.c:1049) </IRQ> <TASK> asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702) RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:743) Code: 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 6e ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 0f 00 2d c7 c9 27 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 RSP: 0018:ffff888100d2fe00 EFLAGS: 00000246 RAX: 00000000001870ed RBX: 1ffff110201a5fc2 RCX: ffffffffb61a3e46 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb3d4d123 RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed11c7e1835d R10: ffff888e3f0c1aeb R11: 0000000000000000 R12: 0000000000000000 R13: ffff888100d20000 R14: dffffc0000000000 R15: 0000000000000000 ? ct_kernel_exit.constprop.0 (kernel/context_tracking.c:148) ? cpuidle_idle_call (kernel/sched/idle.c:186) default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118) cpuidle_idle_call (kernel/sched/idle.c:186) ? __pfx_cpuidle_idle_call (kernel/sched/idle.c:168) ? lock_release (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5848) ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4347 kernel/locking/lockdep.c:4406) ? tsc_verify_tsc_adjust (arch/x86/kernel/tsc_sync.c:59) do_idle (kernel/sched/idle.c:326) cpu_startup_entry (kernel/sched/idle.c:423 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:202 arch/x86/kernel/smpboot.c:282) ? __pfx_start_secondary (arch/x86/kernel/smpboot.c:232) ? soft_restart_cpu (arch/x86/kernel/head_64.S:452) common_startup_64 (arch/x86/kernel/head_64.S:414) </TASK> Dec 03 05:46:18 kernel: Allocated by task 12184: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69) __kasan_slab_alloc (mm/kasan/common.c:319 mm/kasan/common.c:345) kmem_cache_alloc_noprof (mm/slub.c:4085 mm/slub.c:4134 mm/slub.c:4141) copy_net_ns (net/core/net_namespace.c:421 net/core/net_namespace.c:480) create_new_namespaces (kernel/nsproxy.c:110) unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4)) ksys_unshare (kernel/fork.c:3313) __x64_sys_unshare (kernel/fork.c:3382) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Dec 03 05:46:18 kernel: Freed by task 11: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69) kasan_save_free_info (mm/kasan/generic.c:582) __kasan_slab_free (mm/kasan/common.c:271) kmem_cache_free (mm/slub.c:4579 mm/slub.c:4681) cleanup_net (net/core/net_namespace.c:456 net/core/net_namespace.c:446 net/core/net_namespace.c:647) process_one_work (kernel/workqueue.c:3229) worker_thread (kernel/workqueue.c:3304 kernel/workqueue.c:3391) kthread (kernel/kthread.c:389) ret_from_fork (arch/x86/kernel/process.c:147) ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Dec 03 05:46:18 kernel: Last potentially related work creation: kasan_save_stack (mm/kasan/common.c:48) __kasan_record_aux_stack (mm/kasan/generic.c:541) insert_work (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:788 kernel/workqueue.c:795 kernel/workqueue.c:2186) __queue_work (kernel/workqueue.c:2340) queue_work_on (kernel/workqueue.c:2391) xfrm_policy_insert (net/xfrm/xfrm_policy.c:1610) xfrm_add_policy (net/xfrm/xfrm_user.c:2116) xfrm_user_rcv_msg (net/xfrm/xfrm_user.c:3321) netlink_rcv_skb (net/netlink/af_netlink.c:2536) xfrm_netlink_rcv (net/xfrm/xfrm_user.c:3344) netlink_unicast (net/netlink/af_netlink.c:1316 net/netlink/af_netlink.c:1342) netlink_sendmsg (net/netlink/af_netlink.c:1886) sock_write_iter (net/socket.c:729 net/socket.c:744 net/socket.c:1165) vfs_write (fs/read_write.c:590 fs/read_write.c:683) ksys_write (fs/read_write.c:736) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Dec 03 05:46:18 kernel: Second to last potentially related work creation: kasan_save_stack (mm/kasan/common.c:48) __kasan_record_aux_stack (mm/kasan/generic.c:541) insert_work (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:788 kernel/workqueue.c:795 kernel/workqueue.c:2186) __queue_work (kernel/workqueue.c:2340) queue_work_on (kernel/workqueue.c:2391) __xfrm_state_insert (./include/linux/workqueue.h:723 net/xfrm/xfrm_state.c:1150 net/xfrm/xfrm_state.c:1145 net/xfrm/xfrm_state.c:1513) xfrm_state_update (./include/linux/spinlock.h:396 net/xfrm/xfrm_state.c:1940) xfrm_add_sa (net/xfrm/xfrm_user.c:912) xfrm_user_rcv_msg (net/xfrm/xfrm_user.c:3321) netlink_rcv_skb (net/netlink/af_netlink.c:2536) xfrm_netlink_rcv (net/xfrm/xfrm_user.c:3344) netlink_unicast (net/netlink/af_netlink.c:1316 net/netlink/af_netlink.c:1342) netlink_sendmsg (net/netlink/af_netlink.c:1886) sock_write_iter (net/socket.c:729 net/socket.c:744 net/socket.c:1165) vfs_write (fs/read_write.c:590 fs/read_write.c:683) ksys_write (fs/read_write.c:736) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Fixes: a8a572a6b5f2 ("xfrm: dst_entries_init() per-net dst_ops") Reported-by: Ilya Maximets <i.maximets@ovn.org> Closes: https://lore.kernel.org/netdev/CANn89iKKYDVpB=MtmfH7nyv2p=rJWSLedO5k7wSZgtY_tO8WQg@mail.gmail.com/T/#m02c98c3009fe66382b73cfb4db9cf1df6fab3fbf Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241204125455.3871859-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: lapb: increase LAPB_HEADER_LEN	Eric Dumazet
	It is unclear if net/lapb code is supposed to be ready for 8021q. We can at least avoid crashes like the following : skbuff: skb_under_panic: text:ffffffff8aabe1f6 len:24 put:20 head:ffff88802824a400 data:ffff88802824a3fe tail:0x16 end:0x140 dev:nr0.2 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 UID: 0 PID: 5508 Comm: dhcpcd Not tainted 6.12.0-rc7-syzkaller-00144-g66418447d27b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0d 8d 48 c7 c6 2e 9e 29 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 1a 6f 37 02 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc90002ddf638 EFLAGS: 00010282 RAX: 0000000000000086 RBX: dffffc0000000000 RCX: 7a24750e538ff600 RDX: 0000000000000000 RSI: 0000000000000201 RDI: 0000000000000000 RBP: ffff888034a86650 R08: ffffffff8174b13c R09: 1ffff920005bbe60 R10: dffffc0000000000 R11: fffff520005bbe61 R12: 0000000000000140 R13: ffff88802824a400 R14: ffff88802824a3fe R15: 0000000000000016 FS: 00007f2a5990d740(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000110c2631fd CR3: 0000000029504000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 nr_header+0x36/0x320 net/netrom/nr_dev.c:69 dev_hard_header include/linux/netdevice.h:3148 [inline] vlan_dev_hard_header+0x359/0x480 net/8021q/vlan_dev.c:83 dev_hard_header include/linux/netdevice.h:3148 [inline] lapbeth_data_transmit+0x1f6/0x2a0 drivers/net/wan/lapbether.c:257 lapb_data_transmit+0x91/0xb0 net/lapb/lapb_iface.c:447 lapb_transmit_buffer+0x168/0x1f0 net/lapb/lapb_out.c:149 lapb_establish_data_link+0x84/0xd0 lapb_device_event+0x4e0/0x670 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93 __dev_notify_flags+0x207/0x400 dev_change_flags+0xf0/0x1a0 net/core/dev.c:8922 devinet_ioctl+0xa4e/0x1aa0 net/ipv4/devinet.c:1188 inet_ioctl+0x3d7/0x4f0 net/ipv4/af_inet.c:1003 sock_do_ioctl+0x158/0x460 net/socket.c:1227 sock_ioctl+0x626/0x8e0 net/socket.c:1346 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+fb99d1b0c0f81d94a5e2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/67506220.050a0220.17bd51.006c.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241204141031.4030267-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	selftests: net: cleanup busy_poller.c	Joe Damato
	Fix various integer type conversions by using strtoull and a temporary variable which is bounds checked before being casted into the appropriate cfg_* variable for use by the test program. While here: - free the strdup'd cfg string for overall hygenie. - initialize napi_id = 0 in setup_queue to avoid warnings on some compilers. Signed-off-by: Joe Damato <jdamato@fastly.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241204163239.294123-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	net: tipc: remove one synchronize_net() from tipc_nametbl_stop()	Eric Dumazet
	tipc_exit_net() is very slow and is abused by syzbot. tipc_nametbl_stop() is called for each netns being dismantled. Calling synchronize_net() right before freeing tn->nametbl is a big hammer. Replace this with kfree_rcu(). Note that RCU is not properly used here, otherwise tn->nametbl should be cleared before the synchronize_net() or kfree_rcu(), or even before the cleanup loop. We might need to fix this at some point. Also note tipc uses other synchronize_rcu() calls, more work is needed to make tipc_exit_net() much faster. List of remaining calls to synchronize_rcu() tipc_detach_loopback() (dev_remove_pack()) tipc_bcast_stop() tipc_sk_rht_destroy() Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241204210234.319484-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06	Merge branch 'bnxt_en-bug-fixes'	Jakub Kicinski
	Michael Chan says: ==================== bnxt_en: Bug fixes There are 2 bug fixes in this series. This first one fixes the issue of setting the gso_type incorrectly for HW GRO packets on 5750X (Thor) chips. This can cause HW GRO packets to be dropped by the stack if they are re-segmented. The second one fixes a potential division by zero crash when dumping FW log coredump. ==================== Link: https://patch.msgid.link/20241204215918.1692597-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>