Age | Commit message (Collapse) | Author |
|
Originally, the total NCM header size was computed as the sum of two
vaguely labelled constants. While accurate, it wasn't particularly clear
where they were coming from.
Use sizes of existing NCM structs where available. Define the total
NDP16 size based on the maximum amount of DPEs that can fit into the
iOS-specific fixed-size header.
This change does not fix any particular issue. Rather, it introduces
intermediate constants that will simplify subsequent commits.
It should also make it clearer for the reader where the constant values
come from.
Cc: stable@vger.kernel.org # 6.5.x
Signed-off-by: Foster Snowhill <forst@pen.gy>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Introduce an rx_error label to reduce repetitions in the header
signature checks.
Store wDatagramIndex and wDatagramLength after endianness conversion to
avoid repeated le16_to_cpu() calls.
Rewrite the loop to return on a null trailing DPE, which is required
by the CDC NCM spec. In case it is missing, fall through to rx_error.
This change does not fix any particular issue. Its purpose is to
simplify a subsequent commit that fixes a potential OoB read by limiting
the maximum amount of processed DPEs.
Cc: stable@vger.kernel.org # 6.5.x
Signed-off-by: Foster Snowhill <forst@pen.gy>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Original code allowed for the start of NDP16 to be anywhere within the
URB based on the `wNdpIndex` value in NTH16. Only the start position of
NDP16 was checked, so it was possible for even the fixed-length part
of NDP16 to extend past the end of URB, leading to an out-of-bounds
read.
On iOS devices, the NDP16 header always directly follows NTH16. Rely on
and check for this specific format.
This, along with NCM-specific minimal URB length check that already
exists, will ensure that the fixed-length part of NDP16 plus a set
amount of DPEs fit within the URB.
Note that this commit alone does not fully address the OoB read.
The limit on the amount of DPEs needs to be enforced separately.
Fixes: a2d274c62e44 ("usbnet: ipheth: add CDC NCM support")
Cc: stable@vger.kernel.org
Signed-off-by: Foster Snowhill <forst@pen.gy>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
By definition, a DPE points at the start of a network frame/datagram.
Thus it makes no sense for it to point at anything that's part of the
NCM header. It is not a security issue, but merely an indication of
a malformed DPE.
Enforce that all DPEs point at the data portion of the URB, past the
NCM header.
Fixes: a2d274c62e44 ("usbnet: ipheth: add CDC NCM support")
Cc: stable@vger.kernel.org
Signed-off-by: Foster Snowhill <forst@pen.gy>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Originally, it was possible for the DPE length check to overflow if
wDatagramIndex + wDatagramLength > U16_MAX. This could lead to an OoB
read.
Move the wDatagramIndex term to the other side of the inequality.
An existing condition ensures that wDatagramIndex < urb->actual_length.
Fixes: a2d274c62e44 ("usbnet: ipheth: add CDC NCM support")
Cc: stable@vger.kernel.org
Signed-off-by: Foster Snowhill <forst@pen.gy>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Pointer arguments passed to ioctls need to pass through compat_ptr() to
work correctly on s390; as explained in Documentation/driver-api/ioctl.rst.
Detect compat mode at runtime and call compat_ptr() for those commands
which do take pointer arguments.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/lkml/1ba5d3a4-7931-455b-a3ce-85a968a7cb10@app.fastmail.com/
Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://patch.msgid.link/20250125-posix-clock-compat_ioctl-v2-1-11c865c500eb@weissschuh.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Syzkaller reports [1] encountering a common issue of utilizing a wrong
usb endpoint type during URB submitting stage. This, in turn, triggers
a warning shown below.
For now, enable simple endpoint checking (specifically, bulk and
interrupt eps, testing control one is not essential) to mitigate
the issue with a view to do other related cosmetic changes later,
if they are necessary.
[1] Syzkaller report:
usb 1-1: BOGUS urb xfer, pipe 3 != type 1
WARNING: CPU: 1 PID: 2586 at drivers/usb/core/urb.c:503 usb_submit_urb+0xe4b/0x1730 driv>
Modules linked in:
CPU: 1 UID: 0 PID: 2586 Comm: dhcpcd Not tainted 6.11.0-rc4-syzkaller-00069-gfc88bb11617>
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
RIP: 0010:usb_submit_urb+0xe4b/0x1730 drivers/usb/core/urb.c:503
Code: 84 3c 02 00 00 e8 05 e4 fc fc 4c 89 ef e8 fd 25 d7 fe 45 89 e0 89 e9 4c 89 f2 48 8>
RSP: 0018:ffffc9000441f740 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff888112487a00 RCX: ffffffff811a99a9
RDX: ffff88810df6ba80 RSI: ffffffff811a99b6 RDI: 0000000000000001
RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: ffff8881023bf0a8 R14: ffff888112452a20 R15: ffff888112487a7c
FS: 00007fc04eea5740(0000) GS:ffff8881f6300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0a1de9f870 CR3: 000000010dbd0000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
rtl8150_open+0x300/0xe30 drivers/net/usb/rtl8150.c:733
__dev_open+0x2d4/0x4e0 net/core/dev.c:1474
__dev_change_flags+0x561/0x720 net/core/dev.c:8838
dev_change_flags+0x8f/0x160 net/core/dev.c:8910
devinet_ioctl+0x127a/0x1f10 net/ipv4/devinet.c:1177
inet_ioctl+0x3aa/0x3f0 net/ipv4/af_inet.c:1003
sock_do_ioctl+0x116/0x280 net/socket.c:1222
sock_ioctl+0x22e/0x6c0 net/socket.c:1341
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl fs/ioctl.c:893 [inline]
__x64_sys_ioctl+0x193/0x220 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fc04ef73d49
...
This change has not been tested on real hardware.
Reported-and-tested-by: syzbot+d7e968426f644b567e31@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d7e968426f644b567e31
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Link: https://patch.msgid.link/20250124093020.234642-1-n.zhandarovich@fintech.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-01-24 (idpf, ice, iavf)
For idpf:
Emil adds memory barrier when accessing control queue descriptors and
restores call to idpf_vc_xn_shutdown() when resetting.
Manoj Vishwanathan expands transaction lock to properly protect xn->salt
value and adds additional debugging information.
Marco Leogrande converts workqueues to be unbound.
For ice:
Przemek fixes incorrect size use for array.
Mateusz removes reporting of invalid parameter and value.
For iavf:
Michal adjusts some VLAN changes to occur without a PF call to avoid
timing issues with the calls.
* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
iavf: allow changing VLAN state without calling PF
ice: remove invalid parameter of equalizer
ice: fix ice_parser_rt::bst_key array size
idpf: add more info during virtchnl transaction timeout/salt mismatch
idpf: convert workqueues to unbound
idpf: Acquire the lock before accessing the xn->salt
idpf: fix transaction timeouts on reset
idpf: add read memory barrier when checking descriptor done bit
====================
Link: https://patch.msgid.link/20250124213213.1328775-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2025-01-27
1) Fix incrementing the upper 32 bit sequence numbers for GSO skbs.
From Jianbo Liu.
2) Fix an out-of-bounds read on xfrm state lookup.
From Florian Westphal.
3) Fix secpath handling on packet offload mode.
From Alexandre Cassen.
4) Fix the usage of skb->sk in the xfrm layer.
5) Don't disable preemption while looking up cache state
to fix PREEMPT_RT.
From Sebastian Sewior.
* tag 'ipsec-2025-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: Don't disable preemption while looking up cache state.
xfrm: Fix the usage of skb->sk
xfrm: delete intermediate secpath entry in packet offload mode
xfrm: state: fix out-of-bounds read during lookup
xfrm: replay: Fix the update of replay_esn->oseq_hi for GSO
====================
Link: https://patch.msgid.link/20250127060757.3946314-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Matthieu Baerts says:
====================
mptcp: fixes addressing syzbot reports
Recently, a few issues linked to MPTCP have been reported by syzbot. All
the remaining ones are addressed in this series.
- Patch 1: Address "KMSAN: uninit-value in mptcp_incoming_options (2)".
A fix for v5.11.
- Patch 2: Address "WARNING in mptcp_pm_nl_set_flags (2)". A fix for
v5.18.
- Patch 3: Address "WARNING in __mptcp_clean_una (2)". A fix for v6.4,
backported up to v6.1.
====================
Link: https://patch.msgid.link/20250123-net-mptcp-syzbot-issues-v1-0-af73258a726f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Syzbot was able to trigger a data stream corruption:
WARNING: CPU: 0 PID: 9846 at net/mptcp/protocol.c:1024 __mptcp_clean_una+0xddb/0xff0 net/mptcp/protocol.c:1024
Modules linked in:
CPU: 0 UID: 0 PID: 9846 Comm: syz-executor351 Not tainted 6.13.0-rc2-syzkaller-00059-g00a5acdbf398 #0
Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024
RIP: 0010:__mptcp_clean_una+0xddb/0xff0 net/mptcp/protocol.c:1024
Code: fa ff ff 48 8b 4c 24 18 80 e1 07 fe c1 38 c1 0f 8c 8e fa ff ff 48 8b 7c 24 18 e8 e0 db 54 f6 e9 7f fa ff ff e8 e6 80 ee f5 90 <0f> 0b 90 4c 8b 6c 24 40 4d 89 f4 e9 04 f5 ff ff 44 89 f1 80 e1 07
RSP: 0018:ffffc9000c0cf400 EFLAGS: 00010293
RAX: ffffffff8bb0dd5a RBX: ffff888033f5d230 RCX: ffff888059ce8000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc9000c0cf518 R08: ffffffff8bb0d1dd R09: 1ffff110170c8928
R10: dffffc0000000000 R11: ffffed10170c8929 R12: 0000000000000000
R13: ffff888033f5d220 R14: dffffc0000000000 R15: ffff8880592b8000
FS: 00007f6e866496c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6e86f491a0 CR3: 00000000310e6000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__mptcp_clean_una_wakeup+0x7f/0x2d0 net/mptcp/protocol.c:1074
mptcp_release_cb+0x7cb/0xb30 net/mptcp/protocol.c:3493
release_sock+0x1aa/0x1f0 net/core/sock.c:3640
inet_wait_for_connect net/ipv4/af_inet.c:609 [inline]
__inet_stream_connect+0x8bd/0xf30 net/ipv4/af_inet.c:703
mptcp_sendmsg_fastopen+0x2a2/0x530 net/mptcp/protocol.c:1755
mptcp_sendmsg+0x1884/0x1b10 net/mptcp/protocol.c:1830
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg+0x1a6/0x270 net/socket.c:726
____sys_sendmsg+0x52a/0x7e0 net/socket.c:2583
___sys_sendmsg net/socket.c:2637 [inline]
__sys_sendmsg+0x269/0x350 net/socket.c:2669
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f6e86ebfe69
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 1f 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6e86649168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f6e86f491b8 RCX: 00007f6e86ebfe69
RDX: 0000000030004001 RSI: 0000000020000080 RDI: 0000000000000003
RBP: 00007f6e86f491b0 R08: 00007f6e866496c0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f6e86f491bc
R13: 000000000000006e R14: 00007ffe445d9420 R15: 00007ffe445d9508
</TASK>
The root cause is the bad handling of disconnect() generated internally
by the MPTCP protocol in case of connect FASTOPEN errors.
Address the issue increasing the socket disconnect counter even on such
a case, to allow other threads waiting on the same socket lock to
properly error out.
Fixes: c2b2ae3925b6 ("mptcp: handle correctly disconnect() failures")
Cc: stable@vger.kernel.org
Reported-by: syzbot+ebc0b8ae5d3590b2c074@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/67605870.050a0220.37aaf.0137.GAE@google.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/537
Tested-by: syzbot+ebc0b8ae5d3590b2c074@syzkaller.appspotmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250123-net-mptcp-syzbot-issues-v1-3-af73258a726f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With the in-kernel path-manager, it is possible to change the 'fullmesh'
flag. The code in mptcp_pm_nl_fullmesh() expects to change it only on
'subflow' endpoints, to recreate more or less subflows using the linked
address.
Unfortunately, the set_flags() hook was a bit more permissive, and
allowed 'implicit' endpoints to get the 'fullmesh' flag while it is not
allowed before.
That's what syzbot found, triggering the following warning:
WARNING: CPU: 0 PID: 6499 at net/mptcp/pm_netlink.c:1496 __mark_subflow_endp_available net/mptcp/pm_netlink.c:1496 [inline]
WARNING: CPU: 0 PID: 6499 at net/mptcp/pm_netlink.c:1496 mptcp_pm_nl_fullmesh net/mptcp/pm_netlink.c:1980 [inline]
WARNING: CPU: 0 PID: 6499 at net/mptcp/pm_netlink.c:1496 mptcp_nl_set_flags net/mptcp/pm_netlink.c:2003 [inline]
WARNING: CPU: 0 PID: 6499 at net/mptcp/pm_netlink.c:1496 mptcp_pm_nl_set_flags+0x974/0xdc0 net/mptcp/pm_netlink.c:2064
Modules linked in:
CPU: 0 UID: 0 PID: 6499 Comm: syz.1.413 Not tainted 6.13.0-rc5-syzkaller-00172-gd1bf27c4e176 #0
Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:__mark_subflow_endp_available net/mptcp/pm_netlink.c:1496 [inline]
RIP: 0010:mptcp_pm_nl_fullmesh net/mptcp/pm_netlink.c:1980 [inline]
RIP: 0010:mptcp_nl_set_flags net/mptcp/pm_netlink.c:2003 [inline]
RIP: 0010:mptcp_pm_nl_set_flags+0x974/0xdc0 net/mptcp/pm_netlink.c:2064
Code: 01 00 00 49 89 c5 e8 fb 45 e8 f5 e9 b8 fc ff ff e8 f1 45 e8 f5 4c 89 f7 be 03 00 00 00 e8 44 1d 0b f9 eb a0 e8 dd 45 e8 f5 90 <0f> 0b 90 e9 17 ff ff ff 89 d9 80 e1 07 38 c1 0f 8c c9 fc ff ff 48
RSP: 0018:ffffc9000d307240 EFLAGS: 00010293
RAX: ffffffff8bb72e03 RBX: 0000000000000000 RCX: ffff88807da88000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc9000d307430 R08: ffffffff8bb72cf0 R09: 1ffff1100b842a5e
R10: dffffc0000000000 R11: ffffed100b842a5f R12: ffff88801e2e5ac0
R13: ffff88805c214800 R14: ffff88805c2152e8 R15: 1ffff1100b842a5d
FS: 00005555619f6500(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020002840 CR3: 00000000247e6000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2542
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1347
netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:726
____sys_sendmsg+0x52a/0x7e0 net/socket.c:2583
___sys_sendmsg net/socket.c:2637 [inline]
__sys_sendmsg+0x269/0x350 net/socket.c:2669
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5fe8785d29
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff571f5558 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f5fe8975fa0 RCX: 00007f5fe8785d29
RDX: 0000000000000000 RSI: 0000000020000480 RDI: 0000000000000007
RBP: 00007f5fe8801b08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f5fe8975fa0 R14: 00007f5fe8975fa0 R15: 00000000000011f4
</TASK>
Here, syzbot managed to set the 'fullmesh' flag on an 'implicit' and
used -- according to 'id_avail_bitmap' -- endpoint, causing the PM to
try decrement the local_addr_used counter which is only incremented for
the 'subflow' endpoint.
Note that 'no type' endpoints -- not 'subflow', 'signal', 'implicit' --
are fine, because their ID will not be marked as used in the 'id_avail'
bitmap, and setting 'fullmesh' can help forcing the creation of subflow
when receiving an ADD_ADDR.
Fixes: 73c762c1f07d ("mptcp: set fullmesh flag in pm_netlink")
Cc: stable@vger.kernel.org
Reported-by: syzbot+cd16e79c1e45f3fe0377@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/6786ac51.050a0220.216c54.00a6.GAE@google.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/540
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250123-net-mptcp-syzbot-issues-v1-2-af73258a726f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
MPTCP maintains the received sub-options status is the bitmask carrying
the received suboptions and in several bitfields carrying per suboption
additional info.
Zeroing the bitmask before parsing is not enough to ensure a consistent
status, and the MPTCP code has to additionally clear some bitfiled
depending on the actually parsed suboption.
The above schema is fragile, and syzbot managed to trigger a path where
a relevant bitfield is not cleared/initialized:
BUG: KMSAN: uninit-value in __mptcp_expand_seq net/mptcp/options.c:1030 [inline]
BUG: KMSAN: uninit-value in mptcp_expand_seq net/mptcp/protocol.h:864 [inline]
BUG: KMSAN: uninit-value in ack_update_msk net/mptcp/options.c:1060 [inline]
BUG: KMSAN: uninit-value in mptcp_incoming_options+0x2036/0x3d30 net/mptcp/options.c:1209
__mptcp_expand_seq net/mptcp/options.c:1030 [inline]
mptcp_expand_seq net/mptcp/protocol.h:864 [inline]
ack_update_msk net/mptcp/options.c:1060 [inline]
mptcp_incoming_options+0x2036/0x3d30 net/mptcp/options.c:1209
tcp_data_queue+0xb4/0x7be0 net/ipv4/tcp_input.c:5233
tcp_rcv_established+0x1061/0x2510 net/ipv4/tcp_input.c:6264
tcp_v4_do_rcv+0x7f3/0x11a0 net/ipv4/tcp_ipv4.c:1916
tcp_v4_rcv+0x51df/0x5750 net/ipv4/tcp_ipv4.c:2351
ip_protocol_deliver_rcu+0x2a3/0x13d0 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x336/0x500 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:460 [inline]
ip_rcv_finish+0x4a2/0x520 net/ipv4/ip_input.c:447
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_rcv+0xcd/0x380 net/ipv4/ip_input.c:567
__netif_receive_skb_one_core net/core/dev.c:5704 [inline]
__netif_receive_skb+0x319/0xa00 net/core/dev.c:5817
process_backlog+0x4ad/0xa50 net/core/dev.c:6149
__napi_poll+0xe7/0x980 net/core/dev.c:6902
napi_poll net/core/dev.c:6971 [inline]
net_rx_action+0xa5a/0x19b0 net/core/dev.c:7093
handle_softirqs+0x1a0/0x7c0 kernel/softirq.c:561
__do_softirq+0x14/0x1a kernel/softirq.c:595
do_softirq+0x9a/0x100 kernel/softirq.c:462
__local_bh_enable_ip+0x9f/0xb0 kernel/softirq.c:389
local_bh_enable include/linux/bottom_half.h:33 [inline]
rcu_read_unlock_bh include/linux/rcupdate.h:919 [inline]
__dev_queue_xmit+0x2758/0x57d0 net/core/dev.c:4493
dev_queue_xmit include/linux/netdevice.h:3168 [inline]
neigh_hh_output include/net/neighbour.h:523 [inline]
neigh_output include/net/neighbour.h:537 [inline]
ip_finish_output2+0x187c/0x1b70 net/ipv4/ip_output.c:236
__ip_finish_output+0x287/0x810
ip_finish_output+0x4b/0x600 net/ipv4/ip_output.c:324
NF_HOOK_COND include/linux/netfilter.h:303 [inline]
ip_output+0x15f/0x3f0 net/ipv4/ip_output.c:434
dst_output include/net/dst.h:450 [inline]
ip_local_out net/ipv4/ip_output.c:130 [inline]
__ip_queue_xmit+0x1f2a/0x20d0 net/ipv4/ip_output.c:536
ip_queue_xmit+0x60/0x80 net/ipv4/ip_output.c:550
__tcp_transmit_skb+0x3cea/0x4900 net/ipv4/tcp_output.c:1468
tcp_transmit_skb net/ipv4/tcp_output.c:1486 [inline]
tcp_write_xmit+0x3b90/0x9070 net/ipv4/tcp_output.c:2829
__tcp_push_pending_frames+0xc4/0x380 net/ipv4/tcp_output.c:3012
tcp_send_fin+0x9f6/0xf50 net/ipv4/tcp_output.c:3618
__tcp_close+0x140c/0x1550 net/ipv4/tcp.c:3130
__mptcp_close_ssk+0x74e/0x16f0 net/mptcp/protocol.c:2496
mptcp_close_ssk+0x26b/0x2c0 net/mptcp/protocol.c:2550
mptcp_pm_nl_rm_addr_or_subflow+0x635/0xd10 net/mptcp/pm_netlink.c:889
mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:924 [inline]
mptcp_pm_flush_addrs_and_subflows net/mptcp/pm_netlink.c:1688 [inline]
mptcp_nl_flush_addrs_list net/mptcp/pm_netlink.c:1709 [inline]
mptcp_pm_nl_flush_addrs_doit+0xe10/0x1630 net/mptcp/pm_netlink.c:1750
genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0x1214/0x12c0 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2542
genl_rcv+0x40/0x60 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
netlink_unicast+0xf52/0x1260 net/netlink/af_netlink.c:1347
netlink_sendmsg+0x10da/0x11e0 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:726
____sys_sendmsg+0x877/0xb60 net/socket.c:2583
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2637
__sys_sendmsg net/socket.c:2669 [inline]
__do_sys_sendmsg net/socket.c:2674 [inline]
__se_sys_sendmsg net/socket.c:2672 [inline]
__x64_sys_sendmsg+0x212/0x3c0 net/socket.c:2672
x64_sys_call+0x2ed6/0x3c30 arch/x86/include/generated/asm/syscalls_64.h:47
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Uninit was stored to memory at:
mptcp_get_options+0x2c0f/0x2f20 net/mptcp/options.c:397
mptcp_incoming_options+0x19a/0x3d30 net/mptcp/options.c:1150
tcp_data_queue+0xb4/0x7be0 net/ipv4/tcp_input.c:5233
tcp_rcv_established+0x1061/0x2510 net/ipv4/tcp_input.c:6264
tcp_v4_do_rcv+0x7f3/0x11a0 net/ipv4/tcp_ipv4.c:1916
tcp_v4_rcv+0x51df/0x5750 net/ipv4/tcp_ipv4.c:2351
ip_protocol_deliver_rcu+0x2a3/0x13d0 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x336/0x500 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:460 [inline]
ip_rcv_finish+0x4a2/0x520 net/ipv4/ip_input.c:447
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_rcv+0xcd/0x380 net/ipv4/ip_input.c:567
__netif_receive_skb_one_core net/core/dev.c:5704 [inline]
__netif_receive_skb+0x319/0xa00 net/core/dev.c:5817
process_backlog+0x4ad/0xa50 net/core/dev.c:6149
__napi_poll+0xe7/0x980 net/core/dev.c:6902
napi_poll net/core/dev.c:6971 [inline]
net_rx_action+0xa5a/0x19b0 net/core/dev.c:7093
handle_softirqs+0x1a0/0x7c0 kernel/softirq.c:561
__do_softirq+0x14/0x1a kernel/softirq.c:595
Uninit was stored to memory at:
put_unaligned_be32 include/linux/unaligned.h:68 [inline]
mptcp_write_options+0x17f9/0x3100 net/mptcp/options.c:1417
mptcp_options_write net/ipv4/tcp_output.c:465 [inline]
tcp_options_write+0x6d9/0xe90 net/ipv4/tcp_output.c:759
__tcp_transmit_skb+0x294b/0x4900 net/ipv4/tcp_output.c:1414
tcp_transmit_skb net/ipv4/tcp_output.c:1486 [inline]
tcp_write_xmit+0x3b90/0x9070 net/ipv4/tcp_output.c:2829
__tcp_push_pending_frames+0xc4/0x380 net/ipv4/tcp_output.c:3012
tcp_send_fin+0x9f6/0xf50 net/ipv4/tcp_output.c:3618
__tcp_close+0x140c/0x1550 net/ipv4/tcp.c:3130
__mptcp_close_ssk+0x74e/0x16f0 net/mptcp/protocol.c:2496
mptcp_close_ssk+0x26b/0x2c0 net/mptcp/protocol.c:2550
mptcp_pm_nl_rm_addr_or_subflow+0x635/0xd10 net/mptcp/pm_netlink.c:889
mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:924 [inline]
mptcp_pm_flush_addrs_and_subflows net/mptcp/pm_netlink.c:1688 [inline]
mptcp_nl_flush_addrs_list net/mptcp/pm_netlink.c:1709 [inline]
mptcp_pm_nl_flush_addrs_doit+0xe10/0x1630 net/mptcp/pm_netlink.c:1750
genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0x1214/0x12c0 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2542
genl_rcv+0x40/0x60 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
netlink_unicast+0xf52/0x1260 net/netlink/af_netlink.c:1347
netlink_sendmsg+0x10da/0x11e0 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:726
____sys_sendmsg+0x877/0xb60 net/socket.c:2583
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2637
__sys_sendmsg net/socket.c:2669 [inline]
__do_sys_sendmsg net/socket.c:2674 [inline]
__se_sys_sendmsg net/socket.c:2672 [inline]
__x64_sys_sendmsg+0x212/0x3c0 net/socket.c:2672
x64_sys_call+0x2ed6/0x3c30 arch/x86/include/generated/asm/syscalls_64.h:47
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Uninit was stored to memory at:
mptcp_pm_add_addr_signal+0x3d7/0x4c0
mptcp_established_options_add_addr net/mptcp/options.c:666 [inline]
mptcp_established_options+0x1b9b/0x3a00 net/mptcp/options.c:884
tcp_established_options+0x2c4/0x7d0 net/ipv4/tcp_output.c:1012
__tcp_transmit_skb+0x5b7/0x4900 net/ipv4/tcp_output.c:1333
tcp_transmit_skb net/ipv4/tcp_output.c:1486 [inline]
tcp_write_xmit+0x3b90/0x9070 net/ipv4/tcp_output.c:2829
__tcp_push_pending_frames+0xc4/0x380 net/ipv4/tcp_output.c:3012
tcp_send_fin+0x9f6/0xf50 net/ipv4/tcp_output.c:3618
__tcp_close+0x140c/0x1550 net/ipv4/tcp.c:3130
__mptcp_close_ssk+0x74e/0x16f0 net/mptcp/protocol.c:2496
mptcp_close_ssk+0x26b/0x2c0 net/mptcp/protocol.c:2550
mptcp_pm_nl_rm_addr_or_subflow+0x635/0xd10 net/mptcp/pm_netlink.c:889
mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:924 [inline]
mptcp_pm_flush_addrs_and_subflows net/mptcp/pm_netlink.c:1688 [inline]
mptcp_nl_flush_addrs_list net/mptcp/pm_netlink.c:1709 [inline]
mptcp_pm_nl_flush_addrs_doit+0xe10/0x1630 net/mptcp/pm_netlink.c:1750
genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0x1214/0x12c0 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2542
genl_rcv+0x40/0x60 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
netlink_unicast+0xf52/0x1260 net/netlink/af_netlink.c:1347
netlink_sendmsg+0x10da/0x11e0 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:726
____sys_sendmsg+0x877/0xb60 net/socket.c:2583
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2637
__sys_sendmsg net/socket.c:2669 [inline]
__do_sys_sendmsg net/socket.c:2674 [inline]
__se_sys_sendmsg net/socket.c:2672 [inline]
__x64_sys_sendmsg+0x212/0x3c0 net/socket.c:2672
x64_sys_call+0x2ed6/0x3c30 arch/x86/include/generated/asm/syscalls_64.h:47
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Uninit was stored to memory at:
mptcp_pm_add_addr_received+0x95f/0xdd0 net/mptcp/pm.c:235
mptcp_incoming_options+0x2983/0x3d30 net/mptcp/options.c:1169
tcp_data_queue+0xb4/0x7be0 net/ipv4/tcp_input.c:5233
tcp_rcv_state_process+0x2a38/0x49d0 net/ipv4/tcp_input.c:6972
tcp_v4_do_rcv+0xbf9/0x11a0 net/ipv4/tcp_ipv4.c:1939
tcp_v4_rcv+0x51df/0x5750 net/ipv4/tcp_ipv4.c:2351
ip_protocol_deliver_rcu+0x2a3/0x13d0 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x336/0x500 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:460 [inline]
ip_rcv_finish+0x4a2/0x520 net/ipv4/ip_input.c:447
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_rcv+0xcd/0x380 net/ipv4/ip_input.c:567
__netif_receive_skb_one_core net/core/dev.c:5704 [inline]
__netif_receive_skb+0x319/0xa00 net/core/dev.c:5817
process_backlog+0x4ad/0xa50 net/core/dev.c:6149
__napi_poll+0xe7/0x980 net/core/dev.c:6902
napi_poll net/core/dev.c:6971 [inline]
net_rx_action+0xa5a/0x19b0 net/core/dev.c:7093
handle_softirqs+0x1a0/0x7c0 kernel/softirq.c:561
__do_softirq+0x14/0x1a kernel/softirq.c:595
Local variable mp_opt created at:
mptcp_incoming_options+0x119/0x3d30 net/mptcp/options.c:1127
tcp_data_queue+0xb4/0x7be0 net/ipv4/tcp_input.c:5233
The current schema is too fragile; address the issue grouping all the
state-related data together and clearing the whole group instead of
just the bitmask. This also cleans-up the code a bit, as there is no
need to individually clear "random" bitfield in a couple of places
any more.
Fixes: 84dfe3677a6f ("mptcp: send out dedicated ADD_ADDR packet")
Cc: stable@vger.kernel.org
Reported-by: syzbot+23728c2df58b3bd175ad@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/6786ac51.050a0220.216c54.00a7.GAE@google.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/541
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250123-net-mptcp-syzbot-issues-v1-1-af73258a726f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
dm is netdev private data and it cannot be
used after free_netdev() call. Using dm after free_netdev()
can cause UAF bug. Fix it by moving free_netdev() at the end of the
function.
This is similar to the issue fixed in commit
ad297cd2db89 ("net: qcom/emac: fix UAF in emac_remove").
This bug is detected by our static analysis tool.
Fixes: cf9e60aa69ae ("net: davicom: Fix regulator not turned off on driver removal")
Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com>
CC: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20250123214213.623518-1-chenyuan0y@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In application note (AN13663) for TJA1120, on page 30, there's a figure
with average PHY startup timing values following software reset.
The time it takes for SMI to become operational after software reset
ranges roughly from 500 us to 1500 us.
This commit adds 2000 us delay after MDIO write which triggers software
reset. Without this delay, soft_reset function returns an error and
prevents successful PHY init.
Cc: stable@vger.kernel.org
Fixes: b050f2f15e04 ("phy: nxp-c45: add driver for tja1103")
Signed-off-by: Milos Reljin <milos_reljin@outlook.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/AM8P250MB0124D258E5A71041AF2CC322E1E32@AM8P250MB0124.EURP250.PROD.OUTLOOK.COM
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
KMSAN reported an uninit-value access in vxlan_vnifilter_dump() [1].
If the length of the netlink message payload is less than
sizeof(struct tunnel_msg), vxlan_vnifilter_dump() accesses bytes
beyond the message. This can lead to uninit-value access. Fix this by
returning an error in such situations.
[1]
BUG: KMSAN: uninit-value in vxlan_vnifilter_dump+0x328/0x920 drivers/net/vxlan/vxlan_vnifilter.c:422
vxlan_vnifilter_dump+0x328/0x920 drivers/net/vxlan/vxlan_vnifilter.c:422
rtnl_dumpit+0xd5/0x2f0 net/core/rtnetlink.c:6786
netlink_dump+0x93e/0x15f0 net/netlink/af_netlink.c:2317
__netlink_dump_start+0x716/0xd60 net/netlink/af_netlink.c:2432
netlink_dump_start include/linux/netlink.h:340 [inline]
rtnetlink_dump_start net/core/rtnetlink.c:6815 [inline]
rtnetlink_rcv_msg+0x1256/0x14a0 net/core/rtnetlink.c:6882
netlink_rcv_skb+0x467/0x660 net/netlink/af_netlink.c:2542
rtnetlink_rcv+0x35/0x40 net/core/rtnetlink.c:6944
netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
netlink_unicast+0xed6/0x1290 net/netlink/af_netlink.c:1347
netlink_sendmsg+0x1092/0x1230 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg+0x330/0x3d0 net/socket.c:726
____sys_sendmsg+0x7f4/0xb50 net/socket.c:2583
___sys_sendmsg+0x271/0x3b0 net/socket.c:2637
__sys_sendmsg net/socket.c:2669 [inline]
__do_sys_sendmsg net/socket.c:2674 [inline]
__se_sys_sendmsg net/socket.c:2672 [inline]
__x64_sys_sendmsg+0x211/0x3e0 net/socket.c:2672
x64_sys_call+0x3878/0x3d90 arch/x86/include/generated/asm/syscalls_64.h:47
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xd9/0x1d0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Uninit was created at:
slab_post_alloc_hook mm/slub.c:4110 [inline]
slab_alloc_node mm/slub.c:4153 [inline]
kmem_cache_alloc_node_noprof+0x800/0xe80 mm/slub.c:4205
kmalloc_reserve+0x13b/0x4b0 net/core/skbuff.c:587
__alloc_skb+0x347/0x7d0 net/core/skbuff.c:678
alloc_skb include/linux/skbuff.h:1323 [inline]
netlink_alloc_large_skb+0xa5/0x280 net/netlink/af_netlink.c:1196
netlink_sendmsg+0xac9/0x1230 net/netlink/af_netlink.c:1866
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg+0x330/0x3d0 net/socket.c:726
____sys_sendmsg+0x7f4/0xb50 net/socket.c:2583
___sys_sendmsg+0x271/0x3b0 net/socket.c:2637
__sys_sendmsg net/socket.c:2669 [inline]
__do_sys_sendmsg net/socket.c:2674 [inline]
__se_sys_sendmsg net/socket.c:2672 [inline]
__x64_sys_sendmsg+0x211/0x3e0 net/socket.c:2672
x64_sys_call+0x3878/0x3d90 arch/x86/include/generated/asm/syscalls_64.h:47
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xd9/0x1d0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
CPU: 0 UID: 0 PID: 30991 Comm: syz.4.10630 Not tainted 6.12.0-10694-gc44daa7e3c73 #29
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250123145746.785768-1-syoshida@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In its address list, afs now retains pointers to and refs on one or more
rxrpc_peer objects. The address list is freed under RCU and at this time,
it puts the refs on those peers.
Now, when an rxrpc_peer object runs out of refs, it gets removed from the
peer hash table and, for that, rxrpc has to take a spinlock. However, it
is now being called from afs's RCU cleanup, which takes place in BH
context - but it is just taking an ordinary spinlock.
The put may also be called from non-BH context, and so there exists the
possibility of deadlock if the BH-based RCU cleanup happens whilst the hash
spinlock is held. This led to the attached lockdep complaint.
Fix this by changing spinlocks of rxnet->peer_hash_lock back to
BH-disabling locks.
================================
WARNING: inconsistent lock state
6.13.0-rc5-build2+ #1223 Tainted: G E
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
ffff88810babe228 (&rxnet->peer_hash_lock){+.?.}-{3:3}, at: rxrpc_put_peer+0xcb/0x180
{SOFTIRQ-ON-W} state was registered at:
mark_usage+0x164/0x180
__lock_acquire+0x544/0x990
lock_acquire.part.0+0x103/0x280
_raw_spin_lock+0x2f/0x40
rxrpc_peer_keepalive_worker+0x144/0x440
process_one_work+0x486/0x7c0
process_scheduled_works+0x73/0x90
worker_thread+0x1c8/0x2a0
kthread+0x19b/0x1b0
ret_from_fork+0x24/0x40
ret_from_fork_asm+0x1a/0x30
irq event stamp: 972402
hardirqs last enabled at (972402): [<ffffffff8244360e>] _raw_spin_unlock_irqrestore+0x2e/0x50
hardirqs last disabled at (972401): [<ffffffff82443328>] _raw_spin_lock_irqsave+0x18/0x60
softirqs last enabled at (972300): [<ffffffff810ffbbe>] handle_softirqs+0x3ee/0x430
softirqs last disabled at (972313): [<ffffffff810ffc54>] __irq_exit_rcu+0x44/0x110
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&rxnet->peer_hash_lock);
<Interrupt>
lock(&rxnet->peer_hash_lock);
*** DEADLOCK ***
1 lock held by swapper/1/0:
#0: ffffffff83576be0 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire+0x7/0x30
stack backtrace:
CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Tainted: G E 6.13.0-rc5-build2+ #1223
Tainted: [E]=UNSIGNED_MODULE
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
Call Trace:
<IRQ>
dump_stack_lvl+0x57/0x80
print_usage_bug.part.0+0x227/0x240
valid_state+0x53/0x70
mark_lock_irq+0xa5/0x2f0
mark_lock+0xf7/0x170
mark_usage+0xe1/0x180
__lock_acquire+0x544/0x990
lock_acquire.part.0+0x103/0x280
_raw_spin_lock+0x2f/0x40
rxrpc_put_peer+0xcb/0x180
afs_free_addrlist+0x46/0x90 [kafs]
rcu_do_batch+0x2d2/0x640
rcu_core+0x2f7/0x350
handle_softirqs+0x1ee/0x430
__irq_exit_rcu+0x44/0x110
irq_exit_rcu+0xa/0x30
sysvec_apic_timer_interrupt+0x7f/0xa0
</IRQ>
Fixes: 72904d7b9bfb ("rxrpc, afs: Allow afs to pin rxrpc_peer objects")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/2095618.1737622752@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Package build environments like Fedora rpmbuild introduced hardening
options (e.g. -pie -Wl,-z,now) by passing a -spec option to CFLAGS
and LDFLAGS.
Some Makefiles currently override CFLAGS but not LDFLAGS, which leads
to a mismatch and build failure, for example:
/usr/bin/ld: /tmp/ccd2apay.o: relocation R_X86_64_32 against
`.rodata.str1.1' can not be used when making a PIE object; recompile with -fPIE
/usr/bin/ld: failed to set dynamic section sizes: bad value
collect2: error: ld returned 1 exit status
make[1]: *** [../../lib.mk:222: tools/testing/selftests/net/lib/csum] Error 1
openvswitch/Makefile CFLAGS currently do not appear to be used, but
fix it anyway for the case when new tests are introduced in future.
Fixes: 1d0dc857b5d8 ("selftests: drv-net: add checksum tests")
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/3d173603ee258f419d0403363765c9f9494ff79a.1737635092.git.jstancek@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Package build environments like Fedora rpmbuild introduced hardening
options (e.g. -pie -Wl,-z,now) by passing a -spec option to CFLAGS
and LDFLAGS.
mptcp Makefile currently overrides CFLAGS but not LDFLAGS, which leads
to a mismatch and build failure, for example:
make[1]: *** [../../lib.mk:222: tools/testing/selftests/net/mptcp/mptcp_sockopt] Error 1
/usr/bin/ld: /tmp/ccqyMVdb.o: relocation R_X86_64_32 against `.rodata.str1.8' can not be used when making a PIE object; recompile with -fPIE
/usr/bin/ld: failed to set dynamic section sizes: bad value
collect2: error: ld returned 1 exit status
Fixes: cc937dad85ae ("selftests: centralize -D_GNU_SOURCE= to CFLAGS in lib.mk")
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/7abc701da9df39c2d6cd15bc3cf9e6cee445cb96.1737621162.git.jstancek@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Page ppol tried to cache the NAPI ID in page pool info to avoid
having a dependency on the life cycle of the NAPI instance.
Since commit under Fixes the NAPI ID is not populated until
napi_enable() and there's a good chance that page pool is
created before NAPI gets enabled.
Protect the NAPI pointer with the existing page pool mutex,
the reading path already holds it. napi_id itself we need
to READ_ONCE(), it's protected by netdev_lock() which are
not holding in page pool.
Before this patch napi IDs were missing for mlx5:
# ./cli.py --spec netlink/specs/netdev.yaml --dump page-pool-get
[{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912},
{'id': 143, 'ifindex': 2, 'inflight': 5568, 'inflight-mem': 22806528},
{'id': 142, 'ifindex': 2, 'inflight': 5120, 'inflight-mem': 20971520},
{'id': 141, 'ifindex': 2, 'inflight': 4992, 'inflight-mem': 20447232},
...
After:
[{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912,
'napi-id': 565},
{'id': 143, 'ifindex': 2, 'inflight': 4224, 'inflight-mem': 17301504,
'napi-id': 525},
{'id': 142, 'ifindex': 2, 'inflight': 4288, 'inflight-mem': 17563648,
'napi-id': 524},
...
Fixes: 86e25f40aa1e ("net: napi: Add napi_config")
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20250123231620.1086401-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Syzbot reports:
BUG: KMSAN: uninit-value in nsim_get_ringparam+0xa8/0xe0 drivers/net/netdevsim/ethtool.c:77
nsim_get_ringparam+0xa8/0xe0 drivers/net/netdevsim/ethtool.c:77
ethtool_set_ringparam+0x268/0x570 net/ethtool/ioctl.c:2072
__dev_ethtool net/ethtool/ioctl.c:3209 [inline]
dev_ethtool+0x126d/0x2a40 net/ethtool/ioctl.c:3398
dev_ioctl+0xb0e/0x1280 net/core/dev_ioctl.c:759
This is the SET path, where we call GET to either check user request
against max values, or check if any of the settings will change.
The logic in netdevsim is trying to report the default (ENABLED)
if user has not requested any specific setting. The user setting
is recorded in dev->cfg, don't depend on kernel_ringparam being
pre-populated with it.
Fixes: 928459bbda19 ("net: ethtool: populate the default HDS params in the core")
Reported-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+b3bcd80232d00091e061@syzkaller.appspotmail.com
Tested-by: syzbot+b3bcd80232d00091e061@syzkaller.appspotmail.com
Link: https://patch.msgid.link/20250123221410.1067678-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Paul has been providing very solid reviews for NC-SI changes
lately, so much so I started CCing him on all NC-SI patches.
Make the designation official.
Reviewed-by: Paul Fertser <fercerpav@gmail.com>
Link: https://patch.msgid.link/20250123155540.943243-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski says:
====================
eth: fix calling napi_enable() in atomic context
Dan has reported that I missed a lot of drivers which call napi_enable()
in atomic with the naive coccinelle search for spin locks:
https://lore.kernel.org/dcfd56bc-de32-4b11-9e19-d8bd1543745d@stanley.mountain
Fix them. Most of the fixes involve taking the netdev_lock()
before the spin lock. mt76 is special because we can just
move napi_enable() from the BH section.
All patches compile tested only.
v2: https://lore.kernel.org/20250123004520.806855-1-kuba@kernel.org
v1: https://lore.kernel.org/20250121221519.392014-1-kuba@kernel.org
====================
Link: https://patch.msgid.link/20250124031841.1179756-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mt76 does a lot of:
local_bh_disable();
napi_enable(...napi);
napi_schedule(...napi);
local_bh_enable();
local_bh_disable() is not a real lock, its most likely taken
because napi_schedule() requires that we invoke softirqs at
some point. napi_enable() needs to take a mutex, so move it
from under the BH protection.
Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/dcfd56bc-de32-4b11-9e19-d8bd1543745d@stanley.mountain
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250124031841.1179756-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
napi_enable() may sleep now, take netdev_lock() before rp->lock.
napi_enable() is hidden inside init_registers().
Note that this patch orders netdev_lock after rp->task_lock,
to avoid having to take the netdev_lock() around disable path.
Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/dcfd56bc-de32-4b11-9e19-d8bd1543745d@stanley.mountain
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250124031841.1179756-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
napi_enable() may sleep now, take netdev_lock() before np->lock.
Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/dcfd56bc-de32-4b11-9e19-d8bd1543745d@stanley.mountain
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250124031841.1179756-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
napi_enable() may sleep now, take netdev_lock() before tp->lock and
tp->rx_lock.
Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/dcfd56bc-de32-4b11-9e19-d8bd1543745d@stanley.mountain
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Link: https://patch.msgid.link/20250124031841.1179756-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
napi_enable() may sleep now, take netdev_lock() before np->lock.
Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/dcfd56bc-de32-4b11-9e19-d8bd1543745d@stanley.mountain
Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250124031841.1179756-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The local helpers for calling napi_enable() and napi_disable()
don't serve much purpose and they will complicate the fix in
the subsequent patch. Remove them, call the core functions
directly.
Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250124031841.1179756-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tg3 has a spin lock protecting most of the config,
switch to taking netdev_lock() explicitly on enable/start
paths. Disable/stop paths seem to not be under the spin
lock (since napi_disable() already needs to sleep),
so leave that side as is.
tg3_restart_hw() releases and re-takes the spin lock,
we need to do the same because dev_close() needs to
take netdev_lock().
Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/dcfd56bc-de32-4b11-9e19-d8bd1543745d@stanley.mountain
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250124031841.1179756-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
netlink reports which attribute was incorrect by sending back
an attribute offset. Offset points to the address of struct nlattr,
but to interpret the type we also need the nesting path.
Attribute IDs have different meaning in different nests
of the same message.
Correct the condition for "is the offset within current attribute".
ynl_attr_data_len() does not include the attribute header,
so the end offset was off by 4 bytes.
This means that we'd always skip over flags and empty nests.
The devmem tests, for example, issues an invalid request with
empty queue nests, resulting in the following error:
YNL failed: Kernel error: missing attribute: .queues.ifindex
The message is incorrect, "queues" nest does not have an "ifindex"
attribute defined. With this fix we decend correctly into the nest:
YNL failed: Kernel error: missing attribute: .queues.id
Fixes: 86878f14d71a ("tools: ynl: user space helpers")
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250124012130.1121227-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ioctl and sysfs handlers unconditionally call the ->enable callback.
Not all drivers implement that callback, leading to NULL dereferences.
Example of affected drivers: ptp_s390.c, ptp_vclock.c and ptp_mock.c.
Instead use a dummy callback if no better was specified by the driver.
Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.")
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250123-ptp-enable-v1-1-b015834d3a47@weissschuh.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix a couple of typos/spelling mistakes in the documentation.
Signed-off-by: Khaled Elnaggar <khaledelnaggarlinux@gmail.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://patch.msgid.link/20250123082521.59997-1-khaledelnaggarlinux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
kvzalloc_node is not doing a runtime check on the node argument
(__alloc_pages_node_noprof does have a VM_BUG_ON, but it expands to
nothing on !CONFIG_DEBUG_VM builds), so doing any ethtool/netlink
operation that calls mlx5e_open on a CPU that's larger that MAX_NUMNODES
triggers OOB access and panic (see the trace below).
Add missing cpu_to_node call to convert cpu id to node id.
[ 165.427394] mlx5_core 0000:5c:00.0 beth1: Link up
[ 166.479327] BUG: unable to handle page fault for address: 0000000800000010
[ 166.494592] #PF: supervisor read access in kernel mode
[ 166.505995] #PF: error_code(0x0000) - not-present page
...
[ 166.816958] Call Trace:
[ 166.822380] <TASK>
[ 166.827034] ? __die_body+0x64/0xb0
[ 166.834774] ? page_fault_oops+0x2cd/0x3f0
[ 166.843862] ? exc_page_fault+0x63/0x130
[ 166.852564] ? asm_exc_page_fault+0x22/0x30
[ 166.861843] ? __kvmalloc_node_noprof+0x43/0xd0
[ 166.871897] ? get_partial_node+0x1c/0x320
[ 166.880983] ? deactivate_slab+0x269/0x2b0
[ 166.890069] ___slab_alloc+0x521/0xa90
[ 166.898389] ? __kvmalloc_node_noprof+0x43/0xd0
[ 166.908442] __kmalloc_node_noprof+0x216/0x3f0
[ 166.918302] ? __kvmalloc_node_noprof+0x43/0xd0
[ 166.928354] __kvmalloc_node_noprof+0x43/0xd0
[ 166.938021] mlx5e_open_channels+0x5e2/0xc00
[ 166.947496] mlx5e_open_locked+0x3e/0xf0
[ 166.956201] mlx5e_open+0x23/0x50
[ 166.963551] __dev_open+0x114/0x1c0
[ 166.971292] __dev_change_flags+0xa2/0x1b0
[ 166.980378] dev_change_flags+0x21/0x60
[ 166.988887] do_setlink+0x38d/0xf20
[ 166.996628] ? ep_poll_callback+0x1b9/0x240
[ 167.005910] ? __nla_validate_parse.llvm.10713395753544950386+0x80/0xd70
[ 167.020782] ? __wake_up_sync_key+0x52/0x80
[ 167.030066] ? __mutex_lock+0xff/0x550
[ 167.038382] ? security_capable+0x50/0x90
[ 167.047279] rtnl_setlink+0x1c9/0x210
[ 167.055403] ? ep_poll_callback+0x1b9/0x240
[ 167.064684] ? security_capable+0x50/0x90
[ 167.073579] rtnetlink_rcv_msg+0x2f9/0x310
[ 167.082667] ? rtnetlink_bind+0x30/0x30
[ 167.091173] netlink_rcv_skb+0xb1/0xe0
[ 167.099492] netlink_unicast+0x20f/0x2e0
[ 167.108191] netlink_sendmsg+0x389/0x420
[ 167.116896] __sys_sendto+0x158/0x1c0
[ 167.125024] __x64_sys_sendto+0x22/0x30
[ 167.133534] do_syscall_64+0x63/0x130
[ 167.141657] ? __irq_exit_rcu.llvm.17843942359718260576+0x52/0xd0
[ 167.155181] entry_SYSCALL_64_after_hwframe+0x4b/0x53
Fixes: bb135e40129d ("net/mlx5e: move XDP_REDIRECT sq to dynamic allocation")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250123000407.3464715-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot discovered that we remove the debugfs files after we free
the netdev. Try to clean up the relevant dir while the device
is still around.
Reported-by: syzbot+2e5de9e3ab986b71d2bf@syzkaller.appspotmail.com
Fixes: 424be63ad831 ("netdevsim: add UDP tunnel port offload support")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250122224503.762705-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rose timers only acquire the socket spinlock, without
checking if the socket is owned by one user thread.
Add a check and rearm the timers if needed.
BUG: KASAN: slab-use-after-free in rose_timer_expiry+0x31d/0x360 net/rose/rose_timer.c:174
Read of size 2 at addr ffff88802f09b82a by task swapper/0/0
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.13.0-rc5-syzkaller-00172-gd1bf27c4e176 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0x169/0x550 mm/kasan/report.c:489
kasan_report+0x143/0x180 mm/kasan/report.c:602
rose_timer_expiry+0x31d/0x360 net/rose/rose_timer.c:174
call_timer_fn+0x187/0x650 kernel/time/timer.c:1793
expire_timers kernel/time/timer.c:1844 [inline]
__run_timers kernel/time/timer.c:2418 [inline]
__run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2430
run_timer_base kernel/time/timer.c:2439 [inline]
run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2449
handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
__do_softirq kernel/softirq.c:595 [inline]
invoke_softirq kernel/softirq.c:435 [inline]
__irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049
</IRQ>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250122180244.1861468-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ndo_do_ioctl is no longer used by the appletalk subsystem after commit
45bd1c5ba758 ("net: appletalk: Drop aarp_send_probe_phase1()").
Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/tencent_4AC6ED413FEA8116B4253D3ED6947FDBCF08@qq.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Copy of the rationale from 790071347a0a1a89e618eedcd51c687ea783aeb3:
Change ndo_set_mac_address to dev_set_mac_address because
dev_set_mac_address provides a way to notify network layer about MAC
change. In other case, services may not aware about MAC change and keep
using old one which set from network adapter driver.
As example, DHCP client from systemd do not update MAC address without
notification from net subsystem which leads to the problem with acquiring
the right address from DHCP server.
Since dev_set_mac_address requires RTNL lock the operation can not be
performed directly in the response handler, see
9e2bbab94b88295dcc57c7580393c9ee08d7314d.
The way of selecting the first suitable MAC address from the list is
changed, instead of having the driver check it this patch just assumes
any valid MAC should be good.
Fixes: b8291cf3d118 ("net/ncsi: Add NC-SI 1.2 Get MC MAC Address command")
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
First case:
> ip l a l $VF name vlanx type vlan id 100
> ip l d vlanx
> ip l a l $VF name vlanx type vlan id 100
As workqueue can be execute after sometime, there is a window to have
call trace like that:
- iavf_del_vlan
- iavf_add_vlan
- iavf_del_vlans (wq)
It means that our VLAN 100 will change the state from IAVF_VLAN_ACTIVE
to IAVF_VLAN_REMOVE (iavf_del_vlan). After that in iavf_add_vlan state
won't be changed because VLAN 100 is on the filter list. The final
result is that the VLAN 100 filter isn't added in hardware (no
iavf_add_vlans call).
To fix that change the state if the filter wasn't removed yet directly
to active. It is save as IAVF_VLAN_REMOVE means that virtchnl message
wasn't sent yet.
Second case:
> ip l a l $VF name vlanx type vlan id 100
Any type of VF reset ex. change trust
> ip l s $PF vf $VF_NUM trust on
> ip l d vlanx
> ip l a l $VF name vlanx type vlan id 100
In case of reset iavf driver is responsible for readding all filters
that are being used. To do that all VLAN filters state are changed to
IAVF_VLAN_ADD. Here is even longer window for changing VLAN state from
kernel side, as workqueue isn't called immediately. We can have call
trace like that:
- changing to IAVF_VLAN_ADD (after reset)
- iavf_del_vlan (called from kernel ops)
- iavf_del_vlans (wq)
Not exsisitng VLAN filters will be removed from hardware. It isn't a
bug, ice driver will handle it fine. However, we can have call trace
like that:
- changing to IAVF_VLAN_ADD (after reset)
- iavf_del_vlan (called from kernel ops)
- iavf_add_vlan (called from kernel ops)
- iavf_del_vlans (wq)
With fix for previous case we end up with no VLAN filters in hardware.
We have to remove VLAN filters if the state is IAVF_VLAN_ADD and delete
VLAN was called. It is save as IAVF_VLAN_ADD means that virtchnl message
wasn't sent yet.
Fixes: 0c0da0e95105 ("iavf: refactor VLAN filter states")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
It occurred that in the commit 70838938e89c ("ice: Implement driver
functionality to dump serdes equalizer values") the invalid DRATE parameter
for reading has been added. The output of the command:
$ ethtool -d <ethX>
returns the garbage value in the place where DRATE value should be
stored.
Remove mentioned parameter to prevent return of corrupted data to
userspace.
Fixes: 70838938e89c ("ice: Implement driver functionality to dump serdes equalizer values")
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Fix &ice_parser_rt::bst_key size. It was wrongly set to 10 instead of 20
in the initial impl commit (see Fixes tag). All usage code assumed it was
of size 20. That was also the initial size present up to v2 of the intro
series [2], but halved by v3 [3] refactor described as "Replace magic
hardcoded values with macros." The introducing series was so big that
some ugliness was unnoticed, same for bugs :/
ICE_BST_KEY_TCAM_SIZE and ICE_BST_TCAM_KEY_SIZE were differing by one.
There was tmp variable @j in the scope of edited function, but was not
used in all places. This ugliness is now gone.
I'm moving ice_parser_rt::pg_prio a few positions up, to fill up one of
the holes in order to compensate for the added 10 bytes to the ::bst_key,
resulting in the same size of the whole as prior to the fix, and minimal
changes in the offsets of the fields.
Extend also the debug dump print of the key to cover all bytes. To not
have string with 20 "%02x" and 20 params, switch to
ice_debug_array_w_prefix().
This fix obsoletes Ahmed's attempt at [1].
[1] https://lore.kernel.org/intel-wired-lan/20240823230847.172295-1-ahmed.zaki@intel.com
[2] https://lore.kernel.org/intel-wired-lan/20230605054641.2865142-13-junfeng.guo@intel.com
[3] https://lore.kernel.org/intel-wired-lan/20230817093442.2576997-13-junfeng.guo@intel.com
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/intel-wired-lan/b1fb6ff9-b69e-4026-9988-3c783d86c2e0@stanley.mountain
Fixes: 9a4c07aaa0f5 ("ice: add parser execution main loop")
CC: Ahmed Zaki <ahmed.zaki@intel.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Add more information related to the transaction like cookie, vc_op,
salt when transaction times out and include similar information
when transaction salt does not match.
Info output for transaction timeout:
-------------------
(op:5015 cookie:45fe vc_op:5015 salt:45 timeout:60000ms)
-------------------
before it was:
-------------------
(op 5015, 60000ms)
-------------------
Signed-off-by: Manoj Vishwanathan <manojvishy@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When a workqueue is created with `WQ_UNBOUND`, its work items are
served by special worker-pools, whose host workers are not bound to
any specific CPU. In the default configuration (i.e. when
`queue_delayed_work` and friends do not specify which CPU to run the
work item on), `WQ_UNBOUND` allows the work item to be executed on any
CPU in the same node of the CPU it was enqueued on. While this
solution potentially sacrifices locality, it avoids contention with
other processes that might dominate the CPU time of the processor the
work item was scheduled on.
This is not just a theoretical problem: in a particular scenario
misconfigured process was hogging most of the time from CPU0, leaving
less than 0.5% of its CPU time to the kworker. The IDPF workqueues
that were using the kworker on CPU0 suffered large completion delays
as a result, causing performance degradation, timeouts and eventual
system crash.
Tested:
* I have also run a manual test to gauge the performance
improvement. The test consists of an antagonist process
(`./stress --cpu 2`) consuming as much of CPU 0 as possible. This
process is run under `taskset 01` to bind it to CPU0, and its
priority is changed with `chrt -pQ 9900 10000 ${pid}` and
`renice -n -20 ${pid}` after start.
Then, the IDPF driver is forced to prefer CPU0 by editing all calls
to `queue_delayed_work`, `mod_delayed_work`, etc... to use CPU 0.
Finally, `ktraces` for the workqueue events are collected.
Without the current patch, the antagonist process can force
arbitrary delays between `workqueue_queue_work` and
`workqueue_execute_start`, that in my tests were as high as
`30ms`. With the current patch applied, the workqueue can be
migrated to another unloaded CPU in the same node, and, keeping
everything else equal, the maximum delay I could see was `6us`.
Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration")
Signed-off-by: Marco Leogrande <leogrande@google.com>
Signed-off-by: Manoj Vishwanathan <manojvishy@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The transaction salt was being accessed before acquiring the
idpf_vc_xn_lock when idpf has to forward the virtchnl reply.
Fixes: 34c21fa894a1 ("idpf: implement virtchnl transaction manager")
Signed-off-by: Manoj Vishwanathan <manojvishy@google.com>
Signed-off-by: David Decotigny <decot@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Restore the call to idpf_vc_xn_shutdown() at the beginning of
idpf_vc_core_deinit() provided the function is not called on remove.
In the reset path the mailbox is destroyed, leading to all transactions
timing out.
Fixes: 09d0fb5cb30e ("idpf: deinit virtchnl transaction manager after vport and vectors")
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Add read memory barrier to ensure the order of operations when accessing
control queue descriptors. Specifically, we want to avoid cases where loads
can be reordered:
1. Load #1 is dispatched to read descriptor flags.
2. Load #2 is dispatched to read some other field from the descriptor.
3. Load #2 completes, accessing memory/cache at a point in time when the DD
flag is zero.
4. NIC DMA overwrites the descriptor, now the DD flag is one.
5. Any fields loaded before step 4 are now inconsistent with the actual
descriptor state.
Add read memory barrier between steps 1 and 2, so that load #2 is not
executed until load #1 has completed.
Fixes: 8077c727561a ("idpf: add controlq init and reset checks")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Suggested-by: Lance Richardson <rlance@google.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
For the state cache lookup xfrm_input_state_lookup() first disables
preemption, to remain on the CPU and then retrieves a per-CPU pointer.
Within the preempt-disable section it also acquires
netns_xfrm::xfrm_state_lock, a spinlock_t. This lock must not be
acquired with explicit disabled preemption (such as by get_cpu())
because this lock becomes a sleeping lock on PREEMPT_RT.
To remain on the same CPU is just an optimisation for the CPU local
lookup. The actual modification of the per-CPU variable happens with
netns_xfrm::xfrm_state_lock acquired.
Remove get_cpu() and use the state_cache_input on the current CPU.
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Closes: https://lore.kernel.org/all/CAADnVQKkCLaj=roayH=Mjiiqz_svdf1tsC3OE4EC0E=mAD+L1A@mail.gmail.com/
Fixes: 81a331a0e72dd ("xfrm: Add an inbound percpu state cache.")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
syzbot found that calling mr_mfc_uses_dev() for unres entries
would crash [1], because c->mfc_un.res.minvif / c->mfc_un.res.maxvif
alias to "struct sk_buff_head unresolved", which contain two pointers.
This code never worked, lets remove it.
[1]
Unable to handle kernel paging request at virtual address ffff5fff2d536613
KASAN: maybe wild-memory-access in range [0xfffefff96a9b3098-0xfffefff96a9b309f]
Modules linked in:
CPU: 1 UID: 0 PID: 7321 Comm: syz.0.16 Not tainted 6.13.0-rc7-syzkaller-g1950a0af2d55 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mr_mfc_uses_dev net/ipv4/ipmr_base.c:290 [inline]
pc : mr_table_dump+0x5a4/0x8b0 net/ipv4/ipmr_base.c:334
lr : mr_mfc_uses_dev net/ipv4/ipmr_base.c:289 [inline]
lr : mr_table_dump+0x694/0x8b0 net/ipv4/ipmr_base.c:334
Call trace:
mr_mfc_uses_dev net/ipv4/ipmr_base.c:290 [inline] (P)
mr_table_dump+0x5a4/0x8b0 net/ipv4/ipmr_base.c:334 (P)
mr_rtm_dumproute+0x254/0x454 net/ipv4/ipmr_base.c:382
ipmr_rtm_dumproute+0x248/0x4b4 net/ipv4/ipmr.c:2648
rtnl_dump_all+0x2e4/0x4e8 net/core/rtnetlink.c:4327
rtnl_dumpit+0x98/0x1d0 net/core/rtnetlink.c:6791
netlink_dump+0x4f0/0xbc0 net/netlink/af_netlink.c:2317
netlink_recvmsg+0x56c/0xe64 net/netlink/af_netlink.c:1973
sock_recvmsg_nosec net/socket.c:1033 [inline]
sock_recvmsg net/socket.c:1055 [inline]
sock_read_iter+0x2d8/0x40c net/socket.c:1125
new_sync_read fs/read_write.c:484 [inline]
vfs_read+0x740/0x970 fs/read_write.c:565
ksys_read+0x15c/0x26c fs/read_write.c:708
Fixes: cb167893f41e ("net: Plumb support for filtering ipv4 and ipv6 multicast route dumps")
Reported-by: syzbot+5cfae50c0e5f2c500013@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/678fe2d1.050a0220.15cac.00b3.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250121181241.841212-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Recent change to add more cases to XFAIL has a broken regex,
the matching needs a real regex not a glob pattern.
While at it add the cases Willem pointed out during review.
Fixes: 3030e3d57ba8 ("selftests/net: packetdrill: make tcp buf limited timing tests benign")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250121143423.215261-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When port is stopped, unlock before returning
Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250121005002.3938236-1-harshit.m.mogalapalli@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|