summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2025-07-17net: bridge: Do not offload IGMP/MLD messagesJoseph Huang
Do not offload IGMP/MLD messages as it could lead to IGMP/MLD Reports being unintentionally flooded to Hosts. Instead, let the bridge decide where to send these IGMP/MLD messages. Consider the case where the local host is sending out reports in response to a remote querier like the following: mcast-listener-process (IP_ADD_MEMBERSHIP) \ br0 / \ swp1 swp2 | | QUERIER SOME-OTHER-HOST In the above setup, br0 will want to br_forward() reports for mcast-listener-process's group(s) via swp1 to QUERIER; but since the source hwdom is 0, the report is eligible for tx offloading, and is flooded by hardware to both swp1 and swp2, reaching SOME-OTHER-HOST as well. (Example and illustration provided by Tobias.) Fixes: 472111920f1c ("net: bridge: switchdev: allow the TX data plane forwarding to be offloaded") Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250716153551.1830255-1-Joseph.Huang@garmin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtimeDong Chenchen
Assuming the "rx-vlan-filter" feature is enabled on a net device, the 8021q module will automatically add or remove VLAN 0 when the net device is put administratively up or down, respectively. There are a couple of problems with the above scheme. The first problem is a memory leak that can happen if the "rx-vlan-filter" feature is disabled while the device is running: # ip link add bond1 up type bond mode 0 # ethtool -K bond1 rx-vlan-filter off # ip link del dev bond1 When the device is put administratively down the "rx-vlan-filter" feature is disabled, so the 8021q module will not remove VLAN 0 and the memory will be leaked [1]. Another problem that can happen is that the kernel can automatically delete VLAN 0 when the device is put administratively down despite not adding it when the device was put administratively up since during that time the "rx-vlan-filter" feature was disabled. null-ptr-unref or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance if toggling filtering during runtime: $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ip link del vlan0 Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //vlan_info is null Fix both problems by noting in the VLAN info whether VLAN 0 was automatically added upon NETDEV_UP and based on that decide whether it should be deleted upon NETDEV_DOWN, regardless of the state of the "rx-vlan-filter" feature. [1] unreferenced object 0xffff8880068e3100 (size 256): comm "ip", pid 384, jiffies 4296130254 hex dump (first 32 bytes): 00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 81ce31fa): __kmalloc_cache_noprof+0x2b5/0x340 vlan_vid_add+0x434/0x940 vlan_device_event.cold+0x75/0xa8 notifier_call_chain+0xca/0x150 __dev_notify_flags+0xe3/0x250 rtnl_configure_link+0x193/0x260 rtnl_newlink_create+0x383/0x8e0 __rtnl_newlink+0x22c/0xa40 rtnl_newlink+0x627/0xb00 rtnetlink_rcv_msg+0x6fb/0xb70 netlink_rcv_skb+0x11f/0x350 netlink_unicast+0x426/0x710 netlink_sendmsg+0x75a/0xc20 __sock_sendmsg+0xc1/0x150 ____sys_sendmsg+0x5aa/0x7b0 ___sys_sendmsg+0xfc/0x180 [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Fixes: ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250716034504.2285203-2-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17tls: always refresh the queue when reading sockJakub Kicinski
After recent changes in net-next TCP compacts skbs much more aggressively. This unearthed a bug in TLS where we may try to operate on an old skb when checking if all skbs in the queue have matching decrypt state and geometry. BUG: KASAN: slab-use-after-free in tls_strp_check_rcv+0x898/0x9a0 [tls] (net/tls/tls_strp.c:436 net/tls/tls_strp.c:530 net/tls/tls_strp.c:544) Read of size 4 at addr ffff888013085750 by task tls/13529 CPU: 2 UID: 0 PID: 13529 Comm: tls Not tainted 6.16.0-rc5-virtme Call Trace: kasan_report+0xca/0x100 tls_strp_check_rcv+0x898/0x9a0 [tls] tls_rx_rec_wait+0x2c9/0x8d0 [tls] tls_sw_recvmsg+0x40f/0x1aa0 [tls] inet_recvmsg+0x1c3/0x1f0 Always reload the queue, fast path is to have the record in the queue when we wake, anyway (IOW the path going down "if !strp->stm.full_len"). Fixes: 0d87bbd39d7f ("tls: strp: make sure the TCP skbs do not have overlapping data") Link: https://patch.msgid.link/20250716143850.1520292-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()Nathan Chancellor
A new warning in clang [1] points out a place in pep_sock_accept() where dst is uninitialized then passed as a const pointer to pep_find_pipe(): net/phonet/pep.c:829:37: error: variable 'dst' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer] 829 | newsk = pep_find_pipe(&pn->hlist, &dst, pipe_handle); | ^~~: Move the call to pn_skb_get_dst_sockaddr(), which initializes dst, to before the call to pep_find_pipe(), so that dst is consistently used initialized throughout the function. Cc: stable@vger.kernel.org Fixes: f7ae8d59f661 ("Phonet: allocate sock from accept syscall rather than soft IRQ") Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d441f19b319e [1] Closes: https://github.com/ClangBuiltLinux/linux/issues/2101 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20250715-net-phonet-fix-uninit-const-pointer-v1-1-8efd1bd188b3@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Bluetooth: L2CAP: Fix attempting to adjust outgoing MTULuiz Augusto von Dentz
Configuration request only configure the incoming direction of the peer initiating the request, so using the MTU is the other direction shall not be used, that said the spec allows the peer responding to adjust: Bluetooth Core 6.1, Vol 3, Part A, Section 4.5 'Each configuration parameter value (if any is present) in an L2CAP_CONFIGURATION_RSP packet reflects an ‘adjustment’ to a configuration parameter value that has been sent (or, in case of default values, implied) in the corresponding L2CAP_CONFIGURATION_REQ packet.' That said adjusting the MTU in the response shall be limited to ERTM channels only as for older modes the remote stack may not be able to detect the adjustment causing it to silently drop packets. Link: https://github.com/bluez/bluez/issues/1422 Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/149 Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4793 Fixes: 042bb9603c44 ("Bluetooth: L2CAP: Fix L2CAP MTU negotiation") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-17Merge tag 'wireless-next-2025-07-17' of ↵Paolo Abeni
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== Another set of changes, notably: - cfg80211: fix double-free introduced earlier - mac80211: fix RCU iteration in CSA - iwlwifi: many cleanups (unused FW APIs, PCIe code, WoWLAN) - mac80211: some work around how FIPS affects wifi, which was wrong (RC4 is used by TKIP, not only WEP) - cfg/mac80211: improvements for unsolicated probe response handling * tag 'wireless-next-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (64 commits) wifi: cfg80211: fix double free for link_sinfo in nl80211_station_dump() wifi: cfg80211: fix off channel operation allowed check for MLO wifi: mac80211: use RCU-safe iteration in ieee80211_csa_finish wifi: mac80211_hwsim: Update comments in header wifi: mac80211: parse unsolicited broadcast probe response data wifi: cfg80211: parse attribute to update unsolicited probe response template wifi: mac80211: don't use TPE data from assoc response wifi: mac80211: handle WLAN_HT_ACTION_NOTIFY_CHANWIDTH async wifi: mac80211: simplify __ieee80211_rx_h_amsdu() loop wifi: mac80211: don't mark keys for inactive links as uploaded wifi: mac80211: only assign chanctx in reconfig wifi: mac80211_hwsim: Declare support for AP scanning wifi: mac80211: clean up cipher suite handling wifi: mac80211: don't send keys to driver when fips_enabled wifi: cfg80211: Fix interface type validation wifi: mac80211: remove ieee80211_link_unreserve_chanctx() return value wifi: mac80211: don't unreserve never reserved chanctx mwl8k: Add missing check after DMA map wifi: mac80211: make VHT opmode NSS ignore a debug message wifi: iwlwifi: remove support of several iwl_ppag_table_cmd versions ... ==================== Link: https://patch.msgid.link/20250717094610.20106-47-johannes@sipsolutions.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17Merge tag 'nf-25-07-17' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following batch contains Netfilter fixes for net: 1) Three patches to enhance conntrack selftests for resize and clash resolution, from Florian Westphal. 2) Expand nft_concat_range.sh selftest to improve coverage from error path, from Florian Westphal. 3) Hide clash bit to userspace from netlink dumps until there is a good reason to expose, from Florian Westphal. 4) Revert notification for device registration/unregistration for nftables basechains and flowtables, we decided to go for a better way to handle this through the nfnetlink_hook infrastructure which will come via nf-next, patch from Phil Sutter. 5) Fix crash in conntrack due to race related to SLAB_TYPESAFE_BY_RCU that results in removing a recycled object that is not yet in the hashes. Move IPS_CONFIRM setting after the object is in the hashes. From Florian Westphal. netfilter pull request 25-07-17 * tag 'nf-25-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_conntrack: fix crash due to removal of uninitialised entry Revert "netfilter: nf_tables: Add notifications for hook changes" netfilter: nf_tables: hide clash bit from userspace selftests: netfilter: nft_concat_range.sh: send packets to empty set selftests: netfilter: conntrack_resize.sh: also use udpclash tool selftests: netfilter: add conntrack clash resolution test case selftests: netfilter: conntrack_resize.sh: extend resize test ==================== Link: https://patch.msgid.link/20250717095808.41725-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17netfilter: nf_conntrack: fix crash due to removal of uninitialised entryFlorian Westphal
A crash in conntrack was reported while trying to unlink the conntrack entry from the hash bucket list: [exception RIP: __nf_ct_delete_from_lists+172] [..] #7 [ff539b5a2b043aa0] nf_ct_delete at ffffffffc124d421 [nf_conntrack] #8 [ff539b5a2b043ad0] nf_ct_gc_expired at ffffffffc124d999 [nf_conntrack] #9 [ff539b5a2b043ae0] __nf_conntrack_find_get at ffffffffc124efbc [nf_conntrack] [..] The nf_conn struct is marked as allocated from slab but appears to be in a partially initialised state: ct hlist pointer is garbage; looks like the ct hash value (hence crash). ct->status is equal to IPS_CONFIRMED|IPS_DYING, which is expected ct->timeout is 30000 (=30s), which is unexpected. Everything else looks like normal udp conntrack entry. If we ignore ct->status and pretend its 0, the entry matches those that are newly allocated but not yet inserted into the hash: - ct hlist pointers are overloaded and store/cache the raw tuple hash - ct->timeout matches the relative time expected for a new udp flow rather than the absolute 'jiffies' value. If it were not for the presence of IPS_CONFIRMED, __nf_conntrack_find_get() would have skipped the entry. Theory is that we did hit following race: cpu x cpu y cpu z found entry E found entry E E is expired <preemption> nf_ct_delete() return E to rcu slab init_conntrack E is re-inited, ct->status set to 0 reply tuplehash hnnode.pprev stores hash value. cpu y found E right before it was deleted on cpu x. E is now re-inited on cpu z. cpu y was preempted before checking for expiry and/or confirm bit. ->refcnt set to 1 E now owned by skb ->timeout set to 30000 If cpu y were to resume now, it would observe E as expired but would skip E due to missing CONFIRMED bit. nf_conntrack_confirm gets called sets: ct->status |= CONFIRMED This is wrong: E is not yet added to hashtable. cpu y resumes, it observes E as expired but CONFIRMED: <resumes> nf_ct_expired() -> yes (ct->timeout is 30s) confirmed bit set. cpu y will try to delete E from the hashtable: nf_ct_delete() -> set DYING bit __nf_ct_delete_from_lists Even this scenario doesn't guarantee a crash: cpu z still holds the table bucket lock(s) so y blocks: wait for spinlock held by z CONFIRMED is set but there is no guarantee ct will be added to hash: "chaintoolong" or "clash resolution" logic both skip the insert step. reply hnnode.pprev still stores the hash value. unlocks spinlock return NF_DROP <unblocks, then crashes on hlist_nulls_del_rcu pprev> In case CPU z does insert the entry into the hashtable, cpu y will unlink E again right away but no crash occurs. Without 'cpu y' race, 'garbage' hlist is of no consequence: ct refcnt remains at 1, eventually skb will be free'd and E gets destroyed via: nf_conntrack_put -> nf_conntrack_destroy -> nf_ct_destroy. To resolve this, move the IPS_CONFIRMED assignment after the table insertion but before the unlock. Pablo points out that the confirm-bit-store could be reordered to happen before hlist add resp. the timeout fixup, so switch to set_bit and before_atomic memory barrier to prevent this. It doesn't matter if other CPUs can observe a newly inserted entry right before the CONFIRMED bit was set: Such event cannot be distinguished from above "E is the old incarnation" case: the entry will be skipped. Also change nf_ct_should_gc() to first check the confirmed bit. The gc sequence is: 1. Check if entry has expired, if not skip to next entry 2. Obtain a reference to the expired entry. 3. Call nf_ct_should_gc() to double-check step 1. nf_ct_should_gc() is thus called only for entries that already failed an expiry check. After this patch, once the confirmed bit check passes ct->timeout has been altered to reflect the absolute 'best before' date instead of a relative time. Step 3 will therefore not remove the entry. Without this change to nf_ct_should_gc() we could still get this sequence: 1. Check if entry has expired. 2. Obtain a reference. 3. Call nf_ct_should_gc() to double-check step 1: 4 - entry is still observed as expired 5 - meanwhile, ct->timeout is corrected to absolute value on other CPU and confirm bit gets set 6 - confirm bit is seen 7 - valid entry is removed again First do check 6), then 4) so the gc expiry check always picks up either confirmed bit unset (entry gets skipped) or expiry re-check failure for re-inited conntrack objects. This change cannot be backported to releases before 5.19. Without commit 8a75a2c17410 ("netfilter: conntrack: remove unconfirmed list") |= IPS_CONFIRMED line cannot be moved without further changes. Cc: Razvan Cojocaru <rzvncj@gmail.com> Link: https://lore.kernel.org/netfilter-devel/20250627142758.25664-1-fw@strlen.de/ Link: https://lore.kernel.org/netfilter-devel/4239da15-83ff-4ca4-939d-faef283471bb@gmail.com/ Fixes: 1397af5bfd7d ("netfilter: conntrack: remove the percpu dying list") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-07-17net: fix segmentation after TCP/UDP fraglist GROFelix Fietkau
Since "net: gro: use cb instead of skb->network_header", the skb network header is no longer set in the GRO path. This breaks fraglist segmentation, which relies on ip_hdr()/tcp_hdr() to check for address/port changes. Fix this regression by selectively setting the network header for merged segment skbs. Fixes: 186b1ea73ad8 ("net: gro: use cb instead of skb->network_header") Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250705150622.10699-1-nbd@nbd.name Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-16bpf: Clean up individual BTF_ID codeFeng Yang
Use BTF_ID_LIST_SINGLE(a, b, c) instead of BTF_ID_LIST(a) BTF_ID(b, c) Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Link: https://lore.kernel.org/r/20250710055419.70544-1-yangfeng59949@163.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-16ipv6: mcast: Delay put pmc->idev in mld_del_delrec()Yue Haibing
pmc->idev is still used in ip6_mc_clear_src(), so as mld_clear_delrec() does, the reference should be put after ip6_mc_clear_src() return. Fixes: 63ed8de4be81 ("mld: add mc_lock for protecting per-interface mld data") Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Link: https://patch.msgid.link/20250714141957.3301871-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16ipv6: mcast: Simplify mld_clear_{report|query}()Yue Haibing
Use __skb_queue_purge() instead of re-implementing it. Note that it uses kfree_skb_reason() instead of kfree_skb() internally, and pass SKB_DROP_REASON_QUEUE_PURGE drop reason to the kfree_skb tracepoint. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250715120709.3941510-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16tcp: fix UaF in tcp_prune_ofo_queue()Paolo Abeni
The CI reported a UaF in tcp_prune_ofo_queue(): BUG: KASAN: slab-use-after-free in tcp_prune_ofo_queue+0x55d/0x660 Read of size 4 at addr ffff8880134729d8 by task socat/20348 CPU: 0 UID: 0 PID: 20348 Comm: socat Not tainted 6.16.0-rc5-virtme #1 PREEMPT(full) Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0x82/0xd0 print_address_description.constprop.0+0x2c/0x400 print_report+0xb4/0x270 kasan_report+0xca/0x100 tcp_prune_ofo_queue+0x55d/0x660 tcp_try_rmem_schedule+0x855/0x12e0 tcp_data_queue+0x4dd/0x2260 tcp_rcv_established+0x5e8/0x2370 tcp_v4_do_rcv+0x4ba/0x8c0 __release_sock+0x27a/0x390 release_sock+0x53/0x1d0 tcp_sendmsg+0x37/0x50 sock_write_iter+0x3c1/0x520 vfs_write+0xc09/0x1210 ksys_write+0x183/0x1d0 do_syscall_64+0xc1/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fcf73ef2337 Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 RSP: 002b:00007ffd4f924708 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcf73ef2337 RDX: 0000000000002000 RSI: 0000555f11d1a000 RDI: 0000000000000008 RBP: 0000555f11d1a000 R08: 0000000000002000 R09: 0000000000000000 R10: 0000000000000040 R11: 0000000000000246 R12: 0000000000000008 R13: 0000000000002000 R14: 0000555ee1a44570 R15: 0000000000002000 </TASK> Allocated by task 20348: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 __kasan_slab_alloc+0x59/0x70 kmem_cache_alloc_node_noprof+0x110/0x340 __alloc_skb+0x213/0x2e0 tcp_collapse+0x43f/0xff0 tcp_try_rmem_schedule+0x6b9/0x12e0 tcp_data_queue+0x4dd/0x2260 tcp_rcv_established+0x5e8/0x2370 tcp_v4_do_rcv+0x4ba/0x8c0 __release_sock+0x27a/0x390 release_sock+0x53/0x1d0 tcp_sendmsg+0x37/0x50 sock_write_iter+0x3c1/0x520 vfs_write+0xc09/0x1210 ksys_write+0x183/0x1d0 do_syscall_64+0xc1/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 20348: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x38/0x50 kmem_cache_free+0x149/0x330 tcp_prune_ofo_queue+0x211/0x660 tcp_try_rmem_schedule+0x855/0x12e0 tcp_data_queue+0x4dd/0x2260 tcp_rcv_established+0x5e8/0x2370 tcp_v4_do_rcv+0x4ba/0x8c0 __release_sock+0x27a/0x390 release_sock+0x53/0x1d0 tcp_sendmsg+0x37/0x50 sock_write_iter+0x3c1/0x520 vfs_write+0xc09/0x1210 ksys_write+0x183/0x1d0 do_syscall_64+0xc1/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff888013472900 which belongs to the cache skbuff_head_cache of size 232 The buggy address is located 216 bytes inside of freed 232-byte region [ffff888013472900, ffff8880134729e8) The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x13472 head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x80000000000040(head|node=0|zone=1) page_type: f5(slab) raw: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290 raw: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000 head: 0080000000000040 ffff88800198fb40 ffffea0000347b10 ffffea00004f5290 head: 0000000000000000 0000000000120012 00000000f5000000 0000000000000000 head: 0080000000000001 ffffea00004d1c81 00000000ffffffff 00000000ffffffff head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888013472880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888013472900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff888013472980: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc ^ ffff888013472a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888013472a80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb Indeed tcp_prune_ofo_queue() is reusing the skb dropped a few lines above. The caller wants to enqueue 'in_skb', lets check space vs the latter. Fixes: 1d2fbaad7cd8 ("tcp: stronger sk_rcvbuf checks") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: syzbot+865aca08c0533171bf6a@syzkaller.appspotmail.com Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/b78d2d9bdccca29021eed9a0e7097dd8dc00f485.1752567053.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16ethtool: Don't check for RXFH fields conflict when no input_xfrm is requestedGal Pressman
The requirement of ->get_rxfh_fields() in ethtool_set_rxfh() is there to verify that we have no conflict of input_xfrm with the RSS fields options, there is no point in doing it if input_xfrm is not supported/requested. This is under the assumption that a driver that supports input_xfrm will also support ->get_rxfh_fields(), so add a WARN_ON() to ethtool_check_ops() to verify it, and remove the op NULL check. This fixes the following error in mlx4_en, which doesn't support getting/setting RXFH fields. $ ethtool --set-rxfh-indir eth2 hfunc xor Cannot set RX flow hash configuration: Operation not supported Fixes: 72792461c8e8 ("net: ethtool: don't mux RXFH via rxnfc callbacks") Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250715140754.489677-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-16Bluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmapChristian Eggers
The 'quirks' member already ran out of bits on some platforms some time ago. Replace the integer member by a bitmap in order to have enough bits in future. Replace raw bit operations by accessor macros. Fixes: ff26b2dd6568 ("Bluetooth: Add quirk for broken READ_VOICE_SETTING") Fixes: 127881334eaa ("Bluetooth: Add quirk for broken READ_PAGE_SCAN_TYPE") Suggested-by: Pauli Virtanen <pav@iki.fi> Tested-by: Ivan Pravdin <ipravdin.official@gmail.com> Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Christian Eggers <ceggers@arri.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-16Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeoutLuiz Augusto von Dentz
This replaces the usage of HCI_ERROR_REMOTE_USER_TERM, which as the name suggest is to indicate a regular disconnection initiated by an user, with HCI_ERROR_AUTH_FAILURE to indicate the session has timeout thus any pairing shall be considered as failed. Fixes: 1e91c29eb60c ("Bluetooth: Use hci_disconnect for immediate disconnection from SMP") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-16Bluetooth: SMP: If an unallowed command is received consider it a failureLuiz Augusto von Dentz
If a command is received while a bonding is ongoing consider it a pairing failure so the session is cleanup properly and the device is disconnected immediately instead of continuing with other commands that may result in the session to get stuck without ever completing such as the case bellow: > ACL Data RX: Handle 2048 flags 0x02 dlen 21 SMP: Identity Information (0x08) len 16 Identity resolving key[16]: d7e08edef97d3e62cd2331f82d8073b0 > ACL Data RX: Handle 2048 flags 0x02 dlen 21 SMP: Signing Information (0x0a) len 16 Signature key[16]: 1716c536f94e843a9aea8b13ffde477d Bluetooth: hci0: unexpected SMP command 0x0a from XX:XX:XX:XX:XX:XX > ACL Data RX: Handle 2048 flags 0x02 dlen 12 SMP: Identity Address Information (0x09) len 7 Address: XX:XX:XX:XX:XX:XX (Intel Corporate) While accourding to core spec 6.1 the expected order is always BD_ADDR first first then CSRK: When using LE legacy pairing, the keys shall be distributed in the following order: LTK by the Peripheral EDIV and Rand by the Peripheral IRK by the Peripheral BD_ADDR by the Peripheral CSRK by the Peripheral LTK by the Central EDIV and Rand by the Central IRK by the Central BD_ADDR by the Central CSRK by the Central When using LE Secure Connections, the keys shall be distributed in the following order: IRK by the Peripheral BD_ADDR by the Peripheral CSRK by the Peripheral IRK by the Central BD_ADDR by the Central CSRK by the Central According to the Core 6.1 for commands used for key distribution "Key Rejected" can be used: '3.6.1. Key distribution and generation A device may reject a distributed key by sending the Pairing Failed command with the reason set to "Key Rejected". Fixes: b28b4943660f ("Bluetooth: Add strict checks for allowed SMP PDUs") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-16Bluetooth: hci_sync: fix connectable extended advertising when using static ↵Alessandro Gasbarroni
random address Currently, the connectable flag used by the setup of an extended advertising instance drives whether we require privacy when trying to pass a random address to the advertising parameters (Own Address). If privacy is not required, then it automatically falls back to using the controller's public address. This can cause problems when using controllers that do not have a public address set, but instead use a static random address. e.g. Assume a BLE controller that does not have a public address set. The controller upon powering is set with a random static address by default by the kernel. < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 Address: E4:AF:26:D8:3E:3A (Static) > HCI Event: Command Complete (0x0e) plen 4 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) Setting non-connectable extended advertisement parameters in bluetoothctl mgmt add-ext-adv-params -r 0x801 -x 0x802 -P 2M -g 1 correctly sets Own address type as Random < HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036) plen 25 ... Own address type: Random (0x01) Setting connectable extended advertisement parameters in bluetoothctl mgmt add-ext-adv-params -r 0x801 -x 0x802 -P 2M -g -c 1 mistakenly sets Own address type to Public (which causes to use Public Address 00:00:00:00:00:00) < HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036) plen 25 ... Own address type: Public (0x00) This causes either the controller to emit an Invalid Parameters error or to mishandle the advertising. This patch makes sure that we use the already set static random address when requesting a connectable extended advertising when we don't require privacy and our public address is not set (00:00:00:00:00:00). Fixes: 3fe318ee72c5 ("Bluetooth: move hci_get_random_address() to hci_sync") Signed-off-by: Alessandro Gasbarroni <alex.gasbarroni@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-16Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb()Kuniyuki Iwashima
syzbot reported null-ptr-deref in l2cap_sock_resume_cb(). [0] l2cap_sock_resume_cb() has a similar problem that was fixed by commit 1bff51ea59a9 ("Bluetooth: fix use-after-free error in lock_sock_nested()"). Since both l2cap_sock_kill() and l2cap_sock_resume_cb() are executed under l2cap_sock_resume_cb(), we can avoid the issue simply by checking if chan->data is NULL. Let's not access to the killed socket in l2cap_sock_resume_cb(). [0]: BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:82 [inline] BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] BUG: KASAN: null-ptr-deref in l2cap_sock_resume_cb+0xb4/0x17c net/bluetooth/l2cap_sock.c:1711 Write of size 8 at addr 0000000000000570 by task kworker/u9:0/52 CPU: 1 UID: 0 PID: 52 Comm: kworker/u9:0 Not tainted 6.16.0-rc4-syzkaller-g7482bb149b9f #0 PREEMPT Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 Workqueue: hci0 hci_rx_work Call trace: show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:501 (C) __dump_stack+0x30/0x40 lib/dump_stack.c:94 dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120 print_report+0x58/0x84 mm/kasan/report.c:524 kasan_report+0xb0/0x110 mm/kasan/report.c:634 check_region_inline mm/kasan/generic.c:-1 [inline] kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:189 __kasan_check_write+0x20/0x30 mm/kasan/shadow.c:37 instrument_atomic_write include/linux/instrumented.h:82 [inline] clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] l2cap_sock_resume_cb+0xb4/0x17c net/bluetooth/l2cap_sock.c:1711 l2cap_security_cfm+0x524/0xea0 net/bluetooth/l2cap_core.c:7357 hci_auth_cfm include/net/bluetooth/hci_core.h:2092 [inline] hci_auth_complete_evt+0x2e8/0xa4c net/bluetooth/hci_event.c:3514 hci_event_func net/bluetooth/hci_event.c:7511 [inline] hci_event_packet+0x650/0xe9c net/bluetooth/hci_event.c:7565 hci_rx_work+0x320/0xb18 net/bluetooth/hci_core.c:4070 process_one_work+0x7e8/0x155c kernel/workqueue.c:3238 process_scheduled_works kernel/workqueue.c:3321 [inline] worker_thread+0x958/0xed8 kernel/workqueue.c:3402 kthread+0x5fc/0x75c kernel/kthread.c:464 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:847 Fixes: d97c899bde33 ("Bluetooth: Introduce L2CAP channel callback for resuming") Reported-by: syzbot+e4d73b165c3892852d22@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/686c12bd.a70a0220.29fe6c.0b13.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-15mptcp: reset fallback status gracefully at disconnect() timePaolo Abeni
mptcp_disconnect() clears the fallback bit unconditionally, without touching the associated flags. The bit clear is safe, as no fallback operation can race with that -- all subflow are already in TCP_CLOSE status thanks to the previous FASTCLOSE -- but we need to consistently reset all the fallback related status. Also acquire the relevant lock, to avoid fouling static analyzers. Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-3-391aff963322@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15mptcp: plug races between subflow fail and subflow creationPaolo Abeni
We have races similar to the one addressed by the previous patch between subflow failing and additional subflow creation. They are just harder to trigger. The solution is similar. Use a separate flag to track the condition 'socket state prevent any additional subflow creation' protected by the fallback lock. The socket fallback makes such flag true, and also receiving or sending an MP_FAIL option. The field 'allow_infinite_fallback' is now always touched under the relevant lock, we can drop the ONCE annotation on write. Fixes: 478d770008b0 ("mptcp: send out MP_FAIL when data checksum fails") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-2-391aff963322@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15mptcp: make fallback action and fallback decision atomicPaolo Abeni
Syzkaller reported the following splat: WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 __mptcp_do_fallback net/mptcp/protocol.h:1223 [inline] WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_do_fallback net/mptcp/protocol.h:1244 [inline] WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 check_fully_established net/mptcp/options.c:982 [inline] WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153 Modules linked in: CPU: 1 UID: 0 PID: 7704 Comm: syz.3.1419 Not tainted 6.16.0-rc3-gbd5ce2324dba #20 PREEMPT(voluntary) Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:__mptcp_do_fallback net/mptcp/protocol.h:1223 [inline] RIP: 0010:mptcp_do_fallback net/mptcp/protocol.h:1244 [inline] RIP: 0010:check_fully_established net/mptcp/options.c:982 [inline] RIP: 0010:mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153 Code: 24 18 e8 bb 2a 00 fd e9 1b df ff ff e8 b1 21 0f 00 e8 ec 5f c4 fc 44 0f b7 ac 24 b0 00 00 00 e9 54 f1 ff ff e8 d9 5f c4 fc 90 <0f> 0b 90 e9 b8 f4 ff ff e8 8b 2a 00 fd e9 8d e6 ff ff e8 81 2a 00 RSP: 0018:ffff8880a3f08448 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8880180a8000 RCX: ffffffff84afcf45 RDX: ffff888090223700 RSI: ffffffff84afdaa7 RDI: 0000000000000001 RBP: ffff888017955780 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8880180a8910 R14: ffff8880a3e9d058 R15: 0000000000000000 FS: 00005555791b8500(0000) GS:ffff88811c495000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000110c2800b7 CR3: 0000000058e44000 CR4: 0000000000350ef0 Call Trace: <IRQ> tcp_reset+0x26f/0x2b0 net/ipv4/tcp_input.c:4432 tcp_validate_incoming+0x1057/0x1b60 net/ipv4/tcp_input.c:5975 tcp_rcv_established+0x5b5/0x21f0 net/ipv4/tcp_input.c:6166 tcp_v4_do_rcv+0x5dc/0xa70 net/ipv4/tcp_ipv4.c:1925 tcp_v4_rcv+0x3473/0x44a0 net/ipv4/tcp_ipv4.c:2363 ip_protocol_deliver_rcu+0xba/0x480 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x2f1/0x500 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:317 [inline] NF_HOOK include/linux/netfilter.h:311 [inline] ip_local_deliver+0x1be/0x560 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:469 [inline] ip_rcv_finish net/ipv4/ip_input.c:447 [inline] NF_HOOK include/linux/netfilter.h:317 [inline] NF_HOOK include/linux/netfilter.h:311 [inline] ip_rcv+0x514/0x810 net/ipv4/ip_input.c:567 __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5975 __netif_receive_skb+0x1f/0x120 net/core/dev.c:6088 process_backlog+0x301/0x1360 net/core/dev.c:6440 __napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7453 napi_poll net/core/dev.c:7517 [inline] net_rx_action+0xb44/0x1010 net/core/dev.c:7644 handle_softirqs+0x1d0/0x770 kernel/softirq.c:579 do_softirq+0x3f/0x90 kernel/softirq.c:480 </IRQ> <TASK> __local_bh_enable_ip+0xed/0x110 kernel/softirq.c:407 local_bh_enable include/linux/bottom_half.h:33 [inline] inet_csk_listen_stop+0x2c5/0x1070 net/ipv4/inet_connection_sock.c:1524 mptcp_check_listen_stop.part.0+0x1cc/0x220 net/mptcp/protocol.c:2985 mptcp_check_listen_stop net/mptcp/mib.h:118 [inline] __mptcp_close+0x9b9/0xbd0 net/mptcp/protocol.c:3000 mptcp_close+0x2f/0x140 net/mptcp/protocol.c:3066 inet_release+0xed/0x200 net/ipv4/af_inet.c:435 inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:487 __sock_release+0xb3/0x270 net/socket.c:649 sock_close+0x1c/0x30 net/socket.c:1439 __fput+0x402/0xb70 fs/file_table.c:465 task_work_run+0x150/0x240 kernel/task_work.c:227 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop+0xd4/0xe0 kernel/entry/common.c:114 exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline] do_syscall_64+0x245/0x360 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fc92f8a36ad Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffcf52802d8 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4 RAX: 0000000000000000 RBX: 00007ffcf52803a8 RCX: 00007fc92f8a36ad RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003 RBP: 00007fc92fae7ba0 R08: 0000000000000001 R09: 0000002800000000 R10: 00007fc92f700000 R11: 0000000000000246 R12: 00007fc92fae5fac R13: 00007fc92fae5fa0 R14: 0000000000026d00 R15: 0000000000026c51 </TASK> irq event stamp: 4068 hardirqs last enabled at (4076): [<ffffffff81544816>] __up_console_sem+0x76/0x80 kernel/printk/printk.c:344 hardirqs last disabled at (4085): [<ffffffff815447fb>] __up_console_sem+0x5b/0x80 kernel/printk/printk.c:342 softirqs last enabled at (3096): [<ffffffff840e1be0>] local_bh_enable include/linux/bottom_half.h:33 [inline] softirqs last enabled at (3096): [<ffffffff840e1be0>] inet_csk_listen_stop+0x2c0/0x1070 net/ipv4/inet_connection_sock.c:1524 softirqs last disabled at (3097): [<ffffffff813b6b9f>] do_softirq+0x3f/0x90 kernel/softirq.c:480 Since we need to track the 'fallback is possible' condition and the fallback status separately, there are a few possible races open between the check and the actual fallback action. Add a spinlock to protect the fallback related information and use it close all the possible related races. While at it also remove the too-early clearing of allow_infinite_fallback in __mptcp_subflow_connect(): the field will be correctly cleared by subflow_finish_connect() if/when the connection will complete successfully. If fallback is not possible, as per RFC, reset the current subflow. Since the fallback operation can now fail and return value should be checked, rename the helper accordingly. Fixes: 0530020a7c8f ("mptcp: track and update contiguous data status") Cc: stable@vger.kernel.org Reported-by: Matthieu Baerts <matttbe@kernel.org> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/570 Reported-by: syzbot+5cf807c20386d699b524@syzkaller.appspotmail.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/555 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-1-391aff963322@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15ipv6: mcast: Remove unnecessary null check in ip6_mc_find_dev()Yue Haibing
These is no need to check null for idev before return NULL. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250714081732.3109764-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15don't open-code kernel_accept() in rds_tcp_accept_one()Al Viro
rds_tcp_accept_one() starts with a pretty much verbatim copy of kernel_accept(). Might as well use the real thing... That code went into mainline in 2009, kernel_accept() had been added in Aug 2006, the copyright on rds/tcp_listen.c is "Copyright (c) 2006 Oracle", so it's entirely possible that it predates the introduction of kernel_accept(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://patch.msgid.link/20250713180134.GC1880847@ZenIV Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15net: mctp: Add bind lookup testMatt Johnston
Test the preference order of bound socket matches with a series of test packets. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-8-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-15net: mctp: Test conflicts of connect() with bind()Matt Johnston
The addition of connect() adds new conflict cases to test. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-7-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-15net: mctp: Allow limiting binds to a peer addressMatt Johnston
Prior to calling bind() a program may call connect() on a socket to restrict to a remote peer address. Using connect() is the normal mechanism to specify a remote network peer, so we use that here. In MCTP connect() is only used for bound sockets - send() is not available for MCTP since a tag must be provided for each message. The smctp_type must match between connect() and bind() calls. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-6-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-15net: mctp: Use hashtable for bindsMatt Johnston
Ensure that a specific EID (remote or local) bind will match in preference to a MCTP_ADDR_ANY bind. This adds infrastructure for binding a socket to receive messages from a specific remote peer address, a future commit will expose an API for this. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-5-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-15net: mctp: Add test for conflicting bind()sMatt Johnston
Test pairwise combinations of bind addresses and types. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-4-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-15net: mctp: Treat MCTP_NET_ANY specially in bind()Matt Johnston
When a specific EID is passed as a bind address, it only makes sense to interpret with an actual network ID, so resolve that to the default network at bind time. For bind address of MCTP_ADDR_ANY, we want to be able to capture traffic to any network and address, so keep the current behaviour of matching traffic from any network interface (don't interpret MCTP_NET_ANY as the default network ID). Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-3-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-15net: mctp: Prevent duplicate bindsMatt Johnston
Disallow bind() calls that have the same arguments as existing bound sockets. Previously multiple sockets could bind() to the same type/local address, with an arbitrary socket receiving matched messages. This is only a partial fix, a future commit will define precedence order for MCTP_ADDR_ANY versus specific EID bind(), which are allowed to exist together. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-2-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-15net: mctp: mctp_test_route_extaddr_input cleanupMatt Johnston
The sock was not being released. Other than leaking, the stale socket will conflict with subsequent bind() calls in unrelated MCTP tests. Fixes: 46ee16462fed ("net: mctp: test: Add extaddr routing output test") Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://patch.msgid.link/20250710-mctp-bind-v4-1-8ec2f6460c56@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-15wifi: cfg80211: fix double free for link_sinfo in nl80211_station_dump()Sarika Sharma
Currently, the link_sinfo structure is being freed twice in nl80211_dump_station(), once after the send_station() call and again in the error handling path. This results in a double free of both link_sinfo and link_sinfo->pertid, which might lead to undefined behavior or kernel crashes. Hence, fix by ensuring cfg80211_sinfo_release_content() is only invoked once during execution of nl80211_station_dump(). Fixes: 49e47223ecc4 ("wifi: cfg80211: allocate memory for link_station info structure") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/81f30515-a83d-4b05-a9d1-e349969df9e9@sabinyo.mountain/ Reported-by: syzbot+4ba6272678aa468132c8@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68655325.a70a0220.5d25f.0316.GAE@google.com Signed-off-by: Sarika Sharma <quic_sarishar@quicinc.com> Link: https://patch.msgid.link/20250714084405.178066-1-quic_sarishar@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: cfg80211: fix off channel operation allowed check for MLOAditya Kumar Singh
In cfg80211_off_channel_oper_allowed(), the current logic disallows off-channel operations if any link operates on a radar channel, assuming such channels cannot be vacated. This assumption holds for non-MLO interfaces but not for MLO. With MLO and multi-radio devices, different links may operate on separate radios. This allows one link to scan off-channel while another remains on a radar channel. For example, in a 5 GHz split-phy setup, the lower band can scan while the upper band stays on a radar channel. Off-channel operations can be allowed if the radio/link onto which the input channel falls is different from the radio/link which has an active radar channel. Therefore, fix cfg80211_off_channel_oper_allowed() by returning false only if the requested channel maps to the same radio as an active radar channel. Allow off-channel operations when the requested channel is on a different radio, as in MLO with multi-radio setups. Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com> Signed-off-by: Amith A <quic_amitajit@quicinc.com> Link: https://patch.msgid.link/20250714040742.538550-1-quic_amitajit@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: use RCU-safe iteration in ieee80211_csa_finishMaharaja Kennadyrajan
The ieee80211_csa_finish() function currently uses for_each_sdata_link() to iterate over links of sdata. However, this macro internally uses wiphy_dereference(), which expects the wiphy->mtx lock to be held. When ieee80211_csa_finish() is invoked under an RCU read-side critical section (e.g., under rcu_read_lock()), this leads to a warning from the RCU debugging framework. WARNING: suspicious RCU usage net/mac80211/cfg.c:3830 suspicious rcu_dereference_protected() usage! This warning is triggered because wiphy_dereference() is not safe to use without holding the wiphy mutex, and it is being used in an RCU context without the required locking. Fix this by introducing and using a new macro, for_each_sdata_link_rcu(), which performs RCU-safe iteration over sdata links using list_for_each_entry_rcu() and rcu_dereference(). This ensures that the link pointers are accessed safely under RCU and eliminates the warning. Fixes: f600832794c9 ("wifi: mac80211: restructure tx profile retrieval for MLO MBSSID") Signed-off-by: Maharaja Kennadyrajan <maharaja.kennadyrajan@oss.qualcomm.com> Link: https://patch.msgid.link/20250711033846.40455-1-maharaja.kennadyrajan@oss.qualcomm.com [unindent like the non-RCU macro] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15ipv6: mcast: Avoid a duplicate pointer check in mld_del_delrec()Yue Haibing
Avoid duplicate non-null pointer check for pmc in mld_del_delrec(). No functional changes. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250714081949.3109947-1-yuehaibing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-15wifi: mac80211: parse unsolicited broadcast probe response dataYuvarani V
During commands like channel switch and color change, the updated unsolicited broadcast probe response template may be provided. However, this data is not parsed or acted upon in mac80211. Add support to parse it and set the BSS changed flag BSS_CHANGED_UNSOL_BCAST_PROBE_RESP so that drivers could take further action. Signed-off-by: Yuvarani V <quic_yuvarani@quicinc.com> Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com> Link: https://patch.msgid.link/20250710-update_unsol_bcast_probe_resp-v2-2-31aca39d3b30@oss.qualcomm.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: cfg80211: parse attribute to update unsolicited probe response templateYuvarani V
At present, the updated unsolicited broadcast probe response template is not processed during userspace commands such as channel switch or color change. This leads to an issue where older incorrect unsolicited probe response is still used during these events. Add support to parse the netlink attribute and store it so that mac80211/drivers can use it to set the BSS_CHANGED_UNSOL_BCAST_PROBE_RESP flag in order to send the updated unsolicited broadcast probe response templates during these events. Signed-off-by: Yuvarani V <quic_yuvarani@quicinc.com> Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com> Link: https://patch.msgid.link/20250710-update_unsol_bcast_probe_resp-v2-1-31aca39d3b30@oss.qualcomm.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: don't use TPE data from assoc responseJohannes Berg
Since there's no TPE element in the (re)assoc response, trying to use the data from it just leads to using the defaults, even though the real values had been set during authentication from the discovered BSS information. Fix this by simply not handling the TPE data in assoc response since it's not intended to be present, if it changes later the necessary changes will be made by tracking beacons later. As a side effect, by passing the real frame subtype, now print a correct value for ML reconfiguration responses. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250709233537.caa1ca853f5a.I588271f386731978163aa9d84ae75d6f79633e16@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: handle WLAN_HT_ACTION_NOTIFY_CHANWIDTH asyncMiri Korenblit
If this action frame, with the value of IEEE80211_HT_CHANWIDTH_ANY, arrives right after a beacon that changed the operational bandwidth from 20 MHz to 40 MHz, then updating the rate control bandwidth to 40 can race with updating the chanctx width (that happens in the beacon proccesing) back to 40 MHz: cpu0 cpu1 ieee80211_rx_mgmt_beacon ieee80211_config_bw ieee80211_link_change_chanreq (*)ieee80211_link_update_chanreq ieee80211_rx_h_action (**)ieee80211_sta_cur_vht_bw (***) ieee80211_recalc_chanctx_chantype in (**), the maximum between the capability width and the bss width is returned. But the bss width was just updated to 40 in (*), so the action frame handling code will increase the width of the rate control before the chanctx was increased (in ***), leading to a FW error (at least in iwlwifi driver. But this is wrong regardless). Fix this by simply handling the action frame async, so it won't race with the beacon proccessing. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218632 Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250709233537.bb9dc6f36c35.I39782d6077424e075974c3bee4277761494a1527@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: simplify __ieee80211_rx_h_amsdu() loopJohannes Berg
The loop handling individual subframes can be simplified to not use a somewhat confusing goto inside the loop. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250709233537.a217a1e8c667.I5283df9627912c06c8327b5786d6b715c6f3a4e1@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: don't mark keys for inactive links as uploadedMiri Korenblit
During resume, the driver can call ieee80211_add_gtk_rekey for keys that are not programmed into the device, e.g. keys of inactive links. Don't mark such a key as uploaded to avoid removing it later from the driver/device. Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250709233537.655094412b0b.Iacae31af3ba2a705da0a9baea976c2f799d65dc4@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: only assign chanctx in reconfigMiri Korenblit
At the end of reconfig we are activating all the links that were active before the error. During the activation, _ieee80211_link_use_channel will unassign and re-assign the chanctx from/to the link. But we only need to do the assign, as we are re-building the state as it was before the reconfig. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250709233537.6245c3ae7031.Ia5f68992c7c112bea8a426c9339f50c88be3a9ca@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: clean up cipher suite handlingJohannes Berg
Under the previous commit's assumption that FIPS isn't supported by hardware, we don't need to modify the cipher suite list, but just need to use the software one instead of the driver's in this case, so clean up the code. Also fix it to exclude TKIP in this case, since that's also dependent on RC4. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250709233537.cff427e8f8a5.I744d1ea6a37e3ea55ae8bc3e770acee734eff268@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: don't send keys to driver when fips_enabledJohannes Berg
When fips_enabled is set, don't send any keys to the driver (including possibly WoWLAN KEK/KCK material), assuming that no device exists with the necessary certifications. If this turns out to be false in the future, we can add a HW flag. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250709233537.e5eebc2b19d8.I968ef8c9ffb48d464ada78685bd25d22349fb063@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: remove ieee80211_link_unreserve_chanctx() return valueJohannes Berg
All the paths that could return an error are considered misuses of the function and WARN already, and none of the callers ever check the return value. Remove it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250709233537.5b436ee3c20c.Ieff61ec510939adb5fe6da4840557b649b3aa820@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: don't unreserve never reserved chanctxJohannes Berg
If a link has no chanctx, indicating it is an inactive link that we tracked CSA for, then attempting to unreserve the reserved chanctx will throw a warning and fail, since there never was a reserved chanctx. Skip the unreserve. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250709233537.022192f4b1ae.Ib58156ac13e674a9f4d714735be0764a244c0aae@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-15wifi: mac80211: make VHT opmode NSS ignore a debug messageJohannes Berg
There's no need to always print it, it's only useful for debugging specific client issues. Make it a debug message. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/linux-wireless/20250529070922.3467-1-pmenzel@molgen.mpg.de/ Link: https://patch.msgid.link/20250708105849.22448-2-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-14tcp: stronger sk_rcvbuf checksEric Dumazet
Currently, TCP stack accepts incoming packet if sizes of receive queues are below sk->sk_rcvbuf limit. This can cause memory overshoot if the packet is big, like an 1/2 MB BIG TCP one. Refine the check to take into account the incoming skb truesize. Note that we still accept the packet if the receive queue is empty, to not completely freeze TCP flows in pathological conditions. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-8-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-14tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skbEric Dumazet
These functions to not modify the skb, add a const qualifier. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250711114006.480026-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>