summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
3 daysMerge tag 'net-6.16-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and wireless. Current release - regressions: - af_unix: allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD Current release - new code bugs: - eth: airoha: correct enable mask for RX queues 16-31 - veth: prevent NULL pointer dereference in veth_xdp_rcv when peer disappears under traffic - ipv6: move fib6_config_validate() to ip6_route_add(), prevent invalid routes Previous releases - regressions: - phy: phy_caps: don't skip better duplex match on non-exact match - dsa: b53: fix untagged traffic sent via cpu tagged with VID 0 - Revert "wifi: mwifiex: Fix HT40 bandwidth issue.", it caused transient packet loss, exact reason not fully understood, yet Previous releases - always broken: - net: clear the dst when BPF is changing skb protocol (IPv4 <> IPv6) - sched: sfq: fix a potential crash on gso_skb handling - Bluetooth: intel: improve rx buffer posting to avoid causing issues in the firmware - eth: intel: i40e: make reset handling robust against multiple requests - eth: mlx5: ensure FW pages are always allocated on the local NUMA node, even when device is configure to 'serve' another node - wifi: ath12k: fix GCC_GCC_PCIE_HOT_RST definition for WCN7850, prevent kernel crashes - wifi: ath11k: avoid burning CPU in ath11k_debugfs_fw_stats_request() for 3 sec if fw_stats_done is not set" * tag 'net-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits) selftests: drv-net: rss_ctx: Add test for ntuple rules targeting default RSS context net: ethtool: Don't check if RSS context exists in case of context 0 af_unix: Allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD. ipv6: Move fib6_config_validate() to ip6_route_add(). net: drv: netdevsim: don't napi_complete() from netpoll net/mlx5: HWS, Add error checking to hws_bwc_rule_complex_hash_node_get() veth: prevent NULL pointer dereference in veth_xdp_rcv net_sched: remove qdisc_tree_flush_backlog() net_sched: ets: fix a race in ets_qdisc_change() net_sched: tbf: fix a race in tbf_change() net_sched: red: fix a race in __red_change() net_sched: prio: fix a race in prio_tune() net_sched: sch_sfq: reject invalid perturb period net: phy: phy_caps: Don't skip better duplex macth on non-exact match MAINTAINERS: Update Kuniyuki Iwashima's email address. selftests: net: add test case for NAT46 looping back dst net: clear the dst when changing skb protocol net/mlx5e: Fix number of lanes to UNKNOWN when using data_rate_oper net/mlx5e: Fix leak of Geneve TLV option object net/mlx5: HWS, make sure the uplink is the last destination ...
3 daysMerge tag 'wireless-2025-06-12' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Another quick round of updates: - revert mwifiex HT40 that was causing issues - many ath10k/ath11k/ath12k fixes - re-add some iwlwifi code I lost in a merge - use kfree_sensitive() on an error path in cfg80211 * tag 'wireless-2025-06-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211: use kfree_sensitive() for connkeys cleanup wifi: iwlwifi: fix merge damage related to iwl_pci_resume Revert "wifi: mwifiex: Fix HT40 bandwidth issue." wifi: ath12k: fix uaf in ath12k_core_init() wifi: ath12k: Fix hal_reo_cmd_status kernel-doc wifi: ath12k: fix GCC_GCC_PCIE_HOT_RST definition for WCN7850 wifi: ath11k: validate ath11k_crypto_mode on top of ath11k_core_qmi_firmware_ready wifi: ath11k: consistently use ath11k_mac_get_fw_stats() wifi: ath11k: move locking outside of ath11k_mac_get_fw_stats() wifi: ath11k: adjust unlock sequence in ath11k_update_stats_event() wifi: ath11k: move some firmware stats related functions outside of debugfs wifi: ath11k: don't wait when there is no vdev started wifi: ath11k: don't use static variables in ath11k_debugfs_fw_stats_process() wifi: ath11k: avoid burning CPU in ath11k_debugfs_fw_stats_request() wil6210: fix support for sparrow chipsets wifi: ath10k: Avoid vdev delete timeout when firmware is already down ath10k: snoc: fix unbalanced IRQ enable in crash recovery ==================== Link: https://patch.msgid.link/20250612082519.11447-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 daysnet: ethtool: Don't check if RSS context exists in case of context 0Gal Pressman
Context 0 (default context) always exists, there is no need to check whether it exists or not when adding a flow steering rule. The existing check fails when creating a flow steering rule for context 0 as it is not stored in the rss_ctx xarray. For example: $ ethtool --config-ntuple eth2 flow-type tcp4 dst-ip 194.237.147.23 dst-port 19983 context 0 loc 618 rmgr: Cannot insert RX class rule: Invalid argument Cannot insert classification rule An example usecase for this could be: - A high-priority rule (loc 0) directing specific port traffic to context 0. - A low-priority rule (loc 1) directing all other TCP traffic to context 1. This is a user-visible regression that was caught in our testing environment, it was not reported by a user yet. Fixes: de7f7582dff2 ("net: ethtool: prevent flow steering to RSS contexts which don't exist") Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250612071958.1696361-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 daysMerge tag 'for-net-2025-06-11' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - eir: Fix NULL pointer deference on eir_get_service_data - eir: Fix possible crashes on eir_create_adv_data - hci_sync: Fix broadcast/PA when using an existing instance - ISO: Fix using BT_SK_PA_SYNC to detect BIS sockets - ISO: Fix not using bc_sid as advertisement SID - MGMT: Fix sparse errors * tag 'for-net-2025-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: MGMT: Fix sparse errors Bluetooth: ISO: Fix not using bc_sid as advertisement SID Bluetooth: ISO: Fix using BT_SK_PA_SYNC to detect BIS sockets Bluetooth: eir: Fix possible crashes on eir_create_adv_data Bluetooth: hci_sync: Fix broadcast/PA when using an existing instance Bluetooth: Fix NULL pointer deference on eir_get_service_data ==================== Link: https://patch.msgid.link/20250611204944.1559356-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 daysaf_unix: Allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD.Kuniyuki Iwashima
Before the cited commit, the kernel unconditionally embedded SCM credentials to skb for embryo sockets even when both the sender and listener disabled SO_PASSCRED and SO_PASSPIDFD. Now, the credentials are added to skb only when configured by the sender or the listener. However, as reported in the link below, it caused a regression for some programs that assume credentials are included in every skb, but sometimes not now. The only problematic scenario would be that a socket starts listening before setting the option. Then, there will be 2 types of non-small race window, where a client can send skb without credentials, which the peer receives as an "invalid" message (and aborts the connection it seems ?): Client Server ------ ------ s1.listen() <-- No SO_PASS{CRED,PIDFD} s2.connect() s2.send() <-- w/o cred s1.setsockopt(SO_PASS{CRED,PIDFD}) s2.send() <-- w/ cred or Client Server ------ ------ s1.listen() <-- No SO_PASS{CRED,PIDFD} s2.connect() s2.send() <-- w/o cred s3, _ = s1.accept() <-- Inherit cred options s2.send() <-- w/o cred but not set yet s3.setsockopt(SO_PASS{CRED,PIDFD}) s2.send() <-- w/ cred It's unfortunate that buggy programs depend on the behaviour, but let's restore the previous behaviour. Fixes: 3f84d577b79d ("af_unix: Inherit sk_flags at connect().") Reported-by: Jacek Łuczak <difrost.kernel@gmail.com> Closes: https://lore.kernel.org/all/68d38b0b-1666-4974-85d4-15575789c8d4@gmail.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Tested-by: Christian Heusel <christian@heusel.eu> Tested-by: André Almeida <andrealmeid@igalia.com> Tested-by: Jacek Łuczak <difrost.kernel@gmail.com> Link: https://patch.msgid.link/20250611202758.3075858-1-kuni1840@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 daysipv6: Move fib6_config_validate() to ip6_route_add().Kuniyuki Iwashima
syzkaller created an IPv6 route from a malformed packet, which has a prefix len > 128, triggering the splat below. [0] This is a similar issue fixed by commit 586ceac9acb7 ("ipv6: Restore fib6_config validation for SIOCADDRT."). The cited commit removed fib6_config validation from some callers of ip6_add_route(). Let's move the validation back to ip6_route_add() and ip6_route_multipath_add(). [0]: UBSAN: array-index-out-of-bounds in ./include/net/ipv6.h:616:34 index 20 is out of range for type '__u8 [16]' CPU: 1 UID: 0 PID: 7444 Comm: syz.0.708 Not tainted 6.16.0-rc1-syzkaller-g19272b37aa4f #0 PREEMPT Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff80078a80>] dump_backtrace+0x2e/0x3c arch/riscv/kernel/stacktrace.c:132 [<ffffffff8000327a>] show_stack+0x30/0x3c arch/riscv/kernel/stacktrace.c:138 [<ffffffff80061012>] __dump_stack lib/dump_stack.c:94 [inline] [<ffffffff80061012>] dump_stack_lvl+0x12e/0x1a6 lib/dump_stack.c:120 [<ffffffff800610a6>] dump_stack+0x1c/0x24 lib/dump_stack.c:129 [<ffffffff8001c0ea>] ubsan_epilogue+0x14/0x46 lib/ubsan.c:233 [<ffffffff819ba290>] __ubsan_handle_out_of_bounds+0xf6/0xf8 lib/ubsan.c:455 [<ffffffff85b363a4>] ipv6_addr_prefix include/net/ipv6.h:616 [inline] [<ffffffff85b363a4>] ip6_route_info_create+0x8f8/0x96e net/ipv6/route.c:3793 [<ffffffff85b635da>] ip6_route_add+0x2a/0x1aa net/ipv6/route.c:3889 [<ffffffff85b02e08>] addrconf_prefix_route+0x2c4/0x4e8 net/ipv6/addrconf.c:2487 [<ffffffff85b23bb2>] addrconf_prefix_rcv+0x1720/0x1e62 net/ipv6/addrconf.c:2878 [<ffffffff85b92664>] ndisc_router_discovery+0x1a06/0x3504 net/ipv6/ndisc.c:1570 [<ffffffff85b99038>] ndisc_rcv+0x500/0x600 net/ipv6/ndisc.c:1874 [<ffffffff85bc2c18>] icmpv6_rcv+0x145e/0x1e0a net/ipv6/icmp.c:988 [<ffffffff85af6798>] ip6_protocol_deliver_rcu+0x18a/0x1976 net/ipv6/ip6_input.c:436 [<ffffffff85af8078>] ip6_input_finish+0xf4/0x174 net/ipv6/ip6_input.c:480 [<ffffffff85af8262>] NF_HOOK include/linux/netfilter.h:317 [inline] [<ffffffff85af8262>] NF_HOOK include/linux/netfilter.h:311 [inline] [<ffffffff85af8262>] ip6_input+0x16a/0x70c net/ipv6/ip6_input.c:491 [<ffffffff85af8dcc>] ip6_mc_input+0x5c8/0x1268 net/ipv6/ip6_input.c:588 [<ffffffff85af6112>] dst_input include/net/dst.h:469 [inline] [<ffffffff85af6112>] ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline] [<ffffffff85af6112>] NF_HOOK include/linux/netfilter.h:317 [inline] [<ffffffff85af6112>] NF_HOOK include/linux/netfilter.h:311 [inline] [<ffffffff85af6112>] ipv6_rcv+0x5ae/0x6e0 net/ipv6/ip6_input.c:309 [<ffffffff85087e84>] __netif_receive_skb_one_core+0x106/0x16e net/core/dev.c:5977 [<ffffffff85088104>] __netif_receive_skb+0x2c/0x144 net/core/dev.c:6090 [<ffffffff850883c6>] netif_receive_skb_internal net/core/dev.c:6176 [inline] [<ffffffff850883c6>] netif_receive_skb+0x1aa/0xbf2 net/core/dev.c:6235 [<ffffffff8328656e>] tun_rx_batched.isra.0+0x430/0x686 drivers/net/tun.c:1485 [<ffffffff8329ed3a>] tun_get_user+0x2952/0x3d6c drivers/net/tun.c:1938 [<ffffffff832a21e0>] tun_chr_write_iter+0xc4/0x21c drivers/net/tun.c:1984 [<ffffffff80b9b9ae>] new_sync_write fs/read_write.c:593 [inline] [<ffffffff80b9b9ae>] vfs_write+0x56c/0xa9a fs/read_write.c:686 [<ffffffff80b9c2be>] ksys_write+0x126/0x228 fs/read_write.c:738 [<ffffffff80b9c42e>] __do_sys_write fs/read_write.c:749 [inline] [<ffffffff80b9c42e>] __se_sys_write fs/read_write.c:746 [inline] [<ffffffff80b9c42e>] __riscv_sys_write+0x6e/0x94 fs/read_write.c:746 [<ffffffff80076912>] syscall_handler+0x94/0x118 arch/riscv/include/asm/syscall.h:112 [<ffffffff8637e31e>] do_trap_ecall_u+0x396/0x530 arch/riscv/kernel/traps.c:341 [<ffffffff863a69e2>] handle_exception+0x146/0x152 arch/riscv/kernel/entry.S:197 Fixes: fa76c1674f2e ("ipv6: Move some validation from ip6_route_info_create() to rtm_to_fib6_config().") Reported-by: syzbot+4c2358694722d304c44e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6849b8c3.a00a0220.1eb5f5.00f0.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611193551.2999991-1-kuni1840@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 daysnet_sched: ets: fix a race in ets_qdisc_change()Eric Dumazet
Gerrard Tai reported a race condition in ETS, whenever SFQ perturb timer fires at the wrong time. The race is as follows: CPU 0 CPU 1 [1]: lock root [2]: qdisc_tree_flush_backlog() [3]: unlock root | | [5]: lock root | [6]: rehash | [7]: qdisc_tree_reduce_backlog() | [4]: qdisc_put() This can be abused to underflow a parent's qlen. Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog() should fix the race, because all packets will be purged from the qdisc before releasing the lock. Fixes: b05972f01e7d ("net: sched: tbf: don't call qdisc_put() while holding tree lock") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 daysnet_sched: tbf: fix a race in tbf_change()Eric Dumazet
Gerrard Tai reported a race condition in TBF, whenever SFQ perturb timer fires at the wrong time. The race is as follows: CPU 0 CPU 1 [1]: lock root [2]: qdisc_tree_flush_backlog() [3]: unlock root | | [5]: lock root | [6]: rehash | [7]: qdisc_tree_reduce_backlog() | [4]: qdisc_put() This can be abused to underflow a parent's qlen. Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog() should fix the race, because all packets will be purged from the qdisc before releasing the lock. Fixes: b05972f01e7d ("net: sched: tbf: don't call qdisc_put() while holding tree lock") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://patch.msgid.link/20250611111515.1983366-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 daysnet_sched: red: fix a race in __red_change()Eric Dumazet
Gerrard Tai reported a race condition in RED, whenever SFQ perturb timer fires at the wrong time. The race is as follows: CPU 0 CPU 1 [1]: lock root [2]: qdisc_tree_flush_backlog() [3]: unlock root | | [5]: lock root | [6]: rehash | [7]: qdisc_tree_reduce_backlog() | [4]: qdisc_put() This can be abused to underflow a parent's qlen. Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog() should fix the race, because all packets will be purged from the qdisc before releasing the lock. Fixes: 0c8d13ac9607 ("net: sched: red: delay destroying child qdisc on replace") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 daysnet_sched: prio: fix a race in prio_tune()Eric Dumazet
Gerrard Tai reported a race condition in PRIO, whenever SFQ perturb timer fires at the wrong time. The race is as follows: CPU 0 CPU 1 [1]: lock root [2]: qdisc_tree_flush_backlog() [3]: unlock root | | [5]: lock root | [6]: rehash | [7]: qdisc_tree_reduce_backlog() | [4]: qdisc_put() This can be abused to underflow a parent's qlen. Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog() should fix the race, because all packets will be purged from the qdisc before releasing the lock. Fixes: 7b8e0b6e6599 ("net: sched: prio: delay destroying child qdiscs on change") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 daysnet_sched: sch_sfq: reject invalid perturb periodEric Dumazet
Gerrard Tai reported that SFQ perturb_period has no range check yet, and this can be used to trigger a race condition fixed in a separate patch. We want to make sure ctl->perturb_period * HZ will not overflow and is positive. Tested: tc qd add dev lo root sfq perturb -10 # negative value : error Error: sch_sfq: invalid perturb period. tc qd add dev lo root sfq perturb 1000000000 # too big : error Error: sch_sfq: invalid perturb period. tc qd add dev lo root sfq perturb 2000000 # acceptable value tc -s -d qd sh dev lo qdisc sfq 8005: root refcnt 2 limit 127p quantum 64Kb depth 127 flows 128 divisor 1024 perturb 2000000sec Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250611083501.1810459-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 daysnet: clear the dst when changing skb protocolJakub Kicinski
A not-so-careful NAT46 BPF program can crash the kernel if it indiscriminately flips ingress packets from v4 to v6: BUG: kernel NULL pointer dereference, address: 0000000000000000 ip6_rcv_core (net/ipv6/ip6_input.c:190:20) ipv6_rcv (net/ipv6/ip6_input.c:306:8) process_backlog (net/core/dev.c:6186:4) napi_poll (net/core/dev.c:6906:9) net_rx_action (net/core/dev.c:7028:13) do_softirq (kernel/softirq.c:462:3) netif_rx (net/core/dev.c:5326:3) dev_loopback_xmit (net/core/dev.c:4015:2) ip_mc_finish_output (net/ipv4/ip_output.c:363:8) NF_HOOK (./include/linux/netfilter.h:314:9) ip_mc_output (net/ipv4/ip_output.c:400:5) dst_output (./include/net/dst.h:459:9) ip_local_out (net/ipv4/ip_output.c:130:9) ip_send_skb (net/ipv4/ip_output.c:1496:8) udp_send_skb (net/ipv4/udp.c:1040:8) udp_sendmsg (net/ipv4/udp.c:1328:10) The output interface has a 4->6 program attached at ingress. We try to loop the multicast skb back to the sending socket. Ingress BPF runs as part of netif_rx(), pushes a valid v6 hdr and changes skb->protocol to v6. We enter ip6_rcv_core which tries to use skb_dst(). But the dst is still an IPv4 one left after IPv4 mcast output. Clear the dst in all BPF helpers which change the protocol. Try to preserve metadata dsts, those may carry non-routing metadata. Cc: stable@vger.kernel.org Reviewed-by: Maciej Żenczykowski <maze@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Fixes: d219df60a70e ("bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()") Fixes: 1b00e0dfe7d0 ("bpf: update skb->protocol in bpf_skb_net_grow") Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper") Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250610001245.1981782-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 daysBluetooth: MGMT: Fix sparse errorsLuiz Augusto von Dentz
This fixes the following errors: net/bluetooth/mgmt.c:5400:59: sparse: sparse: incorrect type in argument 3 (different base types) @@ expected unsigned short [usertype] handle @@ got restricted __le16 [usertype] monitor_handle @@ net/bluetooth/mgmt.c:5400:59: sparse: expected unsigned short [usertype] handle net/bluetooth/mgmt.c:5400:59: sparse: got restricted __le16 [usertype] monitor_handle Fixes: e6ed54e86aae ("Bluetooth: MGMT: Fix UAF on mgmt_remove_adv_monitor_complete") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506060347.ux2O1p7L-lkp@intel.com/ Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: ISO: Fix not using bc_sid as advertisement SIDLuiz Augusto von Dentz
Currently bc_sid is being ignore when acting as Broadcast Source role, so this fix it by passing the bc_sid and then use it when programming the PA: < HCI Command: LE Set Exte.. (0x08|0x0036) plen 25 Handle: 0x01 Properties: 0x0000 Min advertising interval: 140.000 msec (0x00e0) Max advertising interval: 140.000 msec (0x00e0) Channel map: 37, 38, 39 (0x07) Own address type: Random (0x01) Peer address type: Public (0x00) Peer address: 00:00:00:00:00:00 (OUI 00-00-00) Filter policy: Allow Scan Request from Any, Allow Connect Request from Any (0x00) TX power: Host has no preference (0x7f) Primary PHY: LE 1M (0x01) Secondary max skip: 0x00 Secondary PHY: LE 2M (0x02) SID: 0x01 Scan request notifications: Disabled (0x00) Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: ISO: Fix using BT_SK_PA_SYNC to detect BIS socketsLuiz Augusto von Dentz
BT_SK_PA_SYNC is only valid for Broadcast Sinks which means socket used for Broadcast Sources wouldn't be able to use the likes of getpeername to read out the sockaddr_iso_bc fields which may have been update (e.g. bc_sid). Fixes: 0a766a0affb5 ("Bluetooth: ISO: Fix getpeername not returning sockaddr_iso_bc fields") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: eir: Fix possible crashes on eir_create_adv_dataLuiz Augusto von Dentz
eir_create_adv_data may attempt to add EIR_FLAGS and EIR_TX_POWER without checking if that would fit. Link: https://github.com/bluez/bluez/issues/1117#issuecomment-2958244066 Fixes: 01ce70b0a274 ("Bluetooth: eir: Move EIR/Adv Data functions to its own file") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: hci_sync: Fix broadcast/PA when using an existing instanceLuiz Augusto von Dentz
When using and existing adv_info instance for broadcast source it needs to be updated to periodic first before it can be reused, also in case the existing instance already have data hci_set_adv_instance_data cannot be used directly since it would overwrite the existing data so this reappend the original data after the Broadcast ID, if one was generated. Example: bluetoothctl># Add PBP to EA so it can be later referenced as the BIS ID bluetoothctl> advertise.service 0x1856 0x00 0x00 bluetoothctl> advertise on ... < HCI Command: LE Set Extended Advertising Data (0x08|0x0037) plen 13 Handle: 0x01 Operation: Complete extended advertising data (0x03) Fragment preference: Minimize fragmentation (0x01) Data length: 0x09 Service Data: Public Broadcast Announcement (0x1856) Data[2]: 0000 Flags: 0x06 LE General Discoverable Mode BR/EDR Not Supported ... bluetoothctl># Attempt to acquire Broadcast Source transport bluetoothctl>transport.acquire /org/bluez/hci0/pac_bcast0/fd0 ... < HCI Command: LE Set Extended Advertising Data (0x08|0x0037) plen 255 Handle: 0x01 Operation: Complete extended advertising data (0x03) Fragment preference: Minimize fragmentation (0x01) Data length: 0x0e Service Data: Broadcast Audio Announcement (0x1852) Broadcast ID: 11371620 (0xad8464) Service Data: Public Broadcast Announcement (0x1856) Data[2]: 0000 Flags: 0x06 LE General Discoverable Mode BR/EDR Not Supported Link: https://github.com/bluez/bluez/issues/1117 Fixes: eca0ae4aea66 ("Bluetooth: Add initial implementation of BIS connections") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
4 daysBluetooth: Fix NULL pointer deference on eir_get_service_dataLuiz Augusto von Dentz
The len parameter is considered optional so it can be NULL so it cannot be used for skipping to next entry of EIR_SERVICE_DATA. Fixes: 8f9ae5b3ae80 ("Bluetooth: eir: Add helpers for managing service data") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
5 dayswifi: cfg80211: use kfree_sensitive() for connkeys cleanupZilin Guan
The nl80211_parse_connkeys() function currently uses kfree() to release the 'result' structure in error handling paths. However, if an error occurs due to result->def being less than 0, the 'result' structure may contain sensitive information. To prevent potential leakage of sensitive data, replace kfree() with kfree_sensitive() when freeing 'result'. This change aligns with the approach used in its caller, nl80211_join_ibss(), enhancing the overall security of the wireless subsystem. Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Link: https://patch.msgid.link/20250523110156.4017111-1-zilin@seu.edu.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>
6 daysMerge tag 'for-net-2025-06-05' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - MGMT: Fix UAF on mgmt_remove_adv_monitor_complete - MGMT: Protect mgmt_pending list with its own lock - hci_core: fix list_for_each_entry_rcu usage - btintel_pcie: Increase the tx and rx descriptor count - btintel_pcie: Reduce driver buffer posting to prevent race condition - btintel_pcie: Fix driver not posting maximum rx buffers * tag 'for-net-2025-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: MGMT: Protect mgmt_pending list with its own lock Bluetooth: MGMT: Fix UAF on mgmt_remove_adv_monitor_complete Bluetooth: btintel_pcie: Reduce driver buffer posting to prevent race condition Bluetooth: btintel_pcie: Increase the tx and rx descriptor count Bluetooth: btintel_pcie: Fix driver not posting maximum rx buffers Bluetooth: hci_core: fix list_for_each_entry_rcu usage ==================== Link: https://patch.msgid.link/20250605191136.904411-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
6 daysnet_sched: sch_sfq: fix a potential crash on gso_skb handlingEric Dumazet
SFQ has an assumption of always being able to queue at least one packet. However, after the blamed commit, sch->q.len can be inflated by packets in sch->gso_skb, and an enqueue() on an empty SFQ qdisc can be followed by an immediate drop. Fix sfq_drop() to properly clear q->tail in this situation. Tested: ip netns add lb ip link add dev to-lb type veth peer name in-lb netns lb ethtool -K to-lb tso off # force qdisc to requeue gso_skb ip netns exec lb ethtool -K in-lb gro on # enable NAPI ip link set dev to-lb up ip -netns lb link set dev in-lb up ip addr add dev to-lb 192.168.20.1/24 ip -netns lb addr add dev in-lb 192.168.20.2/24 tc qdisc replace dev to-lb root sfq limit 100 ip netns exec lb netserver netperf -H 192.168.20.2 -l 100 & netperf -H 192.168.20.2 -l 100 & netperf -H 192.168.20.2 -l 100 & netperf -H 192.168.20.2 -l 100 & Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback") Reported-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Closes: https://lore.kernel.org/netdev/9da42688-bfaa-4364-8797-e9271f3bdaef@hetzner-cloud.de/ Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250606165127.3629486-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 daystreewide, timers: Rename from_timer() to timer_container_of()Ingo Molnar
Move this API to the canonical timer_*() namespace. [ tglx: Redone against pre rc1 ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
10 daysMerge tag 'net-6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from CAN, wireless, Bluetooth, and Netfilter. Current release - regressions: - Revert "kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests", makes kunit error out if compiler is old - wifi: iwlwifi: mvm: fix assert on suspend - rxrpc: fix return from none_validate_challenge() Current release - new code bugs: - ovpn: couple of fixes for socket cleanup and UDP-tunnel teardown - can: kvaser_pciefd: refine error prone echo_skb_max handling logic - fix net_devmem_bind_dmabuf() stub when DEVMEM not compiled - eth: airoha: fixes for config / accel in bridge mode Previous releases - regressions: - Bluetooth: hci_qca: move the SoC type check to the right place, fix GPIO integration - prevent a NULL deref in rtnl_create_link() after locking changes - fix udp gso skb_segment after pull from frag_list - hv_netvsc: fix potential deadlock in netvsc_vf_setxdp() Previous releases - always broken: - netfilter: - nf_nat: also check reverse tuple to obtain clashing entry - nf_set_pipapo_avx2: fix initial map fill (zeroing) - fix the helper for incremental update of packet checksums after modifying the IP address, used by ILA and BPF - eth: - stmmac: prevent div by 0 when clock rate is misconfigured - ice: fix Tx scheduler handling of XDP and changing queue count - eth: fix support for the RGMII interface when delays configured" * tag 'net-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits) calipso: unlock rcu before returning -EAFNOSUPPORT seg6: Fix validation of nexthop addresses net: prevent a NULL deref in rtnl_create_link() net: annotate data-races around cleanup_net_task selftests: drv-net: tso: make bkg() wait for socat to quit selftests: drv-net: tso: fix the GRE device name selftests: drv-net: add configs for the TSO test wireguard: device: enable threaded NAPI netlink: specs: rt-link: decode ip6gre netlink: specs: rt-link: add missing byte-order properties net: wwan: mhi_wwan_mbim: use correct mux_id for multiplexing wifi: cfg80211/mac80211: correctly parse S1G beacon optional elements net: dsa: b53: do not touch DLL_IQQD on bcm53115 net: dsa: b53: allow RGMII for bcm63xx RGMII ports net: dsa: b53: do not configure bcm63xx's IMP port interface net: dsa: b53: do not enable RGMII delay on bcm63xx net: dsa: b53: do not enable EEE on bcm63xx net: ti: icssg-prueth: Fix swapped TX stats for MII interfaces. selftests: netfilter: nft_nat.sh: add test for reverse clash with nat netfilter: nf_nat: also check reverse tuple to obtain clashing entry ...
10 daysBluetooth: MGMT: Protect mgmt_pending list with its own lockLuiz Augusto von Dentz
This uses a mutex to protect from concurrent access of mgmt_pending list which can cause crashes like: ================================================================== BUG: KASAN: slab-use-after-free in hci_sock_get_channel+0x60/0x68 net/bluetooth/hci_sock.c:91 Read of size 2 at addr ffff0000c48885b2 by task syz.4.334/7318 CPU: 0 UID: 0 PID: 7318 Comm: syz.4.334 Not tainted 6.15.0-rc7-syzkaller-g187899f4124a #0 PREEMPT Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call trace: show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:466 (C) __dump_stack+0x30/0x40 lib/dump_stack.c:94 dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120 print_address_description+0xa8/0x254 mm/kasan/report.c:408 print_report+0x68/0x84 mm/kasan/report.c:521 kasan_report+0xb0/0x110 mm/kasan/report.c:634 __asan_report_load2_noabort+0x20/0x2c mm/kasan/report_generic.c:379 hci_sock_get_channel+0x60/0x68 net/bluetooth/hci_sock.c:91 mgmt_pending_find+0x7c/0x140 net/bluetooth/mgmt_util.c:223 pending_find net/bluetooth/mgmt.c:947 [inline] remove_adv_monitor+0x44/0x1a4 net/bluetooth/mgmt.c:5445 hci_mgmt_cmd+0x780/0xc00 net/bluetooth/hci_sock.c:1712 hci_sock_sendmsg+0x544/0xbb0 net/bluetooth/hci_sock.c:1832 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg net/socket.c:727 [inline] sock_write_iter+0x25c/0x378 net/socket.c:1131 new_sync_write fs/read_write.c:591 [inline] vfs_write+0x62c/0x97c fs/read_write.c:684 ksys_write+0x120/0x210 fs/read_write.c:736 __do_sys_write fs/read_write.c:747 [inline] __se_sys_write fs/read_write.c:744 [inline] __arm64_sys_write+0x7c/0x90 fs/read_write.c:744 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0_svc+0x58/0x17c arch/arm64/kernel/entry-common.c:767 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:786 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600 Allocated by task 7037: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x40/0x78 mm/kasan/common.c:68 kasan_save_alloc_info+0x44/0x54 mm/kasan/generic.c:562 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x9c/0xb4 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __do_kmalloc_node mm/slub.c:4327 [inline] __kmalloc_noprof+0x2fc/0x4c8 mm/slub.c:4339 kmalloc_noprof include/linux/slab.h:909 [inline] sk_prot_alloc+0xc4/0x1f0 net/core/sock.c:2198 sk_alloc+0x44/0x3ac net/core/sock.c:2254 bt_sock_alloc+0x4c/0x300 net/bluetooth/af_bluetooth.c:148 hci_sock_create+0xa8/0x194 net/bluetooth/hci_sock.c:2202 bt_sock_create+0x14c/0x24c net/bluetooth/af_bluetooth.c:132 __sock_create+0x43c/0x91c net/socket.c:1541 sock_create net/socket.c:1599 [inline] __sys_socket_create net/socket.c:1636 [inline] __sys_socket+0xd4/0x1c0 net/socket.c:1683 __do_sys_socket net/socket.c:1697 [inline] __se_sys_socket net/socket.c:1695 [inline] __arm64_sys_socket+0x7c/0x94 net/socket.c:1695 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0_svc+0x58/0x17c arch/arm64/kernel/entry-common.c:767 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:786 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600 Freed by task 6607: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x40/0x78 mm/kasan/common.c:68 kasan_save_free_info+0x58/0x70 mm/kasan/generic.c:576 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x68/0x88 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2380 [inline] slab_free mm/slub.c:4642 [inline] kfree+0x17c/0x474 mm/slub.c:4841 sk_prot_free net/core/sock.c:2237 [inline] __sk_destruct+0x4f4/0x760 net/core/sock.c:2332 sk_destruct net/core/sock.c:2360 [inline] __sk_free+0x320/0x430 net/core/sock.c:2371 sk_free+0x60/0xc8 net/core/sock.c:2382 sock_put include/net/sock.h:1944 [inline] mgmt_pending_free+0x88/0x118 net/bluetooth/mgmt_util.c:290 mgmt_pending_remove+0xec/0x104 net/bluetooth/mgmt_util.c:298 mgmt_set_powered_complete+0x418/0x5cc net/bluetooth/mgmt.c:1355 hci_cmd_sync_work+0x204/0x33c net/bluetooth/hci_sync.c:334 process_one_work+0x7e8/0x156c kernel/workqueue.c:3238 process_scheduled_works kernel/workqueue.c:3319 [inline] worker_thread+0x958/0xed8 kernel/workqueue.c:3400 kthread+0x5fc/0x75c kernel/kthread.c:464 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:847 Fixes: a380b6cff1a2 ("Bluetooth: Add generic mgmt helper API") Closes: https://syzkaller.appspot.com/bug?extid=0a7039d5d9986ff4ecec Closes: https://syzkaller.appspot.com/bug?extid=cc0cc52e7f43dc9e6df1 Reported-by: syzbot+0a7039d5d9986ff4ecec@syzkaller.appspotmail.com Tested-by: syzbot+0a7039d5d9986ff4ecec@syzkaller.appspotmail.com Tested-by: syzbot+cc0cc52e7f43dc9e6df1@syzkaller.appspotmail.com Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 daysBluetooth: MGMT: Fix UAF on mgmt_remove_adv_monitor_completeLuiz Augusto von Dentz
This reworks MGMT_OP_REMOVE_ADV_MONITOR to not use mgmt_pending_add to avoid crashes like bellow: ================================================================== BUG: KASAN: slab-use-after-free in mgmt_remove_adv_monitor_complete+0xe5/0x540 net/bluetooth/mgmt.c:5406 Read of size 8 at addr ffff88801c53f318 by task kworker/u5:5/5341 CPU: 0 UID: 0 PID: 5341 Comm: kworker/u5:5 Not tainted 6.15.0-syzkaller-10402-g4cb6c8af8591 #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Workqueue: hci0 hci_cmd_sync_work Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xd2/0x2b0 mm/kasan/report.c:521 kasan_report+0x118/0x150 mm/kasan/report.c:634 mgmt_remove_adv_monitor_complete+0xe5/0x540 net/bluetooth/mgmt.c:5406 hci_cmd_sync_work+0x261/0x3a0 net/bluetooth/hci_sync.c:334 process_one_work kernel/workqueue.c:3238 [inline] process_scheduled_works+0xade/0x17b0 kernel/workqueue.c:3321 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3402 kthread+0x711/0x8a0 kernel/kthread.c:464 ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> Allocated by task 5987: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4358 kmalloc_noprof include/linux/slab.h:905 [inline] kzalloc_noprof include/linux/slab.h:1039 [inline] mgmt_pending_new+0x65/0x240 net/bluetooth/mgmt_util.c:252 mgmt_pending_add+0x34/0x120 net/bluetooth/mgmt_util.c:279 remove_adv_monitor+0x103/0x1b0 net/bluetooth/mgmt.c:5454 hci_mgmt_cmd+0x9c9/0xef0 net/bluetooth/hci_sock.c:1719 hci_sock_sendmsg+0x6ca/0xef0 net/bluetooth/hci_sock.c:1839 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg+0x219/0x270 net/socket.c:727 sock_write_iter+0x258/0x330 net/socket.c:1131 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x548/0xa90 fs/read_write.c:686 ksys_write+0x145/0x250 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 5989: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x62/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2380 [inline] slab_free mm/slub.c:4642 [inline] kfree+0x18e/0x440 mm/slub.c:4841 mgmt_pending_foreach+0xc9/0x120 net/bluetooth/mgmt_util.c:242 mgmt_index_removed+0x10d/0x2f0 net/bluetooth/mgmt.c:9366 hci_sock_bind+0xbe9/0x1000 net/bluetooth/hci_sock.c:1314 __sys_bind_socket net/socket.c:1810 [inline] __sys_bind+0x2c3/0x3e0 net/socket.c:1841 __do_sys_bind net/socket.c:1846 [inline] __se_sys_bind net/socket.c:1844 [inline] __x64_sys_bind+0x7a/0x90 net/socket.c:1844 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 66bd095ab5d4 ("Bluetooth: advmon offload MSFT remove monitor") Closes: https://syzkaller.appspot.com/bug?extid=feb0dc579bbe30a13190 Reported-by: syzbot+feb0dc579bbe30a13190@syzkaller.appspotmail.com Tested-by: syzbot+feb0dc579bbe30a13190@syzkaller.appspotmail.com Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 daysBluetooth: hci_core: fix list_for_each_entry_rcu usagePauli Virtanen
Releasing + re-acquiring RCU lock inside list_for_each_entry_rcu() loop body is not correct. Fix by taking the update-side hdev->lock instead. Fixes: c7eaf80bfb0c ("Bluetooth: Fix hci_link_tx_to RCU lock usage") Signed-off-by: Pauli Virtanen <pav@iki.fi> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
10 dayscalipso: unlock rcu before returning -EAFNOSUPPORTEric Dumazet
syzbot reported that a recent patch forgot to unlock rcu in the error path. Adopt the convention that netlbl_conn_setattr() is already using. Fixes: 6e9f2df1c550 ("calipso: Don't call calipso functions for AF_INET sk.") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://patch.msgid.link/20250604133826.1667664-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysseg6: Fix validation of nexthop addressesIdo Schimmel
The kernel currently validates that the length of the provided nexthop address does not exceed the specified length. This can lead to the kernel reading uninitialized memory if user space provided a shorter length than the specified one. Fix by validating that the provided length exactly matches the specified one. Fixes: d1df6fd8a1d2 ("ipv6: sr: define core operations for seg6local lightweight tunnel") Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250604113252.371528-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysnet: prevent a NULL deref in rtnl_create_link()Eric Dumazet
At the time rtnl_create_link() is running, dev->netdev_ops is NULL, we must not use netdev_lock_ops() or risk a NULL deref if CONFIG_NET_SHAPER is defined. Use netif_set_group() instead of dev_set_group(). RIP: 0010:netdev_need_ops_lock include/net/netdev_lock.h:33 [inline] RIP: 0010:netdev_lock_ops include/net/netdev_lock.h:41 [inline] RIP: 0010:dev_set_group+0xc0/0x230 net/core/dev_api.c:82 Call Trace: <TASK> rtnl_create_link+0x748/0xd10 net/core/rtnetlink.c:3674 rtnl_newlink_create+0x25c/0xb00 net/core/rtnetlink.c:3813 __rtnl_newlink net/core/rtnetlink.c:3940 [inline] rtnl_newlink+0x16d6/0x1c70 net/core/rtnetlink.c:4055 rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6944 netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2534 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline] netlink_unicast+0x75b/0x8d0 net/netlink/af_netlink.c:1339 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:712 [inline] Reported-by: syzbot+9fc858ba0312b42b577e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6840265f.a00a0220.d4325.0009.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations") Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250604105815.1516973-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysnet: annotate data-races around cleanup_net_taskEric Dumazet
from_cleanup_net() reads cleanup_net_task locklessly. Add READ_ONCE()/WRITE_ONCE() annotations to avoid a potential KCSAN warning, even if the race is harmless. Fixes: 0734d7c3d93c ("net: expedite synchronize_net() for cleanup_net()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20250604093928.1323333-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysMerge tag 'wireless-2025-06-05' of ↵Paolo Abeni
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Couple of quick fixes: - iwlwifi/iwlmld crash on certain error paths - iwlwifi/iwlmld regulatory data mixup - iwlwifi/iwlmld suspend/resume fix - iwlwifi MSI (without -X) fix - cfg80211/mac80211 S1G parsing fixes * tag 'wireless-2025-06-05' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211/mac80211: correctly parse S1G beacon optional elements wifi: iwlwifi: mld: Move regulatory domain initialization wifi: iwlwifi: pcie: fix non-MSIX handshake register wifi: iwlwifi: mld: avoid panic on init failure wifi: iwlwifi: mvm: fix assert on suspend ==================== Link: https://patch.msgid.link/20250605095443.17874-6-johannes@sipsolutions.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 dayswifi: cfg80211/mac80211: correctly parse S1G beacon optional elementsLachlan Hodges
S1G beacons are not traditional beacons but a type of extension frame. Extension frames contain the frame control and duration fields, followed by zero or more optional fields before the frame body. These optional fields are distinct from the variable length elements. The presence of optional fields is indicated in the frame control field. To correctly locate the elements offset, the frame control must be parsed to identify which optional fields are present. Currently, mac80211 parses S1G beacons based on fixed assumptions about the frame layout, without inspecting the frame control field. This can result in incorrect offsets to the "variable" portion of the frame. Properly parse S1G beacon frames by using the field lengths defined in IEEE 802.11-2024, section 9.3.4.3, ensuring that the elements offset is calculated accurately. Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results") Fixes: cd418ba63f0c ("mac80211: convert S1G beacon to scan results") Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250603053538.468562-1-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
11 daysnetfilter: nf_nat: also check reverse tuple to obtain clashing entryFlorian Westphal
The logic added in the blamed commit was supposed to only omit nat source port allocation if neither the existing nor the new entry are subject to NAT. However, its not enough to lookup the conntrack based on the proposed tuple, we must also check the reverse direction. Otherwise there are esoteric cases where the collision is in the reverse direction because that colliding connection has a port rewrite, but the new entry doesn't. In this case, we only check the new entry and then erronously conclude that no clash exists anymore. The existing (udp) tuple is: a:p -> b:P, with nat translation to s:P, i.e. pure daddr rewrite, reverse tuple in conntrack table is s:P -> a:p. When another UDP packet is sent directly to s, i.e. a:p->s:P, this is correctly detected as a colliding entry: tuple is taken by existing reply tuple in reverse direction. But the colliding conntrack is only searched for with unreversed direction, and we can't find such entry matching a:p->s:P. The incorrect conclusion is that the clashing entry has timed out and that no port address translation is required. Such conntrack will then be discarded at nf_confirm time because the proposed reverse direction clashes with an existing mapping in the conntrack table. Search for the reverse tuple too, this will then check the NAT bits of the colliding entry and triggers port reallocation. Followp patch extends nft_nat.sh selftest to cover this scenario. The IPS_SEQ_ADJUST change is also a bug fix: Instead of checking for SEQ_ADJ this tested for SEEN_REPLY and ASSURED by accident -- _BIT is only for use with the test_bit() API. This bug has little consequence in practice, because the sequence number adjustments are only useful for TCP which doesn't support clash resolution. The existing test case (conntrack_reverse_clash.sh) exercise a race condition path (parallel conntrack creation on different CPUs), so the colliding entries have neither SEEN_REPLY nor ASSURED set. Thanks to Yafang Shao and Shaun Brady for an initial investigation of this bug. Fixes: d8f84a9bc7c4 ("netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1795 Reported-by: Yafang Shao <laoar.shao@gmail.com> Reported-by: Shaun Brady <brady.1345@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
11 daysnetfilter: nf_set_pipapo_avx2: fix initial map fillFlorian Westphal
If the first field doesn't cover the entire start map, then we must zero out the remainder, else we leak those bits into the next match round map. The early fix was incomplete and did only fix up the generic C implementation. A followup patch adds a test case to nft_concat_range.sh. Fixes: 791a615b7ad2 ("netfilter: nf_set_pipapo: fix initial map fill") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 daysMerge tag 'nfs-for-6.16-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS clent updates from Anna Schumaker: "New Features: - Implement the Sunrpc rfc2203 rpcsec_gss sequence number cache - Add support for FALLOC_FL_ZERO_RANGE on NFS v4.2 - Add a localio sysfs attribute Stable Fixes: - Fix double-unlock bug in nfs_return_empty_folio() - Don't check for OPEN feature support in v4.1 - Always probe for LOCALIO support asynchronously - Prevent hang on NFS mounts with xprtsec=[m]tls Other Bugfixes: - xattr handlers should check for absent nfs filehandles - Fix setattr caching of TIME_[MODIFY|ACCESS]_SET when timestamps are delegated - Fix listxattr to return selinux security labels - Connect to NFSv3 DS using TLS if MDS connection uses TLS - Clear SB_RDONLY before getting a superblock, and ignore when remounting - Fix incorrect handling of NFS error codes in nfs4_do_mkdir() - Various nfs_localio fixes from Neil Brown that include fixing an rcu compilation error found by older gcc versions. - Update stats on flexfiles pNFS DSes when receiving NFS4ERR_DELAY Cleanups: - Add a refcount tracker for struct net in the nfs_client - Allow FREE_STATEID to clean up delegations - Always set NLINK even if the server doesn't support it - Cleanups to the NFS folio writeback code - Remove dead code from xs_tcp_tls_setup_socket()" * tag 'nfs-for-6.16-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (30 commits) flexfiles/pNFS: update stats on NFS4ERR_DELAY for v4.1 DSes nfs_localio: change nfsd_file_put_local() to take a pointer to __rcu pointer nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh() nfs_localio: duplicate nfs_close_local_fh() nfs_localio: simplify interface to nfsd for getting nfsd_file nfs_localio: always hold nfsd net ref with nfsd_file ref nfs_localio: use cmpxchg() to install new nfs_file_localio SUNRPC: Remove dead code from xs_tcp_tls_setup_socket() SUNRPC: Prevent hang on NFS mount with xprtsec=[m]tls nfs: fix incorrect handling of large-number NFS errors in nfs4_do_mkdir() nfs: ignore SB_RDONLY when remounting nfs nfs: clear SB_RDONLY before getting superblock NFS: always probe for LOCALIO support asynchronously pnfs/flexfiles: connect to NFSv3 DS using TLS if MDS connection uses TLS NFS: add localio to sysfs nfs: use writeback_iter directly nfs: refactor nfs_do_writepage nfs: don't return AOP_WRITEPAGE_ACTIVATE from nfs_do_writepage nfs: fold nfs_page_async_flush into nfs_do_writepage NFSv4: Always set NLINK even if the server doesn't support it ...
13 daysMerge tag 'vfs-6.16-rc1.netfs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull netfs updates from Christian Brauner: - The main API document has been extensively updated/rewritten - Fix an oops in write-retry due to mis-resetting the I/O iterator - Fix the recording of transferred bytes for short DIO reads - Fix a request's work item to not require a reference, thereby avoiding the need to get rid of it in BH/IRQ context - Fix waiting and waking to be consistent about the waitqueue used - Remove NETFS_SREQ_SEEK_DATA_READ, NETFS_INVALID_WRITE, NETFS_ICTX_WRITETHROUGH, NETFS_READ_HOLE_CLEAR, NETFS_RREQ_DONT_UNLOCK_FOLIOS, and NETFS_RREQ_BLOCKED - Reorder structs to eliminate holes - Remove netfs_io_request::ractl - Only provide proc_link field if CONFIG_PROC_FS=y - Remove folio_queue::marks3 - Fix undifferentiation of DIO reads from unbuffered reads * tag 'vfs-6.16-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: netfs: Fix undifferentiation of DIO reads from unbuffered reads netfs: Fix wait/wake to be consistent about the waitqueue used netfs: Fix the request's work item to not require a ref netfs: Fix setting of transferred bytes with short DIO reads netfs: Fix oops in write-retry from mis-resetting the subreq iterator fs/netfs: remove unused flag NETFS_RREQ_BLOCKED fs/netfs: remove unused flag NETFS_RREQ_DONT_UNLOCK_FOLIOS folio_queue: remove unused field `marks3` fs/netfs: declare field `proc_link` only if CONFIG_PROC_FS=y fs/netfs: remove `netfs_io_request.ractl` fs/netfs: reorder struct fields to eliminate holes fs/netfs: remove unused enum choice NETFS_READ_HOLE_CLEAR fs/netfs: remove unused flag NETFS_ICTX_WRITETHROUGH fs/netfs: remove unused source NETFS_INVALID_WRITE fs/netfs: remove unused flag NETFS_SREQ_SEEK_DATA_READ
14 daysnet: fix udp gso skb_segment after pull from frag_listShiming Cheng
Commit a1e40ac5b5e9 ("net: gso: fix udp gso fraglist segmentation after pull from frag_list") detected invalid geometry in frag_list skbs and redirects them from skb_segment_list to more robust skb_segment. But some packets with modified geometry can also hit bugs in that code. We don't know how many such cases exist. Addressing each one by one also requires touching the complex skb_segment code, which risks introducing bugs for other types of skbs. Instead, linearize all these packets that fail the basic invariants on gso fraglist skbs. That is more robust. If only part of the fraglist payload is pulled into head_skb, it will always cause exception when splitting skbs by skb_segment. For detailed call stack information, see below. Valid SKB_GSO_FRAGLIST skbs - consist of two or more segments - the head_skb holds the protocol headers plus first gso_size - one or more frag_list skbs hold exactly one segment - all but the last must be gso_size Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can modify fraglist skbs, breaking these invariants. In extreme cases they pull one part of data into skb linear. For UDP, this causes three payloads with lengths of (11,11,10) bytes were pulled tail to become (12,10,10) bytes. The skbs no longer meets the above SKB_GSO_FRAGLIST conditions because payload was pulled into head_skb, it needs to be linearized before pass to regular skb_segment. skb_segment+0xcd0/0xd14 __udp_gso_segment+0x334/0x5f4 udp4_ufo_fragment+0x118/0x15c inet_gso_segment+0x164/0x338 skb_mac_gso_segment+0xc4/0x13c __skb_gso_segment+0xc4/0x124 validate_xmit_skb+0x9c/0x2c0 validate_xmit_skb_list+0x4c/0x80 sch_direct_xmit+0x70/0x404 __dev_queue_xmit+0x64c/0xe5c neigh_resolve_output+0x178/0x1c4 ip_finish_output2+0x37c/0x47c __ip_finish_output+0x194/0x240 ip_finish_output+0x20/0xf4 ip_output+0x100/0x1a0 NF_HOOK+0xc4/0x16c ip_forward+0x314/0x32c ip_rcv+0x90/0x118 __netif_receive_skb+0x74/0x124 process_backlog+0xe8/0x1a4 __napi_poll+0x5c/0x1f8 net_rx_action+0x154/0x314 handle_softirqs+0x154/0x4b8 [118.376811] [C201134] rxq0_pus: [name:bug&]kernel BUG at net/core/skbuff.c:4278! [118.376829] [C201134] rxq0_pus: [name:traps&]Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP [118.470774] [C201134] rxq0_pus: [name:mrdump&]Kernel Offset: 0x178cc00000 from 0xffffffc008000000 [118.470810] [C201134] rxq0_pus: [name:mrdump&]PHYS_OFFSET: 0x40000000 [118.470827] [C201134] rxq0_pus: [name:mrdump&]pstate: 60400005 (nZCv daif +PAN -UAO) [118.470848] [C201134] rxq0_pus: [name:mrdump&]pc : [0xffffffd79598aefc] skb_segment+0xcd0/0xd14 [118.470900] [C201134] rxq0_pus: [name:mrdump&]lr : [0xffffffd79598a5e8] skb_segment+0x3bc/0xd14 [118.470928] [C201134] rxq0_pus: [name:mrdump&]sp : ffffffc008013770 Fixes: a1e40ac5b5e9 ("gso: fix udp gso fraglist segmentation after pull from frag_list") Signed-off-by: Shiming Cheng <shiming.cheng@mediatek.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-30bpf: Fix L4 csum update on IPv6 in CHECKSUM_COMPLETEPaul Chaignon
In Cilium, we use bpf_csum_diff + bpf_l4_csum_replace to, among other things, update the L4 checksum after reverse SNATing IPv6 packets. That use case is however not currently supported and leads to invalid skb->csum values in some cases. This patch adds support for IPv6 address changes in bpf_l4_csum_update via a new flag. When calling bpf_l4_csum_replace in Cilium, it ends up calling inet_proto_csum_replace_by_diff: 1: void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb, 2: __wsum diff, bool pseudohdr) 3: { 4: if (skb->ip_summed != CHECKSUM_PARTIAL) { 5: csum_replace_by_diff(sum, diff); 6: if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr) 7: skb->csum = ~csum_sub(diff, skb->csum); 8: } else if (pseudohdr) { 9: *sum = ~csum_fold(csum_add(diff, csum_unfold(*sum))); 10: } 11: } The bug happens when we're in the CHECKSUM_COMPLETE state. We've just updated one of the IPv6 addresses. The helper now updates the L4 header checksum on line 5. Next, it updates skb->csum on line 7. It shouldn't. For an IPv6 packet, the updates of the IPv6 address and of the L4 checksum will cancel each other. The checksums are set such that computing a checksum over the packet including its checksum will result in a sum of 0. So the same is true here when we update the L4 checksum on line 5. We'll update it as to cancel the previous IPv6 address update. Hence skb->csum should remain untouched in this case. The same bug doesn't affect IPv4 packets because, in that case, three fields are updated: the IPv4 address, the IP checksum, and the L4 checksum. The change to the IPv4 address and one of the checksums still cancel each other in skb->csum, but we're left with one checksum update and should therefore update skb->csum accordingly. That's exactly what inet_proto_csum_replace_by_diff does. This special case for IPv6 L4 checksums is also described atop inet_proto_csum_replace16, the function we should be using in this case. This patch introduces a new bpf_l4_csum_replace flag, BPF_F_IPV6, to indicate that we're updating the L4 checksum of an IPv6 packet. When the flag is set, inet_proto_csum_replace_by_diff will skip the skb->csum update. Fixes: 7d672345ed295 ("bpf: add generic bpf_csum_diff helper") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/96a6bc3a443e6f0b21ff7b7834000e17fb549e05.1748509484.git.paul.chaignon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-30net: Fix checksum update for ILA adj-transportPaul Chaignon
During ILA address translations, the L4 checksums can be handled in different ways. One of them, adj-transport, consist in parsing the transport layer and updating any found checksum. This logic relies on inet_proto_csum_replace_by_diff and produces an incorrect skb->csum when in state CHECKSUM_COMPLETE. This bug can be reproduced with a simple ILA to SIR mapping, assuming packets are received with CHECKSUM_COMPLETE: $ ip a show dev eth0 14: eth0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 62:ae:35:9e:0f:8d brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 3333:0:0:1::c078/64 scope global valid_lft forever preferred_lft forever inet6 fd00:10:244:1::c078/128 scope global nodad valid_lft forever preferred_lft forever inet6 fe80::60ae:35ff:fe9e:f8d/64 scope link proto kernel_ll valid_lft forever preferred_lft forever $ ip ila add loc_match fd00:10:244:1 loc 3333:0:0:1 \ csum-mode adj-transport ident-type luid dev eth0 Then I hit [fd00:10:244:1::c078]:8000 with a server listening only on [3333:0:0:1::c078]:8000. With the bug, the SYN packet is dropped with SKB_DROP_REASON_TCP_CSUM after inet_proto_csum_replace_by_diff changed skb->csum. The translation and drop are visible on pwru [1] traces: IFACE TUPLE FUNC eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) ipv6_rcv eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) ip6_rcv_core eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) nf_hook_slow eth0:9 [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp) inet_proto_csum_replace_by_diff eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) tcp_v6_early_demux eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_route_input eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_input eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_input_finish eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ip6_protocol_deliver_rcu eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) raw6_local_deliver eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) ipv6_raw_deliver eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) tcp_v6_rcv eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) __skb_checksum_complete eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) kfree_skb_reason(SKB_DROP_REASON_TCP_CSUM) eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_release_head_state eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_release_data eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) skb_free_head eth0:9 [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp) kfree_skbmem This is happening because inet_proto_csum_replace_by_diff is updating skb->csum when it shouldn't. The L4 checksum is updated such that it "cancels" the IPv6 address change in terms of checksum computation, so the impact on skb->csum is null. Note this would be different for an IPv4 packet since three fields would be updated: the IPv4 address, the IP checksum, and the L4 checksum. Two would cancel each other and skb->csum would still need to be updated to take the L4 checksum change into account. This patch fixes it by passing an ipv6 flag to inet_proto_csum_replace_by_diff, to skip the skb->csum update if we're in the IPv6 case. Note the behavior of the only other user of inet_proto_csum_replace_by_diff, the BPF subsystem, is left as is in this patch and fixed in the subsequent patch. With the fix, using the reproduction from above, I can confirm skb->csum is not touched by inet_proto_csum_replace_by_diff and the TCP SYN proceeds to the application after the ILA translation. Link: https://github.com/cilium/pwru [1] Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/b5539869e3550d46068504feb02d37653d939c0b.1748509484.git.paul.chaignon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-30hv_netvsc: fix potential deadlock in netvsc_vf_setxdp()Saurabh Sengar
The MANA driver's probe registers netdevice via the following call chain: mana_probe() register_netdev() register_netdevice() register_netdevice() calls notifier callback for netvsc driver, holding the netdev mutex via netdev_lock_ops(). Further this netvsc notifier callback end up attempting to acquire the same lock again in dev_xdp_propagate() leading to deadlock. netvsc_netdev_event() netvsc_vf_setxdp() dev_xdp_propagate() This deadlock was not observed so far because net_shaper_ops was never set, and thus the lock was effectively a no-op in this case. Fix this by using netif_xdp_propagate() instead of dev_xdp_propagate() to avoid recursive locking in this path. And, since no deadlock is observed on the other path which is via netvsc_probe, add the lock exclusivly for that path. Also, clean up the unregistration path by removing the unnecessary call to netvsc_vf_setxdp(), since unregister_netdevice_many_notify() already performs this cleanup via dev_xdp_uninstall(). Fixes: 97246d6d21c2 ("net: hold netdev instance lock during ndo_bpf") Cc: stable@vger.kernel.org Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Tested-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1748513910-23963-1-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-30net: Fix net_devmem_bind_dmabuf for non-devmem configsPranjal Shrivastava
Fix the signature of the net_devmem_bind_dmabuf API for CONFIG_NET_DEVMEM=n. Fixes: bd61848900bf ("net: devmem: Implement TX path") Signed-off-by: Pranjal Shrivastava <praan@google.com> Link: https://patch.msgid.link/20250528211058.1826608-1-praan@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-30net: dsa: tag_brcm: legacy: fix pskb_may_pull lengthÁlvaro Fernández Rojas
BRCM_LEG_PORT_ID was incorrectly used for pskb_may_pull length. The correct check is BRCM_LEG_TAG_LEN + VLAN_HLEN, or 10 bytes. Fixes: 964dbf186eaa ("net: dsa: tag_brcm: add support for legacy tags") Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250529124406.2513779-1-noltari@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-30Bluetooth: L2CAP: Fix not responding with L2CAP_CR_LE_ENCRYPTIONLuiz Augusto von Dentz
Depending on the security set the response to L2CAP_LE_CONN_REQ shall be just L2CAP_CR_LE_ENCRYPTION if only encryption when BT_SECURITY_MEDIUM is selected since that means security mode 2 which doesn't require authentication which is something that is covered in the qualification test L2CAP/LE/CFC/BV-25-C. Link: https://github.com/bluez/bluez/issues/1270 Fixes: 27e2d4c8d28b ("Bluetooth: Add basic LE L2CAP connect request receiving support") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-05-30Bluetooth: MGMT: reject malformed HCI_CMD_SYNC commandsDmitry Antipov
In 'mgmt_hci_cmd_sync()', check whether the size of parameters passed in 'struct mgmt_cp_hci_cmd_sync' matches the total size of the data (i.e. 'sizeof(struct mgmt_cp_hci_cmd_sync)' plus trailing bytes). Otherwise, large invalid 'params_len' will cause 'hci_cmd_sync_alloc()' to do 'skb_put_data()' from an area beyond the one actually passed to 'mgmt_hci_cmd_sync()'. Reported-by: syzbot+5fe2d5bfbfbec0b675a0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=5fe2d5bfbfbec0b675a0 Fixes: 827af4787e74 ("Bluetooth: MGMT: Add initial implementation of MGMT_OP_HCI_CMD_SYNC") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-05-29net: tipc: fix refcount warning in tipc_aead_encryptCharalampos Mitrodimas
syzbot reported a refcount warning [1] caused by calling get_net() on a network namespace that is being destroyed (refcount=0). This happens when a TIPC discovery timer fires during network namespace cleanup. The recently added get_net() call in commit e279024617134 ("net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done") attempts to hold a reference to the network namespace. However, if the namespace is already being destroyed, its refcount might be zero, leading to the use-after-free warning. Replace get_net() with maybe_get_net(), which safely checks if the refcount is non-zero before incrementing it. If the namespace is being destroyed, return -ENODEV early, after releasing the bearer reference. [1]: https://lore.kernel.org/all/68342b55.a70a0220.253bc2.0091.GAE@google.com/T/#m12019cf9ae77e1954f666914640efa36d52704a2 Reported-by: syzbot+f0c4a4aba757549ae26c@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68342b55.a70a0220.253bc2.0091.GAE@google.com/T/#m12019cf9ae77e1954f666914640efa36d52704a2 Fixes: e27902461713 ("net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done") Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net> Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech> Link: https://patch.msgid.link/20250527-net-tipc-warning-v2-1-df3dc398a047@posteo.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-29rxrpc: Fix return from none_validate_challenge()David Howells
Fix the return value of none_validate_challenge() to be explicitly true (which indicates the source packet should simply be discarded) rather than implicitly true (because rxrpc_abort_conn() always returns -EPROTO which gets converted to true). Note that this change doesn't change the behaviour of the code (which is correct by accident) and, in any case, we *shouldn't* get a CHALLENGE packet to an rxnull connection (ie. no security). Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lists.infradead.org/pipermail/linux-afs/2025-April/009738.html Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Jakub Kicinski <kuba@kernel.org> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Paolo Abeni <pabeni@redhat.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/10720.1748358103@warthog.procyon.org.uk Fixes: 5800b1cf3fd8 ("rxrpc: Allow CHALLENGEs to the passed to the app for a RESPONSE") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28page_pool: Fix use-after-free in page_pool_recycle_in_ringDong Chenchen
syzbot reported a uaf in page_pool_recycle_in_ring: BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30 kernel/locking/lockdep.c:5862 Read of size 8 at addr ffff8880286045a0 by task syz.0.284/6943 CPU: 0 UID: 0 PID: 6943 Comm: syz.0.284 Not tainted 6.13.0-rc3-syzkaller-gdfa94ce54f41 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x169/0x550 mm/kasan/report.c:489 kasan_report+0x143/0x180 mm/kasan/report.c:602 lock_release+0x151/0xa30 kernel/locking/lockdep.c:5862 __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:165 [inline] _raw_spin_unlock_bh+0x1b/0x40 kernel/locking/spinlock.c:210 spin_unlock_bh include/linux/spinlock.h:396 [inline] ptr_ring_produce_bh include/linux/ptr_ring.h:164 [inline] page_pool_recycle_in_ring net/core/page_pool.c:707 [inline] page_pool_put_unrefed_netmem+0x748/0xb00 net/core/page_pool.c:826 page_pool_put_netmem include/net/page_pool/helpers.h:323 [inline] page_pool_put_full_netmem include/net/page_pool/helpers.h:353 [inline] napi_pp_put_page+0x149/0x2b0 net/core/skbuff.c:1036 skb_pp_recycle net/core/skbuff.c:1047 [inline] skb_free_head net/core/skbuff.c:1094 [inline] skb_release_data+0x6c4/0x8a0 net/core/skbuff.c:1125 skb_release_all net/core/skbuff.c:1190 [inline] __kfree_skb net/core/skbuff.c:1204 [inline] sk_skb_reason_drop+0x1c9/0x380 net/core/skbuff.c:1242 kfree_skb_reason include/linux/skbuff.h:1263 [inline] __skb_queue_purge_reason include/linux/skbuff.h:3343 [inline] root cause is: page_pool_recycle_in_ring ptr_ring_produce spin_lock(&r->producer_lock); WRITE_ONCE(r->queue[r->producer++], ptr) //recycle last page to pool page_pool_release page_pool_scrub page_pool_empty_ring ptr_ring_consume page_pool_return_page //release all page __page_pool_destroy free_percpu(pool->recycle_stats); free(pool) //free spin_unlock(&r->producer_lock); //pool->ring uaf read recycle_stat_inc(pool, ring); page_pool can be free while page pool recycle the last page in ring. Add producer-lock barrier to page_pool_release to prevent the page pool from being free before all pages have been recycled. recycle_stat_inc() is empty when CONFIG_PAGE_POOL_STATS is not enabled, which will trigger Wempty-body build warning. Add definition for pool stat macro to fix warning. Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20250513083123.3514193-1-dongchenchen2@huawei.com Fixes: ff7d6b27f894 ("page_pool: refurbish version of page_pool code") Reported-by: syzbot+204a4382fcb3311f3858@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=204a4382fcb3311f3858 Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250527114152.3119109-1-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-28Fix sock_exceed_buf_limit not being triggered in __sk_mem_raise_allocatedTengteng Yang
When a process under memory pressure is not part of any cgroup and the charged flag is false, trace_sock_exceed_buf_limit was not called as expected. This regression was introduced by commit 2def8ff3fdb6 ("sock: Code cleanup on __sk_mem_raise_allocated()"). The fix changes the default value of charged to true while preserving existing logic. Fixes: 2def8ff3fdb6 ("sock: Code cleanup on __sk_mem_raise_allocated()") Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> Signed-off-by: Tengteng Yang <yangtengteng@bytedance.com> Link: https://patch.msgid.link/20250527030419.67693-1-yangtengteng@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-28Merge tag 'bpf-next-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Fix and improve BTF deduplication of identical BTF types (Alan Maguire and Andrii Nakryiko) - Support up to 12 arguments in BPF trampoline on arm64 (Xu Kuohai and Alexis Lothoré) - Support load-acquire and store-release instructions in BPF JIT on riscv64 (Andrea Parri) - Fix uninitialized values in BPF_{CORE,PROBE}_READ macros (Anton Protopopov) - Streamline allowed helpers across program types (Feng Yang) - Support atomic update for hashtab of BPF maps (Hou Tao) - Implement json output for BPF helpers (Ihor Solodrai) - Several s390 JIT fixes (Ilya Leoshkevich) - Various sockmap fixes (Jiayuan Chen) - Support mmap of vmlinux BTF data (Lorenz Bauer) - Support BPF rbtree traversal and list peeking (Martin KaFai Lau) - Tests for sockmap/sockhash redirection (Michal Luczaj) - Introduce kfuncs for memory reads into dynptrs (Mykyta Yatsenko) - Add support for dma-buf iterators in BPF (T.J. Mercier) - The verifier support for __bpf_trap() (Yonghong Song) * tag 'bpf-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (135 commits) bpf, arm64: Remove unused-but-set function and variable. selftests/bpf: Add tests with stack ptr register in conditional jmp bpf: Do not include stack ptr register in precision backtracking bookkeeping selftests/bpf: enable many-args tests for arm64 bpf, arm64: Support up to 12 function arguments bpf: Check rcu_read_lock_trace_held() in bpf_map_lookup_percpu_elem() bpf: Avoid __bpf_prog_ret0_warn when jit fails bpftool: Add support for custom BTF path in prog load/loadall selftests/bpf: Add unit tests with __bpf_trap() kfunc bpf: Warn with __bpf_trap() kfunc maybe due to uninitialized variable bpf: Remove special_kfunc_set from verifier selftests/bpf: Add test for open coded dmabuf_iter selftests/bpf: Add test for dmabuf_iter bpf: Add open coded dmabuf iterator bpf: Add dmabuf iterator dma-buf: Rename debugfs symbols bpf: Fix error return value in bpf_copy_from_user_dynptr libbpf: Use mmap to parse vmlinux BTF from sysfs selftests: bpf: Add a test for mmapable vmlinux BTF btf: Allow mmap of vmlinux btf ...
2025-05-28Merge tag 'net-next-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Implement the Device Memory TCP transmit path, allowing zero-copy data transmission on top of TCP from e.g. GPU memory to the wire. - Move all the IPv6 routing tables management outside the RTNL scope, under its own lock and RCU. The route control path is now 3x times faster. - Convert queue related netlink ops to instance lock, reducing again the scope of the RTNL lock. This improves the control plane scalability. - Refactor the software crc32c implementation, removing unneeded abstraction layers and improving significantly the related micro-benchmarks. - Optimize the GRO engine for UDP-tunneled traffic, for a 10% performance improvement in related stream tests. - Cover more per-CPU storage with local nested BH locking; this is a prep work to remove the current per-CPU lock in local_bh_disable() on PREMPT_RT. - Introduce and use nlmsg_payload helper, combining buffer bounds verification with accessing payload carried by netlink messages. Netfilter: - Rewrite the procfs conntrack table implementation, improving considerably the dump performance. A lot of user-space tools still use this interface. - Implement support for wildcard netdevice in netdev basechain and flowtables. - Integrate conntrack information into nft trace infrastructure. - Export set count and backend name to userspace, for better introspection. BPF: - BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops programs and can be controlled in similar way to traditional qdiscs using the "tc qdisc" command. - Refactor the UDP socket iterator, addressing long standing issues WRT duplicate hits or missed sockets. Protocols: - Improve TCP receive buffer auto-tuning and increase the default upper bound for the receive buffer; overall this improves the single flow maximum thoughput on 200Gbs link by over 60%. - Add AFS GSSAPI security class to AF_RXRPC; it provides transport security for connections to the AFS fileserver and VL server. - Improve TCP multipath routing, so that the sources address always matches the nexthop device. - Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS, and thus preventing DoS caused by passing around problematic FDs. - Retire DCCP socket. DCCP only receives updates for bugs, and major distros disable it by default. Its removal allows for better organisation of TCP fields to reduce the number of cache lines hit in the fast path. - Extend TCP drop-reason support to cover PAWS checks. Driver API: - Reorganize PTP ioctl flag support to require an explicit opt-in for the drivers, avoiding the problem of drivers not rejecting new unsupported flags. - Converted several device drivers to timestamping APIs. - Introduce per-PHY ethtool dump helpers, improving the support for dump operations targeting PHYs. Tests and tooling: - Add support for classic netlink in user space C codegen, so that ynl-c can now read, create and modify links, routes addresses and qdisc layer configuration. - Add ynl sub-types for binary attributes, allowing ynl-c to output known struct instead of raw binary data, clarifying the classic netlink output. - Extend MPTCP selftests to improve the code-coverage. - Add tests for XDP tail adjustment in AF_XDP. New hardware / drivers: - OpenVPN virtual driver: offload OpenVPN data channels processing to the kernel-space, increasing the data transfer throughput WRT the user-space implementation. - Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC. - Broadcom asp-v3.0 ethernet driver. - AMD Renoir ethernet device. - ReakTek MT9888 2.5G ethernet PHY driver. - Aeonsemi 10G C45 PHYs driver. Drivers: - Ethernet high-speed NICs: - nVidia/Mellanox (mlx5): - refactor the steering table handling to significantly reduce the amount of memory used - add support for complex matches in H/W flow steering - improve flow streeing error handling - convert to netdev instance locking - Intel (100G, ice, igb, ixgbe, idpf): - ice: add switchdev support for LLDP traffic over VF - ixgbe: add firmware manipulation and regions devlink support - igb: introduce support for frame transmission premption - igb: adds persistent NAPI configuration - idpf: introduce RDMA support - idpf: add initial PTP support - Meta (fbnic): - extend hardware stats coverage - add devlink dev flash support - Broadcom (bnxt): - add support for RX-side device memory TCP - Wangxun (txgbe): - implement support for udp tunnel offload - complete PTP and SRIOV support for AML 25G/10G devices - Ethernet NICs embedded and virtual: - Google (gve): - add device memory TCP TX support - Amazon (ena): - support persistent per-NAPI config - Airoha: - add H/W support for L2 traffic offload - add per flow stats for flow offloading - RealTek (rtl8211): add support for WoL magic packet - Synopsys (stmmac): - dwmac-socfpga 1000BaseX support - add Loongson-2K3000 support - introduce support for hardware-accelerated VLAN stripping - Broadcom (bcmgenet): - expose more H/W stats - Freescale (enetc, dpaa2-eth): - enetc: add MAC filter, VLAN filter RSS and loopback support - dpaa2-eth: convert to H/W timestamping APIs - vxlan: convert FDB table to rhashtable, for better scalabilty - veth: apply qdisc backpressure on full ring to reduce TX drops - Ethernet switches: - Microchip (kzZ88x3): add ETS scheduler support - Ethernet PHYs: - RealTek (rtl8211): - add support for WoL magic packet - add support for PHY LEDs - CAN: - Adds RZ/G3E CANFD support to the rcar_canfd driver. - Preparatory work for CAN-XL support. - Add self-tests framework with support for CAN physical interfaces. - WiFi: - mac80211: - scan improvements with multi-link operation (MLO) - Qualcomm (ath12k): - enable AHB support for IPQ5332 - add monitor interface support to QCN9274 - add multi-link operation support to WCN7850 - add 802.11d scan offload support to WCN7850 - monitor mode for WCN7850, better 6 GHz regulatory - Qualcomm (ath11k): - restore hibernation support - MediaTek (mt76): - WiFi-7 improvements - implement support for mt7990 - Intel (iwlwifi): - enhanced multi-link single-radio (EMLSR) support on 5 GHz links - rework device configuration - RealTek (rtw88): - improve throughput for RTL8814AU - RealTek (rtw89): - add multi-link operation support - STA/P2P concurrency improvements - support different SAR configs by antenna - Bluetooth: - introduce HCI Driver protocol - btintel_pcie: do not generate coredump for diagnostic events - btusb: add HCI Drv commands for configuring altsetting - btusb: add RTL8851BE device 0x0bda:0xb850 - btusb: add new VID/PID 13d3/3584 for MT7922 - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925 - btnxpuart: implement host-wakeup feature" * tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits) selftests/bpf: Fix bpf selftest build warning selftests: netfilter: Fix skip of wildcard interface test net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames net: openvswitch: Fix the dead loop of MPLS parse calipso: Don't call calipso functions for AF_INET sk. selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem net_sched: hfsc: Address reentrant enqueue adding class to eltree twice octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback octeontx2-pf: QOS: Perform cache sync on send queue teardown net: mana: Add support for Multi Vports on Bare metal net: devmem: ncdevmem: remove unused variable net: devmem: ksft: upgrade rx test to send 1K data net: devmem: ksft: add 5 tuple FS support net: devmem: ksft: add exit_wait to make rx test pass net: devmem: ksft: add ipv4 support net: devmem: preserve sockc_err page_pool: fix ugly page_pool formatting net: devmem: move list_add to net_devmem_bind_dmabuf. selftests: netfilter: nft_queue.sh: include file transfer duration in log message net: phy: mscc: Fix memory leak when using one step timestamping ...