summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-07-17neighbour: Move neigh_find_table() to neigh_get().Kuniyuki Iwashima
neigh_valid_get_req() calls neigh_find_table() to fetch neigh_tables[]. neigh_find_table() uses rcu_dereference_rtnl(), but RTNL actually does not protect it at all; neigh_table_clear() can be called without RTNL and only waits for RCU readers by synchronize_rcu(). Fortunately, there is no bug because IPv4 is built-in, IPv6 cannot be unloaded, and DECNET was removed. To fetch neigh_tables[] by rcu_dereference() later, let's move neigh_find_table() from neigh_valid_get_req() to neigh_get(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-5-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17neighbour: Allocate skb in neigh_get().Kuniyuki Iwashima
We will remove RTNL for neigh_get() and run it under RCU instead. neigh_get_reply() and pneigh_get_reply() allocate skb with GFP_KERNEL. Let's move the allocation before __dev_get_by_index() in neigh_get(). Now, neigh_get_reply() and pneigh_get_reply() are inlined and rtnl_unicast() is factorised. We will convert pneigh_lookup() to __pneigh_lookup() later. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-4-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17neighbour: Move two validations from neigh_get() to neigh_valid_get_req().Kuniyuki Iwashima
We will remove RTNL for neigh_get() and run it under RCU instead. neigh_get() returns -EINVAL in the following cases: * NDA_DST is not specified * Both ndm->ndm_ifindex and NTF_PROXY are not specified These validations do not require RCU. Let's move them to neigh_valid_get_req(). While at it, the extack string for the first case is replaced with NL_SET_ERR_ATTR_MISS(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-3-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17neighbour: Make neigh_valid_get_req() return ndmsg.Kuniyuki Iwashima
neigh_get() passes 4 local variable pointers to neigh_valid_get_req(). If it returns a pointer of struct ndmsg, we do not need to pass two of them. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-2-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Merge branch 'ethtool-rss-support-rss_set-via-netlink'Jakub Kicinski
Jakub Kicinski says: ==================== ethtool: rss: support RSS_SET via Netlink Support configuring RSS settings via Netlink. Creating and removing contexts remains for the following series. v2: https://lore.kernel.org/20250714222729.743282-1-kuba@kernel.org v1: https://lore.kernel.org/20250711015303.3688717-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250716000331.1378807-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17selftests: drv-net: rss_api: test input-xfrm and hash fieldsJakub Kicinski
Test configuring input-xfrm and hash fields with all the limitations. Tested on mlx5 (CX6): # ./ksft-net-drv/drivers/net/hw/rss_api.py TAP version 13 1..10 ok 1 rss_api.test_rxfh_nl_set_fail ok 2 rss_api.test_rxfh_nl_set_indir ok 3 rss_api.test_rxfh_nl_set_indir_ctx ok 4 rss_api.test_rxfh_indir_ntf ok 5 rss_api.test_rxfh_indir_ctx_ntf ok 6 rss_api.test_rxfh_nl_set_key ok 7 rss_api.test_rxfh_fields ok 8 rss_api.test_rxfh_fields_set ok 9 rss_api.test_rxfh_fields_set_xfrm ok 10 rss_api.test_rxfh_fields_ntf # Totals: pass:10 fail:0 xfail:0 xpass:0 skip:0 error:0 Link: https://patch.msgid.link/20250716000331.1378807-12-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17ethtool: rss: support setting flow hashing fieldsJakub Kicinski
Add support for ETHTOOL_SRXFH (setting hashing fields) in RSS_SET. The tricky part is dealing with symmetric hashing. In netlink user can change the hashing fields and symmetric hash in one request, in IOCTL the two used to be set via different uAPI requests. Since fields and hash function config are still separate driver callbacks - changes to the two are not atomic. Keep things simple and validate the settings against both pre- and post- change ones. Meaning that we will reject the config request if user tries to correct the flow fields and set input_xfrm in one request, or disables input_xfrm and makes flow fields non-symmetric. We can adjust it later if there's a real need. Starting simple feels right, and potentially partially applying the settings isn't nice, either. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250716000331.1378807-11-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17ethtool: rss: support setting input-xfrm via NetlinkJakub Kicinski
Support configuring symmetric hashing via Netlink. We have the flow field config prepared as part of SET handling, so scan it for conflicts instead of querying the driver again. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250716000331.1378807-10-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17netlink: specs: define input-xfrm enum in the specJakub Kicinski
Help YNL decode the values for input-xfrm by defining the possible values in the spec. Don't define "no change" as it's an IOCTL artifact with no use in Netlink. With this change on mlx5 input-xfrm gets decoded: # ynl --family ethtool --dump rss-get [{'header': {'dev-index': 2, 'dev-name': 'eth0'}, 'hfunc': 1, 'hkey': b'V\xa8\xf9\x9 ...', 'indir': [0, 1, ... ], 'input-xfrm': {'sym-or-xor'}, <<< 'flow-hash': {'ah4': {'ip-dst', 'ip-src'}, 'ah6': {'ip-dst', 'ip-src'}, 'esp4': {'ip-dst', 'ip-src'}, 'esp6': {'ip-dst', 'ip-src'}, 'ip4': {'ip-dst', 'ip-src'}, 'ip6': {'ip-dst', 'ip-src'}, 'tcp4': {'l4-b-0-1', 'ip-dst', 'l4-b-2-3', 'ip-src'}, 'tcp6': {'l4-b-0-1', 'ip-dst', 'l4-b-2-3', 'ip-src'}, 'udp4': {'l4-b-0-1', 'ip-dst', 'l4-b-2-3', 'ip-src'}, 'udp6': {'l4-b-0-1', 'ip-dst', 'l4-b-2-3', 'ip-src'}} }] Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250716000331.1378807-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17selftests: drv-net: rss_api: test setting hashing key via NetlinkJakub Kicinski
Test setting hashing key via Netlink. # ./tools/testing/selftests/drivers/net/hw/rss_api.py TAP version 13 1..7 ok 1 rss_api.test_rxfh_nl_set_fail ok 2 rss_api.test_rxfh_nl_set_indir ok 3 rss_api.test_rxfh_nl_set_indir_ctx ok 4 rss_api.test_rxfh_indir_ntf ok 5 rss_api.test_rxfh_indir_ctx_ntf ok 6 rss_api.test_rxfh_nl_set_key ok 7 rss_api.test_rxfh_fields # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0 Link: https://patch.msgid.link/20250716000331.1378807-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17ethtool: rss: support setting hkey via NetlinkJakub Kicinski
Support setting RSS hashing key via ethtool Netlink. Use the Netlink policy to make sure user doesn't pass an empty key, "resetting" the key is not a thing. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250716000331.1378807-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17ethtool: rss: support setting hfunc via NetlinkJakub Kicinski
Support setting RSS hash function / algo via ethtool Netlink. Like IOCTL we don't validate that the function is within the range known to the kernel. The drivers do a pretty good job validating the inputs, and the IDs are technically "dynamically queried" rather than part of uAPI. Only change should be that in Netlink we don't support user explicitly passing ETH_RSS_HASH_NO_CHANGE (0), if no change is requested the attribute should be absent. The ETH_RSS_HASH_NO_CHANGE is retained in driver-facing API for consistency (not that I see a strong reason for it). Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250716000331.1378807-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17selftests: drv-net: rss_api: test setting indirection table via NetlinkJakub Kicinski
Test setting indirection table via Netlink. # ./tools/testing/selftests/drivers/net/hw/rss_api.py TAP version 13 1..6 ok 1 rss_api.test_rxfh_nl_set_fail ok 2 rss_api.test_rxfh_nl_set_indir ok 3 rss_api.test_rxfh_nl_set_indir_ctx ok 4 rss_api.test_rxfh_indir_ntf ok 5 rss_api.test_rxfh_indir_ctx_ntf ok 6 rss_api.test_rxfh_fields # Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0 Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250716000331.1378807-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17tools: ynl: support packing binary arrays of scalarsJakub Kicinski
We support decoding a binary type with a scalar subtype already, add support for sending such arrays to the kernel. While at it also support using "None" to indicate that the binary attribute should be empty. I couldn't decide whether empty binary should be [] or None, but there should be no harm in supporting both. Link: https://patch.msgid.link/20250716000331.1378807-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17selftests: drv-net: rss_api: factor out checking min queue countJakub Kicinski
Multiple tests check min queue count, create a helper. Link: https://patch.msgid.link/20250716000331.1378807-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17ethtool: rss: initial RSS_SET (indirection table handling)Jakub Kicinski
Add initial support for RSS_SET, for now only operations on the indirection table are supported. Unlike the ioctl don't check if at least one parameter is being changed. This is how other ethtool-nl ops behave, so pick the ethtool-nl consistency vs copying ioctl behavior. There are two special cases here: 1) resetting the table to defaults; 2) support for tables of different size. For (1) I use an empty Netlink attribute (array of size 0). (2) may require some background. AFAICT a lot of modern devices allow allocating RSS tables of different sizes. mlx5 can upsize its tables, bnxt has some "table size calculation", and Intel folks asked about RSS table sizing in context of resource allocation in the past. The ethtool IOCTL API has a concept of table size, but right now the user is expected to provide a table exactly the size the device requests. Some drivers may change the table size at runtime (in response to queue count changes) but the user is not in control of this. What's not great is that all RSS contexts share the same table size. For example a device with 128 queues enabled, 16 RSS contexts 8 queues in each will likely have 256 entry tables for each of the 16 contexts, while 32 would be more than enough given each context only has 8 queues. To address this the Netlink API should avoid enforcing table size at the uAPI level, and should allow the user to express the min table size they expect. To fully solve (2) we will need more driver plumbing but at the uAPI level this patch allows the user to specify a table size smaller than what the device advertises. The device table size must be a multiple of the user requested table size. We then replicate the user-provided table to fill the full device size table. This addresses the "allow the user to express the min table size" objective, while not enforcing any fixed size. From Netlink perspective .get_rxfh_indir_size() is now de facto the "max" table size supported by the device. We may choose to support table replication in ethtool, too, when we actually plumb this thru the device APIs. Initially I was considering moving full pattern generation to the kernel (which queues to use, at which frequency and what min sequence length). I don't think this complexity would buy us much and most if not all devices have pow-2 table sizes, which simplifies the replication a lot. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250716000331.1378807-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net/mlx5e: TX, Fix dma unmapping for devmem txDragos Tatulea
net_iovs should have the dma address set to 0 so that netmem_dma_unmap_page_attrs() correctly skips the unmap. This was not done in mlx5 when support for devmem tx was added and resulted in the splat below when the platform iommu was enabled. This patch addresses the issue by using netmem_dma_unmap_addr_set() which handles the net_iov case when setting the dma address. A small refactoring of mlx5e_dma_push() was required to be able to use this API. The function was split in two versions and each version called accordingly. Note that netmem_dma_unmap_addr_set() introduces an additional if case. Splat: WARNING: CPU: 14 PID: 2587 at drivers/iommu/dma-iommu.c:1228 iommu_dma_unmap_page+0x7d/0x90 Modules linked in: [...] Unloaded tainted modules: i10nm_edac(E):1 fjes(E):1 CPU: 14 UID: 0 PID: 2587 Comm: ncdevmem Tainted: G S E 6.15.0+ #3 PREEMPT(voluntary) Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE Hardware name: HPE ProLiant DL380 Gen10 Plus/ProLiant DL380 Gen10 Plus, BIOS U46 06/01/2022 RIP: 0010:iommu_dma_unmap_page+0x7d/0x90 Code: [...] RSP: 0000:ff6b1e3ea0b2fc58 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ff46ef2d0a2340c8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001 RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffff8827a120 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000d8000000 R13: 0000000000000008 R14: 0000000000000001 R15: 0000000000000000 FS: 00007feb69adf740(0000) GS:ff46ef2c779f1000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007feb69cca000 CR3: 0000000154b97006 CR4: 0000000000773ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> dma_unmap_page_attrs+0x227/0x250 mlx5e_poll_tx_cq+0x163/0x510 [mlx5_core] mlx5e_napi_poll+0x94/0x720 [mlx5_core] __napi_poll+0x28/0x1f0 net_rx_action+0x33a/0x420 ? mlx5e_completion_event+0x3d/0x40 [mlx5_core] handle_softirqs+0xe8/0x2f0 __irq_exit_rcu+0xcd/0xf0 common_interrupt+0x47/0xa0 asm_common_interrupt+0x26/0x40 RIP: 0033:0x7feb69cd08ec Code: [...] RSP: 002b:00007ffc01b8c880 EFLAGS: 00000246 RAX: 00000000c3a60cf7 RBX: 0000000000045e12 RCX: 000000000000000e RDX: 00000000000035b4 RSI: 0000000000000000 RDI: 00007ffc01b8c8c0 RBP: 00007ffc01b8c8b0 R08: 0000000000000000 R09: 0000000000000064 R10: 00007ffc01b8c8c0 R11: 0000000000000000 R12: 00007feb69cca000 R13: 00007ffc01b90e48 R14: 0000000000427e18 R15: 00007feb69d07000 </TASK> Cc: Mina Almasry <almasrymina@google.com> Reported-by: Stanislav Fomichev <stfomichev@gmail.com> Closes: https://lore.kernel.org/all/aFM6r9kFHeTdj-25@mini-arch/ Fixes: 5a842c288cfa ("net/mlx5e: Add TX support for netmems") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/1752649242-147678-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc7). Conflicts: Documentation/netlink/specs/ovpn.yaml 880d43ca9aa4 ("netlink: specs: clean up spaces in brackets") af52020fc599 ("ovpn: reject unexpected netlink attributes") drivers/net/phy/phy_device.c a44312d58e78 ("net: phy: Don't register LEDs for genphy") f0f2b992d818 ("net: phy: Don't register LEDs for genphy") https://lore.kernel.org/20250710114926.7ec3a64f@kernel.org drivers/net/wireless/intel/iwlwifi/fw/regulatory.c drivers/net/wireless/intel/iwlwifi/mld/regulatory.c 5fde0fcbd760 ("wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap") ea045a0de3b9 ("wifi: iwlwifi: add support for accepting raw DSM tables by firmware") net/ipv6/mcast.c ae3264a25a46 ("ipv6: mcast: Delay put pmc->idev in mld_del_delrec()") a8594c956cc9 ("ipv6: mcast: Avoid a duplicate pointer check in mld_del_delrec()") https://lore.kernel.org/8cc52891-3653-4b03-a45e-05464fe495cf@kernel.org No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Merge tag 'net-6.16-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Bluetooth, CAN, WiFi and Netfilter. More code here than I would have liked. That said, better now than next week. Nothing particularly scary stands out. The improvement to the OpenVPN input validation is a bit large but better get them in before the code makes it to a final release. Some of the changes we got from sub-trees could have been split better between the fix and -next refactoring, IMHO, that has been communicated. We have one known regression in a TI AM65 board not getting link. The investigation is going a bit slow, a number of people are on vacation. We'll try to wrap it up, but don't think it should hold up the release. Current release - fix to a fix: - Bluetooth: L2CAP: fix attempting to adjust outgoing MTU, it broke some headphones and speakers Current release - regressions: - wifi: ath12k: fix packets received in WBM error ring with REO LUT enabled, fix Rx performance regression - wifi: iwlwifi: - fix crash due to a botched indexing conversion - mask reserved bits in chan_state_active_bitmap, avoid FW assert() Current release - new code bugs: - nf_conntrack: fix crash due to removal of uninitialised entry - eth: airoha: fix potential UaF in airoha_npu_get() Previous releases - regressions: - net: fix segmentation after TCP/UDP fraglist GRO - af_packet: fix the SO_SNDTIMEO constraint not taking effect and a potential soft lockup waiting for a completion - rpl: fix UaF in rpl_do_srh_inline() for sneaky skb geometry - virtio-net: fix recursive rtnl_lock() during probe() - eth: stmmac: populate entire system_counterval_t in get_time_fn() - eth: libwx: fix a number of crashes in the driver Rx path - hv_netvsc: prevent IPv6 addrconf after IFF_SLAVE lost that meaning Previous releases - always broken: - mptcp: fix races in handling connection fallback to pure TCP - rxrpc: assorted error handling and race fixes - sched: another batch of "security" fixes for qdiscs (QFQ, HTB) - tls: always refresh the queue when reading sock, avoid UaF - phy: don't register LEDs for genphy, avoid deadlock - Bluetooth: btintel: check if controller is ISO capable on btintel_classify_pkt_type(), work around FW returning incorrect capabilities Misc: - make OpenVPN Netlink input checking more strict before it makes it to a final release - wifi: cfg80211: remove scan request n_channels __counted_by, it's only yielding false positives" * tag 'net-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits) rxrpc: Fix to use conn aborts for conn-wide failures rxrpc: Fix transmission of an abort in response to an abort rxrpc: Fix notification vs call-release vs recvmsg rxrpc: Fix recv-recv race of completed call rxrpc: Fix irq-disabled in local_bh_enable() selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree net: bridge: Do not offload IGMP/MLD messages selftests: Add test cases for vlan_filter modification during runtime net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime tls: always refresh the queue when reading sock virtio-net: fix recursived rtnl_lock() during probe() net/mlx5: Update the list of the PCI supported devices hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 addrconf phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept() Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU netfilter: nf_conntrack: fix crash due to removal of uninitialised entry net: fix segmentation after TCP/UDP fraglist GRO ipv6: mcast: Delay put pmc->idev in mld_del_delrec() net: airoha: fix potential use-after-free in airoha_npu_get() ...
2025-07-17Merge tag 'pm-6.16-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These address three issues introduced during the current development cycle and related to system suspend and hibernation, one triggering when asynchronous suspend of devices fails, one possibly affecting memory management in the core suspend code error path, and one due to duplicate filesystems freezing during system suspend: - Fix a deadlock that may occur on asynchronous device suspend failures due to missing completion updates in error paths (Rafael Wysocki) - Drop a misplaced pm_restore_gfp_mask() call, which may cause swap to be accessed too early if system suspend fails, from suspend_devices_and_enter() (Rafael Wysocki) - Remove duplicate filesystems_freeze/thaw() calls, which sometimes cause systems to be unable to resume, from enter_state() (Zihuan Zhang)" * tag 'pm-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: sleep: Update power.completion for all devices on errors PM: suspend: clean up redundant filesystems_freeze/thaw() handling PM: suspend: Drop a misplaced pm_restore_gfp_mask() call
2025-07-17Merge tag 'for-net-2025-07-17' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_sync: fix connectable extended advertising when using static random address - hci_core: fix typos in macros - hci_core: add missing braces when using macro parameters - hci_core: replace 'quirks' integer by 'quirk_flags' bitmap - SMP: If an unallowed command is received consider it a failure - SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout - L2CAP: Fix null-ptr-deref in l2cap_sock_resume_cb() - L2CAP: Fix attempting to adjust outgoing MTU - btintel: Check if controller is ISO capable on btintel_classify_pkt_type - btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID * tag 'for-net-2025-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID Bluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmap Bluetooth: hci_core: add missing braces when using macro parameters Bluetooth: hci_core: fix typos in macros Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout Bluetooth: SMP: If an unallowed command is received consider it a failure Bluetooth: btintel: Check if controller is ISO capable on btintel_classify_pkt_type Bluetooth: hci_sync: fix connectable extended advertising when using static random address Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb() ==================== Link: https://patch.msgid.link/20250717142849.537425-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Merge branch 'rxrpc-miscellaneous-fixes'Jakub Kicinski
David Howells says: ==================== rxrpc: Miscellaneous fixes Here are some fixes for rxrpc: (1) Fix the calling of IP routing code with IRQs disabled. (2) Fix a recvmsg/recvmsg race when the first completes a call. (3) Fix a race between notification, recvmsg and sendmsg releasing a call. (4) Fix abort of abort. (5) Fix call-level aborts that should be connection-level aborts. ==================== Link: https://patch.msgid.link/20250717074350.3767366-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17rxrpc: Fix to use conn aborts for conn-wide failuresDavid Howells
Fix rxrpc to use connection-level aborts for things that affect the whole connection, such as the service ID not matching a local service. Fixes: 57af281e5389 ("rxrpc: Tidy up abort generation infrastructure") Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-6-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17rxrpc: Fix transmission of an abort in response to an abortDavid Howells
Under some circumstances, such as when a server socket is closing, ABORT packets will be generated in response to incoming packets. Unfortunately, this also may include generating aborts in response to incoming aborts - which may cause a cycle. It appears this may be made possible by giving the client a multicast address. Fix this such that rxrpc_reject_packet() will refuse to generate aborts in response to aborts. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17rxrpc: Fix notification vs call-release vs recvmsgDavid Howells
When a call is released, rxrpc takes the spinlock and removes it from ->recvmsg_q in an effort to prevent racing recvmsg() invocations from seeing the same call. Now, rxrpc_recvmsg() only takes the spinlock when actually removing a call from the queue; it doesn't, however, take it in the lead up to that when it checks to see if the queue is empty. It *does* hold the socket lock, which prevents a recvmsg/recvmsg race - but this doesn't prevent sendmsg from ending the call because sendmsg() drops the socket lock and relies on the call->user_mutex. Fix this by firstly removing the bit in rxrpc_release_call() that dequeues the released call and, instead, rely on recvmsg() to simply discard released calls (done in a preceding fix). Secondly, rxrpc_notify_socket() is abandoned if the call is already marked as released rather than trying to be clever by setting both pointers in call->recvmsg_link to NULL to trick list_empty(). This isn't perfect and can still race, resulting in a released call on the queue, but recvmsg() will now clean that up. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-4-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17rxrpc: Fix recv-recv race of completed callDavid Howells
If a call receives an event (such as incoming data), the call gets placed on the socket's queue and a thread in recvmsg can be awakened to go and process it. Once the thread has picked up the call off of the queue, further events will cause it to be requeued, and once the socket lock is dropped (recvmsg uses call->user_mutex to allow the socket to be used in parallel), a second thread can come in and its recvmsg can pop the call off the socket queue again. In such a case, the first thread will be receiving stuff from the call and the second thread will be blocked on call->user_mutex. The first thread can, at this point, process both the event that it picked call for and the event that the second thread picked the call for and may see the call terminate - in which case the call will be "released", decoupling the call from the user call ID assigned to it (RXRPC_USER_CALL_ID in the control message). The first thread will return okay, but then the second thread will wake up holding the user_mutex and, if it sees that the call has been released by the first thread, it will BUG thusly: kernel BUG at net/rxrpc/recvmsg.c:474! Fix this by just dequeuing the call and ignoring it if it is seen to be already released. We can't tell userspace about it anyway as the user call ID has become stale. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17rxrpc: Fix irq-disabled in local_bh_enable()David Howells
The rxrpc_assess_MTU_size() function calls down into the IP layer to find out the MTU size for a route. When accepting an incoming call, this is called from rxrpc_new_incoming_call() which holds interrupts disabled across the code that calls down to it. Unfortunately, the IP layer uses local_bh_enable() which, config dependent, throws a warning if IRQs are enabled: WARNING: CPU: 1 PID: 5544 at kernel/softirq.c:387 __local_bh_enable_ip+0x43/0xd0 ... RIP: 0010:__local_bh_enable_ip+0x43/0xd0 ... Call Trace: <TASK> rt_cache_route+0x7e/0xa0 rt_set_nexthop.isra.0+0x3b3/0x3f0 __mkroute_output+0x43a/0x460 ip_route_output_key_hash+0xf7/0x140 ip_route_output_flow+0x1b/0x90 rxrpc_assess_MTU_size.isra.0+0x2a0/0x590 rxrpc_new_incoming_peer+0x46/0x120 rxrpc_alloc_incoming_call+0x1b1/0x400 rxrpc_new_incoming_call+0x1da/0x5e0 rxrpc_input_packet+0x827/0x900 rxrpc_io_thread+0x403/0xb60 kthread+0x2f7/0x310 ret_from_fork+0x2a/0x230 ret_from_fork_asm+0x1a/0x30 ... hardirqs last enabled at (23): _raw_spin_unlock_irq+0x24/0x50 hardirqs last disabled at (24): _raw_read_lock_irq+0x17/0x70 softirqs last enabled at (0): copy_process+0xc61/0x2730 softirqs last disabled at (25): rt_add_uncached_list+0x3c/0x90 Fix this by moving the call to rxrpc_assess_MTU_size() out of rxrpc_init_peer() and further up the stack where it can be done without interrupts disabled. It shouldn't be a problem for rxrpc_new_incoming_call() to do it after the locks are dropped as pmtud is going to be performed by the I/O thread - and we're in the I/O thread at this point. Fixes: a2ea9a907260 ("rxrpc: Use irq-disabling spinlocks between app and I/O thread") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Junvyyang, Tencent Zhuque Lab <zhuque@tencent.com> cc: LePremierHomme <kwqcheii@proton.me> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250717074350.3767366-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptyingWilliam Liu
Ensure that any deactivation and row emptying that occurs during htb_dequeue_tree does not cause a kernel panic. This scenario originally triggered a kernel BUG_ON, and we are checking for a graceful fail now. Signed-off-by: William Liu <will@willsroot.io> Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io> Link: https://patch.msgid.link/20250717022912.221426-1-will@willsroot.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtreeWilliam Liu
htb_lookup_leaf has a BUG_ON that can trigger with the following: tc qdisc del dev lo root tc qdisc add dev lo root handle 1: htb default 1 tc class add dev lo parent 1: classid 1:1 htb rate 64bit tc qdisc add dev lo parent 1:1 handle 2: netem tc qdisc add dev lo parent 2:1 handle 3: blackhole ping -I lo -c1 -W0.001 127.0.0.1 The root cause is the following: 1. htb_dequeue calls htb_dequeue_tree which calls the dequeue handler on the selected leaf qdisc 2. netem_dequeue calls enqueue on the child qdisc 3. blackhole_enqueue drops the packet and returns a value that is not just NET_XMIT_SUCCESS 4. Because of this, netem_dequeue calls qdisc_tree_reduce_backlog, and since qlen is now 0, it calls htb_qlen_notify -> htb_deactivate -> htb_deactiviate_prios -> htb_remove_class_from_row -> htb_safe_rb_erase 5. As this is the only class in the selected hprio rbtree, __rb_change_child in __rb_erase_augmented sets the rb_root pointer to NULL 6. Because blackhole_dequeue returns NULL, netem_dequeue returns NULL, which causes htb_dequeue_tree to call htb_lookup_leaf with the same hprio rbtree, and fail the BUG_ON The function graph for this scenario is shown here: 0) | htb_enqueue() { 0) + 13.635 us | netem_enqueue(); 0) 4.719 us | htb_activate_prios(); 0) # 2249.199 us | } 0) | htb_dequeue() { 0) 2.355 us | htb_lookup_leaf(); 0) | netem_dequeue() { 0) + 11.061 us | blackhole_enqueue(); 0) | qdisc_tree_reduce_backlog() { 0) | qdisc_lookup_rcu() { 0) 1.873 us | qdisc_match_from_root(); 0) 6.292 us | } 0) 1.894 us | htb_search(); 0) | htb_qlen_notify() { 0) 2.655 us | htb_deactivate_prios(); 0) 6.933 us | } 0) + 25.227 us | } 0) 1.983 us | blackhole_dequeue(); 0) + 86.553 us | } 0) # 2932.761 us | qdisc_warn_nonwc(); 0) | htb_lookup_leaf() { 0) | BUG_ON(); ------------------------------------------ The full original bug report can be seen here [1]. We can fix this just by returning NULL instead of the BUG_ON, as htb_dequeue_tree returns NULL when htb_lookup_leaf returns NULL. [1] https://lore.kernel.org/netdev/pF5XOOIim0IuEfhI-SOxTgRvNoDwuux7UHKnE_Y5-zVd4wmGvNk2ceHjKb8ORnzw0cGwfmVu42g9dL7XyJLf1NEzaztboTWcm0Ogxuojoeo=@willsroot.io/ Fixes: 512bb43eb542 ("pkt_sched: sch_htb: Optimize WARN_ONs in htb_dequeue_tree() etc.") Signed-off-by: William Liu <will@willsroot.io> Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io> Link: https://patch.msgid.link/20250717022816.221364-1-will@willsroot.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net: bridge: Do not offload IGMP/MLD messagesJoseph Huang
Do not offload IGMP/MLD messages as it could lead to IGMP/MLD Reports being unintentionally flooded to Hosts. Instead, let the bridge decide where to send these IGMP/MLD messages. Consider the case where the local host is sending out reports in response to a remote querier like the following: mcast-listener-process (IP_ADD_MEMBERSHIP) \ br0 / \ swp1 swp2 | | QUERIER SOME-OTHER-HOST In the above setup, br0 will want to br_forward() reports for mcast-listener-process's group(s) via swp1 to QUERIER; but since the source hwdom is 0, the report is eligible for tx offloading, and is flooded by hardware to both swp1 and swp2, reaching SOME-OTHER-HOST as well. (Example and illustration provided by Tobias.) Fixes: 472111920f1c ("net: bridge: switchdev: allow the TX data plane forwarding to be offloaded") Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250716153551.1830255-1-Joseph.Huang@garmin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Merge branch ↵Jakub Kicinski
'net-vlan-fix-vlan-0-refcount-imbalance-of-toggling-filtering-during-runtime' Dong Chenchen says: ==================== net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime Fix VLAN 0 refcount imbalance of toggling filtering during runtime. ==================== Link: https://patch.msgid.link/20250716034504.2285203-1-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17selftests: Add test cases for vlan_filter modification during runtimeDong Chenchen
Add test cases for vlan_filter modification during runtime, which may triger null-ptr-ref or memory leak of vlan0. Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Link: https://patch.msgid.link/20250716034504.2285203-3-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtimeDong Chenchen
Assuming the "rx-vlan-filter" feature is enabled on a net device, the 8021q module will automatically add or remove VLAN 0 when the net device is put administratively up or down, respectively. There are a couple of problems with the above scheme. The first problem is a memory leak that can happen if the "rx-vlan-filter" feature is disabled while the device is running: # ip link add bond1 up type bond mode 0 # ethtool -K bond1 rx-vlan-filter off # ip link del dev bond1 When the device is put administratively down the "rx-vlan-filter" feature is disabled, so the 8021q module will not remove VLAN 0 and the memory will be leaked [1]. Another problem that can happen is that the kernel can automatically delete VLAN 0 when the device is put administratively down despite not adding it when the device was put administratively up since during that time the "rx-vlan-filter" feature was disabled. null-ptr-unref or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance if toggling filtering during runtime: $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ip link del vlan0 Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //vlan_info is null Fix both problems by noting in the VLAN info whether VLAN 0 was automatically added upon NETDEV_UP and based on that decide whether it should be deleted upon NETDEV_DOWN, regardless of the state of the "rx-vlan-filter" feature. [1] unreferenced object 0xffff8880068e3100 (size 256): comm "ip", pid 384, jiffies 4296130254 hex dump (first 32 bytes): 00 20 30 0d 80 88 ff ff 00 00 00 00 00 00 00 00 . 0............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 81ce31fa): __kmalloc_cache_noprof+0x2b5/0x340 vlan_vid_add+0x434/0x940 vlan_device_event.cold+0x75/0xa8 notifier_call_chain+0xca/0x150 __dev_notify_flags+0xe3/0x250 rtnl_configure_link+0x193/0x260 rtnl_newlink_create+0x383/0x8e0 __rtnl_newlink+0x22c/0xa40 rtnl_newlink+0x627/0xb00 rtnetlink_rcv_msg+0x6fb/0xb70 netlink_rcv_skb+0x11f/0x350 netlink_unicast+0x426/0x710 netlink_sendmsg+0x75a/0xc20 __sock_sendmsg+0xc1/0x150 ____sys_sendmsg+0x5aa/0x7b0 ___sys_sendmsg+0xfc/0x180 [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: <TASK> rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Fixes: ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250716034504.2285203-2-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Merge tag 'ovpn-net-20250716' of https://github.com/OpenVPN/ovpn-net-nextJakub Kicinski
Antonio Quartulli says: ==================== This bugfix batch includes the following changes: * properly propagate sk mark to skb->mark field * reject unexpected incoming netlink attributes * reset GSO state when moving skb from transport to tunnel layer * tag 'ovpn-net-20250716' of https://github.com/OpenVPN/ovpn-net-next: ovpn: reset GSO metadata after decapsulation ovpn: reject unexpected netlink attributes ovpn: propagate socket mark to skb in UDP ==================== Link: https://patch.msgid.link/20250716115443.16763-1-antonio@openvpn.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17tls: always refresh the queue when reading sockJakub Kicinski
After recent changes in net-next TCP compacts skbs much more aggressively. This unearthed a bug in TLS where we may try to operate on an old skb when checking if all skbs in the queue have matching decrypt state and geometry. BUG: KASAN: slab-use-after-free in tls_strp_check_rcv+0x898/0x9a0 [tls] (net/tls/tls_strp.c:436 net/tls/tls_strp.c:530 net/tls/tls_strp.c:544) Read of size 4 at addr ffff888013085750 by task tls/13529 CPU: 2 UID: 0 PID: 13529 Comm: tls Not tainted 6.16.0-rc5-virtme Call Trace: kasan_report+0xca/0x100 tls_strp_check_rcv+0x898/0x9a0 [tls] tls_rx_rec_wait+0x2c9/0x8d0 [tls] tls_sw_recvmsg+0x40f/0x1aa0 [tls] inet_recvmsg+0x1c3/0x1f0 Always reload the queue, fast path is to have the record in the queue when we wake, anyway (IOW the path going down "if !strp->stm.full_len"). Fixes: 0d87bbd39d7f ("tls: strp: make sure the TCP skbs do not have overlapping data") Link: https://patch.msgid.link/20250716143850.1520292-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17virtio-net: fix recursived rtnl_lock() during probe()Zigit Zo
The deadlock appears in a stack trace like: virtnet_probe() rtnl_lock() virtio_config_changed_work() netdev_notify_peers() rtnl_lock() It happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing. The config_work in probe() will get scheduled until virtnet_open() enables the config change notification via virtio_config_driver_enable(). Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <zuozhijie@bytedance.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20250716115717.1472430-1-zuozhijie@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17net/mlx5: Update the list of the PCI supported devicesMaor Gottlieb
Add the upcoming ConnectX-10 device ID to the table of supported PCI device IDs. Cc: stable@vger.kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1752650969-148501-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 ↵Li Tian
addrconf Set an additional flag IFF_NO_ADDRCONF to prevent ipv6 addrconf. Commit under Fixes added a new flag change that was not made to hv_netvsc resulting in the VF being assinged an IPv6. Fixes: 8a321cf7becc ("net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf") Suggested-by: Cathy Avery <cavery@redhat.com> Signed-off-by: Li Tian <litian@redhat.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20250716002607.4927-1-litian@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()Nathan Chancellor
A new warning in clang [1] points out a place in pep_sock_accept() where dst is uninitialized then passed as a const pointer to pep_find_pipe(): net/phonet/pep.c:829:37: error: variable 'dst' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer] 829 | newsk = pep_find_pipe(&pn->hlist, &dst, pipe_handle); | ^~~: Move the call to pn_skb_get_dst_sockaddr(), which initializes dst, to before the call to pep_find_pipe(), so that dst is consistently used initialized throughout the function. Cc: stable@vger.kernel.org Fixes: f7ae8d59f661 ("Phonet: allocate sock from accept syscall rather than soft IRQ") Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d441f19b319e [1] Closes: https://github.com/ClangBuiltLinux/linux/issues/2101 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20250715-net-phonet-fix-uninit-const-pointer-v1-1-8efd1bd188b3@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Bluetooth: L2CAP: Fix attempting to adjust outgoing MTULuiz Augusto von Dentz
Configuration request only configure the incoming direction of the peer initiating the request, so using the MTU is the other direction shall not be used, that said the spec allows the peer responding to adjust: Bluetooth Core 6.1, Vol 3, Part A, Section 4.5 'Each configuration parameter value (if any is present) in an L2CAP_CONFIGURATION_RSP packet reflects an ‘adjustment’ to a configuration parameter value that has been sent (or, in case of default values, implied) in the corresponding L2CAP_CONFIGURATION_REQ packet.' That said adjusting the MTU in the response shall be limited to ERTM channels only as for older modes the remote stack may not be able to detect the adjustment causing it to silently drop packets. Link: https://github.com/bluez/bluez/issues/1422 Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/149 Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4793 Fixes: 042bb9603c44 ("Bluetooth: L2CAP: Fix L2CAP MTU negotiation") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-17Merge branch 'ppp-replace-per-cpu-recursion-counter-with-lock-owner-field'Paolo Abeni
Sebastian Andrzej Siewior says: ==================== ppp: Replace per-CPU recursion counter with lock-owner field This is another approach to avoid relying on local_bh_disable() for locking of per-CPU in ppp. I redid it with the per-CPU lock and local_lock_nested_bh() as discussed in v1. The xmit_recursion counter has been removed since it served the same purpose as the owner field. Both were updated and checked. The xmit_recursion looks like a counter in ppp_channel_push() but at this point, the counter should always be 0 so it always serves as a boolean. Therefore I removed it. I do admit that this looks easier to review. v2 https://lore.kernel.org/all/20250710162403.402739-1-bigeasy@linutronix.de/ v1 https://lore.kernel.org/all/20250627105013.Qtv54bEk@linutronix.de/ ==================== Link: https://patch.msgid.link/20250715150806.700536-1-bigeasy@linutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17ppp: Replace per-CPU recursion counter with lock-owner fieldSebastian Andrzej Siewior
The per-CPU variable ppp::xmit_recursion is protecting against recursion due to wrong configuration of the ppp unit. The per-CPU variable relies on disabled BH for its locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking. The ppp::xmit_recursion is used as a per-CPU boolean. The counter is checked early in the send routing and the transmit path is only entered if the counter is zero. Then the counter is incremented to avoid recursion. It used to detect recursion on channel::downl and ppp::wlock. Create a struct ppp_xmit_recursion and move the counter into it. Add local_lock_t to the struct and use local_lock_nested_bh() for locking. Due to possible nesting, the lock cannot be acquired unconditionally but it requires an owner field to identify recursion before attempting to acquire the lock. The counter is incremented and checked only after the lock is acquired. Since it functions as a boolean rather than a count, and its role is now superseded by the owner field, it can be safely removed. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20250715150806.700536-2-bigeasy@linutronix.de Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17Merge branch 'dpll-zl3073x-add-misc-features'Paolo Abeni
Ivan Vecera says: ==================== dpll: zl3073x: Add misc features Add several new features missing in initial submission: * Embedded sync for both pin types * Phase offset reporting for connected input pin * Selectable phase offset monitoring (aka all inputs phase monitor) * Phase adjustments for both pin types * Fractional frequency offset reporting for input pins Everything was tested on Microchip EVB-LAN9668 EDS2 development board. ==================== Link: https://patch.msgid.link/20250715144633.149156-1-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17dpll: zl3073x: Add support to get fractional frequency offsetIvan Vecera
Adds support to get fractional frequency offset for input pins. Implement the appropriate callback and function that periodicaly performs reference frequency measurement and notifies DPLL core about changes. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Prathosh Satish <prathosh.satish@microchip.com> Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20250715144633.149156-6-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17dpll: zl3073x: Add support to adjust phaseIvan Vecera
Add support to get/set phase adjustment for both input and output pins. The phase adjustment is implemented using reference and output phase offset compensation registers. For input pins the adjustment value can be arbitrary number but for outputs the value has to be a multiple of half synthesizer clock cycles. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Prathosh Satish <prathosh.satish@microchip.com> Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20250715144633.149156-5-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17dpll: zl3073x: Implement phase offset monitor featureIvan Vecera
Implement phase offset monitor feature to allow a user to monitor phase offsets across all available inputs. The device firmware periodically performs phase offsets measurements for all available DPLL channels and input references. The driver can ask the firmware to fill appropriate latch registers with measured values. There are 2 sets of latch registers for phase offsets reporting: 1) DPLL-to-connected-ref: up to 5 registers that contain values for phase offset between particular DPLL channel and its connected input reference. 2) selected-DPLL-to-ref: 10 registers that contain values for phase offsets between selected DPLL channel and all available input references. Both are filled with single read request so the driver can read DPLL-to-connected-ref phase offset for all DPLL channels at once. This was implemented in the previous patch. To read selected-DPLL-to-ref registers for all DPLLs a separate read request has to be sent to device firmware for each DPLL channel. To implement phase offset monitor feature: * Extend zl3073x_ref_phase_offsets_update() to select given DPLL channel in phase offset read request. The caller can set channel==-1 if it will not read Type2 registers. * Use this extended function to update phase offset latch registers during zl3073x_dpll_changes_check() call if phase monitor is enabled * Extend zl3073x_dpll_pin_phase_offset_check() to check phase offset changes for all available input references * Extend zl3073x_dpll_input_pin_phase_offset_get() to report phase offset values for all available input references * Implement phase offset monitor callbacks to enable/disable this feature Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Prathosh Satish <prathosh.satish@microchip.com> Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20250715144633.149156-4-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17dpll: zl3073x: Add support to get phase offset on connected input pinIvan Vecera
Add support to get phase offset for the connected input pin. Implement the appropriate callback and function that performs DPLL to connected reference phase error measurement and notifies DPLL core about changes. The measurement is performed internally by device on background 40 times per second but the measured value is read each second and compared with previous value. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Prathosh Satish <prathosh.satish@microchip.com> Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20250715144633.149156-3-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17dpll: zl3073x: Add support to get/set esync on pinsIvan Vecera
Add support to get/set embedded sync for both input and output pins. The DPLL is able to lock on input reference when the embedded sync frequency is 1 PPS and pulse width 25%. The esync on outputs are more versatille and theoretically supports any esync frequency that divides current output frequency but for now support the same that supported on input pins (1 PPS & 25% pulse). Note that for the output pins the esync divisor shares the same register used for N-divided signal formats. Due to this the esync cannot be enabled on outputs configured with one of the N-divided signal formats. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Prathosh Satish <prathosh.satish@microchip.com> Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20250715144633.149156-2-ivecera@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17Merge tag 'wireless-next-2025-07-17' of ↵Paolo Abeni
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== Another set of changes, notably: - cfg80211: fix double-free introduced earlier - mac80211: fix RCU iteration in CSA - iwlwifi: many cleanups (unused FW APIs, PCIe code, WoWLAN) - mac80211: some work around how FIPS affects wifi, which was wrong (RC4 is used by TKIP, not only WEP) - cfg/mac80211: improvements for unsolicated probe response handling * tag 'wireless-next-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (64 commits) wifi: cfg80211: fix double free for link_sinfo in nl80211_station_dump() wifi: cfg80211: fix off channel operation allowed check for MLO wifi: mac80211: use RCU-safe iteration in ieee80211_csa_finish wifi: mac80211_hwsim: Update comments in header wifi: mac80211: parse unsolicited broadcast probe response data wifi: cfg80211: parse attribute to update unsolicited probe response template wifi: mac80211: don't use TPE data from assoc response wifi: mac80211: handle WLAN_HT_ACTION_NOTIFY_CHANWIDTH async wifi: mac80211: simplify __ieee80211_rx_h_amsdu() loop wifi: mac80211: don't mark keys for inactive links as uploaded wifi: mac80211: only assign chanctx in reconfig wifi: mac80211_hwsim: Declare support for AP scanning wifi: mac80211: clean up cipher suite handling wifi: mac80211: don't send keys to driver when fips_enabled wifi: cfg80211: Fix interface type validation wifi: mac80211: remove ieee80211_link_unreserve_chanctx() return value wifi: mac80211: don't unreserve never reserved chanctx mwl8k: Add missing check after DMA map wifi: mac80211: make VHT opmode NSS ignore a debug message wifi: iwlwifi: remove support of several iwl_ppag_table_cmd versions ... ==================== Link: https://patch.msgid.link/20250717094610.20106-47-johannes@sipsolutions.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-17Merge tag 'wireless-2025-07-17' of ↵Paolo Abeni
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Couple of fixes: - ath12k performance regression from -rc1 - cfg80211 counted_by() removal for scan request as it doesn't match usage and keeps complaining - iwlwifi crash with certain older devices - iwlwifi missing an error path unlock - iwlwifi compatibility with certain BIOS updates * tag 'wireless-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: iwlwifi: Fix botched indexing conversion wifi: cfg80211: remove scan request n_channels counted_by wifi: ath12k: Fix packets received in WBM error ring with REO LUT enabled wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap wifi: iwlwifi: pcie: fix locking on invalid TOP reset ==================== Link: https://patch.msgid.link/20250717091831.18787-5-johannes@sipsolutions.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>