summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-19Merge tag 'sched_urgent_for_v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Do not adjust the weight of empty group entities and avoid scheduling artifacts - Avoid scheduling lag by computing lag properly and thus address an EEVDF entity placement issue * tag 'sched_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix update_cfs_group() vs DELAY_DEQUEUE sched/fair: Fix EEVDF entity placement bug causing scheduling lag
2025-01-19selftests/efivarfs: fix tests for failed write removalJames Bottomley
The current self tests expect the zero size remnants that failed variable creation leaves. Update the tests to verify these are now absent. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19Merge branch 'efivarfs' into nextArd Biesheuvel
2025-01-19efivarfs: fix error on write to new variable leaving remnantsJames Bottomley
Make variable cleanup go through the fops release mechanism and use zero inode size as the indicator to delete the file. Since all EFI variables must have an initial u32 attribute, zero size occurs either because the update deleted the variable or because an unsuccessful write after create caused the size never to be set in the first place. In the case of multiple racing opens and closes, the open is counted to ensure that the zero size check is done on the last close. Even though this fixes the bug that a create either not followed by a write or followed by a write that errored would leave a remnant file for the variable, the file will appear momentarily globally visible until the last close of the fd deletes it. This is safe because the normal filesystem operations will mediate any races; however, it is still possible for a directory listing at that instant between create and close contain a zero size variable that doesn't exist in the EFI table. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19efivarfs: remove unused efivarfs_listJames Bottomley
Remove all function helpers and mentions of the efivarfs_list now that all consumers of the list have been removed and entry management goes exclusively through the inode. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19efivarfs: move variable lifetime management into the inodesJames Bottomley
Make the inodes the default management vehicle for struct efivar_entry, so they are now all freed automatically if the file is removed and on unmount in kill_litter_super(). Remove the now superfluous iterator to free the entries after kill_litter_super(). Also fixes a bug where some entry freeing was missing causing efivarfs to leak memory. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19selftests/efivarfs: add check for disallowing file truncationJames Bottomley
Now that the ability of arbitrary writes to set the inode size is fixed, verify that a variable file accepts a truncation operation but does not change the stat size because of it. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19efivarfs: prevent setting of zero size on the inodes in the cacheJames Bottomley
Current efivarfs uses simple_setattr which allows the setting of any size in the inode cache. This is wrong because a zero size file is used to indicate an "uncommitted" variable, so by simple means of truncating the file (as root) any variable may be turned to look like it's uncommitted. Fix by adding an efivarfs_setattr routine which does not allow updating of the cached inode size (which now only comes from the underlying variable). Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19netfilter: flowtable: add CLOSING statePablo Neira Ayuso
tcp rst/fin packet triggers an immediate teardown of the flow which results in sending flows back to the classic forwarding path. This behaviour was introduced by: da5984e51063 ("netfilter: nf_flow_table: add support for sending flows back to the slow path") b6f27d322a0a ("netfilter: nf_flow_table: tear down TCP flows if RST or FIN was seen") whose goal is to expedite removal of flow entries from the hardware table. Before these patches, the flow was released after the flow entry timed out. However, this approach leads to packet races when restoring the conntrack state as well as late flow re-offload situations when the TCP connection is ending. This patch adds a new CLOSING state that is is entered when tcp rst/fin packet is seen. This allows for an early removal of the flow entry from the hardware table. But the flow entry still remains in software, so tcp packets to shut down the flow are not sent back to slow path. If syn packet is seen from this new CLOSING state, then this flow enters teardown state, ct state is set to TCP_CONNTRACK_CLOSE state and packet is sent to slow path, so this TCP reopen scenario can be handled by conntrack. TCP_CONNTRACK_CLOSE provides a small timeout that aims at quickly releasing this stale entry from the conntrack table. Moreover, skip hardware re-offload from flowtable software packet if the flow is in CLOSING state. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: flowtable: teardown flow if cached mtu is stalePablo Neira Ayuso
Tear down the flow entry in the unlikely case that the interface mtu changes, this gives the flow a chance to refresh the cached mtu, otherwise such refresh does not occur until flow entry expires. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: conntrack: rework offload nf_conn timeout extension logicFlorian Westphal
Offload nf_conn entries may not see traffic for a very long time. To prevent incorrect 'ct is stale' checks during nf_conntrack table lookup, the gc worker extends the timeout nf_conn entries marked for offload to a large value. The existing logic suffers from a few problems. Garbage collection runs without locks, its unlikely but possible that @ct is removed right after the 'offload' bit test. In that case, the timeout of a new/reallocated nf_conn entry will be increased. Prevent this by obtaining a reference count on the ct object and re-check of the confirmed and offload bits. If those are not set, the ct is being removed, skip the timeout extension in this case. Parallel teardown is also problematic: cpu1 cpu2 gc_worker calls flow_offload_teardown() tests OFFLOAD bit, set clear OFFLOAD bit ct->timeout is repaired (e.g. set to timeout[UDP_CT_REPLIED]) nf_ct_offload_timeout() called expire value is fetched <INTERRUPT> -> NF_CT_DAY timeout for flow that isn't offloaded (and might not see any further packets). Use cmpxchg: if ct->timeout was repaired after the 2nd 'offload bit' test passed, then ct->timeout will only be updated of ct->timeout was not altered in between. As we already have a gc worker for flowtable entries, ct->timeout repair can be handled from the flowtable gc worker. This avoids having flowtable specific logic in the conntrack core and avoids checking entries that were never offloaded. This allows to remove the nf_ct_offload_timeout helper. Its safe to use in the add case, but not on teardown. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: conntrack: remove skb argument from nf_ct_refreshFlorian Westphal
Its not used (and could be NULL), so remove it. This allows to use nf_ct_refresh in places where we don't have an skb without having to double-check that skb == NULL would be safe. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: nft_flow_offload: update tcp state flags under lockFlorian Westphal
The conntrack entry is already public, there is a small chance that another CPU is handling a packet in reply direction and racing with the tcp state update. Move this under ct spinlock. This is done once, when ct is about to be offloaded, so this should not result in a noticeable performance hit. Fixes: 8437a6209f76 ("netfilter: nft_flow_offload: set liberal tracking mode for tcp") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: nft_flow_offload: clear tcp MAXACK flag before moving to slowpathFlorian Westphal
This state reset is racy, no locks are held here. Since commit 8437a6209f76 ("netfilter: nft_flow_offload: set liberal tracking mode for tcp"), the window checks are disabled for normal data packets, but MAXACK flag is checked when validating TCP resets. Clear the flag so tcp reset validation checks are ignored. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: nf_tables: Simplify chain netdev notifierPhil Sutter
With conditional chain deletion gone, callback code simplifies: Instead of filling an nft_ctx object, just pass basechain to the per-chain function. Also plain list_for_each_entry() is safe now. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: nf_tables: Tolerate chains with no remaining hooksPhil Sutter
Do not drop a netdev-family chain if the last interface it is registered for vanishes. Users dumping and storing the ruleset upon shutdown to restore it upon next boot may otherwise lose the chain and all contained rules. They will still lose the list of devices, a later patch will fix that. For now, this aligns the event handler's behaviour with that for flowtables. The controversal situation at netns exit should be no problem here: event handler will unregister the hooks, core nftables cleanup code will drop the chain itself. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: nf_tables: Compare netdev hooks based on stored namePhil Sutter
The 1:1 relationship between nft_hook and nf_hook_ops is about to break, so choose the stored ifname to uniquely identify hooks. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: nf_tables: Use stored ifname in netdev hook dumpsPhil Sutter
The stored ifname and ops.dev->name may deviate after creation due to interface name changes. Prefer the more deterministic stored name in dumps which also helps avoiding inadvertent changes to stored ruleset dumps. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: nf_tables: Store user-defined hook ifnamePhil Sutter
Prepare for hooks with NULL ops.dev pointer (due to non-existent device) and store the interface name and length as specified by the user upon creation. No functional change intended. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: nf_tables: Flowtable hook's pf value never variesPhil Sutter
When checking for duplicate hooks in nft_register_flowtable_net_hooks(), comparing ops.pf value is pointless as it is always NFPROTO_NETDEV with flowtable hooks. Dropping the check leaves the search identical to the one in nft_hook_list_find() so call that function instead of open coding. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: br_netfilter: remove unused conditional and dead codeAntoine Tenart
The SKB_DROP_REASON_IP_INADDRERRORS drop reason is never returned from any function, as such it cannot be returned from the ip_route_input call tree. The 'reason != SKB_DROP_REASON_IP_INADDRERRORS' conditional is thus always true. Looking back at history, commit 50038bf38e65 ("net: ip: make ip_route_input() return drop reasons") changed the ip_route_input returned value check in br_nf_pre_routing_finish from -EHOSTUNREACH to SKB_DROP_REASON_IP_INADDRERRORS. It turns out -EHOSTUNREACH could not be returned either from the ip_route_input call tree and this since commit 251da4130115 ("ipv4: Cache ip_error() routes even when not forwarding."). Not a fix as this won't change the behavior. While at it use kfree_skb_reason. Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19netfilter: nf_tables: fix set size with rbtree backendPablo Neira Ayuso
The existing rbtree implementation uses singleton elements to represent ranges, however, userspace provides a set size according to the number of ranges in the set. Adjust provided userspace set size to the number of singleton elements in the kernel by multiplying the range by two. Check if the no-match all-zero element is already in the set, in such case release one slot in the set size. Fixes: 0ed6389c483d ("netfilter: nf_tables: rename set implementations") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-19io_uring/fdinfo: fix io_uring_show_fdinfo() misuse of ->d_inameAl Viro
Output of io_uring_show_fdinfo() has several problems: * racy use of ->d_iname * junk if the name is long - in that case it's not stored in ->d_iname at all * lack of quoting (names can contain newlines, etc. - or be equal to "<none>", for that matter). * lines for empty slots are pointless noise - we already have the total amount, so having just the non-empty ones would carry the same information. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-19RDMA/qib: Constify 'struct bin_attribute'Thomas Weißschuh
The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20250114-sysfs-const-bin_attr-infiniband-v1-2-397aaa94d453@weissschuh.net Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-01-19RDMA/hfi1: Constify 'struct bin_attribute'Thomas Weißschuh
The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20250114-sysfs-const-bin_attr-infiniband-v1-1-397aaa94d453@weissschuh.net Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-01-19rhashtable: Fix rhashtable_try_insert testHerbert Xu
The test on whether rhashtable_insert_one did an insertion relies on the value returned by rhashtable_lookup_one. Unfortunately that value is overwritten after rhashtable_insert_one returns. Fix this by moving the test before data gets overwritten. Simplify the test as only data == NULL matters. Finally move atomic_inc back within the lock as otherwise it may be reordered with the atomic_dec on the removal side, potentially leading to an underflow. Reported-by: Michael Kelley <mhklinux@outlook.com> Fixes: e1d3422c95f0 ("rhashtable: Fix potential deadlock by moving schedule_work outside lock") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Breno Leitao <leitao@debian.org> Tested-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-01-19dt-bindings: crypto: qcom,inline-crypto-engine: Document the SM8750 ICEGaurav Kashyap
Document the Inline Crypto Engine (ICE) on the SM8750 Platform. Signed-off-by: Gaurav Kashyap <quic_gaurkash@quicinc.com> Signed-off-by: Melody Olvera <quic_molvera@quicinc.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-01-19dt-bindings: crypto: qcom,prng: Document SM8750 RNGGaurav Kashyap
Document SM8750 compatible for the True Random Number Generator. Signed-off-by: Gaurav Kashyap <quic_gaurkash@quicinc.com> Signed-off-by: Melody Olvera <quic_molvera@quicinc.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-01-19dt-bindings: crypto: qcom-qce: Document the SM8750 crypto engineGaurav Kashyap
Document the crypto engine on the SM8750 Platform. Signed-off-by: Gaurav Kashyap <quic_gaurkash@quicinc.com> Signed-off-by: Melody Olvera <quic_molvera@quicinc.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-01-19crypto: asymmetric_keys - Remove unused key_being_used_for[]Dr. David Alan Gilbert
key_being_used_for[] is an unused array of textual names for the elements of the enum key_being_used_for. It was added in 2015 by commit 99db44350672 ("PKCS#7: Appropriately restrict authenticated attributes and content type") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-01-19padata: avoid UAF for reorder_workChen Ridong
Although the previous patch can avoid ps and ps UAF for _do_serial, it can not avoid potential UAF issue for reorder_work. This issue can happen just as below: crypto_request crypto_request crypto_del_alg padata_do_serial ... padata_reorder // processes all remaining // requests then breaks while (1) { if (!padata) break; ... } padata_do_serial // new request added list_add // sees the new request queue_work(reorder_work) padata_reorder queue_work_on(squeue->work) ... <kworker context> padata_serial_worker // completes new request, // no more outstanding // requests crypto_del_alg // free pd <kworker context> invoke_padata_reorder // UAF of pd To avoid UAF for 'reorder_work', get 'pd' ref before put 'reorder_work' into the 'serial_wq' and put 'pd' ref until the 'serial_wq' finish. Fixes: bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock by using per-instance padata queues") Signed-off-by: Chen Ridong <chenridong@huawei.com> Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-01-19padata: fix UAF in padata_reorderChen Ridong
A bug was found when run ltp test: BUG: KASAN: slab-use-after-free in padata_find_next+0x29/0x1a0 Read of size 4 at addr ffff88bbfe003524 by task kworker/u113:2/3039206 CPU: 0 PID: 3039206 Comm: kworker/u113:2 Kdump: loaded Not tainted 6.6.0+ Workqueue: pdecrypt_parallel padata_parallel_worker Call Trace: <TASK> dump_stack_lvl+0x32/0x50 print_address_description.constprop.0+0x6b/0x3d0 print_report+0xdd/0x2c0 kasan_report+0xa5/0xd0 padata_find_next+0x29/0x1a0 padata_reorder+0x131/0x220 padata_parallel_worker+0x3d/0xc0 process_one_work+0x2ec/0x5a0 If 'mdelay(10)' is added before calling 'padata_find_next' in the 'padata_reorder' function, this issue could be reproduced easily with ltp test (pcrypt_aead01). This can be explained as bellow: pcrypt_aead_encrypt ... padata_do_parallel refcount_inc(&pd->refcnt); // add refcnt ... padata_do_serial padata_reorder // pd while (1) { padata_find_next(pd, true); // using pd queue_work_on ... padata_serial_worker crypto_del_alg padata_put_pd_cnt // sub refcnt padata_free_shell padata_put_pd(ps->pd); // pd is freed // loop again, but pd is freed // call padata_find_next, UAF } In the padata_reorder function, when it loops in 'while', if the alg is deleted, the refcnt may be decreased to 0 before entering 'padata_find_next', which leads to UAF. As mentioned in [1], do_serial is supposed to be called with BHs disabled and always happen under RCU protection, to address this issue, add synchronize_rcu() in 'padata_free_shell' wait for all _do_serial calls to finish. [1] https://lore.kernel.org/all/20221028160401.cccypv4euxikusiq@parnassus.localdomain/ [2] https://lore.kernel.org/linux-kernel/jfjz5d7zwbytztackem7ibzalm5lnxldi2eofeiczqmqs2m7o6@fq426cwnjtkm/ Fixes: b128a3040935 ("padata: allocate workqueue internally") Signed-off-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Qu Zicheng <quzicheng@huawei.com> Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-01-19padata: add pd get/put refcnt helperChen Ridong
Add helpers for pd to get/put refcnt to make code consice. Signed-off-by: Chen Ridong <chenridong@huawei.com> Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-01-18Merge tag 'batadv-next-pullrequest-20250117' of ↵Jakub Kicinski
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - Reorder includes for distributed-arp-table.c, by Sven Eckelmann - Fix translation table change handling, by Remi Pommarel (2 patches) - Map VID 0 to untagged TT VLAN, by Sven Eckelmann - Update MAINTAINERS/mailmap e-mail addresses, by the respective authors (4 patches) - netlink: reduce duplicate code by returning interfaces, by Linus Lüssing * tag 'batadv-next-pullrequest-20250117' of git://git.open-mesh.org/linux-merge: batman-adv: netlink: reduce duplicate code by returning interfaces MAINTAINERS: mailmap: add entries for Antonio Quartulli mailmap: add entries for Sven Eckelmann mailmap: add entries for Simon Wunderlich MAINTAINERS: update email address of Marek Linder batman-adv: Map VID 0 to untagged TT VLAN batman-adv: Don't keep redundant TT change events batman-adv: Remove atomic usage for tt.local_changes batman-adv: Reorder includes for distributed-arp-table.c batman-adv: Start new development cycle ==================== Link: https://patch.msgid.link/20250117123910.219278-1-sw@simonwunderlich.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18Merge tag 'for-net-next-2025-01-15' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - btusb: Add new VID/PID 13d3/3610 for MT7922 - btusb: Add new VID/PID 13d3/3628 for MT7925 - btusb: Add MT7921e device 13d3:3576 - btusb: Add RTL8851BE device 13d3:3600 - btusb: Add ID 0x2c7c:0x0130 for Qualcomm WCN785x - btusb: add sysfs attribute to control USB alt setting - qca: Expand firmware-name property - qca: Fix poor RF performance for WCN6855 - L2CAP: handle NULL sock pointer in l2cap_sock_alloc - Allow reset via sysfs - ISO: Allow BIG re-sync - dt-bindings: Utilize PMU abstraction for WCN6750 - MGMT: Mark LL Privacy as stable * tag 'for-net-next-2025-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (23 commits) Bluetooth: MGMT: Fix slab-use-after-free Read in mgmt_remove_adv_monitor_sync Bluetooth: qca: Fix poor RF performance for WCN6855 Bluetooth: Allow reset via sysfs Bluetooth: Get rid of cmd_timeout and use the reset callback Bluetooth: Remove the cmd timeout count in btusb Bluetooth: Use str_enable_disable-like helpers Bluetooth: btmtk: Remove resetting mt7921 before downloading the fw Bluetooth: L2CAP: handle NULL sock pointer in l2cap_sock_alloc Bluetooth: btusb: Add RTL8851BE device 13d3:3600 dt-bindings: bluetooth: Utilize PMU abstraction for WCN6750 Bluetooth: btusb: Add MT7921e device 13d3:3576 Bluetooth: btrtl: check for NULL in btrtl_setup_realtek() Bluetooth: btbcm: Fix NULL deref in btbcm_get_board_name() Bluetooth: qca: Expand firmware-name to load specific rampatch Bluetooth: qca: Update firmware-name to support board specific nvm dt-bindings: net: bluetooth: qca: Expand firmware-name property Bluetooth: btusb: Add new VID/PID 13d3/3628 for MT7925 Bluetooth: btusb: Add new VID/PID 13d3/3610 for MT7922 Bluetooth: btusb: add sysfs attribute to control USB alt setting Bluetooth: btusb: Add ID 0x2c7c:0x0130 for Qualcomm WCN785x ... ==================== Link: https://patch.msgid.link/20250117213203.3921910-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18Merge tag 'wireless-next-2025-01-17' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.14 Most likely the last "new features" pull request for v6.14 and this is a bigger one. Multi-Link Operation (MLO) work continues both in stack in drivers. Few new devices supported and usual fixes all over. Major changes: cfg80211 * Emergency Preparedness Communication Services (EPCS) station mode support mac80211 * an option to filter a sta from being flushed * some support for RX Operating Mode Indication (OMI) power saving * support for adding and removing station links for MLO iwlwifi * new device ids * rework firmware error handling and restart rtw88 * RTL8812A: RFE type 2 support * LED support rtw89 * variant info to support RTL8922AE-VS mt76 * mt7996: single wiphy multiband support (preparation for MLO) * mt7996: support for more variants * mt792x: P2P_DEVICE support * mt7921u: TP-Link TXE50UH support ath12k * enable MLO for QCN9274 (although it seems to be broken with dual band devices) * MLO radar detection support * debugfs: transmit buffer OFDMA, AST entry and puncture stats * tag 'wireless-next-2025-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (322 commits) wifi: brcmfmac: fix NULL pointer dereference in brcmf_txfinalize() wifi: rtw88: add RTW88_LEDS depends on LEDS_CLASS to Kconfig wifi: wilc1000: unregister wiphy only after netdev registration wifi: cfg80211: adjust allocation of colocated AP data wifi: mac80211: fix memory leak in ieee80211_mgd_assoc_ml_reconf() wifi: ath12k: fix key cache handling wifi: ath12k: Fix uninitialized variable access in ath12k_mac_allocate() function wifi: ath12k: Remove ath12k_get_num_hw() helper function wifi: ath12k: Refactor the ath12k_hw get helper function argument wifi: ath12k: Refactor ath12k_hw set helper function argument wifi: mt76: mt7996: add implicit beamforming support for mt7992 wifi: mt76: mt7996: fix beacon command during disabling wifi: mt76: mt7996: fix ldpc setting wifi: mt76: mt7996: fix definition of tx descriptor wifi: mt76: connac: adjust phy capabilities based on band constraints wifi: mt76: mt7996: fix incorrect indexing of MIB FW event wifi: mt76: mt7996: fix HE Phy capability wifi: mt76: mt7996: fix the capability of reception of EHT MU PPDU wifi: mt76: mt7996: add max mpdu len capability wifi: mt76: mt7921: avoid undesired changes of the preset regulatory domain ... ==================== Link: https://patch.msgid.link/20250117203529.72D45C4CEDD@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18net: introduce netdev_napi_exit()Eric Dumazet
After 1b23cdbd2bbc ("net: protect netdev->napi_list with netdev_lock()") it makes sense to iterate through dev->napi_list while holding the device lock. Also call synchronize_net() at most one time. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250117232113.1612899-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18net: phy: remove leftovers from switch to linkmode bitmapsHeiner Kallweit
We have some leftovers from the switch to linkmode bitmaps which - have never been used - are not used any longer - have no user outside phy_device.c So remove them. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/5493b96e-88bb-4230-a911-322659ec5167@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18net: sched: Disallow replacing of child qdisc from one parent to anotherJamal Hadi Salim
Lion Ackermann was able to create a UAF which can be abused for privilege escalation with the following script Step 1. create root qdisc tc qdisc add dev lo root handle 1:0 drr step2. a class for packet aggregation do demonstrate uaf tc class add dev lo classid 1:1 drr step3. a class for nesting tc class add dev lo classid 1:2 drr step4. a class to graft qdisc to tc class add dev lo classid 1:3 drr step5. tc qdisc add dev lo parent 1:1 handle 2:0 plug limit 1024 step6. tc qdisc add dev lo parent 1:2 handle 3:0 drr step7. tc class add dev lo classid 3:1 drr step 8. tc qdisc add dev lo parent 3:1 handle 4:0 pfifo step 9. Display the class/qdisc layout tc class ls dev lo class drr 1:1 root leaf 2: quantum 64Kb class drr 1:2 root leaf 3: quantum 64Kb class drr 3:1 root leaf 4: quantum 64Kb tc qdisc ls qdisc drr 1: dev lo root refcnt 2 qdisc plug 2: dev lo parent 1:1 qdisc pfifo 4: dev lo parent 3:1 limit 1000p qdisc drr 3: dev lo parent 1:2 step10. trigger the bug <=== prevented by this patch tc qdisc replace dev lo parent 1:3 handle 4:0 step 11. Redisplay again the qdiscs/classes tc class ls dev lo class drr 1:1 root leaf 2: quantum 64Kb class drr 1:2 root leaf 3: quantum 64Kb class drr 1:3 root leaf 4: quantum 64Kb class drr 3:1 root leaf 4: quantum 64Kb tc qdisc ls qdisc drr 1: dev lo root refcnt 2 qdisc plug 2: dev lo parent 1:1 qdisc pfifo 4: dev lo parent 3:1 refcnt 2 limit 1000p qdisc drr 3: dev lo parent 1:2 Observe that a) parent for 4:0 does not change despite the replace request. There can only be one parent. b) refcount has gone up by two for 4:0 and c) both class 1:3 and 3:1 are pointing to it. Step 12. send one packet to plug echo "" | socat -u STDIN UDP4-DATAGRAM:127.0.0.1:8888,priority=$((0x10001)) step13. send one packet to the grafted fifo echo "" | socat -u STDIN UDP4-DATAGRAM:127.0.0.1:8888,priority=$((0x10003)) step14. lets trigger the uaf tc class delete dev lo classid 1:3 tc class delete dev lo classid 1:1 The semantics of "replace" is for a del/add _on the same node_ and not a delete from one node(3:1) and add to another node (1:3) as in step10. While we could "fix" with a more complex approach there could be consequences to expectations so the patch takes the preventive approach of "disallow such config". Joint work with Lion Ackermann <nnamrec@gmail.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250116013713.900000-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18net: avoid race between device unregistration and ethnl opsAntoine Tenart
The following trace can be seen if a device is being unregistered while its number of channels are being modified. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 3 PID: 3754 at kernel/locking/mutex.c:564 __mutex_lock+0xc8a/0x1120 CPU: 3 UID: 0 PID: 3754 Comm: ethtool Not tainted 6.13.0-rc6+ #771 RIP: 0010:__mutex_lock+0xc8a/0x1120 Call Trace: <TASK> ethtool_check_max_channel+0x1ea/0x880 ethnl_set_channels+0x3c3/0xb10 ethnl_default_set_doit+0x306/0x650 genl_family_rcv_msg_doit+0x1e3/0x2c0 genl_rcv_msg+0x432/0x6f0 netlink_rcv_skb+0x13d/0x3b0 genl_rcv+0x28/0x40 netlink_unicast+0x42e/0x720 netlink_sendmsg+0x765/0xc20 __sys_sendto+0x3ac/0x420 __x64_sys_sendto+0xe0/0x1c0 do_syscall_64+0x95/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e This is because unregister_netdevice_many_notify might run before the rtnl lock section of ethnl operations, eg. set_channels in the above example. In this example the rss lock would be destroyed by the device unregistration path before being used again, but in general running ethnl operations while dismantle has started is not a good idea. Fix this by denying any operation on devices being unregistered. A check was already there in ethnl_ops_begin, but not wide enough. Note that the same issue cannot be seen on the ioctl version (__dev_ethtool) because the device reference is retrieved from within the rtnl lock section there. Once dismantle started, the net device is unlisted and no reference will be found. Fixes: dde91ccfa25f ("ethtool: do not perform operations on net devices being unregistered") Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250116092159.50890-1-atenart@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18net: destroy dev->lock later in free_netdev()Eric Dumazet
syzbot complained that free_netdev() was calling netif_napi_del() after dev->lock mutex has been destroyed. This fires a warning for CONFIG_DEBUG_MUTEXES=y builds. Move mutex_destroy(&dev->lock) near the end of free_netdev(). [1] DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 0 PID: 5971 at kernel/locking/mutex.c:564 __mutex_lock_common kernel/locking/mutex.c:564 [inline] WARNING: CPU: 0 PID: 5971 at kernel/locking/mutex.c:564 __mutex_lock+0xdac/0xee0 kernel/locking/mutex.c:735 Modules linked in: CPU: 0 UID: 0 PID: 5971 Comm: syz-executor Not tainted 6.13.0-rc7-syzkaller-01131-g8d20dcda404d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024 RIP: 0010:__mutex_lock_common kernel/locking/mutex.c:564 [inline] RIP: 0010:__mutex_lock+0xdac/0xee0 kernel/locking/mutex.c:735 Code: 0f b6 04 38 84 c0 0f 85 1a 01 00 00 83 3d 6f 40 4c 04 00 75 19 90 48 c7 c7 60 84 0a 8c 48 c7 c6 00 85 0a 8c e8 f5 dc 91 f5 90 <0f> 0b 90 90 90 e9 c7 f3 ff ff 90 0f 0b 90 e9 29 f8 ff ff 90 0f 0b RSP: 0018:ffffc90003317580 EFLAGS: 00010246 RAX: ee0f97edaf7b7d00 RBX: ffff8880299f8cb0 RCX: ffff8880323c9e00 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffc90003317710 R08: ffffffff81602ac2 R09: 1ffff110170c519a R10: dffffc0000000000 R11: ffffed10170c519b R12: 0000000000000000 R13: 0000000000000000 R14: 1ffff92000662ec4 R15: dffffc0000000000 FS: 000055557a046500(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd581d46ff8 CR3: 000000006f870000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> netdev_lock include/linux/netdevice.h:2691 [inline] __netif_napi_del include/linux/netdevice.h:2829 [inline] netif_napi_del include/linux/netdevice.h:2848 [inline] free_netdev+0x2d9/0x610 net/core/dev.c:11621 netdev_run_todo+0xf21/0x10d0 net/core/dev.c:11189 nsim_destroy+0x3c3/0x620 drivers/net/netdevsim/netdev.c:1028 __nsim_dev_port_del+0x14b/0x1b0 drivers/net/netdevsim/dev.c:1428 nsim_dev_port_del_all drivers/net/netdevsim/dev.c:1440 [inline] nsim_dev_reload_destroy+0x28a/0x490 drivers/net/netdevsim/dev.c:1661 nsim_drv_remove+0x58/0x160 drivers/net/netdevsim/dev.c:1676 device_remove drivers/base/dd.c:567 [inline] Fixes: 1b23cdbd2bbc ("net: protect netdev->napi_list with netdev_lock()") Reported-by: syzbot+85ff1051228a04613a32@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/678add43.050a0220.303755.0016.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250117224626.1427577-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18eth: bnxt: fix string truncation warning in FW versionJakub Kicinski
W=1 builds with gcc 14.2.1 report: drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:4193:32: error: ‘%s’ directive output may be truncated writing up to 31 bytes into a region of size 27 [-Werror=format-truncation=] 4193 | "/pkg %s", buf); It's upset that we let buf be full length but then we use 5 characters for "/pkg ". The builds is also clear with clang version 19.1.5 now. Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250117183726.1481524-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18mptcp: sysctl: add syn_retrans_before_tcp_fallbackMatthieu Baerts (NGI0)
The number of SYN + MPC retransmissions before falling back to TCP was fixed to 2. This is certainly a good default value, but having a fixed number can be a problem in some environments. The current behaviour means that if all packets are dropped, there will be: - The initial SYN + MPC - 2 retransmissions with MPC - The next ones will be without MPTCP. So typically ~3 seconds before falling back to TCP. In some networks where some temporally blackholes are unfortunately frequent, or when a client tries to initiate connections while the network is not ready yet, this can cause new connections not to have MPTCP connections. In such environments, it is now possible to increase the number of SYN retransmissions with MPTCP options to make sure MPTCP is used. Interesting values are: - 0: the first retransmission will be done without MPTCP options: quite aggressive, but also a higher risk of detecting false-positive MPTCP blackholes. - >= 128: all SYN retransmissions will keep the MPTCP options: back to the < 6.12 behaviour. The default behaviour is not changed here. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250117-net-next-mptcp-syn_retrans_before_tcp_fallback-v1-1-ab4b187099b0@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18Merge branch 'fix-race-conditions-in-ndo_get_stats64'Jakub Kicinski
Shinas Rasheed says: ==================== Fix race conditions in ndo_get_stats64 Fix race conditions in ndo_get_stats64 by storing tx/rx stats locally and not availing per queue resources which could be torn down during interface stop. Also remove stats fetch from firmware which is currently unnecessary ==================== Link: https://patch.msgid.link/20250117094653.2588578-1-srasheed@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18octeon_ep_vf: update tx/rx stats locally for persistenceShinas Rasheed
Update tx/rx stats locally, so that ndo_get_stats64() can use that and not rely on per queue resources to obtain statistics. The latter used to cause race conditions when the device stopped. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Link: https://patch.msgid.link/20250117094653.2588578-5-srasheed@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18octeon_ep_vf: remove firmware stats fetch in ndo_get_stats64Shinas Rasheed
The firmware stats fetch call that happens in ndo_get_stats64() is currently not required, and causes a warning to issue. The corresponding warn log for the PF is given below: [ 123.316837] ------------[ cut here ]------------ [ 123.316840] Voluntary context switch within RCU read-side critical section! [ 123.316917] pc : rcu_note_context_switch+0x2e4/0x300 [ 123.316919] lr : rcu_note_context_switch+0x2e4/0x300 [ 123.316947] Call trace: [ 123.316949] rcu_note_context_switch+0x2e4/0x300 [ 123.316952] __schedule+0x84/0x584 [ 123.316955] schedule+0x38/0x90 [ 123.316956] schedule_timeout+0xa0/0x1d4 [ 123.316959] octep_send_mbox_req+0x190/0x230 [octeon_ep] [ 123.316966] octep_ctrl_net_get_if_stats+0x78/0x100 [octeon_ep] [ 123.316970] octep_get_stats64+0xd4/0xf0 [octeon_ep] [ 123.316975] dev_get_stats+0x4c/0x114 [ 123.316977] dev_seq_printf_stats+0x3c/0x11c [ 123.316980] dev_seq_show+0x1c/0x40 [ 123.316982] seq_read_iter+0x3cc/0x4e0 [ 123.316985] seq_read+0xc8/0x110 [ 123.316987] proc_reg_read+0x9c/0xec [ 123.316990] vfs_read+0xc8/0x2ec [ 123.316993] ksys_read+0x70/0x100 [ 123.316995] __arm64_sys_read+0x20/0x30 [ 123.316997] invoke_syscall.constprop.0+0x7c/0xd0 [ 123.317000] do_el0_svc+0xb4/0xd0 [ 123.317002] el0_svc+0xe8/0x1f4 [ 123.317005] el0t_64_sync_handler+0x134/0x150 [ 123.317006] el0t_64_sync+0x17c/0x180 [ 123.317008] ---[ end trace 63399811432ab69b ]--- Fixes: c3fad23cdc06 ("octeon_ep_vf: add support for ndo ops") Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Link: https://patch.msgid.link/20250117094653.2588578-4-srasheed@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18octeon_ep: update tx/rx stats locally for persistenceShinas Rasheed
Update tx/rx stats locally, so that ndo_get_stats64() can use that and not rely on per queue resources to obtain statistics. The latter used to cause race conditions when the device stopped. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Link: https://patch.msgid.link/20250117094653.2588578-3-srasheed@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18octeon_ep: remove firmware stats fetch in ndo_get_stats64Shinas Rasheed
The firmware stats fetch call that happens in ndo_get_stats64() is currently not required, and causes a warning to issue. The warn log is given below: [ 123.316837] ------------[ cut here ]------------ [ 123.316840] Voluntary context switch within RCU read-side critical section! [ 123.316917] pc : rcu_note_context_switch+0x2e4/0x300 [ 123.316919] lr : rcu_note_context_switch+0x2e4/0x300 [ 123.316947] Call trace: [ 123.316949] rcu_note_context_switch+0x2e4/0x300 [ 123.316952] __schedule+0x84/0x584 [ 123.316955] schedule+0x38/0x90 [ 123.316956] schedule_timeout+0xa0/0x1d4 [ 123.316959] octep_send_mbox_req+0x190/0x230 [octeon_ep] [ 123.316966] octep_ctrl_net_get_if_stats+0x78/0x100 [octeon_ep] [ 123.316970] octep_get_stats64+0xd4/0xf0 [octeon_ep] [ 123.316975] dev_get_stats+0x4c/0x114 [ 123.316977] dev_seq_printf_stats+0x3c/0x11c [ 123.316980] dev_seq_show+0x1c/0x40 [ 123.316982] seq_read_iter+0x3cc/0x4e0 [ 123.316985] seq_read+0xc8/0x110 [ 123.316987] proc_reg_read+0x9c/0xec [ 123.316990] vfs_read+0xc8/0x2ec [ 123.316993] ksys_read+0x70/0x100 [ 123.316995] __arm64_sys_read+0x20/0x30 [ 123.316997] invoke_syscall.constprop.0+0x7c/0xd0 [ 123.317000] do_el0_svc+0xb4/0xd0 [ 123.317002] el0_svc+0xe8/0x1f4 [ 123.317005] el0t_64_sync_handler+0x134/0x150 [ 123.317006] el0t_64_sync+0x17c/0x180 [ 123.317008] ---[ end trace 63399811432ab69b ]--- Fixes: 6a610a46bad1 ("octeon_ep: add support for ndo ops") Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Link: https://patch.msgid.link/20250117094653.2588578-2-srasheed@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18Merge branch 'net-xilinx-axienet-enable-adaptive-irq-coalescing-with-dim'Jakub Kicinski
Sean Anderson says: ==================== net: xilinx: axienet: Report an error for bad coalesce settings ==================== Link: https://patch.msgid.link/20250116232954.2696930-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18net: xilinx: axienet: Report an error for bad coalesce settingsSean Anderson
Instead of silently ignoring invalid/unsupported settings, report an error. Additionally, relax the check for non-zero usecs to apply only when it will be used (i.e. when frames != 1). Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed by: Shannon Nelson <shannon.nelson@amd.com> Link: https://patch.msgid.link/20250116232954.2696930-3-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>