Age | Commit message (Collapse) | Author |
|
User space may pass non-nul-terminated NFTA_DEVICE_NAME attribute values
to indicate a suffix wildcard.
Expect for multiple devices to match the given prefix in
nft_netdev_hook_alloc() and populate 'ops_list' with them all.
When checking for duplicate hooks, compare the shortest prefix so a
device may never match more than a single hook spec.
Finally respect the stored prefix length when hooking into new devices
from event handlers.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
No point in having err_hook_alloc, just call return directly. Also
rename err_hook_dev - it's not about the hook's device but freeing the
hook itself.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
For the sake of simplicity, treat them like consecutive NETDEV_REGISTER
and NETDEV_UNREGISTER events. If the new name matches a hook spec and
registration fails, escalate the error and keep things as they are.
To avoid unregistering the newly registered hook again during the
following fake NETDEV_UNREGISTER event, leave hooks alone if their
interface spec matches the new name.
Note how this patch also skips for NETDEV_REGISTER if the device is
already registered. This is not yet possible as the new name would have
to match the old one. This will change with wildcard interface specs,
though.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Handling NETDEV_CHANGENAME events has to traverse all chains/flowtables
twice, prepare for this. No functional change intended.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Hook into new devices if their name matches the hook spec.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Put NETDEV_UNREGISTER handling code into a switch, no functional change
intended as the function is only called for that event yet.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Supporting a 1:n relationship between nft_hook and nf_hook_ops is
convenient since a chain's or flowtable's nft_hooks may remain in place
despite matching interfaces disappearing. This stabilizes ruleset dumps
in that regard and opens the possibility to claim newly added interfaces
which match the spec. Also it prepares for wildcard interface specs
since these will potentially match multiple interfaces.
All spots dealing with hook registration are updated to handle a list of
multiple nf_hook_ops, but nft_netdev_hook_alloc() only adds a single
item for now to retain the old behaviour. The only expected functional
change here is how vanishing interfaces are handled: Instead of dropping
the respective nft_hook, only the matching nf_hook_ops are dropped.
To safely remove individual ops from the list in netdev handlers, an
rcu_head is added to struct nf_hook_ops so kfree_rcu() may be used.
There is at least nft_flowtable_find_dev() which may be iterating
through the list at the same time.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The function accesses only the hook's ops field, pass it directly. This
prepares for nft_hooks holding a list of nf_hook_ops in future.
While at it, make use of the function in
__nft_unregister_flowtable_net_hooks() as well.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Facilitate binding and registering of a flowtable hook via a single
function call.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Also a pretty dull wrapper around the hook->ops.dev comparison for now.
Will search the embedded nf_hook_ops list in future. The ugly cast to
eliminate the const qualifier will vanish then, too.
Since this future list will be RCU-protected, also introduce an _rcu()
variant here.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Pointless wrappers around kfree() for now, prep work for an embedded
list of nf_hook_ops.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add the minimal relevant info needed for userspace ("nftables monitor
trace") to provide the conntrack view of the packet:
- state (new, related, established)
- direction (original, reply)
- status (e.g., if connection is subject to dnat)
- id (allows to query ctnetlink for remaining conntrack state info)
Example:
trace id a62 inet filter PRE_RAW packet: iif "enp0s3" ether [..]
[..]
trace id a62 inet filter PRE_MANGLE conntrack: ct direction original ct state new ct id 32
trace id a62 inet filter PRE_MANGLE packet: [..]
[..]
trace id a62 inet filter IN conntrack: ct direction original ct state new ct status dnat-done ct id 32
[..]
In this case one can see that while NAT is active, the new connection
isn't subject to a translation.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
While nf_conntrack_id() doesn't need any functionaliy from conntrack, it
does reside in nf_conntrack_core.c -- callers add a module
dependency on conntrack.
Followup patch will need to compute the conntrack id from nf_tables_trace.c
to include it in nf_trace messages emitted to userspace via netlink.
I don't want to introduce a module dependency between nf_tables and
conntrack for this.
Since trace is slowpath, the added indirection is ok.
One alternative is to move nf_conntrack_id to the netfilter/core.c,
but I don't see a compelling reason so far.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nf_dup_skb_recursion is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Move nf_dup_skb_recursion to struct netdev_xmit, provide wrappers.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nft_pcpu_tun_ctx is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Make a struct with a nft_inner_tun_ctx member (original
nft_pcpu_tun_ctx) and a local_lock_t and use local_lock_nested_bh() for
locking. This change adds only lockdep coverage and does not alter the
functional behaviour for !PREEMPT_RT.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nf_skb_duplicated is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Due to the recursion involved, the simplest change is to make it a
per-task variable.
Move the per-CPU variable nf_skb_duplicated to task_struct and name it
in_nf_duplicate. Add it to the existing bitfield so it doesn't use
additional memory.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When dumping a nft_tunnel with more than one geneve_opt configured the
netlink attribute hierarchy should be as follow:
NFTA_TUNNEL_KEY_OPTS
|
|--NFTA_TUNNEL_KEY_OPTS_GENEVE
| |
| |--NFTA_TUNNEL_KEY_GENEVE_CLASS
| |--NFTA_TUNNEL_KEY_GENEVE_TYPE
| |--NFTA_TUNNEL_KEY_GENEVE_DATA
|
|--NFTA_TUNNEL_KEY_OPTS_GENEVE
| |
| |--NFTA_TUNNEL_KEY_GENEVE_CLASS
| |--NFTA_TUNNEL_KEY_GENEVE_TYPE
| |--NFTA_TUNNEL_KEY_GENEVE_DATA
|
|--NFTA_TUNNEL_KEY_OPTS_GENEVE
...
Otherwise, userspace tools won't be able to fetch the geneve options
configured correctly.
Fixes: 925d844696d9 ("netfilter: nft_tunnel: add support for geneve opts")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Replace the existing VRF test with a more comprehensive one.
It tests following combinations:
- fib type (returns address type, e.g. unicast)
- fib oif (route output interface index
- both with and without 'iif' keyword (changes result, e.g.
'fib daddr type local' will be true when the destination address
is configured on the local machine, but
'fib daddr . iif type local' will only be true when the destination
address is configured on the incoming interface.
Add all types of addresses to test with for both ipv4 and ipv6:
- local address on the incoming interface
- local address on another interface
- local address on another interface thats part of a vrf
- address on another host
The ruleset stores obtained results from 'fib' in nftables sets and
then queries the sets to check that it has the expected results.
Perform one pass while packets are coming in on interface NOT part of
a VRF and then again when it was added and make sure fib returns the
expected routes and address types for the various addresses in the
setup.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
fib has two modes:
1. Obtain output device according to source or destination address
2. Obtain the type of the address, e.g. local, unicast, multicast.
'fib daddr type' should return 'local' if the address is configured
in this netns or unicast otherwise.
'fib daddr . iif type' should return 'local' if the address is configured
on the input interface or unicast otherwise, i.e. more restrictive.
However, if the interface is part of a VRF, then 'fib daddr type'
returns unicast even if the address is configured on the incoming
interface.
This is broken for both ipv4 and ipv6.
In the ipv4 case, inet_dev_addr_type must only be used if the
'iif' or 'oif' (strict mode) was requested.
Else inet_addr_type_dev_table() needs to be used and the correct
dev argument must be passed as well so the correct fib (vrf) table
is used.
In the ipv6 case, the bug is similar, without strict mode, dev is NULL
so .flowi6_l3mdev will be set to 0.
Add a new 'nft_fib_l3mdev_master_ifindex_rcu()' helper and use that
to init the .l3mdev structure member.
For ipv6, use it from nft_fib6_flowi_init() which gets called from
both the 'type' and the 'route' mode eval functions.
This provides consistent behaviour for all modes for both ipv4 and ipv6:
If strict matching is requested, the input respectively output device
of the netfilter hooks is used.
Otherwise, use skb->dev to obtain the l3mdev ifindex.
Without this, most type checks in updated nft_fib.sh selftest fail:
FAIL: did not find veth0 . 10.9.9.1 . local in fibtype4
FAIL: did not find veth0 . dead:1::1 . local in fibtype6
FAIL: did not find veth0 . dead:9::1 . local in fibtype6
FAIL: did not find tvrf . 10.0.1.1 . local in fibtype4
FAIL: did not find tvrf . 10.9.9.1 . local in fibtype4
FAIL: did not find tvrf . dead:1::1 . local in fibtype6
FAIL: did not find tvrf . dead:9::1 . local in fibtype6
FAIL: fib expression address types match (iif in vrf)
(fib errounously returns 'unicast' for all of them, even
though all of these addresses are local to the vrf).
Fixes: f6d0cbcf09c5 ("netfilter: nf_tables: add fib expression")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Kuniyuki Iwashima says:
====================
af_unix: Introduce SO_PASSRIGHTS.
As long as recvmsg() or recvmmsg() is used with cmsg, it is not
possible to avoid receiving file descriptors via SCM_RIGHTS.
This series introduces a new socket option, SO_PASSRIGHTS, to allow
disabling SCM_RIGHTS. The option is enabled by default.
See patch 8 for background/context.
This series is related to [0], but is split into a separate series,
as most of the patches are specific to af_unix.
The v2 of the BPF LSM extension part will be posted later, once
this series is merged into net-next and has landed in bpf-next.
[0]: https://lore.kernel.org/bpf/20250505215802.48449-1-kuniyu@amazon.com/
Changes:
v5:
* Patch 4
* Fix BPF selftest failure (setget_sockopt.c)
v4: https://lore.kernel.org/netdev/20250515224946.6931-1-kuniyu@amazon.com/
* Patch 6
* Group sk->sk_scm_XXX bits by struct
* Patch 9
* Remove errno handling
v3: https://lore.kernel.org/netdev/20250514165226.40410-1-kuniyu@amazon.com/
* Patch 3
* Remove inline in scm.c
* Patch 4 & 5 & 8
* Return -EOPNOTSUPP in getsockopt()
* Patch 5
* Add CONFIG_SECURITY_NETWORK check for SO_PASSSEC
* Patch 6
* Add kdoc for sk_scm_unused
* Update sk_scm_XXX under lock_sock() in setsockopt()
* Patch 7
* Update changelog (recent change -> aed6ecef55d7)
v2: https://lore.kernel.org/netdev/20250510015652.9931-1-kuniyu@amazon.com/
* Added patch 4 & 5 to reuse sk_txrehash for scm_recv() flags
v1: https://lore.kernel.org/netdev/20250508013021.79654-1-kuniyu@amazon.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
scm_rights.c has various patterns of tests to exercise GC.
Let's add cases where SO_PASSRIGHTS is disabled.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As long as recvmsg() or recvmmsg() is used with cmsg, it is not
possible to avoid receiving file descriptors via SCM_RIGHTS.
This behaviour has occasionally been flagged as problematic, as
it can be (ab)used to trigger DoS during close(), for example, by
passing a FUSE-controlled fd or a hung NFS fd.
For instance, as noted on the uAPI Group page [0], an untrusted peer
could send a file descriptor pointing to a hung NFS mount and then
close it. Once the receiver calls recvmsg() with msg_control, the
descriptor is automatically installed, and then the responsibility
for the final close() now falls on the receiver, which may result
in blocking the process for a long time.
Regarding this, systemd calls cmsg_close_all() [1] after each
recvmsg() to close() unwanted file descriptors sent via SCM_RIGHTS.
However, this cannot work around the issue at all, because the final
fput() may still occur on the receiver's side once sendmsg() with
SCM_RIGHTS succeeds. Also, even filtering by LSM at recvmsg() does
not work for the same reason.
Thus, we need a better way to refuse SCM_RIGHTS at sendmsg().
Let's introduce SO_PASSRIGHTS to disable SCM_RIGHTS.
Note that this option is enabled by default for backward
compatibility.
Link: https://uapi-group.org/kernel-features/#disabling-reception-of-scm_rights-for-af_unix-sockets #[0]
Link: https://github.com/systemd/systemd/blob/v257.5/src/basic/fd-util.c#L612-L628 #[1]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For SOCK_STREAM embryo sockets, the SO_PASS{CRED,PIDFD,SEC} options
are inherited from the parent listen()ing socket.
Currently, this inheritance happens at accept(), because these
attributes were stored in sk->sk_socket->flags and the struct socket
is not allocated until accept().
This leads to unintentional behaviour.
When a peer sends data to an embryo socket in the accept() queue,
unix_maybe_add_creds() embeds credentials into the skb, even if
neither the peer nor the listener has enabled these options.
If the option is enabled, the embryo socket receives the ancillary
data after accept(). If not, the data is silently discarded.
This conservative approach works for SO_PASS{CRED,PIDFD,SEC}, but
would not for SO_PASSRIGHTS; once an SCM_RIGHTS with a hung file
descriptor was sent, it'd be game over.
To avoid this, we will need to preserve SOCK_PASSRIGHTS even on embryo
sockets.
Commit aed6ecef55d7 ("af_unix: Save listener for embryo socket.")
made it possible to access the parent's flags in sendmsg() via
unix_sk(other)->listener->sk->sk_socket->flags, but this introduces
an unnecessary condition that is irrelevant for most sockets,
accept()ed sockets and clients.
Therefore, we moved SOCK_PASSXXX into struct sock.
Let’s inherit sk->sk_scm_recv_flags at connect() to avoid receiving
SCM_RIGHTS on embryo sockets created from a parent with SO_PASSRIGHTS=0.
Note that the parent socket is locked in connect() so we don't need
READ_ONCE() for sk_scm_recv_flags.
Now, we can remove !other->sk_socket check in unix_maybe_add_creds()
to avoid slow SOCK_PASS{CRED,PIDFD} handling for embryo sockets
created from a parent with SO_PASS{CRED,PIDFD}=0.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As explained in the next patch, SO_PASSRIGHTS would have a problem
if we assigned a corresponding bit to socket->flags, so it must be
managed in struct sock.
Mixing socket->flags and sk->sk_flags for similar options will look
confusing, and sk->sk_flags does not have enough space on 32bit system.
Also, as mentioned in commit 16e572626961 ("af_unix: dont send
SCM_CREDENTIALS by default"), SOCK_PASSCRED and SOCK_PASSPID handling
is known to be slow, and managing the flags in struct socket cannot
avoid that for embryo sockets.
Let's move SOCK_PASS{CRED,PIDFD,SEC} to struct sock.
While at it, other SOCK_XXX flags in net.h are grouped as enum.
Note that assign_bit() was atomic, so the writer side is moved down
after lock_sock() in setsockopt(), but the bit is only read once
in sendmsg() and recvmsg(), so lock_sock() is not needed there.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
SCM_CREDENTIALS and SCM_SECURITY can be recv()ed by calling
scm_recv() or scm_recv_unix(), and SCM_PIDFD is only used by
scm_recv_unix().
scm_recv() is called from AF_NETLINK and AF_BLUETOOTH.
scm_recv_unix() is literally called from AF_UNIX.
Let's restrict SO_PASSCRED and SO_PASSSEC to such sockets and
SO_PASSPIDFD to AF_UNIX only.
Later, SOCK_PASS{CRED,PIDFD,SEC} will be moved to struct sock
and united with another field.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sk->sk_txrehash is only used for TCP.
Let's restrict SO_TXREHASH to TCP to reflect this.
Later, we will make sk_txrehash a part of the union for other
protocol families.
Note that we need to modify BPF selftest not to get/set
SO_TEREHASH for non-TCP sockets.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
scm_recv() has been placed in scm.h since the pre-git era for no
particular reason (I think), which makes the file really fragile.
For example, when you move SOCK_PASSCRED from include/linux/net.h to
enum sock_flags in include/net/sock.h, you will see weird build failure
due to terrible dependency.
To avoid the build failure in the future, let's move scm_recv(_unix())?
and its callees to scm.c.
Note that only scm_recv() needs to be exported for Bluetooth.
scm_send() should be moved to scm.c too, but I'll revisit later.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We will move SOCK_PASS{CRED,PIDFD,SEC} from struct socket.flags
to struct sock for better handling with SOCK_PASSRIGHTS.
Then, we don't need to access struct socket in maybe_add_creds().
Let's pass struct sock to maybe_add_creds() and its caller
queue_oob().
While at it, we append the unix_ prefix and fix double spaces
around the pid assignment.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, the same checks for SOCK_PASSCRED and SOCK_PASSPIDFD
are scattered across many places.
Let's centralise the bit tests to make the following changes cleaner.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Lots of new things, notably:
* ath12k: monitor mode for WCN7850, better 6 GHz regulatory
* brcmfmac: SAE for some Cypress devices
* iwlwifi: rework device configuration
* mac80211: scan improvements with MLO
* mt76: EHT improvements, new device IDs
* rtw88: throughput improvements
* rtw89: MLO, STA/P2P concurrency improvements, SAR
* tag 'wireless-next-2025-05-22' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (389 commits)
wifi: mt76: mt7925: add rfkill_poll for hardware rfkill
wifi: mt76: support power delta calculation for 5 TX paths
wifi: mt76: fix available_antennas setting
wifi: mt76: mt7996: fix RX buffer size of MCU event
wifi: mt76: mt7996: change max beacon size
wifi: mt76: mt7996: fix invalid NSS setting when TX path differs from NSS
wifi: mt76: mt7996: drop fragments with multicast or broadcast RA
wifi: mt76: mt7996: set EHT max ampdu length capability
wifi: mt76: mt7996: fix beamformee SS field
wifi: mt76: remove capability of partial bandwidth UL MU-MIMO
wifi: mt76: mt7925: add test mode support
wifi: mt76: mt7925: extend MCU support for testmode
wifi: mt76: mt7925: ensure all MCU commands wait for response
wifi: mt76: mt7925: refine the sniffer commnad
wifi: mt76: mt7925: prevent multiple scan commands
wifi: mt76: mt7915: Fix null-ptr-deref in mt7915_mmio_wed_init()
wifi: mt76: mt7996: Fix null-ptr-deref in mt7996_mmio_wed_init()
wifi: mt76: mt7925: add RNR scan support for 6GHz
wifi: mt76: add mt76_connac_mcu_build_rnr_scan_param routine
wifi: mt76: scan: Fix 'mlink' dereferenced before IS_ERR_OR_NULL check
...
====================
Link: https://patch.msgid.link/20250522165501.189958-50-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
core:
- Add support for SIOCETHTOOL ETHTOOL_GET_TS_INFO
- Separate CIS_LINK and BIS_LINK link types
- Introduce HCI Driver protocol
drivers:
- btintel_pcie: Do not generate coredump for diagnostic events
- btusb: Add HCI Drv commands for configuring altsetting
- btusb: Add RTL8851BE device 0x0bda:0xb850
- btusb: Add new VID/PID 13d3/3584 for MT7922
- btusb: Add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
- btnxpuart: Implement host-wakeup feature
* tag 'for-net-next-2025-05-22' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (23 commits)
Bluetooth: btintel: Check dsbr size from EFI variable
Bluetooth: MGMT: iterate over mesh commands in mgmt_mesh_foreach()
Bluetooth: btusb: Add new VID/PID 13d3/3584 for MT7922
Bluetooth: btusb: use skb_pull to avoid unsafe access in QCA dump handling
Bluetooth: L2CAP: Fix not checking l2cap_chan security level
Bluetooth: separate CIS_LINK and BIS_LINK link types
Bluetooth: btusb: Add new VID/PID 13d3/3630 for MT7925
Bluetooth: add support for SIOCETHTOOL ETHTOOL_GET_TS_INFO
Bluetooth: btintel_pcie: Dump debug registers on error
Bluetooth: ISO: Fix getpeername not returning sockaddr_iso_bc fields
Bluetooth: ISO: Fix not using SID from adv report
Revert "Bluetooth: btusb: add sysfs attribute to control USB alt setting"
Revert "Bluetooth: btusb: Configure altsetting for HCI_USER_CHANNEL"
Bluetooth: btusb: Add HCI Drv commands for configuring altsetting
Bluetooth: Introduce HCI Driver protocol
Bluetooth: btnxpuart: Implement host-wakeup feature
dt-bindings: net: bluetooth: nxp: Add support for host-wakeup
Bluetooth: btusb: Add RTL8851BE device 0x0bda:0xb850
Bluetooth: hci_uart: Remove unnecessary NULL check before release_firmware()
Bluetooth: btmtksdio: Fix wakeup source leaks on device unbind
...
====================
Link: https://patch.msgid.link/20250522171048.3307873-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since the size of struct btintel_dsbr is already known, we can just
start there instead of querying the EFI variable size. If the final
result doesn't match what we expect also fail. This fixes a stack buffer
overflow when the EFI variable is larger than struct btintel_dsbr.
Reported-by: zepta <z3ptaa@gmail.com>
Closes: https://lore.kernel.org/all/CAPBS6KoaWV9=dtjTESZiU6KK__OZX0KpDk-=JEH8jCHFLUYv3Q@mail.gmail.com
Fixes: eb9e749c0182 ("Bluetooth: btintel: Allow configuring drive strength of BRI")
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
In 'mgmt_mesh_foreach()', iterate over mesh commands
rather than generic mgmt ones. Compile tested only.
Fixes: b338d91703fa ("Bluetooth: Implement support for Mesh")
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
A new variant of MT7922 wireless device has been identified.
The device introduces itself as MEDIATEK MT7922,
so treat it as MediaTek device.
With this patch, btusb driver works as expected:
[ 3.151162] Bluetooth: Core ver 2.22
[ 3.151185] Bluetooth: HCI device and connection manager initialized
[ 3.151189] Bluetooth: HCI socket layer initialized
[ 3.151191] Bluetooth: L2CAP socket layer initialized
[ 3.151194] Bluetooth: SCO socket layer initialized
[ 3.295718] Bluetooth: hci0: HW/SW Version: 0x008a008a, Build Time: 20241106163512
[ 4.676634] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 4.676637] Bluetooth: BNEP filters: protocol multicast
[ 4.676640] Bluetooth: BNEP socket layer initialized
[ 5.560453] Bluetooth: hci0: Device setup in 2320660 usecs
[ 5.560457] Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported.
[ 5.619197] Bluetooth: hci0: AOSP extensions version v1.00
[ 5.619204] Bluetooth: hci0: AOSP quality report is supported
[ 5.619301] Bluetooth: MGMT ver 1.23
[ 6.741247] Bluetooth: RFCOMM TTY layer initialized
[ 6.741258] Bluetooth: RFCOMM socket layer initialized
[ 6.741261] Bluetooth: RFCOMM ver 1.11
lspci output:
04:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter
USB information:
T: Bus=01 Lev=01 Prnt=01 Port=04 Cnt=02 Dev#= 3 Spd=480 MxCh= 0
D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=13d3 ProdID=3584 Rev= 1.00
S: Manufacturer=MediaTek Inc.
S: Product=Wireless_Device
S: SerialNumber=000000000
C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA
A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us
E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms
I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms
I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms
I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms
I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms
I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms
I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms
I: If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us
E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us
I:* If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us
Signed-off-by: Liwei Sun <sunliweis@126.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Use skb_pull() and skb_pull_data() to safely parse QCA dump packets.
This avoids direct pointer math on skb->data, which could lead to
invalid access if the packet is shorter than expected.
Fixes: 20981ce2d5a5 ("Bluetooth: btusb: Add WCN6855 devcoredump support")
Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
l2cap_check_enc_key_size shall check the security level of the
l2cap_chan rather than the hci_conn since for incoming connection
request that may be different as hci_conn may already been
encrypted using a different security level.
Fixes: 522e9ed157e3 ("Bluetooth: l2cap: Check encryption key size on incoming connection")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc8).
Conflicts:
80f2ab46c2ee ("irdma: free iwdev->rf after removing MSI-X")
4bcc063939a5 ("ice, irdma: fix an off by one in error handling code")
c24a65b6a27c ("iidc/ice/irdma: Update IDC to support multiple consumers")
https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au
No extra adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"This is somewhat larger than what I hoped for, with a few PRs from
subsystems and follow-ups for the recent netdev locking changes,
anyhow there are no known pending regressions.
Including fixes from bluetooth, ipsec and CAN.
Current release - regressions:
- eth: team: grab team lock during team_change_rx_flags
- eth: bnxt_en: fix netdev locking in ULP IRQ functions
Current release - new code bugs:
- xfrm: ipcomp: fix truesize computation on receive
- eth: airoha: fix page recycling in airoha_qdma_rx_process()
Previous releases - regressions:
- sched: hfsc: fix qlen accounting bug when using peek in
hfsc_enqueue()
- mr: consolidate the ipmr_can_free_table() checks.
- bridge: netfilter: fix forwarding of fragmented packets
- xsk: bring back busy polling support in XDP_COPY
- can:
- add missing rcu read protection for procfs content
- kvaser_pciefd: force IRQ edge in case of nested IRQ
Previous releases - always broken:
- xfrm: espintcp: remove encap socket caching to avoid reference leak
- bluetooth: use skb_pull to avoid unsafe access in QCA dump handling
- eth: idpf:
- fix null-ptr-deref in idpf_features_check
- fix idpf_vport_splitq_napi_poll()
- eth: hibmcge: fix wrong ndo.open() after reset fail issue"
* tag 'net-6.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits)
octeontx2-af: Fix APR entry mapping based on APR_LMT_CFG
octeontx2-af: Set LMT_ENA bit for APR table entries
net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done
octeontx2-pf: Avoid adding dcbnl_ops for LBK and SDP vf
selftests/tc-testing: Add an HFSC qlen accounting test
sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue()
idpf: fix idpf_vport_splitq_napi_poll()
net: hibmcge: fix wrong ndo.open() after reset fail issue.
net: hibmcge: fix incorrect statistics update issue
xsk: Bring back busy polling support in XDP_COPY
can: slcan: allow reception of short error messages
net: lan743x: Restore SGMII CTRL register on resume
bnxt_en: Fix netdev locking in ULP IRQ functions
MAINTAINERS: Drop myself to reviewer for ravb driver
net: dwmac-sun8i: Use parsed internal PHY address instead of 1
net: ethernet: ti: am65-cpsw: Lower random mac address error print to info
can: kvaser_pciefd: Continue parsing DMA buf after dropped RX
can: kvaser_pciefd: Fix echo_skb race
can: kvaser_pciefd: Force IRQ edge in case of nested IRQ
idpf: fix null-ptr-deref in idpf_features_check
...
|
|
Tariq Toukan says:
====================
net/mlx5: Convert mlx5 to netdev instance locking
Cosmin Ratiu says:
mlx5 manages multiple netdevices, from basic Ethernet to Infiniband
netdevs. This patch series converts the driver to use netdev instance
locking for everything in preparation for TCP devmem Zero Copy.
Because mlx5 is tightly coupled with the ipoib driver, a series of
changes first happen in ipoib to allow it to work with mlx5 netdevs that
use instance locking:
IB/IPoIB: Enqueue separate work_structs for each flushed interface
IB/IPoIB: Replace vlan_rwsem with the netdev instance lock
IB/IPoIB: Allow using netdevs that require the instance lock
A small patch then avoids dropping RTNL during firmware update:
net/mlx5e: Don't drop RTNL during firmware flash
The main patch then converts all mlx5 netdevs to use instance locking:
net/mlx5e: Convert mlx5 netdevs to instance locking
====================
Link: https://patch.msgid.link/1747829342-1018757-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch convert mlx5 to use the new netdev instance lock in addition
to the pre-existing state_lock (and the RTNL).
mlx5e_priv.state_lock was already used throughout mlx5 to protect
against concurrent state modifications on the same netdev, usually in
addition to the RTNL. The new netdev instance lock will eventually
replace it, but for now, it is acquired in addition to the existing
locks in the order RTNL -> instance lock -> state_lock.
All three netdev types handled by mlx5 are converted to the new style of
locking, because they share a lot of code related to initializing
channels and dealing with NAPI, so it's better to convert all three
rather than introduce different assumptions deep in the call stack
depending on the type of device.
Because of the nature of the call graphs in mlx5, it wasn't possible to
incrementally convert parts of the driver to use the new lock, since
either all call paths into NAPI have to possess the new lock if the
*_locked variants are used, or none of them can have the lock.
One area which required extra care is the interaction between closing
channels and devlink health reporter tasks.
Previously, the recovery tasks were unconditionally acquiring the
RTNL, which could lead to deadlocks in these scenarios:
T1: mlx5e_close (== .ndo_stop(), has RTNL) -> mlx5e_close_locked
-> mlx5e_close_channels -> mlx5e_ptp_close
-> mlx5e_ptp_close_queues -> mlx5e_ptp_close_txqsqs
-> mlx5e_ptp_close_txqsq
-> cancel_work_sync(&ptpsq->report_unhealthy_work) waits for
T2: mlx5e_ptpsq_unhealthy_work -> mlx5e_reporter_tx_ptpsq_unhealthy
-> mlx5e_health_report -> devlink_health_report
-> devlink_health_reporter_recover
-> mlx5e_tx_reporter_ptpsq_unhealthy_recover which does:
rtnl_lock(); => Deadlock.
Another similar instance of this is:
T1: mlx5e_close (== .ndo_stop(), has RTNL) -> mlx5e_close_locked
-> mlx5e_close_channels -> mlx5e_ptp_close
-> mlx5e_ptp_close_queues -> mlx5e_ptp_close_txqsqs
-> mlx5e_ptp_close_txqsq
-> cancel_work_sync(&sq->recover_work) waits for
T2: mlx5e_tx_err_cqe_work -> mlx5e_reporter_tx_err_cqe
-> mlx5e_health_report -> devlink_health_report
-> devlink_health_reporter_recover
-> mlx5e_tx_reporter_err_cqe_recover which does:
rtnl_lock(); => Another deadlock.
Fix that by using the same pattern previously done in
mlx5e_tx_timeout_work, where the RTNL was repeatedly tried to be
acquired until either:
a) it is successfully acquired or
b) there's no need for the work to be done any more (channel is being
closed).
Now, for all three recovery tasks, the instance lock is repeatedly tried
to be acquired until successful or the channel/SQ is closed.
As a side-effect, drop the !test_bit(MLX5E_STATE_OPENED, &priv->state)
check from mlx5e_tx_timeout_work, it's weaker than
!test_bit(MLX5E_STATE_CHANNELS_ACTIVE, &priv->state) and unnecessary.
Future patches will introduce new call paths (from netdev queue
management ops) which can close channels (and call cancel_work_sync on
the recovery tasks) without the RTNL lock and only with the netdev
instance lock.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1747829342-1018757-6-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There's no explanation in the original commit of why that was done, but
presumably flashing takes a long time and holding RTNL for so long
blocks other interactions with the netdev layer.
However, the stack is moving towards netdev instance locking and
dropping and reacquiring RTNL in the context of flashing introduces
locking ordering issues: RTNL must be acquired before the netdev
instance lock and released after it.
This patch therefore takes the simpler approach by no longer dropping
and reacquiring the RTNL, as soon RTNL for ethtool will be removed,
leaving only the instance lock to protect against races.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1747829342-1018757-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After the last patch removing vlan_rwsem, it is an incremental step to
allow ipoib to work with netdevs that require the instance lock.
In several places, netdev_lock() is changed to netdev_lock_ops_to_full()
which takes care of not acquiring the lock again when the netdev is
already locked.
In ipoib_ib_tx_timeout_work() and __ipoib_ib_dev_flush() for HEAVY
flushes, the netdev lock is acquired/released. This is needed because
these functions end up calling .ndo_stop()/.ndo_open() on subinterfaces,
and the device may expect the netdev instance lock to be held.
ipoib_set_mode() now explicitly acquires ops lock while manipulating the
features, mtu and tx queues.
Finally, ipoib_napi_enable()/ipoib_napi_disable() now use the *_locked
variants of the napi_enable()/napi_disable() calls and optionally
acquire the netdev lock themselves depending on the dev they operate on.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1747829342-1018757-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
vlan_rwsem was added more than a decade ago to work around a deadlock
involving the original mutex being acquired twice, once from the wq.
Subsequent changes then tweaked it to partially protect access to
ipoib_dev_priv->child_intfs together with the RTNL. Flushing the wq
synchronously was also since then refactored to happen separately.
This semaphore unfortunately prevents updating ipoib to work with
devices that require the netdev lock, because of lock ordering issues
between RTNL, vlan_rwsem and the netdev instance locks of parent and
child devices.
To uncomplicate things, this commit replaces vlan_rwsem with the netdev
instance lock of the parent device. Both parent child_intfs list and the
children's list membership in it require holding the parent netdev
instance lock.
All call paths were carefully reviewed and no-longer-needed ASSERT_RTNL
calls were dropped. Some non-trivial changes:
- ipoib_match_gid_pkey_addr() now only acquires the instance lock and
iterates through child_intfs for the first level of recursion (the
parent), as it's not possible to have multiple levels of nested
subinterfaces.
- ipoib_open() and ipoib_stop() schedule tasks on the global workqueue
to open/stop child interfaces to avoid potentially acquiring nested
netdev instance locks. To avoid the device going away between the task
scheduling and execution, netdev_hold/netdev_put are used.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1747829342-1018757-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Previously, flushing a netdevice involved first flushing all child
devices from the flush task itself. That requires holding the lock that
protects the list for the entire duration of the flush.
This poses a problem when converting from vlan_rwsem to the netdev
instance lock (next patch), because holding the parent lock while
trying to acquire a child lock makes lockdep unhappy, rightfully.
Fix this by splitting a big flush task into individual flush tasks
(all are already created in their respective ipoib_dev_priv structs)
and defining a helper function to enqueue all of them while holding the
list lock.
In ipoib_set_mac, the function is not used and the task is enqueued
directly, because in the subsequent patches locking is changed and this
function may be called with the netdev instance lock held.
This is effectively a noop, the wq is single-threaded and ordered and
will execute the same flush operations in the same order as before.
Furthermore, there should be no new races because
ipoib_parent_unregister_pre() calls flush_workqueue() after stopping new
work generation to wait for pending work to complete. flush_workqueue()
waits for all currently enqueued work to finish before returning.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1747829342-1018757-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"This deals with a crash in the Qualcomm pin controller GPIO
parts when using hogs.
The first patch to gpiolib makes gpiochip_line_is_valid()
NULL-tolerant.
The second patch fixes the actual problem"
* tag 'pinctrl-v6.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: qcom: switch to devm_register_sys_off_handler()
gpiolib: don't crash on enabling GPIO HOG pins
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes for 6.15 final. It became slightly a
higher amount than expected, but all look easy and safe to apply:
- A fix for PCM core race spotted by fuzzing
- ASoC topology fix for single DAI link
- UAF fix for ASoC SOF Intel HD-audio at reloading
- ASoC SOF Intel and Mediatek fixes
- Trivial HD-audio quirks as usual"
* tag 'sound-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - Add new HP ZBook laptop with micmute led fixup
ALSA: hda/realtek: Add support for HP Agusta using CS35L41 HDA
ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14ASP10
ALSA: hda/realtek - restore auto-mute mode for Dell Chrome platform
ALSA: pcm: Fix race of buffer access at PCM OSS layer
ASoC: SOF: Intel: hda: Fix UAF when reloading module
ASoc: SOF: topology: connect DAI to a single DAI link
ASoC: SOF: Intel: hda-bus: Use PIO mode on ACE2+ platforms
ASoC: SOF: ipc4-pcm: Delay reporting is only supported for playback direction
ASoC: SOF: ipc4-control: Use SOF_CTRL_CMD_BINARY as numid for bytes_ext
ASoC: mediatek: mt8188-mt6359: Depend on MT6359_ACCDET set or disabled
ASoC: mediatek: mt8188-mt6359: select CONFIG_SND_SOC_MT6359_ACCDET
|
|
When xdp is attached or detached, dev->ndo_bpf() is called by
do_setlink(), and it acquires netdev_lock() if needed.
Unlike other drivers, the bnxt driver is protected by netdev_lock while
xdp is attached/detached because it sets dev->request_ops_lock to true.
So, the bnxt_xdp(), that is callback of ->ndo_bpf should not acquire
netdev_lock().
But the xdp_features_{set | clear}_redirect_target() was changed to
acquire netdev_lock() internally.
It causes a deadlock.
To fix this problem, bnxt driver should use
xdp_features_{set | clear}_redirect_target_locked() instead.
Splat looks like:
============================================
WARNING: possible recursive locking detected
6.15.0-rc6+ #1 Not tainted
--------------------------------------------
bpftool/1745 is trying to acquire lock:
ffff888131b85038 (&dev->lock){+.+.}-{4:4}, at: xdp_features_set_redirect_target+0x1f/0x80
but task is already holding lock:
ffff888131b85038 (&dev->lock){+.+.}-{4:4}, at: do_setlink.constprop.0+0x24e/0x35d0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&dev->lock);
lock(&dev->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by bpftool/1745:
#0: ffffffffa56131c8 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_setlink+0x1fe/0x570
#1: ffffffffaafa75a0 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_setlink+0x236/0x570
#2: ffff888131b85038 (&dev->lock){+.+.}-{4:4}, at: do_setlink.constprop.0+0x24e/0x35d0
stack backtrace:
CPU: 1 UID: 0 PID: 1745 Comm: bpftool Not tainted 6.15.0-rc6+ #1 PREEMPT(undef)
Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
Call Trace:
<TASK>
dump_stack_lvl+0x7a/0xd0
print_deadlock_bug+0x294/0x3d0
__lock_acquire+0x153b/0x28f0
lock_acquire+0x184/0x340
? xdp_features_set_redirect_target+0x1f/0x80
__mutex_lock+0x1ac/0x18a0
? xdp_features_set_redirect_target+0x1f/0x80
? xdp_features_set_redirect_target+0x1f/0x80
? __pfx_bnxt_rx_page_skb+0x10/0x10 [bnxt_en
? __pfx___mutex_lock+0x10/0x10
? __pfx_netdev_update_features+0x10/0x10
? bnxt_set_rx_skb_mode+0x284/0x540 [bnxt_en
? __pfx_bnxt_set_rx_skb_mode+0x10/0x10 [bnxt_en
? xdp_features_set_redirect_target+0x1f/0x80
xdp_features_set_redirect_target+0x1f/0x80
bnxt_xdp+0x34e/0x730 [bnxt_en 11cbcce8fa11cff1dddd7ef358d6219e4ca9add3]
dev_xdp_install+0x3f4/0x830
? __pfx_bnxt_xdp+0x10/0x10 [bnxt_en 11cbcce8fa11cff1dddd7ef358d6219e4ca9add3]
? __pfx_dev_xdp_install+0x10/0x10
dev_xdp_attach+0x560/0xf70
dev_change_xdp_fd+0x22d/0x280
do_setlink.constprop.0+0x2989/0x35d0
? __pfx_do_setlink.constprop.0+0x10/0x10
? lock_acquire+0x184/0x340
? find_held_lock+0x32/0x90
? rtnl_setlink+0x236/0x570
? rcu_is_watching+0x11/0xb0
? trace_contention_end+0xdc/0x120
? __mutex_lock+0x946/0x18a0
? __pfx___mutex_lock+0x10/0x10
? __lock_acquire+0xa95/0x28f0
? rcu_is_watching+0x11/0xb0
? rcu_is_watching+0x11/0xb0
? cap_capable+0x172/0x350
rtnl_setlink+0x2cd/0x570
Fixes: 03df156dd3a6 ("xdp: double protect netdev->xdp_flags with netdev->lock")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250520071155.2462843-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With a VRF, ipv4 and ipv6 FIB expression behave differently.
fib daddr . iif oif
Will return the input interface name for ipv4, but the real device
for ipv6. Example:
If VRF device name is tvrf and real (incoming) device is veth0.
First round is ok, both ipv4 and ipv6 will yield 'veth0'.
But in the second round (incoming device will be set to "tvrf"), ipv4
will yield "tvrf" whereas ipv6 returns "veth0" for the second round too.
This makes ipv6 behave like ipv4.
A followup patch will add a test case for this, without this change
it will fail with:
get element inet t fibif6iif { tvrf . dead:1::99 . tvrf }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FAIL: did not find tvrf . dead:1::99 . tvrf in fibif6iif
Alternatively we could either not do anything at all or change
ipv4 to also return the lower/real device, however, nft (userspace)
doc says "iif: if fib lookup provides a route then check its output
interface is identical to the packets input interface." which is what
the nft fib ipv4 behaviour is.
Fixes: f6d0cbcf09c5 ("netfilter: nf_tables: add fib expression")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
It was located in conntrack_vrf.sh because that already had the VRF bits.
Lets not add to this and move it to nft_fib.sh where this belongs.
No functional changes for the subtest intended.
The subtest is limited, it only covered 'fib oif'
(route output interface query) when the incoming interface is part
of a VRF.
Next we can extend it to cover 'fib type' for VRFs and also check fib
results when there is an unrelated VRF in same netns.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
fib can either lookup the interface id/name of the output interface that
would be used for the given address, or it can check for the type of the
address according to the fib, e.g. local, unicast, multicast and so on.
This can be used to e.g. make a locally configured address only reachable
through its interface.
Example: given eth0:10.1.1.1 and eth1:10.1.2.1 then 'fib daddr type' for
10.1.1.1 arriving on eth1 will be 'local', but 'fib daddr . iif type' is
expected to return 'unicast', whereas 'fib daddr' and 'fib daddr . iif'
are expected to indicate 'local' if such a packet arrives on eth0.
So far nft_fib.sh only covered oif/oifname, not type.
Repeat tests both with default and a policy (ip rule) based setup.
Also try to run all remaining tests even if a subtest has failed.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|