summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2024-11-15netfilter: bitwise: add support for doing AND, OR and XOR directlyJeremy Sowden
Hitherto, these operations have been converted in user space to mask-and-xor operations on one register and two immediate values, and it is the latter which have been evaluated by the kernel. We add support for evaluating these operations directly in kernel space on one register and either an immediate value or a second register. Pablo made a few changes to the original patch: - EINVAL if NFTA_BITWISE_SREG2 is used with fast version. - Allow _AND,_OR,_XOR with _DATA != sizeof(u32) - Dump _SREG2 or _DATA with _AND,_OR,_XOR Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: bitwise: rename some boolean operation functionsJeremy Sowden
In the next patch we add support for doing AND, OR and XOR operations directly in the kernel, so rename some functions and an enum constant related to mask-and-xor boolean operations. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.Guillaume Nault
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.Guillaume Nault
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.Guillaume Nault
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: flow_offload: Convert nft_flow_route() to dscp_t.Guillaume Nault
Use ip4h_dscp()instead of reading ip_hdr()->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Also, remove the comment about the net/ip.h include file, since it's now required for the ip4h_dscp() helper too. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.Guillaume Nault
Use ip4h_dscp()instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15xfrm: Fix acquire state insertion.Steffen Klassert
A recent commit jumped over the dst hash computation and left the symbol uninitialized. Fix this by explicitly computing the dst hash before it is used. Fixes: 0045e3d80613 ("xfrm: Cache used outbound xfrm states at the policy.") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-11-14net: ethtool: account for RSS+RXNFC add semantics when checking channel countEdward Cree
In ethtool_check_max_channel(), the new RX count must not only cover the max queue indices in RSS indirection tables and RXNFC destinations separately, but must also, for RXNFC rules with FLOW_RSS, cover the sum of the destination queue and the maximum index in the associated RSS context's indirection table, since that is the highest queue that the rule can actually deliver traffic to. It could be argued that the max queue across all custom RSS contexts (ethtool_get_max_rss_ctx_channel()) need no longer be considered, since any context to which packets can actually be delivered will be targeted by some RXNFC rule and its max will thus be allowed for by ethtool_get_max_rxnfc_channel(). For simplicity we keep both checks, so even RSS contexts unused by any RXNFC rule must fit the channel count. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/43257d375434bef388e36181492aa4c458b88336.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts inEdward Cree
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters. However, some drivers / hardware ignore the ring_cookie, and simply use the indirection table entries as queue IDs directly. Thus, for drivers which have not opted in by setting ethtool_ops.cap_rss_rxnfc_adds to declare that they support the original (addition) semantics, reject in ethtool_set_rxnfc any filter which combines FLOW_RSS and a nonzero ring. (For a ring_cookie of zero, both behaviours are equivalent.) Set the cap bit in sfc, as it is known to support this feature. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://patch.msgid.link/cc3da0844083b0e301a33092a6299e4042b65221.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2024-11-14 We've added 9 non-merge commits during the last 4 day(s) which contain a total of 3 files changed, 226 insertions(+), 84 deletions(-). The main changes are: 1) Fixes to bpf_msg_push/pop_data and test_sockmap. The changes has dependency on the other changes in the bpf-next/net branch, from Zijian Zhang. 2) Drop netns codes from mptcp test. Reuse the common helpers in test_progs, from Geliang Tang. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: bpf, sockmap: Fix sk_msg_reset_curr bpf, sockmap: Several fixes to bpf_msg_pop_data bpf, sockmap: Several fixes to bpf_msg_push_data selftests/bpf: Add more tests for test_txmsg_push_pop in test_sockmap selftests/bpf: Add push/pop checking for msg_verify_data in test_sockmap selftests/bpf: Fix total_bytes in msg_loop_rx in test_sockmap selftests/bpf: Fix SENDPAGE data logic in test_sockmap selftests/bpf: Add txmsg_pass to pull/push/pop in test_sockmap selftests/bpf: Drop netns helpers in mptcp ==================== Link: https://patch.msgid.link/20241114202832.3187927-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14bpf: lwtunnel: Prepare bpf_lwt_xmit_reroute() to future .flowi4_tos conversion.Guillaume Nault
Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the dscp_t value to __u8 with inet_dscp_to_dsfield(). Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop the inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/8338a12377c44f698a651d1ce357dd92bdf18120.1731064982.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14bpf: ipv4: Prepare __bpf_redirect_neigh_v4() to future .flowi4_tos conversion.Guillaume Nault
Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the dscp_t value to __u8 with inet_dscp_to_dsfield(). Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop the inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/35eacc8955003e434afb1365d404193cc98a9579.1731064982.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Bluetooth: MGMT: Add initial implementation of MGMT_OP_HCI_CMD_SYNCLuiz Augusto von Dentz
This adds the initial implementation of MGMT_OP_HCI_CMD_SYNC as documented in mgmt-api (BlueZ tree): Send HCI command and wait for event Command =========================================== Command Code: 0x005B Controller Index: <controller id> Command Parameters: Opcode (2 Octets) Event (1 Octet) Timeout (1 Octet) Parameter Length (2 Octets) Parameter (variable) Return Parameters: Response (1-variable Octets) This command may be used to send a HCI command and wait for an (optional) event. The HCI command is specified by the Opcode, any arbitrary is supported including vendor commands, but contrary to the like of Raw/User channel it is run as an HCI command send by the kernel since it uses its command synchronization thus it is possible to wait for a specific event as a response. Setting event to 0x00 will cause the command to wait for either HCI Command Status or HCI Command Complete. Timeout is specified in seconds, setting it to 0 will cause the default timeout to be used. Possible errors: Failed Invalid Parameters Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: fix use-after-free in device_for_each_child()Dmitry Antipov
Syzbot has reported the following KASAN splat: BUG: KASAN: slab-use-after-free in device_for_each_child+0x18f/0x1a0 Read of size 8 at addr ffff88801f605308 by task kbnepd bnep0/4980 CPU: 0 UID: 0 PID: 4980 Comm: kbnepd bnep0 Not tainted 6.12.0-rc4-00161-gae90f6a6170d #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x100/0x190 ? device_for_each_child+0x18f/0x1a0 print_report+0x13a/0x4cb ? __virt_addr_valid+0x5e/0x590 ? __phys_addr+0xc6/0x150 ? device_for_each_child+0x18f/0x1a0 kasan_report+0xda/0x110 ? device_for_each_child+0x18f/0x1a0 ? __pfx_dev_memalloc_noio+0x10/0x10 device_for_each_child+0x18f/0x1a0 ? __pfx_device_for_each_child+0x10/0x10 pm_runtime_set_memalloc_noio+0xf2/0x180 netdev_unregister_kobject+0x1ed/0x270 unregister_netdevice_many_notify+0x123c/0x1d80 ? __mutex_trylock_common+0xde/0x250 ? __pfx_unregister_netdevice_many_notify+0x10/0x10 ? trace_contention_end+0xe6/0x140 ? __mutex_lock+0x4e7/0x8f0 ? __pfx_lock_acquire.part.0+0x10/0x10 ? rcu_is_watching+0x12/0xc0 ? unregister_netdev+0x12/0x30 unregister_netdevice_queue+0x30d/0x3f0 ? __pfx_unregister_netdevice_queue+0x10/0x10 ? __pfx_down_write+0x10/0x10 unregister_netdev+0x1c/0x30 bnep_session+0x1fb3/0x2ab0 ? __pfx_bnep_session+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? __pfx_woken_wake_function+0x10/0x10 ? __kthread_parkme+0x132/0x200 ? __pfx_bnep_session+0x10/0x10 ? kthread+0x13a/0x370 ? __pfx_bnep_session+0x10/0x10 kthread+0x2b7/0x370 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x48/0x80 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 4974: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0xaa/0xb0 __kmalloc_noprof+0x1d1/0x440 hci_alloc_dev_priv+0x1d/0x2820 __vhci_create_device+0xef/0x7d0 vhci_write+0x2c7/0x480 vfs_write+0x6a0/0xfc0 ksys_write+0x12f/0x260 do_syscall_64+0xc7/0x250 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 4979: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x4f/0x70 kfree+0x141/0x490 hci_release_dev+0x4d9/0x600 bt_host_release+0x6a/0xb0 device_release+0xa4/0x240 kobject_put+0x1ec/0x5a0 put_device+0x1f/0x30 vhci_release+0x81/0xf0 __fput+0x3f6/0xb30 task_work_run+0x151/0x250 do_exit+0xa79/0x2c30 do_group_exit+0xd5/0x2a0 get_signal+0x1fcd/0x2210 arch_do_signal_or_restart+0x93/0x780 syscall_exit_to_user_mode+0x140/0x290 do_syscall_64+0xd4/0x250 entry_SYSCALL_64_after_hwframe+0x77/0x7f In 'hci_conn_del_sysfs()', 'device_unregister()' may be called when an underlying (kobject) reference counter is greater than 1. This means that reparenting (happened when the device is actually freed) is delayed and, during that delay, parent controller device (hciX) may be deleted. Since the latter may create a dangling pointer to freed parent, avoid that scenario by reparenting to NULL explicitly. Reported-by: syzbot+6cf5652d3df49fae2e3f@syzkaller.appspotmail.com Tested-by: syzbot+6cf5652d3df49fae2e3f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6cf5652d3df49fae2e3f Fixes: a85fb91e3d72 ("Bluetooth: Fix double free in hci_conn_cleanup") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: hci_core: Fix calling mgmt_device_connectedLuiz Augusto von Dentz
Since 61a939c68ee0 ("Bluetooth: Queue incoming ACL data until BT_CONNECTED state is reached") there is no long the need to call mgmt_device_connected as ACL data will be queued until BT_CONNECTED state. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219458 Link: https://github.com/bluez/bluez/issues/1014 Fixes: 333b4fd11e89 ("Bluetooth: L2CAP: Fix uaf in l2cap_connect") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: ISO: Send BIG Create Sync via hci_syncIulia Tanasescu
Before issuing the LE BIG Create Sync command, an available BIG handle is chosen by iterating through the conn_hash list and finding the first unused value. If a BIG is terminated, the associated hcons are removed from the list and the LE BIG Terminate Sync command is sent via hci_sync queue. However, a new LE BIG Create sync command might be issued via hci_send_cmd, before the previous BIG sync was terminated. This can cause the same BIG handle to be reused and the LE BIG Create Sync to fail with Command Disallowed. < HCI Command: LE Broadcast Isochronous Group Create Sync (0x08|0x006b) BIG Handle: 0x00 BIG Sync Handle: 0x0002 Encryption: Unencrypted (0x00) Broadcast Code[16]: 00000000000000000000000000000000 Maximum Number Subevents: 0x00 Timeout: 20000 ms (0x07d0) Number of BIS: 1 BIS ID: 0x01 > HCI Event: Command Status (0x0f) plen 4 LE Broadcast Isochronous Group Create Sync (0x08|0x006b) ncmd 1 Status: Command Disallowed (0x0c) < HCI Command: LE Broadcast Isochronous Group Terminate Sync (0x08|0x006c) BIG Handle: 0x00 This commit fixes the ordering of the LE BIG Create Sync/LE BIG Terminate Sync commands, to make sure that either the previous BIG sync is terminated before reusing the handle, or that a new handle is chosen for a new sync. Fixes: eca0ae4aea66 ("Bluetooth: Add initial implementation of BIS connections") Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: hci_conn: Remove alloc from critical sectionIulia Tanasescu
This removes the kzalloc memory allocation inside critical section in create_pa_sync, fixing the following message that appears when the kernel is compiled with CONFIG_DEBUG_ATOMIC_SLEEP enabled: BUG: sleeping function called from invalid context at include/linux/sched/mm.h:321 Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: ISO: Use kref to track lifetime of iso_connLuiz Augusto von Dentz
This make use of kref to keep track of reference of iso_conn which allows better tracking of its lifetime with usage of things like kref_get_unless_zero in a similar way as used in l2cap_chan. In addition to it remove call to iso_sock_set_timer on iso_sock_disconn since at that point it is useless to set a timer as the sk will be freed there is nothing to be done in iso_sock_timeout. Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: SCO: Use kref to track lifetime of sco_connLuiz Augusto von Dentz
This make use of kref to keep track of reference of sco_conn which allows better tracking of its lifetime with usage of things like kref_get_unless_zero in a similar way as used in l2cap_chan. In addition to it remove call to sco_sock_set_timer on __sco_sock_close since at that point it is useless to set a timer as the sk will be freed there is nothing to be done in sco_sock_timeout. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: ISO: Update hci_conn_hash_lookup_big for Broadcast slaveIulia Tanasescu
Currently, hci_conn_hash_lookup_big only checks for BIS master connections, by filtering out connections with the destination address set. This commit updates this function to also consider BIS slave connections, since it is also used for a Broadcast Receiver to set an available BIG handle before issuing the LE BIG Create Sync command. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: ISO: Do not emit LE BIG Create Sync if previous is pendingIulia Tanasescu
The Bluetooth Core spec does not allow a LE BIG Create sync command to be sent to Controller if another one is pending (Vol 4, Part E, page 2586). In order to avoid this issue, the HCI_CONN_CREATE_BIG_SYNC was added to mark that the LE BIG Create Sync command has been sent for a hcon. Once the BIG Sync Established event is received, the hcon flag is erased and the next pending hcon is handled. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: ISO: Fix matching parent socket for BIS slaveIulia Tanasescu
Currently, when a BIS slave connection is notified to the ISO layer, the parent socket is tried to be matched by the HCI_EVT_LE_BIG_SYNC_ESTABILISHED event. However, a BIS slave connection is notified to the ISO layer after the Command Complete for the LE Setup ISO Data Path command is received. This causes the parent to be incorrectly matched if multiple listen sockets are present. This commit adds a fix by matching the parent based on the BIG handle set in the notified connection. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: ISO: Do not emit LE PA Create Sync if previous is pendingIulia Tanasescu
The Bluetooth Core spec does not allow a LE PA Create sync command to be sent to Controller if another one is pending (Vol 4, Part E, page 2493). In order to avoid this issue, the HCI_CONN_CREATE_PA_SYNC was added to mark that the LE PA Create Sync command has been sent for a hcon. Once the PA Sync Established event is received, the hcon flag is erased and the next pending hcon is handled. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: Support new quirks for ATS2851Danil Pylaev
This adds support for quirks for broken extended create connection, and write auth payload timeout. Signed-off-by: Danil Pylaev <danstiv404@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: Fix type of len in rfcomm_sock_getsockopt{,_old}()Andrej Shadura
Commit 9bf4e919ccad worked around an issue introduced after an innocuous optimisation change in LLVM main: > len is defined as an 'int' because it is assigned from > '__user int *optlen'. However, it is clamped against the result of > sizeof(), which has a type of 'size_t' ('unsigned long' for 64-bit > platforms). This is done with min_t() because min() requires compatible > types, which results in both len and the result of sizeof() being casted > to 'unsigned int', meaning len changes signs and the result of sizeof() > is truncated. From there, len is passed to copy_to_user(), which has a > third parameter type of 'unsigned long', so it is widened and changes > signs again. This excessive casting in combination with the KCSAN > instrumentation causes LLVM to fail to eliminate the __bad_copy_from() > call, failing the build. The same issue occurs in rfcomm in functions rfcomm_sock_getsockopt and rfcomm_sock_getsockopt_old. Change the type of len to size_t in both rfcomm_sock_getsockopt and rfcomm_sock_getsockopt_old and replace min_t() with min(). Cc: stable@vger.kernel.org Co-authored-by: Aleksei Vetrov <vvvvvv@google.com> Improves: 9bf4e919ccad ("Bluetooth: Fix type of len in {l2cap,sco}_sock_getsockopt_old()") Link: https://github.com/ClangBuiltLinux/linux/issues/2007 Link: https://github.com/llvm/llvm-project/issues/85647 Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: hci_core: Fix not checking skb length on hci_scodata_packetLuiz Augusto von Dentz
This fixes not checking if skb really contains an SCO header otherwise the code may attempt to access some uninitilized/invalid memory past the valid skb->data. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: hci_core: Fix not checking skb length on hci_acldata_packetLuiz Augusto von Dentz
This fixes not checking if skb really contains an ACL header otherwise the code may attempt to access some uninitilized/invalid memory past the valid skb->data. Reported-by: syzbot+6ea290ba76d8c1eb1ac2@syzkaller.appspotmail.com Tested-by: syzbot+6ea290ba76d8c1eb1ac2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6ea290ba76d8c1eb1ac2 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: hci_conn: Use disable_delayed_work_syncLuiz Augusto von Dentz
This makes use of disable_delayed_work_sync instead cancel_delayed_work_sync as it not only cancel the ongoing work but also disables new submit which is disarable since the object holding the work is about to be freed. Reported-by: syzbot+2446dd3cb07277388db6@syzkaller.appspotmail.com Tested-by: syzbot+2446dd3cb07277388db6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2446dd3cb07277388db6 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Bluetooth: hci_conn: Reduce hci_conn_drop() calls in two functionsMarkus Elfring
An hci_conn_drop() call was immediately used after a null pointer check for an hci_conn_link() call in two function implementations. Thus call such a function only once instead directly before the checks. This issue was transformed by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.12-rc8). Conflicts: tools/testing/selftests/net/.gitignore 252e01e68241 ("selftests: net: add netlink-dumps to .gitignore") be43a6b23829 ("selftests: ncdevmem: Move ncdevmem under drivers/net/hw") https://lore.kernel.org/all/20241113122359.1b95180a@canb.auug.org.au/ drivers/net/phy/phylink.c 671154f174e0 ("net: phylink: ensure PHY momentary link-fails are handled") 7530ea26c810 ("net: phylink: remove "using_mac_select_pcs"") Adjacent changes: drivers/net/ethernet/stmicro/stmmac/dwmac-intel-plat.c 5b366eae7193 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines") e96321fad3ad ("net: ethernet: Switch back to struct platform_driver::remove()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Merge tag 'net-6.12-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth. Quite calm week. No new regression under investigation. Current release - regressions: - eth: revert "igb: Disable threaded IRQ for igb_msix_other" Current release - new code bugs: - bluetooth: btintel: direct exception event to bluetooth stack Previous releases - regressions: - core: fix data-races around sk->sk_forward_alloc - netlink: terminate outstanding dump on socket close - mptcp: error out earlier on disconnect - vsock: fix accept_queue memory leak - phylink: ensure PHY momentary link-fails are handled - eth: mlx5: - fix null-ptr-deref in add rule err flow - lock FTE when checking if active - eth: dwmac-mediatek: fix inverted handling of mediatek,mac-wol Previous releases - always broken: - sched: fix u32's systematic failure to free IDR entries for hnodes. - sctp: fix possible UAF in sctp_v6_available() - eth: bonding: add ns target multicast address to slave device - eth: mlx5: fix msix vectors to respect platform limit - eth: icssg-prueth: fix 1 PPS sync" * tag 'net-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits) net: sched: u32: Add test case for systematic hnode IDR leaks selftests: bonding: add ns multicast group testing bonding: add ns target multicast address to slave device net: ti: icssg-prueth: Fix 1 PPS sync stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines net: Make copy_safe_from_sockptr() match documentation net: stmmac: dwmac-mediatek: Fix inverted handling of mediatek,mac-wol ipmr: Fix access to mfc_cache_list without lock held samples: pktgen: correct dev to DEV net: phylink: ensure PHY momentary link-fails are handled mptcp: pm: use _rcu variant under rcu_read_lock mptcp: hold pm lock when deleting entry mptcp: update local address flags when setting it net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes. MAINTAINERS: Re-add cancelled Renesas driver sections Revert "igb: Disable threaded IRQ for igb_msix_other" Bluetooth: btintel: Direct exception event to bluetooth stack Bluetooth: hci_core: Fix calling mgmt_device_connected virtio/vsock: Improve MSG_ZEROCOPY error handling vsock: Fix sk_error_queue memory leak ...
2024-11-14netfilter: ipset: add missing range check in bitmap_ip_uadtJeongjun Park
When tb[IPSET_ATTR_IP_TO] is not present but tb[IPSET_ATTR_CIDR] exists, the values of ip and ip_to are slightly swapped. Therefore, the range check for ip should be done later, but this part is missing and it seems that the vulnerability occurs. So we should add missing range checks and remove unnecessary range checks. Cc: <stable@vger.kernel.org> Reported-by: syzbot+58c872f7790a4d2ac951@syzkaller.appspotmail.com Fixes: 72205fc68bd1 ("netfilter: ipset: bitmap:ip set type support") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nf_tables: allocate element update information dynamicallyFlorian Westphal
Move the timeout/expire/flag members from nft_trans_one_elem struct into a dybamically allocated structure, only needed when timeout update was requested. This halves size of nft_trans_one_elem struct and allows to compact up to 124 elements in one transaction container rather than 62. This halves memory requirements for a large flush or insert transaction, where ->update remains NULL. Care has to be taken to release the extra data in all spots, including abort path. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nf_tables: switch trans_elem to real flex arrayFlorian Westphal
When queueing a set element add or removal operation to the transaction log, check if the previous operation already asks for a the identical operation on the same set. If so, store the element reference in the preceding operation. This significantlty reduces memory consumption when many set add/delete operations appear in a single transaction. Example: 10k elements require 937kb of memory (10k allocations from kmalloc-96 slab). Assuming we can compact 4 elements in the same set, 468 kbytes are needed (64 bytes for base struct, nft_trans_elemn, 32 bytes for nft_trans_one_elem structure, so 2500 allocations from kmalloc-192 slab). For large batch updates we can compact up to 62 elements into one single nft_trans_elem structure (~65% mem reduction): (64 bytes for base struct, nft_trans_elem, 32 byte for nft_trans_one_elem struct). We can halve size of nft_trans_one_elem struct by moving timeout/expire/update_flags into a dynamically allocated structure, this allows to store 124 elements in a 2k slab nft_trans_elem struct. This is done in a followup patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nf_tables: prepare nft audit for set element compactionFlorian Westphal
nftables audit log format emits the number of added/deleted rules, sets, set elements and so on, to userspace: table=t1 family=2 entries=4 op=nft_register_set ~~~~~~~~~ At this time, the 'entries' key is the number of transactions that will be applied. The upcoming set element compression will coalesce subsequent adds/deletes to the same set requests in the same transaction request to conseve memory. Without this patch, we'd under-report the number of altered elements. Increment the audit counter by the number of elements to keep the reported entries value the same. Without this, nft_audit.sh selftest fails because the recorded (expected) entries key is smaller than the expected one. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nf_tables: prepare for multiple elements in nft_trans_elem structureFlorian Westphal
Add helpers to release the individual elements contained in the trans_elem container structure. No functional change intended. Followup patch will add 'nelems' member and will turn 'priv' into a flexible array. These helpers can then loop over all elements. Care needs to be taken to handle a mix of new elements and existing elements that are being updated (e.g. timeout refresh). Before this patch, NEWSETELEM transaction with update is released early so nft_trans_set_elem_destroy() won't get called, so we need to skip elements marked as update. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nf_tables: add nft_trans_commit_list_add_elem helperFlorian Westphal
Add and use a wrapper to append trans_elem structures to the transaction log. Unlike the existing helper, pass a gfp_t to indicate if sleeping is allowed. This will be used by a followup patch to realloc nft_trans_elem structures after they gain a flexible array member to reduce number of such container structures on the transaction list. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: bpf: Pass string literal as format argument of request_module()Simon Horman
Both gcc-14 and clang-18 report that passing a non-string literal as the format argument of request_module() is potentially insecure. E.g. clang-18 says: .../nf_bpf_link.c:46:24: warning: format string is not a string literal (potentially insecure) [-Wformat-security] 46 | err = request_module(mod); | ^~~ .../kmod.h:25:55: note: expanded from macro 'request_module' 25 | #define request_module(mod...) __request_module(true, mod) | ^~~ .../nf_bpf_link.c:46:24: note: treat the string as an argument to avoid this 46 | err = request_module(mod); | ^ | "%s", .../kmod.h:25:55: note: expanded from macro 'request_module' 25 | #define request_module(mod...) __request_module(true, mod) | ^ It is always the case where the contents of mod is safe to pass as the format argument. That is, in my understanding, it never contains any format escape sequences. But, it seems better to be safe than sorry. And, as a bonus, compiler output becomes less verbose by addressing this issue as suggested by clang-18. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14netfilter: nfnetlink: Report extack policy errors for batched opsDonald Hunter
The nftables batch processing does not currently populate extack with policy errors. Fix this by passing extack when parsing batch messages. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-14xfrm: replace deprecated strncpy with strscpy_padDaniel Yang
The function strncpy is deprecated since it does not guarantee the destination buffer is NULL terminated. Recommended replacement is strscpy. The padded version was used to remain consistent with the other strscpy_pad usage in the modified function. Signed-off-by: Daniel Yang <danielyangkang@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-11-14xfrm: Add error handling when nla_put_u32() returns an errorEverest K.C
Error handling is missing when call to nla_put_u32() fails. Handle the error when the call to nla_put_u32() returns an error. The error was reported by Coverity Scan. Report: CID 1601525: (#1 of 1): Unused value (UNUSED_VALUE) returned_value: Assigning value from nla_put_u32(skb, XFRMA_SA_PCPU, x->pcpu_num) to err here, but that stored value is overwritten before it can be used Fixes: 1ddf9916ac09 ("xfrm: Add support for per cpu xfrm state handling.") Signed-off-by: Everest K.C. <everestkc@everestkc.com.np> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-11-13ipmr: Fix access to mfc_cache_list without lock heldBreno Leitao
Accessing `mr_table->mfc_cache_list` is protected by an RCU lock. In the following code flow, the RCU read lock is not held, causing the following error when `RCU_PROVE` is not held. The same problem might show up in the IPv6 code path. 6.12.0-rc5-kbuilder-01145-gbac17284bdcb #33 Tainted: G E N ----------------------------- net/ipv4/ipmr_base.c:313 RCU-list traversed in non-reader section!! rcu_scheduler_active = 2, debug_locks = 1 2 locks held by RetransmitAggre/3519: #0: ffff88816188c6c0 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x8a/0x290 #1: ffffffff83fcf7a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x6b/0x90 stack backtrace: lockdep_rcu_suspicious mr_table_dump ipmr_rtm_dumproute rtnl_dump_all rtnl_dumpit netlink_dump __netlink_dump_start rtnetlink_rcv_msg netlink_rcv_skb netlink_unicast netlink_sendmsg This is not a problem per see, since the RTNL lock is held here, so, it is safe to iterate in the list without the RCU read lock, as suggested by Eric. To alleviate the concern, modify the code to use list_for_each_entry_rcu() with the RTNL-held argument. The annotation will raise an error only if RTNL or RCU read lock are missing during iteration, signaling a legitimate problem, otherwise it will avoid this false positive. This will solve the IPv6 case as well, since ip6mr_rtm_dumproute() calls this function as well. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20241108-ipmr_rcu-v2-1-c718998e209b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-13mptcp: pm: use _rcu variant under rcu_read_lockMatthieu Baerts (NGI0)
In mptcp_pm_create_subflow_or_signal_addr(), rcu_read_(un)lock() are used as expected to iterate over the list of local addresses, but list_for_each_entry() was used instead of list_for_each_entry_rcu() in __lookup_addr(). It is important to use this variant which adds the required READ_ONCE() (and diagnostic checks if enabled). Because __lookup_addr() is also used in mptcp_pm_nl_set_flags() where it is called under the pernet->lock and not rcu_read_lock(), an extra condition is then passed to help the diagnostic checks making sure either the associated spin lock or the RCU lock is held. Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241112-net-mptcp-misc-6-12-pm-v1-3-b835580cefa8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-13mptcp: hold pm lock when deleting entryGeliang Tang
When traversing userspace_pm_local_addr_list and deleting an entry from it in mptcp_pm_nl_remove_doit(), msk->pm.lock should be held. This patch holds this lock before mptcp_userspace_pm_lookup_addr_by_id() and releases it after list_move() in mptcp_pm_nl_remove_doit(). Fixes: d9a4594edabf ("mptcp: netlink: Add MPTCP_PM_CMD_REMOVE") Cc: stable@vger.kernel.org Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241112-net-mptcp-misc-6-12-pm-v1-2-b835580cefa8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-13mptcp: update local address flags when setting itGeliang Tang
Just like in-kernel pm, when userspace pm does set_flags, it needs to send out MP_PRIO signal, and also modify the flags of the corresponding address entry in the local address list. This patch implements the missing logic. Traverse all address entries on userspace_pm_local_addr_list to find the local address entry, if bkup is true, set the flags of this entry with FLAG_BACKUP, otherwise, clear FLAG_BACKUP. Fixes: 892f396c8e68 ("mptcp: netlink: issue MP_PRIO signals from userspace PMs") Cc: stable@vger.kernel.org Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241112-net-mptcp-misc-6-12-pm-v1-1-b835580cefa8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-13Merge tag 'wireless-next-2024-11-13' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.13 Most likely the last -next pull request for v6.13. Most changes are in Realtek and Qualcomm drivers, otherwise not really anything noteworthy. Major changes: mac80211 * EHT 1024 aggregation size for transmissions ath12k * switch to using wiphy_lock() and remove ar->conf_mutex * firmware coredump collection support * add debugfs support for a multitude of statistics ath11k * dt: document WCN6855 hardware inputs ath9k * remove include/linux/ath9k_platform.h ath5k * Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support rtw88: * 8821au and 8812au USB adapters support rtw89 * thermal protection * firmware secure boot for WiFi 6 chip * tag 'wireless-next-2024-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (154 commits) Revert "wifi: iwlegacy: do not skip frames with bad FCS" wifi: mac80211: pass MBSSID config by reference wifi: mac80211: Support EHT 1024 aggregation size in TX net: rfkill: gpio: Add check for clk_enable() wifi: brcmfmac: Fix oops due to NULL pointer dereference in brcmf_sdiod_sglist_rw() wifi: Switch back to struct platform_driver::remove() wifi: ipw2x00: libipw_rx_any(): fix bad alignment wifi: brcmfmac: release 'root' node in all execution paths wifi: iwlwifi: mvm: don't call power_update_mac in fast suspend wifi: iwlwifi: s/IWL_MVM_INVALID_STA/IWL_INVALID_STA wifi: iwlwifi: bump minimum API version in BZ/SC to 92 wifi: iwlwifi: move IWL_LMAC_*_INDEX to fw/api/context.h wifi: iwlwifi: be less noisy if the NIC is dead in S3 wifi: iwlwifi: mvm: tell iwlmei when we finished suspending wifi: iwlwifi: allow fast resume on ax200 wifi: iwlwifi: mvm: support new initiator and responder command version wifi: iwlwifi: mvm: use wiphy locked debugfs for low-latency wifi: iwlwifi: mvm: MLO scan upon channel condition degradation wifi: iwlwifi: mvm: support new versions of the wowlan APIs wifi: iwlwifi: mvm: allow always calling iwl_mvm_get_bss_vif() ... ==================== Link: https://patch.msgid.link/20241113172918.A8A11C4CEC3@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfAlexei Starovoitov
Cross-merge bpf fixes after downstream PR. In particular to bring the fix in commit aa30eb3260b2 ("bpf: Force checkpoint when jmp history is too long"). The follow up verifier work depends on it. And the fix in commit 6801cf7890f2 ("selftests/bpf: Use -4095 as the bad address for bits iterator"). It's fixing instability of BPF CI on s390 arch. No conflicts. Adjacent changes in: Auto-merging arch/Kconfig Auto-merging kernel/bpf/helpers.c Auto-merging kernel/bpf/memalloc.c Auto-merging kernel/bpf/verifier.c Auto-merging mm/slab_common.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-13Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Daniel Borkmann: - Fix a mismatching RCU unlock flavor in bpf_out_neigh_v6 (Jiawei Ye) - Fix BPF sockmap with kTLS to reject vsock and unix sockets upon kTLS context retrieval (Zijian Zhang) - Fix BPF bits iterator selftest for s390x (Hou Tao) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Fix mismatched RCU unlock flavour in bpf_out_neigh_v6 bpf: Add sk_is_inet and IS_ICSK check in tls_sw_has_ctx_tx/rx selftests/bpf: Use -4095 as the bad address for bits iterator
2024-11-12net: page_pool: do not count normal frag allocation in statsJakub Kicinski
Commit 0f6deac3a079 ("net: page_pool: add page allocation stats for two fast page allocate path") added increments for "fast path" allocation to page frag alloc. It mentions performance degradation analysis but the details are unclear. Could be that the author was simply surprised by the alloc stats not matching packet count. In my experience the key metric for page pool is the recycling rate. Page return stats, however, count returned _pages_ not frags. This makes it impossible to calculate recycling rate for drivers using the frag API. Here is example output of the page-pool YNL sample for a driver allocating 1200B frags (4k pages) with nearly perfect recycling: $ ./page-pool eth0[2] page pools: 32 (zombies: 0) refs: 291648 bytes: 1194590208 (refs: 0 bytes: 0) recycling: 33.3% (alloc: 4557:2256365862 recycle: 200476245:551541893) The recycling rate is reported as 33.3% because we give out 4096 // 1200 = 3 frags for every recycled page. Effectively revert the aforementioned commit. This also aligns with the stats we would see for drivers which do the fragmentation themselves, although that's not a strong reason in itself. On the (very unlikely) path where we can reuse the current page let's bump the "cached" stat. The fact that we don't put the page in the cache is just an optimization. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://patch.msgid.link/20241109023303.3366500-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>