summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2023-12-26net/smc: support SMCv2.x supplemental features negotiationWen Gu
This patch adds SMCv2.x supplemental features negotiation. Supported SMCv2.x supplemental features are represented by feature_mask in FCE field. The negotiation process is as follows. Server Client Proposal(features(c-mask bits)) <----------------------------------------- Accept(features(s-mask bits)) -----------------------------------------> Confirm(features(s&c-mask bits)) <----------------------------------------- Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-and-tested-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26net/smc: unify the structs of accept or confirm message for v1 and v2Wen Gu
The structs of CLC accept and confirm messages for SMCv1 and SMCv2 are separately defined and often casted to each other in the code, which may increase the risk of errors caused by future divergence of them. So unify them into one struct for better maintainability. Suggested-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26net/smc: introduce sub-functions for smc_clc_send_confirm_accept()Wen Gu
There is a large if-else block in smc_clc_send_confirm_accept() and it is better to split it into two sub-functions. Suggested-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26net/smc: rename some 'fce' to 'fce_v2x' for clarityWen Gu
Rename some functions or variables with 'fce' in their name but used in SMCv2.1 as 'fce_v2x' for clarity. Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-25nfc: Do not send datagram if socket state isn't LLCP_BOUNDSiddh Raman Pant
As we know we cannot send the datagram (state can be set to LLCP_CLOSED by nfc_llcp_socket_release()), there is no need to proceed further. Thus, bail out early from llcp_sock_sendmsg(). Signed-off-by: Siddh Raman Pant <code@siddh.me> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-25nfc: llcp_core: Hold a ref to llcp_local->dev when holding a ref to llcp_localSiddh Raman Pant
llcp_sock_sendmsg() calls nfc_llcp_send_ui_frame() which in turn calls nfc_alloc_send_skb(), which accesses the nfc_dev from the llcp_sock for getting the headroom and tailroom needed for skb allocation. Parallelly the nfc_dev can be freed, as the refcount is decreased via nfc_free_device(), leading to a UAF reported by Syzkaller, which can be summarized as follows: (1) llcp_sock_sendmsg() -> nfc_llcp_send_ui_frame() -> nfc_alloc_send_skb() -> Dereference *nfc_dev (2) virtual_ncidev_close() -> nci_free_device() -> nfc_free_device() -> put_device() -> nfc_release() -> Free *nfc_dev When a reference to llcp_local is acquired, we do not acquire the same for the nfc_dev. This leads to freeing even when the llcp_local is in use, and this is the case with the UAF described above too. Thus, when we acquire a reference to llcp_local, we should acquire a reference to nfc_dev, and release the references appropriately later. References for llcp_local is initialized in nfc_llcp_register_device() (which is called by nfc_register_device()). Thus, we should acquire a reference to nfc_dev there. nfc_unregister_device() calls nfc_llcp_unregister_device() which in turn calls nfc_llcp_local_put(). Thus, the reference to nfc_dev is appropriately released later. Reported-and-tested-by: syzbot+bbe84a4010eeea00982d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=bbe84a4010eeea00982d Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when creating a socket") Reviewed-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Siddh Raman Pant <code@siddh.me> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-24rxrpc: Create a procfile to display outstanding client conn bundlesDavid Howells
Create /proc/net/rxrpc/bundles to display outstanding rxrpc client connection bundles. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-12-24rxrpc, afs: Allow afs to pin rxrpc_peer objectsDavid Howells
Change rxrpc's API such that: (1) A new function, rxrpc_kernel_lookup_peer(), is provided to look up an rxrpc_peer record for a remote address and a corresponding function, rxrpc_kernel_put_peer(), is provided to dispose of it again. (2) When setting up a call, the rxrpc_peer object used during a call is now passed in rather than being set up by rxrpc_connect_call(). For afs, this meenat passing it to rxrpc_kernel_begin_call() rather than the full address (the service ID then has to be passed in as a separate parameter). (3) A new function, rxrpc_kernel_remote_addr(), is added so that afs can get a pointer to the transport address for display purposed, and another, rxrpc_kernel_remote_srx(), to gain a pointer to the full rxrpc address. (4) The function to retrieve the RTT from a call, rxrpc_kernel_get_srtt(), is then altered to take a peer. This now returns the RTT or -1 if there are insufficient samples. (5) Rename rxrpc_kernel_get_peer() to rxrpc_kernel_call_get_peer(). (6) Provide a new function, rxrpc_kernel_get_peer(), to get a ref on a peer the caller already has. This allows the afs filesystem to pin the rxrpc_peer records that it is using, allowing faster lookups and pointer comparisons rather than comparing sockaddr_rxrpc contents. It also makes it easier to get hold of the RTT. The following changes are made to afs: (1) The addr_list struct's addrs[] elements now hold a peer struct pointer and a service ID rather than a sockaddr_rxrpc. (2) When displaying the transport address, rxrpc_kernel_remote_addr() is used. (3) The port arg is removed from afs_alloc_addrlist() since it's always overridden. (4) afs_merge_fs_addr4() and afs_merge_fs_addr6() do peer lookup and may now return an error that must be handled. (5) afs_find_server() now takes a peer pointer to specify the address. (6) afs_find_server(), afs_compare_fs_alists() and afs_merge_fs_addr[46]{} now do peer pointer comparison rather than address comparison. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-12-24rxrpc_find_service_conn_rcu: fix the usage of read_seqbegin_or_lock()Oleg Nesterov
rxrpc_find_service_conn_rcu() should make the "seq" counter odd on the second pass, otherwise read_seqbegin_or_lock() never takes the lock. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20231117164846.GA10410@redhat.com/
2023-12-22tipc: Remove some excess struct member documentationJonathan Corbet
Remove documentation for nonexistent struct members, addressing these warnings: ./net/tipc/link.c:228: warning: Excess struct member 'media_addr' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'timer' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'refcnt' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'proto_msg' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'pmsg' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'backlog_limit' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'exp_msg_count' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'reset_rcv_checkpt' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'transmitq' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'snt_nxt' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'deferred_queue' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'unacked_window' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'next_out' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'long_msg_seq_no' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'bc_rcvr' description in 'tipc_link' Signed-off-by: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Remove dead code and fields for bhash2.Kuniyuki Iwashima
Now all sockets including TIME_WAIT are linked to bhash2 using sock_common.skc_bind_node. We no longer use inet_bind2_bucket.deathrow, sock.sk_bind2_node, and inet_timewait_sock.tw_bind2_node. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Link sk and twsk to tb2->owners using skc_bind_node.Kuniyuki Iwashima
Now we can use sk_bind_node/tw_bind_node for bhash2, which means we need not link TIME_WAIT sockets separately. The dead code and sk_bind2_node will be removed in the next patch. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Unlink sk from bhash.Kuniyuki Iwashima
Now we do not use tb->owners and can unlink sockets from bhash. sk_bind_node/tw_bind_node are available for bhash2 and will be used in the following patch. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Check hlist_empty(&tb->bhash2) instead of hlist_empty(&tb->owners).Kuniyuki Iwashima
We use hlist_empty(&tb->owners) to check if the bhash bucket has a socket. We can check the child bhash2 buckets instead. For this to work, the bhash2 bucket must be freed before the bhash bucket. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Iterate tb->bhash2 in inet_csk_bind_conflict().Kuniyuki Iwashima
Sockets in bhash are also linked to bhash2, but TIME_WAIT sockets are linked separately in tb2->deathrow. Let's replace tb->owners iteration in inet_csk_bind_conflict() with two iterations over tb2->owners and tb2->deathrow. This can be done safely under bhash's lock because socket insertion/ deletion in bhash2 happens with bhash's lock held. Note that twsk_for_each_bound_bhash() will be removed later. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Rearrange tests in inet_csk_bind_conflict().Kuniyuki Iwashima
The following patch adds code in the !inet_use_bhash2_on_bind(sk) case in inet_csk_bind_conflict(). To avoid adding nest and make the change cleaner, this patch rearranges tests in inet_csk_bind_conflict(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Link bhash2 to bhash.Kuniyuki Iwashima
bhash2 added a new member sk_bind2_node in struct sock to link sockets to bhash2 in addition to bhash. bhash is still needed to search conflicting sockets efficiently from a port for the wildcard address. However, bhash itself need not have sockets. If we link each bhash2 bucket to the corresponding bhash bucket, we can iterate the same set of the sockets from bhash2 via bhash. This patch links bhash2 to bhash only, and the actual use will be in the later patches. Finally, we will remove sk_bind2_node. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Rename tb in inet_bind2_bucket_(init|create)().Kuniyuki Iwashima
Later, we no longer link sockets to bhash. Instead, each bhash2 bucket is linked to the corresponding bhash bucket. Then, we pass the bhash bucket to bhash2 allocation functions as tb. However, tb is already used in inet_bind2_bucket_create() and inet_bind2_bucket_init() as the bhash2 bucket. To make the following diff clear, let's use tb2 for the bhash2 bucket there. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Save address type in inet_bind2_bucket.Kuniyuki Iwashima
inet_bind2_bucket_addr_match() and inet_bind2_bucket_match_addr_any() are called for each bhash2 bucket to check conflicts. Thus, we call ipv6_addr_any() and ipv6_addr_v4mapped() over and over during bind(). Let's avoid calling them by saving the address type in inet_bind2_bucket. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Save v4 address as v4-mapped-v6 in inet_bind2_bucket.v6_rcv_saddr.Kuniyuki Iwashima
In bhash2, IPv4/IPv6 addresses are saved in two union members, which complicate address checks in inet_bind2_bucket_addr_match() and inet_bind2_bucket_match_addr_any() considering uninitialised memory and v4-mapped-v6 conflicts. Let's simplify that by saving IPv4 address as v4-mapped-v6 address and defining tb2.rcv_saddr as tb2.v6_rcv_saddr.s6_addr32[3]. Then, we can compare v6 address as is, and after checking v4-mapped-v6, we can compare v4 address easily. Also, we can remove tb2->family. Note these functions will be further refactored in the next patch. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Rearrange tests in inet_bind2_bucket_(addr_match|match_addr_any)().Kuniyuki Iwashima
The protocol family tests in inet_bind2_bucket_addr_match() and inet_bind2_bucket_match_addr_any() are ordered as follows. if (sk->sk_family != tb2->family) else if (sk->sk_family == AF_INET6) else This patch rearranges them so that AF_INET6 socket is handled first to make the following patch tidy, where tb2->family will be removed. if (sk->sk_family == AF_INET6) else if (tb2->family == AF_INET6) else Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22tcp: Use bhash2 for v4-mapped-v6 non-wildcard address.Kuniyuki Iwashima
While checking port availability in bind() or listen(), we used only bhash for all v4-mapped-v6 addresses. But there is no good reason not to use bhash2 for v4-mapped-v6 non-wildcard addresses. Let's do it by returning true in inet_use_bhash2_on_bind(). Then, we also need to add a test in inet_bind2_bucket_match_addr_any() so that ::ffff:X.X.X.X will match with 0.0.0.0. Note that sk->sk_rcv_saddr is initialised for v4-mapped-v6 sk in __inet6_bind(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22Bluetooth: Fix atomicity violation in {min,max}_key_size_setGui-Dong Han
In min_key_size_set(): if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE) return -EINVAL; hci_dev_lock(hdev); hdev->le_min_key_size = val; hci_dev_unlock(hdev); In max_key_size_set(): if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size) return -EINVAL; hci_dev_lock(hdev); hdev->le_max_key_size = val; hci_dev_unlock(hdev); The atomicity violation occurs due to concurrent execution of set_min and set_max funcs.Consider a scenario where setmin writes a new, valid 'min' value, and concurrently, setmax writes a value that is greater than the old 'min' but smaller than the new 'min'. In this case, setmax might check against the old 'min' value (before acquiring the lock) but write its value after the 'min' has been updated by setmin. This leads to a situation where the 'max' value ends up being smaller than the 'min' value, which is an inconsistency. This possible bug is found by an experimental static analysis tool developed by our team, BassCheck[1]. This tool analyzes the locking APIs to extract function pairs that can be concurrently executed, and then analyzes the instructions in the paired functions to identify possible concurrency bugs including data races and atomicity violations. The above possible bug is reported when our tool analyzes the source code of Linux 5.17. To resolve this issue, it is suggested to encompass the validity checks within the locked sections in both set_min and set_max funcs. The modification ensures that the validation of 'val' against the current min/max values is atomic, thus maintaining the integrity of the settings. With this patch applied, our tool no longer reports the bug, with the kernel configuration allyesconfig for x86_64. Due to the lack of associated hardware, we cannot test the patch in runtime testing, and just verify it according to the code logic. [1] https://sites.google.com/view/basscheck/ Fixes: 18f81241b74f ("Bluetooth: Move {min,max}_key_size debugfs ...") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han <2045gemini@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Bluetooth: L2CAP: Fix possible multiple reject sendFrédéric Danis
In case of an incomplete command or a command with a null identifier 2 reject packets will be sent, one with the identifier and one with 0. Consuming the data of the command will prevent it. This allows to send a reject packet for each corrupted command in a multi-command packet. Signed-off-by: Frédéric Danis <frederic.danis@collabora.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Bluetooth: hci_sync: fix BR/EDR wakeup bugclancy shang
when Bluetooth set the event mask and enter suspend, the controller has hci mode change event coming, it cause controller can not enter sleep mode. so it should to set the hci mode change event mask before enter suspend. Signed-off-by: clancy shang <clancy.shang@quectel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Bluetooth: hci_conn: Check non NULL function before calling for HFP offloadZijun Hu
For some controllers such as QCA2066, it does not need to send HCI_Configure_Data_Path to configure non-HCI data transport path to support HFP offload, their device drivers may set hdev->get_codec_config_data as NULL, so Explicitly add this non NULL checking before calling the function. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Bluetooth: ISO: Avoid creating child socket if PA sync is terminatingIulia Tanasescu
When a PA sync socket is closed, the associated hcon is also unlinked and cleaned up. If there are no other hcons marked with the HCI_CONN_PA_SYNC flag, HCI_OP_LE_PA_TERM_SYNC is sent to controller. Between the time of the command and the moment PA sync is terminated in controller, residual BIGInfo reports might continue to come. This causes a new PA sync hcon to be added, and a new socket to be notified to user space. This commit fixs this by adding a flag on a Broadcast listening socket to mark when the PA sync child has been closed. This flag is checked when BIGInfo reports are indicated in iso_connect_ind, to avoid recreating a hcon and socket if residual reports arrive before PA sync is terminated. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Bluetooth: Fix bogus check for re-auth no supported with non-sspLuiz Augusto von Dentz
This reverts 19f8def031bfa50c579149b200bfeeb919727b27 "Bluetooth: Fix auth_complete_evt for legacy units" which seems to be working around a bug on a broken controller rather then any limitation imposed by the Bluetooth spec, in fact if there ws not possible to re-auth the command shall fail not succeed. Fixes: 19f8def031bf ("Bluetooth: Fix auth_complete_evt for legacy units") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Bluetooth: hci_core: Remove le_restart_scan workLuiz Augusto von Dentz
This removes le_restart_scan work and instead just disables controller duplicate filtering when discovery result_filtering is enabled and HCI_QUIRK_STRICT_DUPLICATE_FILTER is set. Link: https://github.com/bluez/bluez/issues/573 Link: https://github.com/bluez/bluez/issues/572 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Bluetooth: Add documentation to exported functions in libYuran Pereira
Most functions in `net/bluetooth/lib.c` lack propper documentation. This patch adds documentation to all exported functions in `net/bluetooth/lib.c`. Unnecessary or redundant comments are also removed to ensure the file looks clean. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Bluetooth: ISO: Reassociate a socket with an active BISIulia Tanasescu
For ISO Broadcast, all BISes from a BIG have the same lifespan - they cannot be created or terminated independently from each other. This links together all BIS hcons that are part of the same BIG, so all hcons are kept alive as long as the BIG is active. If multiple BIS sockets are opened for a BIG handle, and only part of them are closed at some point, the associated hcons will be marked as open. If new sockets will later be opened for the same BIG, they will be reassociated with the open BIS hcons. All BIS hcons will be cleaned up and the BIG will be terminated when the last BIS socket is closed from userspace. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Bluetooth: ISO: Allow binding a PA sync socketIulia Tanasescu
This makes it possible to bind a PA sync socket to a number of BISes before issuing the BIG Create Sync command. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-12-22Merge tag '9p-for-6.7-rc7' of https://github.com/martinetd/linuxLinus Torvalds
Pull 9p fixes from Dominique Martinet: "Two small fixes scheduled for stable trees: A tracepoint fix that's been reading past the end of messages forever, but semi-recently also went over the end of the buffer. And a potential incorrectly freeing garbage in pdu parsing error path" * tag '9p-for-6.7-rc7' of https://github.com/martinetd/linux: net: 9p: avoid freeing uninit memory in p9pdu_vreadf 9p: prevent read overrun in protocol dump tracepoint
2023-12-22netfilter: nf_tables: validate chain type update if availablePablo Neira Ayuso
Parse netlink attribute containing the chain type in this update, to bail out if this is different from the existing type. Otherwise, it is possible to define a chain with the same name, hook and priority but different type, which is silently ignored. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-12-22netfilter: ctnetlink: support filtering by zoneFelix Huettner
conntrack zones are heavily used by tools like openvswitch to run multiple virtual "routers" on a single machine. In this context each conntrack zone matches to a single router, thereby preventing overlapping IPs from becoming issues. In these systems it is common to operate on all conntrack entries of a given zone, e.g. to delete them when a router is deleted. Previously this required these tools to dump the full conntrack table and filter out the relevant entries in userspace potentially causing performance issues. To do this we reuse the existing CTA_ZONE attribute. This was previous parsed but not used during dump and flush requests. Now if CTA_ZONE is set we filter these operations based on the provided zone. However this means that users that previously passed CTA_ZONE will experience a difference in functionality. Alternatively CTA_FILTER could have been used for the same functionality. However it is not yet supported during flush requests and is only available when using AF_INET or AF_INET6. Co-developed-by: Luca Czesla <luca.czesla@mail.schwarz> Signed-off-by: Luca Czesla <luca.czesla@mail.schwarz> Co-developed-by: Max Lamprecht <max.lamprecht@mail.schwarz> Signed-off-by: Max Lamprecht <max.lamprecht@mail.schwarz> Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-12-22netfilter: nf_tables: mark newset as dead on transaction abortFlorian Westphal
If a transaction is aborted, we should mark the to-be-released NEWSET dead, just like commit path does for DEL and DESTROYSET commands. In both cases all remaining elements will be released via set->ops->destroy(). The existing abort code does NOT post the actual release to the work queue. Also the entire __nf_tables_abort() function is wrapped in gc_seq begin/end pair. Therefore, async gc worker will never try to release the pending set elements, as gc sequence is always stale. It might be possible to speed up transaction aborts via work queue too, this would result in a race and a possible use-after-free. So fix this before it becomes an issue. Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-12-22netfilter: nft_set_pipapo: prefer gfp_kernel allocationFlorian Westphal
No need to use GFP_ATOMIC here. Fixes: f6c383b8c31a ("netfilter: nf_tables: adapt set backend to use GC transaction API") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-12-22netfilter: nf_tables: Add locking for NFT_MSG_GETSETELEM_RESET requestsPhil Sutter
Set expressions' dump callbacks are not concurrency-safe per-se with reset bit set. If two CPUs reset the same element at the same time, values may underrun at least with element-attached counters and quotas. Prevent this by introducing dedicated callbacks for nfnetlink and the asynchronous dump handling to serialize access. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-12-22netfilter: nf_tables: Introduce nft_set_dump_ctx_init()Phil Sutter
This is a wrapper around nft_ctx_init() for use in nf_tables_getsetelem() and a resetting equivalent introduced later. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-12-22netfilter: nf_tables: Pass const set to nft_get_set_elemPhil Sutter
The function is not supposed to alter the set, passing the pointer as const is fine and merely requires to adjust signatures of two called functions as well. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-12-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni
Cross-merge networking fixes after downstream PR. Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c 23c93c3b6275 ("bnxt_en: do not map packet buffers twice") 6d1add95536b ("bnxt_en: Modify TX ring indexing logic.") tools/testing/selftests/net/Makefile 2258b666482d ("selftests: add vlan hw filter tests") a0bc96c0cd6e ("selftests: net: verify fq per-band packet limit") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-21wifi: mac80211: add a driver callback to check active_linksMiri Korenblit
During ieee80211_set_active_links() we do (among the others): 1. Call drv_change_vif_links() with both old_active and new_active 2. Unassign the chanctx for the removed link(s) (if any) 3. Assign chanctx to the added link(s) (if any) 4. Call drv_change_vif_links() with the new_active links bitmap The problem here is that during step #1 the driver doesn't know whether we will activate multiple links simultaneously or are just doing a link switch, so it can't check there if multiple links are supported/enabled. (Some of the drivers might enable/disable this option dynamically) And during step #3, in which the driver already knows that, returning an error code (for example when multiple links are not supported or disabled), will cause a warning, and we will still complete the transition to the new_active links. (It is hard to undo things in that stage, since we released channels etc.) Therefore add a driver callback to check if the desired new_active links will be supported by the driver or not. This callback will be called in the beginning of ieee80211_set_active_links() so we won't do anything before we are sure it is supported. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Link: https://msgid.link/20231220133549.64c4d70b33b8.I79708619be76b8ecd4ef3975205b8f903e24a2cd@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-21wifi: mac80211: fix advertised TTLM schedulingAyala Beker
Handle a case of time overflow, where the switch time might be smaller than the partial TSF in the beacon. Additionally, apply advertised TTLM earlier in order to be ready on time on the newly activated links. Fixes: 702e80470a33 ("wifi: mac80211: support handling of advertised TID-to-link mapping") Signed-off-by: Ayala Beker <ayala.beker@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231220133549.15079c34e5c8.I0dd50bcceff5953080cdd7aee5118b72c78c6507@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-21wifi: cfg80211: avoid double free if updating BSS failsBenjamin Berg
cfg80211_update_known_bss will always consume the passed IEs. As such, cfg80211_update_assoc_bss_entry also needs to always set the pointers to NULL so that no double free can occur. Note that hitting this would probably require being connected to a hidden BSS which is then doing a channel switch while also switching to be not hidden anymore at the same time. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231220133549.8891edb28d51.Id09c5145363e990ff5237decd58296302e2d53c8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-21wifi: cfg80211: ensure cfg80211_bss_update frees IEs on errorBenjamin Berg
cfg80211_bss_update is expected to consume the IEs that are passed into it in the temporary internal BSS. This did not happen in some error cases (which are also WARN_ON paths), so change the code to use a common label and use that everywhere. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231220133549.8e72ea105e17.Ic81e9431e980419360e97502ce8c75c58793f05a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-21wifi: cfg80211: free beacon_ies when overridden from hidden BSSBenjamin Berg
This is a more of a cosmetic fix. The branch will only be taken if proberesp_ies is set, which implies that beacon_ies is not set unless we are connected to an AP that just did a channel switch. And, in that case we should have found the BSS in the internal storage to begin with. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231220133549.b898e22dadff.Id8c4c10aedd176ef2e18a4cad747b299f150f9df@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-21wifi: mac80211: allow 64-bit radiotap timestampsJohannes Berg
When reporting the radiotap timestamp, the mactime field is usually unused, we take the data from the device_timestamp. However, there can be cases where the radiotap timestamp is better reported as a 64-bit value, so since the mactime is free, add a flag to support using the mactime as a 64-bit radiotap timestamp. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Reviewed-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231220133549.00c8b9234f0c.Ie3ce5eae33cce88fa01178e7aea94661ded1ac24@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-21wifi: mac80211: rework RX timestamp flagsJohannes Berg
We only have a single flag free, and before using that for another mactime flag, instead refactor the mactime flags to use a 2-bit field. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Reviewed-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231220133549.d0e664832d14.I20c8900106f9bf81316bed778b1e3ce145785274@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-21wifi: cfg80211: handle UHB AP and STA power typeMukesh Sisodiya
UHB AP send supported power type(LPI, SP, VLP) in beacon and probe response IE and STA should connect to these AP only if their regulatory support the AP power type. Beacon/Probe response are reported to userspace with reason "STA regulatory not supporting to connect to AP based on transmitted power type" and it should not connect to AP. Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231220133549.cbfbef9170a9.I432f78438de18aa9f5c9006be12e41dc34cc47c5@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-21wifi: mac80211: Schedule regulatory channels check on bandwith changeAndrei Otcheretianski
Some devices may support concurrent DFS operation which relies on the BSS channel width for its relaxations. Notify cfg80211 about BW change so it can schedule regulatory checks. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231220133549.e08f8e9ebc67.If8915d13e203ebd380579f55fd9148e9b3f43306@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>