summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2025-07-08wifi: mac80211: fix rx link assignment for non-MLO stationsHari Chandrakanthan
Currently, ieee80211_rx_data_set_sta() does not correctly handle the case where the interface supports multiple links (MLO), but the station does not (non-MLO). This can lead to incorrect link assignment or unexpected warnings when accessing link information. Hence, add a fix to check if the station lacks valid link support and use its default link ID for rx->link assignment. If the station unexpectedly has valid links, fall back to the default link. This ensures correct link association and prevents potential issues in mixed MLO/non-MLO environments. Signed-off-by: Hari Chandrakanthan <quic_haric@quicinc.com> Signed-off-by: Sarika Sharma <quic_sarishar@quicinc.com> Link: https://patch.msgid.link/20250630084119.3583593-1-quic_sarishar@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-08wifi: cfg80211: move away from using a fake platform deviceGreg Kroah-Hartman
Downloading regulatory "firmware" needs a device to hang off of, and so a platform device seemed like the simplest way to do this. Now that we have a faux device interface, use that instead as this "regulatory device" is not anything resembling a platform device at all. Cc: Johannes Berg <johannes@sipsolutions.net> Cc: <linux-wireless@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/2025070116-growing-skeptic-494c@gregkh Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-07Merge tag 'for-net-2025-07-03' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_sync: Fix not disabling advertising instance - hci_core: Remove check of BDADDR_ANY in hci_conn_hash_lookup_big_state - hci_sync: Fix attempting to send HCI_Disconnect to BIS handle - hci_event: Fix not marking Broadcast Sink BIS as connected * tag 'for-net-2025-07-03' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: hci_event: Fix not marking Broadcast Sink BIS as connected Bluetooth: hci_sync: Fix attempting to send HCI_Disconnect to BIS handle Bluetooth: hci_core: Remove check of BDADDR_ANY in hci_conn_hash_lookup_big_state Bluetooth: hci_sync: Fix not disabling advertising instance ==================== Link: https://patch.msgid.link/20250703160409.1791514-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07netpoll: move Ethernet setup to push_eth() helperBreno Leitao
Refactor Ethernet header population into dedicated function, completing the layered abstraction with: - push_eth() for link layer - push_udp() for transport - push_ipv4()/push_ipv6() for network Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-6-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07netpoll: factor out UDP header setup into push_udp() helperBreno Leitao
Move UDP header construction from netpoll_send_udp() into a new static helper function push_udp(). This completes the protocol layer refactoring by: 1. Creating a dedicated helper for UDP header assembly 2. Removing UDP-specific logic from the main send function 3. Establishing a consistent pattern with existing IPv4/IPv6 helpers: - push_udp() - push_ipv4() - push_ipv6() The change improves code organization and maintains the encapsulation pattern established in previous refactorings. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-5-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07netpoll: factor out IPv4 header setup into push_ipv4() helperBreno Leitao
Move IPv4 header construction from netpoll_send_udp() into a new static helper function push_ipv4(). This completes the refactoring started with IPv6 header handling, creating symmetric helper functions for both IP versions. Changes include: 1. Extracting IPv4 header setup logic into push_ipv4() 2. Replacing inline IPv4 code with helper call 3. Moving eth assignment after helper calls for consistency The refactoring reduces code duplication and improves maintainability by isolating IP version-specific logic. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-4-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07netpoll: factor out IPv6 header setup into push_ipv6() helperBreno Leitao
Move IPv6 header construction from netpoll_send_udp() into a new static helper function, push_ipv6(). This refactoring reduces code duplication and improves readability in netpoll_send_udp(). Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-3-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07netpoll: factor out UDP checksum calculation into helperBreno Leitao
Extract UDP checksum calculation logic from netpoll_send_udp() into a new static helper function netpoll_udp_checksum(). This reduces code duplication and improves readability for both IPv4 and IPv6 cases. No functional change intended. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-2-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07netpoll: Improve code clarity with explicit struct size calculationsBreno Leitao
Replace pointer-dereference sizeof() operations with explicit struct names for improved readability and maintainability. This change: 1. Replaces `sizeof(*udph)` with `sizeof(struct udphdr)` 2. Replaces `sizeof(*ip6h)` with `sizeof(struct ipv6hdr)` 3. Replaces `sizeof(*iph)` with `sizeof(struct iphdr)` This will make it easy to move code in the upcoming patches. No functional changes are introduced by this patch. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-1-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07net: remove RTNL use for /proc/sys/net/core/rps_default_maskEric Dumazet
Use a dedicated mutex instead. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250702061558.1585870-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07page_pool: rename __page_pool_alloc_pages_slow() to ↵Byungchul Park
__page_pool_alloc_netmems_slow() Now that __page_pool_alloc_pages_slow() is for allocating netmem, not struct page, rename it to __page_pool_alloc_netmems_slow() to reflect what it does. Signed-off-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250702053256.4594-4-byungchul@sk.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07page_pool: rename __page_pool_release_page_dma() to ↵Byungchul Park
__page_pool_release_netmem_dma() Now that __page_pool_release_page_dma() is for releasing netmem, not struct page, rename it to __page_pool_release_netmem_dma() to reflect what it does. Signed-off-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://patch.msgid.link/20250702053256.4594-3-byungchul@sk.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07page_pool: rename page_pool_return_page() to page_pool_return_netmem()Byungchul Park
Now that page_pool_return_page() is for returning netmem, not struct page, rename it to page_pool_return_netmem() to reflect what it does. Signed-off-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://patch.msgid.link/20250702053256.4594-2-byungchul@sk.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07tipc: Fix use-after-free in tipc_conn_close().Kuniyuki Iwashima
syzbot reported a null-ptr-deref in tipc_conn_close() during netns dismantle. [0] tipc_topsrv_stop() iterates tipc_net(net)->topsrv->conn_idr and calls tipc_conn_close() for each tipc_conn. The problem is that tipc_conn_close() is called after releasing the IDR lock. At the same time, there might be tipc_conn_recv_work() running and it could call tipc_conn_close() for the same tipc_conn and release its last ->kref. Once we release the IDR lock in tipc_topsrv_stop(), there is no guarantee that the tipc_conn is alive. Let's hold the ref before releasing the lock and put the ref after tipc_conn_close() in tipc_topsrv_stop(). [0]: BUG: KASAN: use-after-free in tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165 Read of size 8 at addr ffff888099305a08 by task kworker/u4:3/435 CPU: 0 PID: 435 Comm: kworker/u4:3 Not tainted 4.19.204-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1fc/0x2ef lib/dump_stack.c:118 print_address_description.cold+0x54/0x219 mm/kasan/report.c:256 kasan_report_error.cold+0x8a/0x1b9 mm/kasan/report.c:354 kasan_report mm/kasan/report.c:412 [inline] __asan_report_load8_noabort+0x88/0x90 mm/kasan/report.c:433 tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165 tipc_topsrv_stop net/tipc/topsrv.c:701 [inline] tipc_topsrv_exit_net+0x27b/0x5c0 net/tipc/topsrv.c:722 ops_exit_list+0xa5/0x150 net/core/net_namespace.c:153 cleanup_net+0x3b4/0x8b0 net/core/net_namespace.c:553 process_one_work+0x864/0x1570 kernel/workqueue.c:2153 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296 kthread+0x33f/0x460 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 Allocated by task 23: kmem_cache_alloc_trace+0x12f/0x380 mm/slab.c:3625 kmalloc include/linux/slab.h:515 [inline] kzalloc include/linux/slab.h:709 [inline] tipc_conn_alloc+0x43/0x4f0 net/tipc/topsrv.c:192 tipc_topsrv_accept+0x1b5/0x280 net/tipc/topsrv.c:470 process_one_work+0x864/0x1570 kernel/workqueue.c:2153 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296 kthread+0x33f/0x460 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 Freed by task 23: __cache_free mm/slab.c:3503 [inline] kfree+0xcc/0x210 mm/slab.c:3822 tipc_conn_kref_release net/tipc/topsrv.c:150 [inline] kref_put include/linux/kref.h:70 [inline] conn_put+0x2cd/0x3a0 net/tipc/topsrv.c:155 process_one_work+0x864/0x1570 kernel/workqueue.c:2153 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296 kthread+0x33f/0x460 kernel/kthread.c:259 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415 The buggy address belongs to the object at ffff888099305a00 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 8 bytes inside of 512-byte region [ffff888099305a00, ffff888099305c00) The buggy address belongs to the page: page:ffffea000264c140 count:1 mapcount:0 mapping:ffff88813bff0940 index:0x0 flags: 0xfff00000000100(slab) raw: 00fff00000000100 ffffea00028b6b88 ffffea0002cd2b08 ffff88813bff0940 raw: 0000000000000000 ffff888099305000 0000000100000006 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888099305900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888099305980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888099305a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888099305a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888099305b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: c5fa7b3cf3cb ("tipc: introduce new TIPC server infrastructure") Reported-by: syzbot+d333febcf8f4bc5f6110@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=27169a847a70550d17be Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech> Link: https://patch.msgid.link/20250702014350.692213-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07netlink: Fix wraparounds of sk->sk_rmem_alloc.Kuniyuki Iwashima
Netlink has this pattern in some places if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf) atomic_add(skb->truesize, &sk->sk_rmem_alloc); , which has the same problem fixed by commit 5a465a0da13e ("udp: Fix multiple wraparounds of sk->sk_rmem_alloc."). For example, if we set INT_MAX to SO_RCVBUFFORCE, the condition is always false as the two operands are of int. Then, a single socket can eat as many skb as possible until OOM happens, and we can see multiple wraparounds of sk->sk_rmem_alloc. Let's fix it by using atomic_add_return() and comparing the two variables as unsigned int. Before: [root@fedora ~]# ss -f netlink Recv-Q Send-Q Local Address:Port Peer Address:Port -1668710080 0 rtnl:nl_wraparound/293 * After: [root@fedora ~]# ss -f netlink Recv-Q Send-Q Local Address:Port Peer Address:Port 2147483072 0 rtnl:nl_wraparound/290 * ^ `--- INT_MAX - 576 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Jason Baron <jbaron@akamai.com> Closes: https://lore.kernel.org/netdev/cover.1750285100.git.jbaron@akamai.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250704054824.1580222-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07net: openvswitch: allow providing upcall pid for the 'execute' commandIlya Maximets
When a packet enters OVS datapath and there is no flow to handle it, packet goes to userspace through a MISS upcall. With per-CPU upcall dispatch mechanism, we're using the current CPU id to select the Netlink PID on which to send this packet. This allows us to send packets from the same traffic flow through the same handler. The handler will process the packet, install required flow into the kernel and re-inject the original packet via OVS_PACKET_CMD_EXECUTE. While handling OVS_PACKET_CMD_EXECUTE, however, we may hit a recirculation action that will pass the (likely modified) packet through the flow lookup again. And if the flow is not found, the packet will be sent to userspace again through another MISS upcall. However, the handler thread in userspace is likely running on a different CPU core, and the OVS_PACKET_CMD_EXECUTE request is handled in the syscall context of that thread. So, when the time comes to send the packet through another upcall, the per-CPU dispatch will choose a different Netlink PID, and this packet will end up processed by a different handler thread on a different CPU. The process continues as long as there are new recirculations, each time the packet goes to a different handler thread before it is sent out of the OVS datapath to the destination port. In real setups the number of recirculations can go up to 4 or 5, sometimes more. There is always a chance to re-order packets while processing upcalls, because userspace will first install the flow and then re-inject the original packet. So, there is a race window when the flow is already installed and the second packet can match it and be forwarded to the destination before the first packet is re-injected. But the fact that packets are going through multiple upcalls handled by different userspace threads makes the reordering noticeably more likely, because we not only have a race between the kernel and a userspace handler (which is hard to avoid), but also between multiple userspace handlers. For example, let's assume that 10 packets got enqueued through a MISS upcall for handler-1, it will start processing them, will install the flow into the kernel and start re-injecting packets back, from where they will go through another MISS to handler-2. Handler-2 will install the flow into the kernel and start re-injecting the packets, while handler-1 continues to re-inject the last of the 10 packets, they will hit the flow installed by handler-2 and be forwarded without going to the handler-2, while handler-2 still re-injects the first of these 10 packets. Given multiple recirculations and misses, these 10 packets may end up completely mixed up on the output from the datapath. Let's allow userspace to specify on which Netlink PID the packets should be upcalled while processing OVS_PACKET_CMD_EXECUTE. This makes it possible to ensure that all the packets are processed by the same handler thread in the userspace even with them being upcalled multiple times in the process. Packets will remain in order since they will be enqueued to the same socket and re-injected in the same order. This doesn't eliminate re-ordering as stated above, since we still have a race between kernel and the userspace thread, but it allows to eliminate races between multiple userspace threads. Userspace knows the PID of the socket on which the original upcall is received, so there is no need to send it up from the kernel. Solution requires storing the value somewhere for the duration of the packet processing. There are two potential places for this: our skb extension or the per-CPU storage. It's not clear which is better, so just following currently used scheme of storing this kind of things along the skb. We still have a decent amount of space in the cb. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/20250702155043.2331772-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-07wifi: prevent A-MSDU attacks in mesh networksMathy Vanhoef
This patch is a mitigation to prevent the A-MSDU spoofing vulnerability for mesh networks. The initial update to the IEEE 802.11 standard, in response to the FragAttacks, missed this case (CVE-2025-27558). It can be considered a variant of CVE-2020-24588 but for mesh networks. This patch tries to detect if a standard MSDU was turned into an A-MSDU by an adversary. This is done by parsing a received A-MSDU as a standard MSDU, calculating the length of the Mesh Control header, and seeing if the 6 bytes after this header equal the start of an rfc1042 header. If equal, this is a strong indication of an ongoing attack attempt. This defense was tested with mac80211_hwsim against a mesh network that uses an empty Mesh Address Extension field, i.e., when four addresses are used, and when using a 12-byte Mesh Address Extension field, i.e., when six addresses are used. Functionality of normal MSDUs and A-MSDUs was also tested, and confirmed working, when using both an empty and 12-byte Mesh Address Extension field. It was also tested with mac80211_hwsim that A-MSDU attacks in non-mesh networks keep being detected and prevented. Note that the vulnerability being patched, and the defense being implemented, was also discussed in the following paper and in the following IEEE 802.11 presentation: https://papers.mathyvanhoef.com/wisec2025.pdf https://mentor.ieee.org/802.11/dcn/25/11-25-0949-00-000m-a-msdu-mesh-spoof-protection.docx Cc: stable@vger.kernel.org Signed-off-by: Mathy Vanhoef <Mathy.Vanhoef@kuleuven.be> Link: https://patch.msgid.link/20250616004635.224344-1-Mathy.Vanhoef@kuleuven.be Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-07wifi: mac80211: reject VHT opmode for unsupported channel widthsMoon Hee Lee
VHT operating mode notifications are not defined for channel widths below 20 MHz. In particular, 5 MHz and 10 MHz are not valid under the VHT specification and must be rejected. Without this check, malformed notifications using these widths may reach ieee80211_chan_width_to_rx_bw(), leading to a WARN_ON due to invalid input. This issue was reported by syzbot. Reject these unsupported widths early in sta_link_apply_parameters() when opmode_notif is used. The accepted set includes 20, 40, 80, 160, and 80+80 MHz, which are valid for VHT. While 320 MHz is not defined for VHT, it is allowed to avoid rejecting HE or EHT clients that may still send a VHT opmode notification. Reported-by: syzbot+ededba317ddeca8b3f08@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ededba317ddeca8b3f08 Fixes: 751e7489c1d7 ("wifi: mac80211: expose ieee80211_chan_width_to_rx_bw() to drivers") Tested-by: syzbot+ededba317ddeca8b3f08@syzkaller.appspotmail.com Signed-off-by: Moon Hee Lee <moonhee.lee.ca@gmail.com> Link: https://patch.msgid.link/20250703193756.46622-2-moonhee.lee.ca@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-07wifi: mac80211: fix non-transmitted BSSID profile searchJohannes Berg
When the non-transmitted BSSID profile is found, immediately return from the search to not return the wrong profile_len when the profile is found in a multiple BSSID element that isn't the last one in the frame. Fixes: 5023b14cf4df ("mac80211: support profile split between elements") Reported-by: Michael-CY Lee <michael-cy.lee@mediatek.com> Link: https://patch.msgid.link/20250630154501.f26cd45a0ecd.I28e0525d06e8a99e555707301bca29265cf20dc8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-07wifi: mac80211: clear frame buffer to never leak stackJohannes Berg
In disconnect paths paths, local frame buffers are used to build deauthentication frames to send them over the air and as notifications to userspace. Some internal error paths (that, given no other bugs, cannot happen) don't always initialize the buffers before sending them to userspace, so in the presence of other bugs they can leak stack content. Initialize the buffers to avoid the possibility of this happening. Suggested-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Link: https://patch.msgid.link/20250701072213.13004-2-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-07wifi: mac80211: correctly identify S1G short beaconLachlan Hodges
mac80211 identifies a short beacon by the presence of the next TBTT field, however the standard actually doesn't explicitly state that the next TBTT can't be in a long beacon or even that it is required in a short beacon - and as a result this validation does not work for all vendor implementations. The standard explicitly states that an S1G long beacon shall contain the S1G beacon compatibility element as the first element in a beacon transmitted at a TBTT that is not a TSBTT (Target Short Beacon Transmission Time) as per IEEE80211-2024 11.1.3.10.1. This is validated by 9.3.4.3 Table 9-76 which states that the S1G beacon compatibility element is only allowed in the full set and is not allowed in the minimum set of elements permitted for use within short beacons. Correctly identify short beacons by the lack of an S1G beacon compatibility element as the first element in an S1G beacon frame. Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results") Signed-off-by: Simon Wadsworth <simon@morsemicro.com> Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250701075541.162619-1-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-04libceph: Rename hmac_sha256() to ceph_hmac_sha256()Eric Biggers
Rename hmac_sha256() to ceph_hmac_sha256(), to avoid a naming conflict with the upcoming hmac_sha256() library function. This code will be able to use the HMAC-SHA256 library, but that's left for a later commit. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250630160645.3198-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-04af_unix: enable handing out pidfds for reaped tasks in SCM_PIDFDAlexander Mikhalitsyn
Now everything is ready to pass PIDFD_STALE to pidfd_prepare(). This will allow opening pidfd for reaped tasks. Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@google.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: David Rheinsberg <david@readahead.eu> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Link: https://lore.kernel.org/20250703222314.309967-7-aleksandr.mikhalitsyn@canonical.com Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-04af_unix: stash pidfs dentry when neededAlexander Mikhalitsyn
We need to ensure that pidfs dentry is allocated when we meet any struct pid for the first time. This will allows us to open pidfd even after the task it corresponds to is reaped. Basically, we need to identify all places where we fill skb/scm_cookie with struct pid reference for the first time and call pidfs_register_pid(). Tricky thing here is that we have a few places where this happends depending on what userspace is doing: - [__scm_replace_pid()] explicitly sending an SCM_CREDENTIALS message and specified pid in a numeric format - [unix_maybe_add_creds()] enabled SO_PASSCRED/SO_PASSPIDFD but didn't send SCM_CREDENTIALS explicitly - [scm_send()] force_creds is true. Netlink case, we don't need to touch it. Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@google.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: David Rheinsberg <david@readahead.eu> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Link: https://lore.kernel.org/20250703222314.309967-6-aleksandr.mikhalitsyn@canonical.com Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-04af_unix/scm: fix whitespace errorsAlexander Mikhalitsyn
Fix whitespace/formatting errors. Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@google.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: David Rheinsberg <david@readahead.eu> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Link: https://lore.kernel.org/20250703222314.309967-5-aleksandr.mikhalitsyn@canonical.com Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-04af_unix: introduce and use scm_replace_pid() helperAlexander Mikhalitsyn
Existing logic in __scm_send() related to filling an struct scm_cookie with a proper struct pid reference is already pretty tricky. Let's simplify it a bit by introducing a new helper. This helper will be extended in one of the next patches. Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@google.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: David Rheinsberg <david@readahead.eu> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Link: https://lore.kernel.org/20250703222314.309967-4-aleksandr.mikhalitsyn@canonical.com Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-04af_unix: introduce unix_skb_to_scm helperAlexander Mikhalitsyn
Instead of open-coding let's consolidate this logic in a separate helper. This will simplify further changes. Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@google.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: David Rheinsberg <david@readahead.eu> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Link: https://lore.kernel.org/20250703222314.309967-3-aleksandr.mikhalitsyn@canonical.com Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-04af_unix: rework unix_maybe_add_creds() to allow sleepAlexander Mikhalitsyn
As a preparation for the next patches we need to allow sleeping in unix_maybe_add_creds() and also return err. Currently, we can't do that as unix_maybe_add_creds() is being called under unix_state_lock(). There is no need for this, really. So let's move call sites of this helper a bit and do necessary function signature changes. Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Kuniyuki Iwashima <kuniyu@google.com> Cc: Lennart Poettering <mzxreary@0pointer.de> Cc: Luca Boccassi <bluca@debian.org> Cc: David Rheinsberg <david@readahead.eu> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Link: https://lore.kernel.org/20250703222314.309967-2-aleksandr.mikhalitsyn@canonical.com Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-04xfrm: Skip redundant statistics update for crypto offloadJianbo Liu
In the crypto offload path, every packet is still processed by the software stack. The state's statistics required for the expiration check are being updated in software. However, the code also calls xfrm_dev_state_update_stats(), which triggers a query to the hardware device to fetch statistics. This hardware query is redundant and introduces unnecessary performance overhead. Skip this call when it's crypto offload (not packet offload) to avoid the unnecessary hardware access, thereby improving performance. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-07-04xfrm: interface: fix use-after-free after changing collect_md xfrm interfaceEyal Birger
collect_md property on xfrm interfaces can only be set on device creation, thus xfrmi_changelink() should fail when called on such interfaces. The check to enforce this was done only in the case where the xi was returned from xfrmi_locate() which doesn't look for the collect_md interface, and thus the validation was never reached. Calling changelink would thus errornously place the special interface xi in the xfrmi_net->xfrmi hash, but since it also exists in the xfrmi_net->collect_md_xfrmi pointer it would lead to a double free when the net namespace was taken down [1]. Change the check to use the xi from netdev_priv which is available earlier in the function to prevent changes in xfrm collect_md interfaces. [1] resulting oops: [ 8.516540] kernel BUG at net/core/dev.c:12029! [ 8.516552] Oops: invalid opcode: 0000 [#1] SMP NOPTI [ 8.516559] CPU: 0 UID: 0 PID: 12 Comm: kworker/u80:0 Not tainted 6.15.0-virtme #5 PREEMPT(voluntary) [ 8.516565] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 8.516569] Workqueue: netns cleanup_net [ 8.516579] RIP: 0010:unregister_netdevice_many_notify+0x101/0xab0 [ 8.516590] Code: 90 0f 0b 90 48 8b b0 78 01 00 00 48 8b 90 80 01 00 00 48 89 56 08 48 89 32 4c 89 80 78 01 00 00 48 89 b8 80 01 00 00 eb ac 90 <0f> 0b 48 8b 45 00 4c 8d a0 88 fe ff ff 48 39 c5 74 5c 41 80 bc 24 [ 8.516593] RSP: 0018:ffffa93b8006bd30 EFLAGS: 00010206 [ 8.516598] RAX: ffff98fe4226e000 RBX: ffffa93b8006bd58 RCX: ffffa93b8006bc60 [ 8.516601] RDX: 0000000000000004 RSI: 0000000000000000 RDI: dead000000000122 [ 8.516603] RBP: ffffa93b8006bdd8 R08: dead000000000100 R09: ffff98fe4133c100 [ 8.516605] R10: 0000000000000000 R11: 00000000000003d2 R12: ffffa93b8006be00 [ 8.516608] R13: ffffffff96c1a510 R14: ffffffff96c1a510 R15: ffffa93b8006be00 [ 8.516615] FS: 0000000000000000(0000) GS:ffff98fee73b7000(0000) knlGS:0000000000000000 [ 8.516619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8.516622] CR2: 00007fcd2abd0700 CR3: 000000003aa40000 CR4: 0000000000752ef0 [ 8.516625] PKRU: 55555554 [ 8.516627] Call Trace: [ 8.516632] <TASK> [ 8.516635] ? rtnl_is_locked+0x15/0x20 [ 8.516641] ? unregister_netdevice_queue+0x29/0xf0 [ 8.516650] ops_undo_list+0x1f2/0x220 [ 8.516659] cleanup_net+0x1ad/0x2e0 [ 8.516664] process_one_work+0x160/0x380 [ 8.516673] worker_thread+0x2aa/0x3c0 [ 8.516679] ? __pfx_worker_thread+0x10/0x10 [ 8.516686] kthread+0xfb/0x200 [ 8.516690] ? __pfx_kthread+0x10/0x10 [ 8.516693] ? __pfx_kthread+0x10/0x10 [ 8.516697] ret_from_fork+0x82/0xf0 [ 8.516705] ? __pfx_kthread+0x10/0x10 [ 8.516709] ret_from_fork_asm+0x1a/0x30 [ 8.516718] </TASK> Fixes: abc340b38ba2 ("xfrm: interface: support collect metadata mode") Reported-by: Lonial Con <kongln9170@gmail.com> Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-07-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni
Cross-merge networking fixes after downstream PR (net-6.16-rc5). No conflicts. No adjacent changes. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03Merge tag 'net-6.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from Bluetooth. Current release - new code bugs: - eth: - txgbe: fix the issue of TX failure - ngbe: specify IRQ vector when the number of VFs is 7 Previous releases - regressions: - sched: always pass notifications when child class becomes empty - ipv4: fix stat increase when udp early demux drops the packet - bluetooth: prevent unintended pause by checking if advertising is active - virtio: fix error reporting in virtqueue_resize - eth: - virtio-net: - ensure the received length does not exceed allocated size - fix the xsk frame's length check - lan78xx: fix WARN in __netif_napi_del_locked on disconnect Previous releases - always broken: - bluetooth: mesh: check instances prior disabling advertising - eth: - idpf: convert control queue mutex to a spinlock - dpaa2: fix xdp_rxq_info leak - amd-xgbe: align CL37 AN sequence as per databook" * tag 'net-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits) vsock/vmci: Clear the vmci transport packet properly when initializing it dt-bindings: net: sophgo,sg2044-dwmac: Drop status from the example net: ngbe: specify IRQ vector when the number of VFs is 7 net: wangxun: revert the adjustment of the IRQ vector sequence net: txgbe: request MISC IRQ in ndo_open virtio_net: Enforce minimum TX ring size for reliability virtio_net: Cleanup '2+MAX_SKB_FRAGS' virtio_ring: Fix error reporting in virtqueue_resize virtio-net: xsk: rx: fix the frame's length check virtio-net: use the check_mergeable_len helper virtio-net: remove redundant truesize check with PAGE_SIZE virtio-net: ensure the received length does not exceed allocated size net: ipv4: fix stat increase when udp early demux drops the packet net: libwx: fix the incorrect display of the queue number amd-xgbe: do not double read link status net/sched: Always pass notifications when child class becomes empty nui: Fix dma_mapping_error() check rose: fix dangling neighbour pointers in rose_rt_device_down() enic: fix incorrect MTU comparison in enic_change_mtu() amd-xgbe: align CL37 AN sequence as per databook ...
2025-07-03Bluetooth: hci_event: Fix not marking Broadcast Sink BIS as connectedLuiz Augusto von Dentz
Upon receiving HCI_EVT_LE_BIG_SYNC_ESTABLISHED with status 0x00 (success) the corresponding BIS hci_conn state shall be set to BT_CONNECTED otherwise they will be left with BT_OPEN which is invalid at that point, also create the debugfs and sysfs entries following the same logic as the likes of Broadcast Source BIS and CIS connections. Fixes: f777d8827817 ("Bluetooth: ISO: Notify user space about failed bis connections") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-03Bluetooth: hci_sync: Fix attempting to send HCI_Disconnect to BIS handleLuiz Augusto von Dentz
BIS/PA connections do have their own cleanup proceedure which are performed by hci_conn_cleanup/bis_cleanup. Fixes: 23205562ffc8 ("Bluetooth: separate CIS_LINK and BIS_LINK link types") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-03Bluetooth: hci_sync: Fix not disabling advertising instanceLuiz Augusto von Dentz
As the code comments on hci_setup_ext_adv_instance_sync suggests the advertising instance needs to be disabled in order to update its parameters, but it was wrongly checking that !adv->pending. Fixes: cba6b758711c ("Bluetooth: hci_sync: Make use of hci_cmd_sync_queue set 2") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-07-03ipv6: Cleanup fib6_drop_pcpu_from()Yue Haibing
Since commit 0e2338749192 ("ipv6: fix races in ip6_dst_destroy()"), 'table' is unused in __fib6_drop_pcpu_from(), no need pass it from fib6_drop_pcpu_from(). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250701041235.1333687-1-yuehaibing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-03netfilter: conntrack: remove DCCP protocol supportPablo Neira Ayuso
The DCCP socket family has now been removed from this tree, see: 8bb3212be4b4 ("Merge branch 'net-retire-dccp-socket'") Remove connection tracking and NAT support for this protocol, this should not pose a problem because no DCCP traffic is expected to be seen on the wire. As for the code for matching on dccp header for iptables and nftables, mark it as deprecated and keep it in place. Ruleset restoration is an atomic operation. Without dccp matching support, an astray match on dccp could break this operation leaving your computer with no policy in place, so let's follow a more conservative approach for matches. Add CONFIG_NFT_EXTHDR_DCCP which is set to 'n' by default to deprecate dccp extension support. Similarly, label CONFIG_NETFILTER_XT_MATCH_DCCP as deprecated too and also set it to 'n' by default. Code to match on DCCP protocol from ebtables also remains in place, this is just a few checks on IPPROTO_DCCP from _check() path which is exercised when ruleset is loaded. There is another use of IPPROTO_DCCP from the _check() path in the iptables multiport match. Another check for IPPROTO_DCCP from the packet in the reject target is also removed. So let's schedule removal of the dccp matching for a second stage, this should not interfer with the dccp retirement since this is only matching on the dccp header. Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-07-03vsock/vmci: Clear the vmci transport packet properly when initializing itHarshaVardhana S A
In vmci_transport_packet_init memset the vmci_transport_packet before populating the fields to avoid any uninitialised data being left in the structure. Cc: Bryan Tan <bryan-bt.tan@broadcom.com> Cc: Vishnu Dasa <vishnu.dasa@broadcom.com> Cc: Broadcom internal kernel review list Cc: Stefano Garzarella <sgarzare@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: virtualization@lists.linux.dev Cc: netdev@vger.kernel.org Cc: stable <stable@kernel.org> Signed-off-by: HarshaVardhana S A <harshavardhana.sa@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Acked-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20250701122254.2397440-1-gregkh@linuxfoundation.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-02rpc_create_client_dir(): return 0 or -E...Al Viro
Callers couldn't care less which dentry did we get - anything valid is treated as success. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_create_client_dir(): don't bother with rpc_populate()Al Viro
not for a single file... Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_new_dir(): the last argument is always NULLAl Viro
All callers except the one in rpc_populate() pass explicit NULL there; rpc_populate() passes its last argument ('private') instead, but in the only call of rpc_populate() that creates any subdirectories (when creating fixed subdirectories of root) private itself is NULL. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_pipe: expand the calls of rpc_mkdir_populate()Al Viro
... and get rid of convoluted callbacks. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_gssd_dummy_populate(): don't bother with rpc_populate()Al Viro
Just have it create gssd (in root), clntXX in gssd, then info and gssd in clntXX - all with explicit rpc_new_dir()/rpc_new_file()/rpc_mkpipe_dentry(). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_mkpipe_dentry(): switch to simple_start_creating()Al Viro
... and make sure we set the fs-private part of inode up before attaching it to dentry. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_pipe: saner primitive for creating regular filesAl Viro
rpc_new_file(); similar to rpc_new_dir(), except that here we pass file_operations as well. Callers don't care about dentry, just return 0 or -E... Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_pipe: saner primitive for creating subdirectoriesAl Viro
All users of __rpc_mkdir() have the same form - start_creating(), followed, in case of success, by __rpc_mkdir() and unlocking parent. Combine that into a single helper, expanding __rpc_mkdir() into it, along with the call of __rpc_create_common() in it. Don't mess with d_drop() + d_add() - just d_instantiate() and be done with that. The reason __rpc_create_common() goes for that dance is that dentry it gets might or might not be hashed; here we know it's hashed. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_pipe: don't overdo directory lockingAl Viro
Don't try to hold directories locked more than VFS requires; lock just before getting a child to be made positive (using simple_start_creating()) and unlock as soon as the child is created. There's no benefit in keeping the parent locked while populating the child - it won't stop dcache lookups anyway. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_mkpipe_dentry(): saner calling conventionsAl Viro
Instead of returning a dentry or ERR_PTR(-E...), return 0 and store dentry into pipe->dentry on success and return -E... on failure. Callers are happier that way... NOTE: dummy rpc_pipe is getting ->dentry set; we never access that, since we 1) never call rpc_unlink() for it (dentry is taken out by ->kill_sb()) 2) never call rpc_queue_upcall() for it (writing to that sucker fails; no downcalls are ever submitted, so no replies are going to arrive) IOW, having that ->dentry set (and left dangling) is harmless, if ugly; cleaner solution will take more massage. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_unlink(): saner calling conventionsAl Viro
1) pass it pipe instead of pipe->dentry 2) zero pipe->dentry afterwards 3) it always returns 0; why bother? Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-02rpc_populate(): lift cleanup into callersAl Viro
rpc_populate() is called either from fill_super (where we don't need to remove any files on failure - rpc_kill_sb() will take them all out anyway) or from rpc_mkdir_populate(), where we need to remove the directory we'd been trying to populate along with whatever we'd put into it before we failed. Simpler to combine that into simple_recursive_removal() there. Note that rpc_pipe is overlocking directories quite a bit - locked parent is no obstacle to finding a child in dcache, so keeping it locked won't prevent userland observing a partially built subtree. All we need is to follow minimal VFS requirements; it's not as if clients used directory locking for exclusion - tree changes are serialized, but that's done on ->pipefs_sb_lock. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>