summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2024-03-06Bluetooth: hci_sync: Attempt to dequeue connection attemptLuiz Augusto von Dentz
If connection is still queued/pending in the cmd_sync queue it means no command has been generated and it should be safe to just dequeue the callback when it is being aborted. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: hci_sync: Add helper functions to manipulate cmd_sync queueLuiz Augusto von Dentz
This adds functions to queue, dequeue and lookup into the cmd_sync list. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: hci_conn: Fix UAF Write in __hci_acl_create_connection_syncLuiz Augusto von Dentz
This fixes the UAF on __hci_acl_create_connection_sync caused by connection abortion, it uses the same logic as to LE_LINK which uses hci_cmd_sync_cancel to prevent the callback to run if the connection is abort prematurely. Reported-by: syzbot+3f0a39be7a2035700868@syzkaller.appspotmail.com Fixes: 45340097ce6e ("Bluetooth: hci_conn: Only do ACL connections sequentially") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: hci_conn: Always use sk_timeo as conn_timeoutLuiz Augusto von Dentz
This aligns the use socket sk_timeo as conn_timeout when initiating a connection and then use it when scheduling the resulting HCI command, that way the command is actually aborted synchronously thus not blocking commands generated by hci_abort_conn_sync to inform the controller the connection is to be aborted. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: Remove pending ACL connection attemptsJonas Dreßler
With the last commit we moved to using the hci_sync queue for "Create Connection" requests, removing the need for retrying the paging after finished/failed "Create Connection" requests and after the end of inquiries. hci_conn_check_pending() was used to trigger this retry, we can remove it now. Note that we can also remove the special handling for COMMAND_DISALLOWED errors in the completion handler of "Create Connection", because "Create Connection" requests are now always serialized. This is somewhat reverting commit 4c67bc74f016 ("[Bluetooth] Support concurrent connect requests"). With this, the BT_CONNECT2 state of ACL hci_conn objects should now be back to meaning only one thing: That we received a "Connection Request" from another device (see hci_conn_request_evt), but the response to that is going to be deferred. Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: hci_conn: Only do ACL connections sequentiallyJonas Dreßler
Pretty much all bluetooth chipsets only support paging a single device at a time, and if they don't reject a secondary "Create Connection" request while another is still ongoing, they'll most likely serialize those requests in the firware. With commit 4c67bc74f016 ("[Bluetooth] Support concurrent connect requests") we started adding some serialization of our own in case the adapter returns "Command Disallowed" HCI error. This commit was using the BT_CONNECT2 state for the serialization, this state is also used for a few more things (most notably to indicate we're waiting for an inquiry to cancel) and therefore a bit unreliable. Also not all BT firwares would respond with "Command Disallowed" on too many connection requests, some will also respond with "Hardware Failure" (BCM4378), and others will error out later and send a "Connect Complete" event with error "Rejected Limited Resources" (Marvell 88W8897). We can clean things up a bit and also make the serialization more reliable by using our hci_sync machinery to always do "Create Connection" requests in a sequential manner. This is very similar to what we're already doing for establishing LE connections, and it works well there. Note that this causes a test failure in mgmt-tester (test "Pair Device - Power off 1") because the hci_abort_conn_sync() changes the error we return on timeout of the "Create Connection". We'll fix this on the mgmt-tester side by adjusting the expected error for the test. Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: Remove BT_HSLuiz Augusto von Dentz
High Speed, Alternate MAC and PHY (AMP) extension, has been removed from Bluetooth Core specification on 5.3: https://www.bluetooth.com/blog/new-core-specification-v5-3-feature-enhancements/ Fixes: 244bc377591c ("Bluetooth: Add BT_HS config option") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: hci_core: Cancel request on command timeoutLuiz Augusto von Dentz
If command has timed out call __hci_cmd_sync_cancel to notify the hci_req since it will inevitably cause a timeout. This also rework the code around __hci_cmd_sync_cancel since it was wrongly assuming it needs to cancel timer as well, but sometimes the timers have not been started or in fact they already had timed out in which case they don't need to be cancel yet again. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: hci_event: Use HCI error defines instead of magic valuesJonas Dreßler
We have error defines already, so let's use them. Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: Add new state HCI_POWERING_DOWNJonas Dreßler
Add a new state HCI_POWERING_DOWN that indicates that the device is currently powering down, this will be useful for the next commit. Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: Remove HCI_POWER_OFF_TIMEOUTJonas Dreßler
With commit cf75ad8b41d2 ("Bluetooth: hci_sync: Convert MGMT_SET_POWERED"), the power off sequence got refactored so that this timeout was no longer necessary, let's remove the leftover define from the header too. Fixes: cf75ad8b41d2 ("Bluetooth: hci_sync: Convert MGMT_SET_POWERED") Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06mac802154: fix llsec key resources release in mac802154_llsec_key_delFedor Pchelkin
mac802154_llsec_key_del() can free resources of a key directly without following the RCU rules for waiting before the end of a grace period. This may lead to use-after-free in case llsec_lookup_key() is traversing the list of keys in parallel with a key deletion: refcount_t: addition on 0; use-after-free. WARNING: CPU: 4 PID: 16000 at lib/refcount.c:25 refcount_warn_saturate+0x162/0x2a0 Modules linked in: CPU: 4 PID: 16000 Comm: wpan-ping Not tainted 6.7.0 #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 RIP: 0010:refcount_warn_saturate+0x162/0x2a0 Call Trace: <TASK> llsec_lookup_key.isra.0+0x890/0x9e0 mac802154_llsec_encrypt+0x30c/0x9c0 ieee802154_subif_start_xmit+0x24/0x1e0 dev_hard_start_xmit+0x13e/0x690 sch_direct_xmit+0x2ae/0xbc0 __dev_queue_xmit+0x11dd/0x3c20 dgram_sendmsg+0x90b/0xd60 __sys_sendto+0x466/0x4c0 __x64_sys_sendto+0xe0/0x1c0 do_syscall_64+0x45/0xf0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 Also, ieee802154_llsec_key_entry structures are not freed by mac802154_llsec_key_del(): unreferenced object 0xffff8880613b6980 (size 64): comm "iwpan", pid 2176, jiffies 4294761134 (age 60.475s) hex dump (first 32 bytes): 78 0d 8f 18 80 88 ff ff 22 01 00 00 00 00 ad de x......."....... 00 00 00 00 00 00 00 00 03 00 cd ab 00 00 00 00 ................ backtrace: [<ffffffff81dcfa62>] __kmem_cache_alloc_node+0x1e2/0x2d0 [<ffffffff81c43865>] kmalloc_trace+0x25/0xc0 [<ffffffff88968b09>] mac802154_llsec_key_add+0xac9/0xcf0 [<ffffffff8896e41a>] ieee802154_add_llsec_key+0x5a/0x80 [<ffffffff8892adc6>] nl802154_add_llsec_key+0x426/0x5b0 [<ffffffff86ff293e>] genl_family_rcv_msg_doit+0x1fe/0x2f0 [<ffffffff86ff46d1>] genl_rcv_msg+0x531/0x7d0 [<ffffffff86fee7a9>] netlink_rcv_skb+0x169/0x440 [<ffffffff86ff1d88>] genl_rcv+0x28/0x40 [<ffffffff86fec15c>] netlink_unicast+0x53c/0x820 [<ffffffff86fecd8b>] netlink_sendmsg+0x93b/0xe60 [<ffffffff86b91b35>] ____sys_sendmsg+0xac5/0xca0 [<ffffffff86b9c3dd>] ___sys_sendmsg+0x11d/0x1c0 [<ffffffff86b9c65a>] __sys_sendmsg+0xfa/0x1d0 [<ffffffff88eadbf5>] do_syscall_64+0x45/0xf0 [<ffffffff890000ea>] entry_SYSCALL_64_after_hwframe+0x6e/0x76 Handle the proper resource release in the RCU callback function mac802154_llsec_key_del_rcu(). Note that if llsec_lookup_key() finds a key, it gets a refcount via llsec_key_get() and locally copies key id from key_entry (which is a list element). So it's safe to call llsec_key_put() and free the list entry after the RCU grace period elapses. Found by Linux Verification Center (linuxtesting.org). Fixes: 5d637d5aabd8 ("mac802154: add llsec structures and mutators") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Acked-by: Alexander Aring <aahringo@redhat.com> Message-ID: <20240228163840.6667-1-pchelkin@ispras.ru> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2024-03-06ethtool: Add GTP RSS hash options to ethtool.hTakeru Hayasaka
This is a patch that enables RSS functionality for GTP packets using ethtool. A user can include TEID and make RSS work for GTP-U over IPv4 by doing the following:`ethtool -N ens3 rx-flow-hash gtpu4 sde` In addition to gtpu(4|6), we now support gtpc(4|6),gtpc(4|6)t,gtpu(4|6)e, gtpu(4|6)u, and gtpu(4|6)d. gtpc(4|6): Used for GTP-C in IPv4 and IPv6, where the GTP header format does not include a TEID. gtpc(4|6)t: Used for GTP-C in IPv4 and IPv6, with a GTP header format that includes a TEID. gtpu(4|6): Used for GTP-U in both IPv4 and IPv6 scenarios. gtpu(4|6)e: Used for GTP-U with extended headers in both IPv4 and IPv6. gtpu(4|6)u: Used when the PSC (PDU session container) in the GTP-U extended header includes Uplink, applicable to both IPv4 and IPv6. gtpu(4|6)d: Used when the PSC in the GTP-U extended header includes Downlink, for both IPv4 and IPv6. GTP generates a flow that includes an ID called TEID to identify the tunnel. This tunnel is created for each UE (User Equipment).By performing RSS based on this flow, it is possible to apply RSS for each communication unit from the UE. Without this, RSS would only be effective within the range of IP addresses. For instance, the PGW can only perform RSS within the IP range of the SGW. Problematic from a load distribution perspective, especially if there's a bias in the terminals connected to a particular base station.This case can be solved by using this patch. Signed-off-by: Takeru Hayasaka <hayatake396@gmail.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-06Merge tag 'vfs-6.8-release.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Get rid of copy_mc flag in iov_iter which really only makes sense for the core dumping code so move it out of the generic iov iter code and make it coredump's problem. See the detailed commit description. - Revert fs/aio: Make io_cancel() generate completions again The initial fix here was predicated on the assumption that calling ki_cancel() didn't complete aio requests. However, that turned out to be wrong since the two drivers that actually make use of this set a cancellation function that performs the cancellation correctly. So revert this change. - Ensure that the test for IOCB_AIO_RW always happens before the read from ki_ctx. * tag 'vfs-6.8-release.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iov_iter: get rid of 'copy_mc' flag fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion Revert "fs/aio: Make io_cancel() generate completions again"
2024-03-06inet: Add getsockopt support for IP_ROUTER_ALERT and IPV6_ROUTER_ALERTJuntong Deng
Currently getsockopt does not support IP_ROUTER_ALERT and IPV6_ROUTER_ALERT, and we are unable to get the values of these two socket options through getsockopt. This patch adds getsockopt support for IP_ROUTER_ALERT and IPV6_ROUTER_ALERT. Signed-off-by: Juntong Deng <juntong.deng@outlook.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06iov_iter: get rid of 'copy_mc' flagLinus Torvalds
This flag is only set by one single user: the magical core dumping code that looks up user pages one by one, and then writes them out using their kernel addresses (by using a BVEC_ITER). That actually ends up being a huge problem, because while we do use copy_mc_to_kernel() for this case and it is able to handle the possible machine checks involved, nothing else is really ready to handle the failures caused by the machine check. In particular, as reported by Tong Tiangen, we don't actually support fault_in_iov_iter_readable() on a machine check area. As a result, the usual logic for writing things to a file under a filesystem lock, which involves doing a copy with page faults disabled and then if that fails trying to fault pages in without holding the locks with fault_in_iov_iter_readable() does not work at all. We could decide to always just make the MC copy "succeed" (and filling the destination with zeroes), and that would then create a core dump file that just ignores any machine checks. But honestly, this single special case has been problematic before, and means that all the normal iov_iter code ends up slightly more complex and slower. See for example commit c9eec08bac96 ("iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()") where David Howells re-organized the code just to avoid having to check the 'copy_mc' flags inside the inner iov_iter loops. So considering that we have exactly one user, and that one user is a non-critical special case that doesn't actually ever trigger in real life (Tong found this with manual error injection), the sane solution is to just decide that the onus on handling the machine check lines on that user instead. Ergo, do the copy_mc_to_kernel() in the core dump logic itself, copying the user data to a stable kernel page before writing it out. Fixes: f1982740f5e7 ("iov_iter: Convert iterate*() to inline funcs") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Link: https://lore.kernel.org/r/20240305133336.3804360-1-tongtiangen@huawei.com Link: https://lore.kernel.org/all/4e80924d-9c85-f13a-722a-6a5d2b1c225a@huawei.com/ Tested-by: David Howells <dhowells@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reported-by: Tong Tiangen <tongtiangen@huawei.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-05net: phy: Add phy_support_eee() indicating MAC support EEEAndrew Lunn
In order for EEE to operate, both the MAC and the PHY need to support it, similar to how pause works. With some exception - a number of PHYs have SmartEEE or AutoGrEEEn support in order to provide some EEE-like power savings with non-EEE capable MACs. Copy the pause concept and add the call phy_support_eee() which the MAC makes after connecting the PHY to indicate it supports EEE. phylib will then advertise EEE when auto-neg is performed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20240302195306.3207716-6-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: phy: Keep track of EEE configurationAndrew Lunn
Have phylib keep track of the EEE configuration. This simplifies the MAC drivers, in that they don't need to store it. Future patches to phylib will also make use of this information to further simplify the MAC drivers. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20240302195306.3207716-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: phy: Add phydev->enable_tx_lpi to simplify adjust link callbacksAndrew Lunn
MAC drivers which support EEE need to know the results of the EEE auto-neg in order to program the hardware to perform EEE or not. The oddly named phy_init_eee() can be used to determine this, it returns 0 if EEE should be used, or a negative error code, e.g. -EOPPROTONOTSUPPORT if the PHY does not support EEE or negotiate resulted in it not being used. However, many MAC drivers get this wrong. Add phydev->enable_tx_lpi which indicates the result of the autoneg for EEE, including if EEE is administratively disabled with ethtool. The MAC driver can then access this in the same way as link speed and duplex in the adjust link callback. If enable_tx_lpi is true, the MAC should send low power indications and does not need to consider anything else with respect to EEE. Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20240302195306.3207716-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: add helpers for EEE configurationRussell King
Add helpers that phylib and phylink can use to manage EEE configuration and determine whether the MAC should be permitted to use LPI based on that configuration. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20240302195306.3207716-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05dpll: move all dpll<>netdev helpers to dpll codeJakub Kicinski
Older versions of GCC really want to know the full definition of the type involved in rcu_assign_pointer(). struct dpll_pin is defined in a local header, net/core can't reach it. Move all the netdev <> dpll code into dpll, where the type is known. Otherwise we'd need multiple function calls to jump between the compilation units. This is the same problem the commit under fixes was trying to address, but with rcu_assign_pointer() not rcu_dereference(). Some of the exports are not needed, networking core can't be a module, we only need exports for the helpers used by drivers. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/all/35a869c8-52e8-177-1d4d-e57578b99b6@linux-m68k.org/ Fixes: 640f41ed33b5 ("dpll: fix build failure due to rcu_dereference_check() on unknown type") Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240305013532.694866-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05rxrpc: Use ktimes for call timeout tracking and set the timer lazilyDavid Howells
Track the call timeouts as ktimes rather than jiffies as the latter's granularity is too high and only set the timer at the end of the event handling function. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Differentiate PING ACK transmission traces.David Howells
There are three points that transmit PING ACKs and all of them use the same trace string. Change two of them to use different strings. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05Merge tag 'hyperv-fixes-signed-20240303' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Multiple fixes, cleanups and documentations for Hyper-V core code and drivers * tag 'hyperv-fixes-signed-20240303' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: make hv_bus const x86/hyperv: Allow 15-bit APIC IDs for VTL platforms x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad() x86/mm: Regularize set_memory_p() parameters and make non-static x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback Documentation: hyperv: Add overview of PCI pass-thru device support Drivers: hv: vmbus: Update indentation in create_gpadl_header() Drivers: hv: vmbus: Remove duplication and cleanup code in create_gpadl_header() fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem() Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory hv_utils: Allow implicit ICTIMESYNCFLAG_SYNC
2024-03-05nfc: core: make nfc_class constantRicardo B. Marliere
Since commit 43a7206b0963 ("driver core: class: make class_register() take a const *"), the driver core allows for struct class to be in read-only memory, so move the nfc_class structure to be declared at build time placing it into read-only memory, instead of having to be dynamically allocated at boot time. Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240302-class_cleanup-net-next-v1-6-8fa378595b93@marliere.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: Re-use and set mono_delivery_time bit for userspace tstamp packetsAbhishek Chauhan
Bridge driver today has no support to forward the userspace timestamp packets and ends up resetting the timestamp. ETF qdisc checks the packet coming from userspace and encounters to be 0 thereby dropping time sensitive packets. These changes will allow userspace timestamps packets to be forwarded from the bridge to NIC drivers. Setting the same bit (mono_delivery_time) to avoid dropping of userspace tstamp packets in the forwarding path. Existing functionality of mono_delivery_time remains unaltered here, instead just extended with userspace tstamp support for bridge forwarding path. Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240301201348.2815102-1-quic_abchauha@quicinc.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05net: gro: enable fast path for more casesEric Dumazet
Currently the so-called GRO fast path is only enabled for napi_frags_skb() callers. After the prior patch, we no longer have to clear frag0 whenever we pulled bytes to skb->head. We therefore can initialize frag0 to skb->data so that GRO fast path can be used in the following additional cases: - Drivers using header split (populating skb->data with headers, and having payload in one or more page fragments). - Drivers not using any page frag (entire packet is in skb->data) Add a likely() in skb_gro_may_pull() to help the compiler to generate better code if possible. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05net: gro: change skb_gro_network_header()Eric Dumazet
Change skb_gro_network_header() to accept a const sk_buff and to no longer check if frag0 is NULL or not. This allows to remove skb_gro_frag0_invalidate() which is seen in profiles when header-split is enabled. sk_buff parameter is constified for skb_gro_header_fast(), inet_gro_compute_pseudo() and ip6_gro_compute_pseudo(). Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05net: gro: rename skb_gro_header_hard()Eric Dumazet
skb_gro_header_hard() is renamed to skb_gro_may_pull() to match the convention used by common helpers like pskb_may_pull(). This means the condition is inverted: if (skb_gro_header_hard(skb, hlen)) slow_path(); becomes: if (!skb_gro_may_pull(skb, hlen)) slow_path(); Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05net: introduce page_frag_cache_drain()Yunsheng Lin
When draining a page_frag_cache, most user are doing the similar steps, so introduce an API to avoid code duplication. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05mm/page_alloc: modify page_frag_alloc_align() to accept align as an argumentYunsheng Lin
napi_alloc_frag_align() and netdev_alloc_frag_align() accept align as an argument, and they are thin wrappers around the __napi_alloc_frag_align() and __netdev_alloc_frag_align() APIs doing the alignment checking and align mask conversion, in order to call page_frag_alloc_align() directly. The intention here is to keep the alignment checking and the alignmask conversion in in-line wrapper to avoid those kind of operations during execution time since it can usually be handled during compile time. We are going to use page_frag_alloc_align() in vhost_net.c, it need the same kind of alignment checking and alignmask conversion, so split up page_frag_alloc_align into an inline wrapper doing the above operation, and add __page_frag_alloc_align() which is passed with the align mask the original function expected as suggested by Alexander. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Alexander Duyck <alexander.duyck@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-04Merge tag 'mlx5-fixes-2024-03-01' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2024-03-01 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. * tag 'mlx5-fixes-2024-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Switch to using _bh variant of of spinlock API in port timestamping NAPI poll context net/mlx5e: Use a memory barrier to enforce PTP WQ xmit submission tracking occurs after populating the metadata_map net/mlx5e: Fix MACsec state loss upon state update in offload path net/mlx5e: Change the warning when ignore_flow_level is not supported net/mlx5: Check capability for fw_reset net/mlx5: Fix fw reporter diagnose output net/mlx5: E-switch, Change flow rule destination checking Revert "net/mlx5e: Check the number of elements before walk TC rhashtable" Revert "net/mlx5: Block entering switchdev mode with ns inconsistency" ==================== Link: https://lore.kernel.org/r/20240302070318.62997-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04tcp: align tcp_sock_write_rx groupEric Dumazet
Stephen Rothwell and kernel test robot reported that some arches (parisc, hexagon) and/or compilers would not like blamed commit. Lets make sure tcp_sock_write_rx group does not start with a hole. While we are at it, correct tcp_sock_write_tx CACHELINE_ASSERT_GROUP_SIZE() since after the blamed commit, we went to 105 bytes. Fixes: 99123622050f ("tcp: remove some holes in struct tcp_sock") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/netdev/20240301121108.5d39e4f9@canb.auug.org.au/ Closes: https://lore.kernel.org/oe-kbuild-all/202403011451.csPYOS3C-lkp@intel.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Link: https://lore.kernel.org/r/20240301171945.2958176-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04wifi: mac80211: introduce a feature flag for quiet in CSAJohannes Berg
When doing CSA in multi-link, there really isn't a need to stop transmissions entirely. Add a feature flag for drivers to indicate they can handle quiet in CSA (be it by parsing themselves, or by implementing drv_pre_channel_switch()), to make that possible. Also clean up the csa_block_tx handling: it clearly cannot handle multi-link due to the way queues are stopped, move it to the sdata. Drivers should be doing it themselves for working properly during CSA in MLO anyway. Also rename it to indicate that it reflects TX was blocked at mac80211. Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240228095719.258439191541.I2469d206e2bf5cb244cfde2b4bbc2ae6d1cd3dd9@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-03-04wifi: mac80211: pass link conf to abort_channel_switchJohannes Berg
Pass the link conf to the abort_channel_switch driver method so the driver can handle things correctly. Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240228095718.27f621106ddd.Iadd3d69b722ffe5934779a32a0e4e596a4e33ed4@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-03-04wifi: mac80211: pass link_id to channel switch opsJohannes Berg
For CSA to work correctly in multi-link scenarios, pass the link_id to the relevant callbacks. While at it, unify/deduplicate the tracing for them. Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://msgid.link/20240228095718.b7726635c054.I0be5d00af4acb48cfbd23a9dbf067f9aeb66469d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-03-04wifi: cfg80211: allow cfg80211_defragment_element() without outputJohannes Berg
If we just want to determine the length of the fragmented data, we basically need the same logic, and really we want it to be _literally_ the same logic, so it cannot be out of sync in any way. Allow calling cfg80211_defragment_element() without an output buffer, where it then just returns the required output size. Also add this to the tests, just to exercise it, using the pre-calculated length to really do the defragmentation, which checks that this is sufficient. Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://msgid.link/20240228095718.6d6565b9e3f2.Ib441903f4b8644ba04b1c766f90580ee6f54fc66@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-03-04wifi: cfg80211: expose cfg80211_iter_rnr() to driversJohannes Berg
In mac80211 we'll need to look at reduced neighbor report entries for channel switch purposes, so export the iteration function to make that simpler. Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240228095718.0954809964ef.I53e95c017aa71f14e8d1057afbbc75982ddb43df@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-03-04wifi: mac80211: add ieee80211_vif_link_active() helperJohannes Berg
We sometimes need to check if a link is active, and this is complicated by the fact that active_links has no bits set when the vif isn't (acting as) an MLD. Add a small new helper ieee80211_vif_link_active() to make that a bit easier, and use it in a few places. Reviewed-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240228094901.688760aff5f7.I06892a503f5ecb9563fbd678d35d08daf7a044b0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-03-04wifi: mac80211: add link id to ieee80211_gtk_rekey_add()Shaul Triebitz
In MLO, we need the link id in the GTK key to be given by the driver after rekeying in wowlan, so add that. Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240228094500.ce1bfc83a680.I43a6f8ab2804ee07116a37d5b9ec601b843464b1@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-03-04tracing/net_sched: Fix tracepoints that save qdisc_dev() as a stringSteven Rostedt (Google)
I'm updating __assign_str() and will be removing the second parameter. To make sure that it does not break anything, I make sure that it matches the __string() field, as that is where the string is actually going to be saved in. To make sure there's nothing that breaks, I added a WARN_ON() to make sure that what was used in __string() is the same that is used in __assign_str(). In doing this change, an error was triggered as __assign_str() now expects the string passed in to be a char * value. I instead had the following warning: include/trace/events/qdisc.h: In function ‘trace_event_raw_event_qdisc_reset’: include/trace/events/qdisc.h:91:35: error: passing argument 1 of 'strcmp' from incompatible pointer type [-Werror=incompatible-pointer-types] 91 | __assign_str(dev, qdisc_dev(q)); That's because the qdisc_enqueue() and qdisc_reset() pass in qdisc_dev(q) to __assign_str() and to __string(). But that function returns a pointer to struct net_device and not a string. It appears that these events are just saving the pointer as a string and then reading it as a string as well. Use qdisc_dev(q)->name to save the device instead. Fixes: a34dac0b90552 ("net_sched: add tracepoints for qdisc_reset() and qdisc_destroy()") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04net: adopt skb_network_header_len() more broadlyEric Dumazet
(skb_transport_header(skb) - skb_network_header(skb)) can be replaced by skb_network_header_len(skb) Add a DEBUG_NET_WARN_ON_ONCE() in skb_network_header_len() to catch cases were the transport_header was not set. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-02Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-02-29 We've added 119 non-merge commits during the last 32 day(s) which contain a total of 150 files changed, 3589 insertions(+), 995 deletions(-). The main changes are: 1) Extend the BPF verifier to enable static subprog calls in spin lock critical sections, from Kumar Kartikeya Dwivedi. 2) Fix confusing and incorrect inference of PTR_TO_CTX argument type in BPF global subprogs, from Andrii Nakryiko. 3) Larger batch of riscv BPF JIT improvements and enabling inlining of the bpf_kptr_xchg() for RV64, from Pu Lehui. 4) Allow skeleton users to change the values of the fields in struct_ops maps at runtime, from Kui-Feng Lee. 5) Extend the verifier's capabilities of tracking scalars when they are spilled to stack, especially when the spill or fill is narrowing, from Maxim Mikityanskiy & Eduard Zingerman. 6) Various BPF selftest improvements to fix errors under gcc BPF backend, from Jose E. Marchesi. 7) Avoid module loading failure when the module trying to register a struct_ops has its BTF section stripped, from Geliang Tang. 8) Annotate all kfuncs in .BTF_ids section which eventually allows for automatic kfunc prototype generation from bpftool, from Daniel Xu. 9) Several updates to the instruction-set.rst IETF standardization document, from Dave Thaler. 10) Shrink the size of struct bpf_map resp. bpf_array, from Alexei Starovoitov. 11) Initial small subset of BPF verifier prepwork for sleepable bpf_timer, from Benjamin Tissoires. 12) Fix bpftool to be more portable to musl libc by using POSIX's basename(), from Arnaldo Carvalho de Melo. 13) Add libbpf support to gcc in CORE macro definitions, from Cupertino Miranda. 14) Remove a duplicate type check in perf_event_bpf_event, from Florian Lehner. 15) Fix bpf_spin_{un,}lock BPF helpers to actually annotate them with notrace correctly, from Yonghong Song. 16) Replace the deprecated bpf_lpm_trie_key 0-length array with flexible array to fix build warnings, from Kees Cook. 17) Fix resolve_btfids cross-compilation to non host-native endianness, from Viktor Malik. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits) selftests/bpf: Test if shadow types work correctly. bpftool: Add an example for struct_ops map and shadow type. bpftool: Generated shadow variables for struct_ops maps. libbpf: Convert st_ops->data to shadow type. libbpf: Set btf_value_type_id of struct bpf_map for struct_ops. bpf: Replace bpf_lpm_trie_key 0-length array with flexible array bpf, arm64: use bpf_prog_pack for memory management arm64: patching: implement text_poke API bpf, arm64: support exceptions arm64: stacktrace: Implement arch_bpf_stack_walk() for the BPF JIT bpf: add is_async_callback_calling_insn() helper bpf: introduce in_sleepable() helper bpf: allow more maps in sleepable bpf programs selftests/bpf: Test case for lacking CFI stub functions. bpf: Check cfi_stubs before registering a struct_ops type. bpf: Clarify batch lookup/lookup_and_delete semantics bpf, docs: specify which BPF_ABS and BPF_IND fields were zero bpf, docs: Fix typos in instruction-set.rst selftests/bpf: update tcp_custom_syncookie to use scalar packet offset bpf: Shrink size of struct bpf_map/bpf_array. ... ==================== Link: https://lore.kernel.org/r/20240301001625.8800-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-02block: define bvec_iter as __packed __aligned(4)Ming Lei
In commit 19416123ab3e ("block: define 'struct bvec_iter' as packed"), what we need is to save the 4byte padding, and avoid `bio` to spread on one extra cache line. It is enough to define it as '__packed __aligned(4)', as '__packed' alone means byte aligned, and can cause compiler to generate horrible code on architectures that don't support unaligned access in case that bvec_iter is embedded in other structures. Cc: Mikulas Patocka <mpatocka@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Fixes: 19416123ab3e ("block: define 'struct bvec_iter' as packed") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-03-01net/mlx5: Check capability for fw_resetMoshe Shemesh
Functions which can't access MFRL (Management Firmware Reset Level) register, have no use of fw_reset structures or events. Remove fw_reset structures allocation and registration for fw reset events notifications for these functions. Having the devlink param enable_remote_dev_reset on functions that don't have this capability is misleading as these functions are not allowed to influence the reset flow. Hence, this patch removes this parameter for such functions. In addition, return not supported on devlink reload action fw_activate for these functions. Fixes: 38b9f903f22b ("net/mlx5: Handle sync reset request event") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-03-01Merge tag 'sound-6.8-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The amount of changes wasn't as small as wished, but all reasonably small fixes. There is a PCM core API change, which is for correcting the behavior change we took in 6.8. The rest are device-specific fixes for ASoC AMD, Qualcomm, Cirrus codecs, HD-audio quirks & co" * tag 'sound-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: amd: yc: Fix non-functional mic on Lenovo 21J2 ASoC: amd: yc: add new YC platform variant (0x63) support ALSA: hda/realtek - ALC285 reduce pop noise from Headphone port ASoC: amd: yc: Add Lenovo ThinkBook 21J0 into DMI quirk table ALSA: hda/realtek: Add special fixup for Lenovo 14IRP8 ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol() ALSA: hda/realtek: tas2781: enable subwoofer volume control ALSA: pcm: clarify and fix default msbits value for all formats ASoC: qcom: Fix uninitialized pointer dmactl ALSA: hda/realtek: fix mute/micmute LED For HP mt440 ALSA: Drop leftover snd-rtctimer stuff from Makefile ALSA: ump: Fix the discard error code from snd_ump_legacy_open() ALSA: hda/realtek: Enable Mute LED on HP 840 G8 (MB 8AB8) ASoC: cs35l56: Must clear HALO_STATE before issuing SYSTEM_RESET ALSA: hda/realtek: Fix top speaker connection on Dell Inspiron 16 Plus 7630 ALSA: firewire-lib: fix to check cycle continuity
2024-03-01Merge tag 'drm-fixes-2024-03-01' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Bunch of fixes, xe, amdgpu, nouveau and tegra all have a few. Then drm/bridge including some drivers/soc fallout fixes. The biggest thing in here is a new unit test for some buddy allocator fixes, otherwise a misc fbcon, ttm unit test and one msm revert. Seems pretty normal for this stage. buddy: - two allocation fixes + unit test fbcon: - font restore syzkaller fix ttm: - kunit test fix bridge: - fix aux-hpd leaks - fix aux-hpd registration - fix use after free in soc/qcom - fix boot on soc/qcom xe: - A couple of tracepoint updates from Priyanka and Lucas - Make sure BINDs are completed before accepting UNBINDs on LR vms - Don't arbitrarily restrict max number of batched binds - Add uapi for dumpable bos (agreed on IRC) - Remove unused uapi flags and a leftover comment - A couple of fixes related to the execlist backend msm: - DP: Revert a change which was causing a HDP regression amdgpu: - Fix potential buffer overflow - Fix power min cap - Suspend/resume fix - SI PM fix - eDP fix nouveau: - fix a misreported VRAM sizing - fix a regression in suspend/resume due to freeing tegra: - host1x reset fix - only remove existing driver if display is possible" * tag 'drm-fixes-2024-03-01' of https://gitlab.freedesktop.org/drm/kernel: (32 commits) drm/nouveau: keep DMA buffers required for suspend/resume nouveau: report byte usage in VRAM usage drm/xe/xe_trace: Add move_lacks_source detail to xe_bo_move trace drm/xe: Deny unbinds if uapi ufence pending drm/xe: Expose user fence from xe_sync_entry drm/xe: Use pointers in trace events drm/xe/xe_bo_move: Enhance xe_bo_move trace drm/xe/mmio: fix build warning for BAR resize on 32-bit drm/xe: get rid of MAX_BINDS drm/xe: Use vmalloc for array of bind allocation in bind IOCTL drm/xe: Don't support execlists in xe_gt_tlb_invalidation layer drm/xe: Fix execlist splat drm/xe/uapi: Remove unused flags drm/xe/uapi: Remove DRM_XE_VM_BIND_FLAG_ASYNC comment left over drm/xe: Add uapi for dumpable bos drm/amd/display: Add monitor patch for specific eDP Revert "drm/msm/dp: use drm_bridge_hpd_notify() to report HPD status changes" drm/tests/drm_buddy: add alloc_range_bias test drm/buddy: check range allocation matches alignment drm/buddy: fix range bias ...
2024-03-01net: bql: fix building with BQL disabledArnd Bergmann
It is now possible to disable BQL, but that causes the cpsw driver to break: drivers/net/ethernet/ti/am65-cpsw-nuss.c:297:28: error: no member named 'dql' in 'struct netdev_queue' 297 | dql_avail(&netif_txq->dql), There is already a helper function in net/sch_generic.h that could be used to help here. Move its implementation into the common linux/netdevice.h along with the other bql interfaces and change both users over to the new interface. Fixes: ea7f3cfaa588 ("net: bql: allow the config to be disabled") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01Simplify net_dbg_ratelimited() dummyGeert Uytterhoeven
There is no need to wrap calls to the no_printk() helper inside an always-false check, as no_printk() already does that internally. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01ipv6: annotate data-races around idev->cnf.ignore_routes_with_linkdownEric Dumazet
idev->cnf.ignore_routes_with_linkdown can be used without any locks, add appropriate annotations. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>