summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-07net: bcmgenet: BCM7712 is GENETv5 compatibleDoug Berger
The major revision of the GENET core in the BCM7712 SoC was bumped to 7 but it is compatible with the GENETv5 implementation. This commit maps the version accordingly to avoid a warning. Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250306192643.2383632-5-opendmb@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: bcmgenet: move feature flags to bcmgenet_privDoug Berger
The feature flags are moved and consolidated to the primary private driver structure and are now initialized from the platform device data rather than the hardware parameters to allow finer control over which platforms use which features. Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250306192643.2383632-4-opendmb@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: bcmgenet: add bcmgenet_has_* helpersDoug Berger
Introduce helper functions to indicate whether the driver should make use of a particular feature that it supports. These helpers abstract the implementation of how the feature availability is encoded. Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250306192643.2383632-3-opendmb@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: bcmgenet: bcmgenet_hw_params clean upDoug Berger
The entries of the bcmgenet_hw_params array are broken out to remove unused and duplicate entries and are made read only since they should not change for a specific version of the GENET hardware. Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250306192643.2383632-2-opendmb@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net/mlx5: Fill out devlink dev info only for PFsJiri Pirko
Firmware version query is supported on the PFs. Due to this following kernel warning log is observed: [ 188.590344] mlx5_core 0000:08:00.2: mlx5_fw_version_query:816:(pid 1453): fw query isn't supported by the FW Fix it by restricting the query and devlink info to the PF. Fixes: 8338d9378895 ("net/mlx5: Added devlink info callback") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Link: https://patch.msgid.link/20250306212529.429329-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07netmem: prevent TX of unreadable skbsMina Almasry
Currently on stable trees we have support for netmem/devmem RX but not TX. It is not safe to forward/redirect an RX unreadable netmem packet into the device's TX path, as the device may call dma-mapping APIs on dma addrs that should not be passed to it. Fix this by preventing the xmit of unreadable skbs. Tested by configuring tc redirect: sudo tc qdisc add dev eth1 ingress sudo tc filter add dev eth1 ingress protocol ip prio 1 flower ip_proto \ tcp src_ip 192.168.1.12 action mirred egress redirect dev eth1 Before, I see unreadable skbs in the driver's TX path passed to dma mapping APIs. After, I don't see unreadable skbs in the driver's TX path passed to dma mapping APIs. Fixes: 65249feb6b3d ("net: add support for skbs with unreadable frags") Suggested-by: Jakub Kicinski <kuba@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250306215520.1415465-1-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: stmmac: remove write-only priv->speedRussell King (Oracle)
priv->speed is only ever written to in two locations, but never read. Therefore, it serves no useful purpose. Remove this unnecessary struct member. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tqLJJ-005aQm-Mv@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07gve: convert to use netmem for DQO RDA modeHarshitha Ramamurthy
To add netmem support to the gve driver, add a union to the struct gve_rx_slot_page_info. netmem_ref is used for DQO queue format's raw DMA addressing(RDA) mode. The struct page is retained for other usecases. Then, switch to using relevant netmem helper functions for page pool and skb frag management. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250307003905.601175-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07Merge tag 'for-net-2025-03-07' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - btusb: Configure altsetting for HCI_USER_CHANNEL - hci_event: Fix enabling passive scanning - revert: "hci_core: Fix sleeping function called from invalid context" - SCO: fix sco_conn refcounting on sco_conn_ready * tag 'for-net-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Revert "Bluetooth: hci_core: Fix sleeping function called from invalid context" Bluetooth: hci_event: Fix enabling passive scanning Bluetooth: SCO: fix sco_conn refcounting on sco_conn_ready Bluetooth: btusb: Configure altsetting for HCI_USER_CHANNEL ==================== Link: https://patch.msgid.link/20250307181854.99433-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: ethtool: use correct device pointer in ethnl_default_dump_one()Eric Dumazet
ethnl_default_dump_one() operates on the device provided in its @dev parameter, not from ctx->req_info->dev. syzbot reported: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000197: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000cb8-0x0000000000000cbf] RIP: 0010:netdev_need_ops_lock include/linux/netdevice.h:2792 [inline] RIP: 0010:netdev_lock_ops include/linux/netdevice.h:2803 [inline] RIP: 0010:ethnl_default_dump_one net/ethtool/netlink.c:557 [inline] RIP: 0010:ethnl_default_dumpit+0x447/0xd40 net/ethtool/netlink.c:593 Call Trace: <TASK> genl_dumpit+0x10d/0x1b0 net/netlink/genetlink.c:1027 netlink_dump+0x64d/0xe10 net/netlink/af_netlink.c:2309 __netlink_dump_start+0x5a2/0x790 net/netlink/af_netlink.c:2424 genl_family_rcv_msg_dumpit net/netlink/genetlink.c:1076 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1192 [inline] genl_rcv_msg+0x894/0xec0 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x206/0x480 net/netlink/af_netlink.c:2534 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline] netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1339 netlink_sendmsg+0x8de/0xcb0 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:709 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:724 ____sys_sendmsg+0x53a/0x860 net/socket.c:2564 ___sys_sendmsg net/socket.c:2618 [inline] __sys_sendmsg+0x269/0x350 net/socket.c:2650 Fixes: 2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock") Reported-by: syzbot+3da2442641f0c6a705a2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/67caaf5e.050a0220.15b4b9.007a.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250307083544.1659135-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07bpf: fix a possible NULL deref in bpf_map_offload_map_alloc()Eric Dumazet
Call bpf_dev_offload_check() before netdev_lock_ops(). This is needed if attr->map_ifindex is not valid. Oops: general protection fault, probably for non-canonical address 0xdffffc0000000197: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000cb8-0x0000000000000cbf] RIP: 0010:netdev_need_ops_lock include/linux/netdevice.h:2792 [inline] RIP: 0010:netdev_lock_ops include/linux/netdevice.h:2803 [inline] RIP: 0010:bpf_map_offload_map_alloc+0x19a/0x910 kernel/bpf/offload.c:533 Call Trace: <TASK> map_create+0x946/0x11c0 kernel/bpf/syscall.c:1455 __sys_bpf+0x6d3/0x820 kernel/bpf/syscall.c:5777 __do_sys_bpf kernel/bpf/syscall.c:5902 [inline] __se_sys_bpf kernel/bpf/syscall.c:5900 [inline] __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5900 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 Fixes: 97246d6d21c2 ("net: hold netdev instance lock during ndo_bpf") Reported-by: syzbot+0c7bfd8cf3aecec92708@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/67caa2b1.050a0220.15b4b9.0077.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250307074303.1497911-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-03-06 We've added 6 non-merge commits during the last 13 day(s) which contain a total of 6 files changed, 230 insertions(+), 56 deletions(-). The main changes are: 1) Add XDP metadata support for tun driver, from Marcus Wichelmann. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests/bpf: Fix file descriptor assertion in open_tuntap helper selftests/bpf: Add test for XDP metadata support in tun driver selftests/bpf: Refactor xdp_context_functional test and bpf program selftests/bpf: Move open_tuntap to network helpers net: tun: Enable transfer of XDP metadata to skb net: tun: Enable XDP metadata support ==================== Link: https://patch.msgid.link/20250307055335.441298-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07selftests/net: add proc_net_pktgen to .gitignoreWillem de Bruijn
Ensure git doesn't pick up this new target. Fixes: 03544faad761 ("selftest: net: add proc_net_pktgen") Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250307031356.368350-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07Merge branch 'riscv-sophgo-add-ethernet-support-for-sg2044'Jakub Kicinski
Inochi Amaoto says: ==================== riscv: sophgo: Add ethernet support for SG2044 The ethernet controller of SG2044 is Synopsys DesignWare IP with custom clock. Add glue layer for it. v6: https://lore.kernel.org/20250305063920.803601-1-inochiama@gmail.com v5: https://lore.kernel.org/20250216123953.1252523-1-inochiama@gmail.com v4: https://lore.kernel.org/20250209013054.816580-1-inochiama@gmail.com v3: https://lore.kernel.org/20241223005843.483805-1-inochiama@gmail.com RFC: https://lore.kernel.org/20241101014327.513732-1-inochiama@gmail.com v2: https://lore.kernel.org/20241025011000.244350-1-inochiama@gmail.com v1: https://lore.kernel.org/20241021103617.653386-1-inochiama@gmail.com ==================== Link: https://patch.msgid.link/20250307011623.440792-1-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: stmmac: Add glue layer for Sophgo SG2044 SoCInochi Amaoto
Adds Sophgo dwmac driver support on the Sophgo SG2044 SoC. Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250307011623.440792-5-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: stmmac: platform: Add snps,dwmac-5.30a IP compatible stringInochi Amaoto
Add "snps,dwmac-5.30a" compatible string for 5.30a version that can avoid to define some platform data in the glue layer. Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Reviewed-by: Romain Gantois <romain.gantois@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250307011623.440792-4-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: stmmac: platform: Group GMAC4 compatible checkInochi Amaoto
Use of_device_compatible_match to group existing compatible check of GMAC4 device. Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Reviewed-by: Romain Gantois <romain.gantois@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250307011623.440792-3-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07dt-bindings: net: Add support for Sophgo SG2044 dwmacInochi Amaoto
The GMAC IP on SG2044 is almost a standard Synopsys DesignWare MAC (version 5.30a) with some extra clock. Add necessary compatible string for this device. Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250307011623.440792-2-inochiama@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: phylink: Remove unused phylink_init_eeeDr. David Alan Gilbert
phylink_init_eee() is currently unused. It was last added in 2019 by commit 86e58135bc4a ("net: phylink: add phylink_init_eee() helper") but it didn't actually wire a use up. It had previous been removed in 2017 by commit 939eae25d9a5 ("phylink: remove phylink_init_eee()"). Remove it again. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250306184534.246152-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07Merge tag 's390-6.14-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix return address recovery of traced function in ftrace to ensure reliable stack unwinding - Fix compiler warnings and runtime crashes of vDSO selftests on s390 by introducing a dedicated GNU hash bucket pointer with correct 32-bit entry size - Fix test_monitor_call() inline asm, which misses CC clobber, by switching to an instruction that doesn't modify CC * tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ftrace: Fix return address recovery of traced function selftests/vDSO: Fix GNU hash table entry size for s390x s390/traps: Fix test_monitor_call() inline assembly
2025-03-08rust: lockdep: Use Pin for all LockClassKey usagesMitchell Levy
Reintroduce dynamically-allocated LockClassKeys such that they are automatically (de)registered. Require that all usages of LockClassKeys ensure that they are Pin'd. Currently, only `'static` LockClassKeys are supported, so Pin is redundant. However, it is intended that dynamically-allocated LockClassKeys will eventually be supported, so using Pin from the outset will make that change simpler. Closes: https://github.com/Rust-for-Linux/linux/issues/1102 Suggested-by: Benno Lossin <benno.lossin@proton.me> Suggested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Mitchell Levy <levymitchell0@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Link: https://lore.kernel.org/r/20250307232717.1759087-12-boqun.feng@gmail.com
2025-03-08rust: sync: condvar: Add wait_interruptible_freezable()Alice Ryhl
To support waiting for a `CondVar` as a freezable process, add a wait_interruptible_freezable() function. Binder needs this function in the appropriate places to freeze a process where some of its threads are blocked on the Binder driver. [ Boqun: Cleaned up the changelog and documentation. ] Signed-off-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250307232717.1759087-10-boqun.feng@gmail.com
2025-03-08rust: sync: lock: Add an example for Guard:: Lock_ref()Boqun Feng
To provide examples on usage of `Guard::lock_ref()` along with the unit test, an "assert a lock is held by a guard" example is added. (Also apply feedback from Benno.) Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250223072114.3715-1-boqun.feng@gmail.com Link: https://lore.kernel.org/r/20250307232717.1759087-9-boqun.feng@gmail.com
2025-03-08rust: sync: Add accessor for the lock behind a given guardAlice Ryhl
In order to assert a particular `Guard` is associated with a particular `Lock`, add an accessor to obtain a reference to the underlying `Lock` of a `Guard`. Binder needs this assertion to ensure unsafe list operations are done with the correct lock held. [Boqun: Capitalize the title and reword the commit log] Signed-off-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Fiona Behrens <me@kloenk.dev> Link: https://lore.kernel.org/r/20250205-guard-get-lock-v2-1-ba32a8c1d5b7@google.com Link: https://lore.kernel.org/r/20250307232717.1759087-8-boqun.feng@gmail.com
2025-03-08locking/lockdep: Add kasan_check_byte() check in lock_acquire()Waiman Long
KASAN instrumentation of lockdep has been disabled, as we don't need KASAN to check the validity of lockdep internal data structures and incur unnecessary performance overhead. However, the lockdep_map pointer passed in externally may not be valid (e.g. use-after-free) and we run the risk of using garbage data resulting in false lockdep reports. Add kasan_check_byte() call in lock_acquire() for non kernel core data object to catch invalid lockdep_map and print out a KASAN report before any lockdep splat, if any. Suggested-by: Marco Elver <elver@google.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20250214195242.2480920-1-longman@redhat.com Link: https://lore.kernel.org/r/20250307232717.1759087-7-boqun.feng@gmail.com
2025-03-08locking/lockdep: Disable KASAN instrumentation of lockdep.cWaiman Long
Both KASAN and LOCKDEP are commonly enabled in building a debug kernel. Each of them can significantly slow down the speed of a debug kernel. Enabling KASAN instrumentation of the LOCKDEP code will further slow things down. Since LOCKDEP is a high overhead debugging tool, it will never get enabled in a production kernel. The LOCKDEP code is also pretty mature and is unlikely to get major changes. There is also a possibility of recursion similar to KCSAN. To evaluate the performance impact of disabling KASAN instrumentation of lockdep.c, the time to do a parallel build of the Linux defconfig kernel was used as the benchmark. Two x86-64 systems (Skylake & Zen 2) and an arm64 system were used as test beds. Two sets of non-RT and RT kernels with similar configurations except mainly CONFIG_PREEMPT_RT were used for evaluation. For the Skylake system: Kernel Run time Sys time ------ -------- -------- Non-debug kernel (baseline) 0m47.642s 4m19.811s [CONFIG_KASAN_INLINE=y] Debug kernel 2m11.108s (x2.8) 38m20.467s (x8.9) Debug kernel (patched) 1m49.602s (x2.3) 31m28.501s (x7.3) Debug kernel (patched + mitigations=off) 1m30.988s (x1.9) 26m41.993s (x6.2) RT kernel (baseline) 0m54.871s 7m15.340s [CONFIG_KASAN_INLINE=n] RT debug kernel 6m07.151s (x6.7) 135m47.428s (x18.7) RT debug kernel (patched) 3m42.434s (x4.1) 74m51.636s (x10.3) RT debug kernel (patched + mitigations=off) 2m40.383s (x2.9) 57m54.369s (x8.0) [CONFIG_KASAN_INLINE=y] RT debug kernel 3m22.155s (x3.7) 77m53.018s (x10.7) RT debug kernel (patched) 2m36.700s (x2.9) 54m31.195s (x7.5) RT debug kernel (patched + mitigations=off) 2m06.110s (x2.3) 45m49.493s (x6.3) For the Zen 2 system: Kernel Run time Sys time ------ -------- -------- Non-debug kernel (baseline) 1m42.806s 39m48.714s [CONFIG_KASAN_INLINE=y] Debug kernel 4m04.524s (x2.4) 125m35.904s (x3.2) Debug kernel (patched) 3m56.241s (x2.3) 127m22.378s (x3.2) Debug kernel (patched + mitigations=off) 2m38.157s (x1.5) 92m35.680s (x2.3) RT kernel (baseline) 1m51.500s 14m56.322s [CONFIG_KASAN_INLINE=n] RT debug kernel 16m04.962s (x8.7) 244m36.463s (x16.4) RT debug kernel (patched) 9m09.073s (x4.9) 129m28.439s (x8.7) RT debug kernel (patched + mitigations=off) 3m31.662s (x1.9) 51m01.391s (x3.4) For the arm64 system: Kernel Run time Sys time ------ -------- -------- Non-debug kernel (baseline) 1m56.844s 8m47.150s Debug kernel 3m54.774s (x2.0) 92m30.098s (x10.5) Debug kernel (patched) 3m32.429s (x1.8) 77m40.779s (x8.8) RT kernel (baseline) 4m01.641s 18m16.777s [CONFIG_KASAN_INLINE=n] RT debug kernel 19m32.977s (x4.9) 304m23.965s (x16.7) RT debug kernel (patched) 16m28.354s (x4.1) 234m18.149s (x12.8) Turning the mitigations off doesn't seems to have any noticeable impact on the performance of the arm64 system. So the mitigation=off entries aren't included. For the x86 CPUs, CPU mitigations has a much bigger impact on performance, especially the RT debug kernel with CONFIG_KASAN_INLINE=n. The SRSO mitigation in Zen 2 has an especially big impact on the debug kernel. It is also the majority of the slowdown with mitigations on. It is because the patched RET instruction slows down function returns. A lot of helper functions that are normally compiled out or inlined may become real function calls in the debug kernel. With !CONFIG_KASAN_INLINE, the KASAN instrumentation inserts a lot of __asan_loadX*() and __kasan_check_read() function calls to memory access portion of the code. The lockdep's __lock_acquire() function, for instance, has 66 __asan_loadX*() and 6 __kasan_check_read() calls added with KASAN instrumentation. Of course, the actual numbers may vary depending on the compiler used and the exact version of the lockdep code. With the Skylake test system, the parallel kernel build times reduction of the RT debug kernel with this patch are: CONFIG_KASAN_INLINE=n: -37% CONFIG_KASAN_INLINE=y: -22% The time reduction is less with CONFIG_KASAN_INLINE=y, but it is still significant. Setting CONFIG_KASAN_INLINE=y can result in a significant performance improvement. The major drawback is a significant increase in the size of kernel text. In the case of vmlinux, its text size increases from 45997948 to 67606807. That is a 47% size increase (about 21 Mbytes). The size increase of other kernel modules should be similar. With the newly added rtmutex and lockdep lock events, the relevant event counts for the test runs with the Skylake system were: Event type Debug kernel RT debug kernel ---------- ------------ --------------- lockdep_acquire 1,968,663,277 5,425,313,953 rtlock_slowlock - 401,701,156 rtmutex_slowlock - 139,672 The __lock_acquire() calls in the RT debug kernel are x2.8 times of the non-RT debug kernel with the same workload. Since the __lock_acquire() function is a big hitter in term of performance slowdown, this makes the RT debug kernel much slower than the non-RT one. The average lock nesting depth is likely to be higher in the RT debug kernel too leading to longer execution time in the __lock_acquire() function. As the small advantage of enabling KASAN instrumentation to catch potential memory access error in the lockdep debugging tool is probably not worth the drawback of further slowing down a debug kernel, disable KASAN instrumentation in the lockdep code to allow the debug kernels to regain some performance back, especially for the RT debug kernels. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250307232717.1759087-6-boqun.feng@gmail.com
2025-03-08locking/lock_events: Add locking events for lockdepWaiman Long
Add some lock events to lockdep to profile its behavior. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250307232717.1759087-5-boqun.feng@gmail.com
2025-03-08locking/lock_events: Add locking events for rtmutex slow pathsWaiman Long
Add locking events for rtlock_slowlock() and rt_mutex_slowlock() for profiling the slow path behavior of rt_spin_lock() and rt_mutex_lock(). Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250307232717.1759087-4-boqun.feng@gmail.com
2025-03-08Merge branch 'locking/urgent' into locking/core, to pick up locking fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-08locking/semaphore: Use wake_q to wake up processes outside lock critical sectionWaiman Long
A circular lock dependency splat has been seen involving down_trylock(): ====================================================== WARNING: possible circular locking dependency detected 6.12.0-41.el10.s390x+debug ------------------------------------------------------ dd/32479 is trying to acquire lock: 0015a20accd0d4f8 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x26/0x90 but task is already holding lock: 000000017e461698 (&zone->lock){-.-.}-{2:2}, at: rmqueue_bulk+0xac/0x8f0 the existing dependency chain (in reverse order) is: -> #4 (&zone->lock){-.-.}-{2:2}: -> #3 (hrtimer_bases.lock){-.-.}-{2:2}: -> #2 (&rq->__lock){-.-.}-{2:2}: -> #1 (&p->pi_lock){-.-.}-{2:2}: -> #0 ((console_sem).lock){-.-.}-{2:2}: The console_sem -> pi_lock dependency is due to calling try_to_wake_up() while holding the console_sem raw_spinlock. This dependency can be broken by using wake_q to do the wakeup instead of calling try_to_wake_up() under the console_sem lock. This will also make the semaphore's raw_spinlock become a terminal lock without taking any further locks underneath it. The hrtimer_bases.lock is a raw_spinlock while zone->lock is a spinlock. The hrtimer_bases.lock -> zone->lock dependency happens via the debug_objects_fill_pool() helper function in the debugobjects code. -> #4 (&zone->lock){-.-.}-{2:2}: __lock_acquire+0xe86/0x1cc0 lock_acquire.part.0+0x258/0x630 lock_acquire+0xb8/0xe0 _raw_spin_lock_irqsave+0xb4/0x120 rmqueue_bulk+0xac/0x8f0 __rmqueue_pcplist+0x580/0x830 rmqueue_pcplist+0xfc/0x470 rmqueue.isra.0+0xdec/0x11b0 get_page_from_freelist+0x2ee/0xeb0 __alloc_pages_noprof+0x2c2/0x520 alloc_pages_mpol_noprof+0x1fc/0x4d0 alloc_pages_noprof+0x8c/0xe0 allocate_slab+0x320/0x460 ___slab_alloc+0xa58/0x12b0 __slab_alloc.isra.0+0x42/0x60 kmem_cache_alloc_noprof+0x304/0x350 fill_pool+0xf6/0x450 debug_object_activate+0xfe/0x360 enqueue_hrtimer+0x34/0x190 __run_hrtimer+0x3c8/0x4c0 __hrtimer_run_queues+0x1b2/0x260 hrtimer_interrupt+0x316/0x760 do_IRQ+0x9a/0xe0 do_irq_async+0xf6/0x160 Normally a raw_spinlock to spinlock dependency is not legitimate and will be warned if CONFIG_PROVE_RAW_LOCK_NESTING is enabled, but debug_objects_fill_pool() is an exception as it explicitly allows this dependency for non-PREEMPT_RT kernel without causing PROVE_RAW_LOCK_NESTING lockdep splat. As a result, this dependency is legitimate and not a bug. Anyway, semaphore is the only locking primitive left that is still using try_to_wake_up() to do wakeup inside critical section, all the other locking primitives had been migrated to use wake_q to do wakeup outside of the critical section. It is also possible that there are other circular locking dependencies involving printk/console_sem or other existing/new semaphores lurking somewhere which may show up in the future. Let just do the migration now to wake_q to avoid headache like this. Reported-by: yzbot+ed801a886dfdbfe7136d@syzkaller.appspotmail.com Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250307232717.1759087-3-boqun.feng@gmail.com
2025-03-08locking/rtmutex: Use the 'struct' keyword in kernel-doc commentRandy Dunlap
Add the "struct" keyword to prevent a kernel-doc warning: rtmutex_common.h:67: warning: cannot understand function prototype: 'struct rt_wake_q_head ' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20250307232717.1759087-2-boqun.feng@gmail.com
2025-03-08rust: lockdep: Remove support for dynamically allocated LockClassKeysMitchell Levy
Currently, dynamically allocated LockCLassKeys can be used from the Rust side without having them registered. This is a soundness issue, so remove them. Fixes: 6ea5aa08857a ("rust: sync: introduce `LockClassKey`") Suggested-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Mitchell Levy <levymitchell0@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250307232717.1759087-11-boqun.feng@gmail.com
2025-03-08x86/mm: Define PTRS_PER_PMD for assembly code tooIngo Molnar
Andy reported the following build warning from head_32.S: In file included from arch/x86/kernel/head_32.S:29: arch/x86/include/asm/pgtable_32.h:59:5: error: "PTRS_PER_PMD" is not defined, evaluates to 0 [-Werror=undef] 59 | #if PTRS_PER_PMD > 1 The reason is that on 2-level i386 paging the folded in PMD's PTRS_PER_PMD constant is not defined in assembly headers, only in generic MM C headers. Instead of trying to fish out the definition from the generic headers, just define it - it even has a comment for it already... Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/Z8oa8AUVyi2HWfo9@gmail.com
2025-03-07x86/boot: Drop CRC-32 checksum and the build tool that generates itArd Biesheuvel
Apart from some sanity checks on the size of setup.bin, the only remaining task carried out by the arch/x86/boot/tools/build.c build tool is generating the CRC-32 checksum of the bzImage. This feature was added in commit 7d6e737c8d2698b6 ("x86: add a crc32 checksum to the kernel image.") without any motivation (or any commit log text, for that matter). This checksum is not verified by any known bootloader, and given that a) the checksum of the entire bzImage is reported by most tools (zlib, rhash) as 0xffffffff and not 0x0 as documented, b) the checksum is corrupted when the image is signed for secure boot, which means that no distro ships x86 images with valid CRCs, it seems quite unlikely that this checksum is being used, so let's just drop it, along with the tool that generates it. Instead, use simple file concatenation and truncation to combine the two pieces into bzImage, and replace the checks on the size of the setup block with a couple of ASSERT()s in the linker script. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ian Campbell <ijc@hellion.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250307164801.885261-2-ardb+git@google.com
2025-03-07Merge tag 'slab-for-6.14-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fix from Vlastimil Babka: - Stable fix for kmem_cache_destroy() called from a WQ_MEM_RECLAIM workqueue causing a warning due to the new kvfree_rcu_barrier() (Uladzislau Rezki) * tag 'slab-for-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq
2025-03-07Merge tag 'acpi-6.14-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Restore the previous behavior of the ACPI platform_profile sysfs interface that has been changed recently in a way incompatible with the existing user space (Mario Limonciello)" * tag 'acpi-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: platform/x86/amd: pmf: Add balanced-performance to hidden choices platform/x86/amd: pmf: Add 'quiet' to hidden choices ACPI: platform_profile: Add support for hidden choices
2025-03-07Merge tag 'execve-v6.14-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull core dumping fix from Kees Cook: - Only sort VMAs when core_sort_vma sysctl is set * tag 'execve-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: coredump: Only sort VMAs when core_sort_vma sysctl is set
2025-03-07Merge tag 'for-6.14-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix leaked extent map after error when reading chunks - replace use of deprecated strncpy - in zoned mode, fixed range when ulocking extent range, causing a hang * tag 'for-6.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix a leaked chunk map issue in read_one_chunk() btrfs: replace deprecated strncpy() with strscpy() btrfs: zoned: fix extent range end unlock in cow_file_range()
2025-03-07Merge tag 'block-6.14-20250306' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - TCP use after free fix on polling (Sagi) - Controller memory buffer cleanup fixes (Icenowy) - Free leaking requests on bad user passthrough commands (Keith) - TCP error message fix (Maurizio) - TCP corruption fix on partial PDU (Maurizio) - TCP memory ordering fix for weakly ordered archs (Meir) - Type coercion fix on message error for TCP (Dan) - Name the RQF flags enum, fixing issues with anon enums and BPF import of it - ublk parameter setting fix - GPT partition 7-bit conversion fix * tag 'block-6.14-20250306' of git://git.kernel.dk/linux: block: Name the RQF flags enum nvme-tcp: fix signedness bug in nvme_tcp_init_connection() block: fix conversion of GPT partition name to 7-bit ublk: set_params: properly check if parameters can be applied nvmet-tcp: Fix a possible sporadic response drops in weakly ordered arch nvme-tcp: fix potential memory corruption in nvme_tcp_recv_pdu() nvme-tcp: Fix a C2HTermReq error message nvmet: remove old function prototype nvme-ioctl: fix leaked requests on mapping error nvme-pci: skip CMB blocks incompatible with PCI P2P DMA nvme-pci: clean up CMBMSC when registering CMB fails nvme-tcp: fix possible UAF in nvme_tcp_poll
2025-03-07Merge tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fix from Jens Axboe: "A single fix for a regression introduced in the 6.14 merge window, causing stalls/hangs with IOPOLL reads or writes" * tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linux: io_uring/rw: ensure reissue path is correctly handled for IOPOLL
2025-03-07io_uring: Remove unused declaration io_alloc_async_data()Yue Haibing
Commit ef623a647f42 ("io_uring: Move old async data allocation helper to header") leave behind this unused declaration. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20250305013454.3635021-1-yuehaibing@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07Merge tag 'sched-urgent-2025-03-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc scheduler fixes from Ingo Molnar: - Fix deadline scheduler sysctl parameter setting bug - Fix RT scheduler sysctl parameter setting bug - Fix possible memory corruption in child_cfs_rq_on_list() * tag 'sched-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/rt: Update limit of sched_rt sysctl in documentation sched/deadline: Use online cpus for validating runtime sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
2025-03-07cpupower: Make lib versioning scheme more obvious and fix version linkThomas Renninger
library versioning was broken: libcpupower.so.0.0.1 libcpupower.so -> libcpupower.so.0.0.1 libcpupower.so.1 -> libcpupower.so.0.0.1 and is fixed by this patch to: libcpupower.so.1.0.1 libcpupower.so -> libcpupower.so.1.0.1 libcpupower.so.1 -> libcpupower.so.1.0.1 Link: https://lore.kernel.org/r/20250307094334.39587-1-trenn@suse.de Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Shuah Khan <shuah@kernel.org>
2025-03-07Merge tag 'perf-urgent-2025-03-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fixes from Ingo Molnar: "Fix a race between PMU registration and event creation, and fix pmus_lock vs. pmus_srcu lock ordering" * tag 'perf-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix perf_pmu_register() vs. perf_init_event() perf/core: Fix pmus_lock vs. pmus_srcu ordering
2025-03-07elf: add remaining SHF_ flag macrosTimur Tabi
Add the remaining SHF_ flags, as listed in the "Executable and Linkable Format" Wikipedia page and the System V Application Binary Interface[1]. This allows drivers to load and parse ELF images that use some of those flags. In particular, an upcoming change to the Nouveau GPU driver will use some of the flags. Link: https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.sheader.html#sh_flags [1] Signed-off-by: Timur Tabi <ttabi@nvidia.com> Link: https://lore.kernel.org/r/20250307171417.267488-1-ttabi@nvidia.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-07selinux: support wildcard network interface namesChristian Göttsche
Add support for wildcard matching of network interface names. This is useful for auto-generated interfaces, for example podman creates network interfaces for containers with the naming scheme podman0, podman1, podman2, ... To maintain backward compatibility guard this feature with a new policy capability 'netif_wildcard'. Netifcon definitions are compared against in the order given by the policy, so userspace tools should sort them in a reasonable order. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-03-07Merge tag 'x86-urgent-2025-03-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - Fix CPUID leaf 0x2 parsing bugs - Sanitize very early boot parameters to avoid crash - Fix size overflows in the SGX code - Make CALL_NOSPEC use consistent * tag 'x86-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Sanitize boot params before parsing command line x86/sgx: Fix size overflows in sgx_encl_create() x86/cpu: Properly parse CPUID leaf 0x2 TLB descriptor 0x63 x86/cpu: Validate CPUID leaf 0x2 EDX output x86/cacheinfo: Validate CPUID leaf 0x2 EDX output x86/speculation: Add a conditional CS prefix to CALL_NOSPEC x86/speculation: Simplify and make CALL_NOSPEC consistent
2025-03-07selinux: Chain up tool resolving errors in install_policy.shTim Schumacher
Subshell evaluations are not exempt from errexit, so if a command is not available, `which` will fail and exit the script as a whole. This causes the helpful error messages to not be printed if they are tacked on using a `$?` comparison. Resolve the issue by using chains of logical operators, which are not subject to the effects of errexit. Fixes: e37c1877ba5b1 ("scripts/selinux: modernize mdp") Signed-off-by: Tim Schumacher <tim.schumacher1@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-03-07arm64: lib: Use MOPS for usercopy routinesKristina Martšenko
Similarly to what was done with the memcpy() routines, make copy_to_user(), copy_from_user() and clear_user() also use the Armv8.8 FEAT_MOPS instructions. Both MOPS implementation options (A and B) are supported, including asymmetric systems. The exception fixup code fixes up the registers according to the option used. In case of a fault the routines return precisely how much was not copied (as required by the comment in include/linux/uaccess.h), as unprivileged versions of CPY/SET are guaranteed not to have written past the addresses reported in the GPRs. The MOPS instructions could possibly be inlined into callers (and patched to branch to the generic implementation if not detected; similarly to what x86 does), but as a first step this patch just uses them in the out-of-line routines. Signed-off-by: Kristina Martšenko <kristina.martsenko@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20250228170006.390100-4-kristina.martsenko@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-03-07arm64: mm: Handle PAN faults on uaccess CPY* instructionsKristina Martšenko
A subsequent patch will use CPY* instructions to copy between user and kernel memory. Add handling for PAN faults caused by an intended kernel memory access erroneously accessing user memory, in order to make it easier to debug kernel bugs and to keep the same behavior as with regular loads/stores. Signed-off-by: Kristina Martšenko <kristina.martsenko@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20250228170006.390100-3-kristina.martsenko@arm.com [catalin.marinas@arm.com: Folded the extable search into insn_may_access_user()] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>