Age | Commit message (Collapse) | Author |
|
The major revision of the GENET core in the BCM7712 SoC was bumped
to 7 but it is compatible with the GENETv5 implementation. This
commit maps the version accordingly to avoid a warning.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250306192643.2383632-5-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The feature flags are moved and consolidated to the primary
private driver structure and are now initialized from the
platform device data rather than the hardware parameters to
allow finer control over which platforms use which features.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250306192643.2383632-4-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce helper functions to indicate whether the driver should
make use of a particular feature that it supports. These helpers
abstract the implementation of how the feature availability is
encoded.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250306192643.2383632-3-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The entries of the bcmgenet_hw_params array are broken out to
remove unused and duplicate entries and are made read only since
they should not change for a specific version of the GENET
hardware.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250306192643.2383632-2-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Firmware version query is supported on the PFs. Due to this
following kernel warning log is observed:
[ 188.590344] mlx5_core 0000:08:00.2: mlx5_fw_version_query:816:(pid 1453): fw query isn't supported by the FW
Fix it by restricting the query and devlink info to the PF.
Fixes: 8338d9378895 ("net/mlx5: Added devlink info callback")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Link: https://patch.msgid.link/20250306212529.429329-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently on stable trees we have support for netmem/devmem RX but not
TX. It is not safe to forward/redirect an RX unreadable netmem packet
into the device's TX path, as the device may call dma-mapping APIs on
dma addrs that should not be passed to it.
Fix this by preventing the xmit of unreadable skbs.
Tested by configuring tc redirect:
sudo tc qdisc add dev eth1 ingress
sudo tc filter add dev eth1 ingress protocol ip prio 1 flower ip_proto \
tcp src_ip 192.168.1.12 action mirred egress redirect dev eth1
Before, I see unreadable skbs in the driver's TX path passed to dma
mapping APIs.
After, I don't see unreadable skbs in the driver's TX path passed to dma
mapping APIs.
Fixes: 65249feb6b3d ("net: add support for skbs with unreadable frags")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250306215520.1415465-1-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
priv->speed is only ever written to in two locations, but never
read. Therefore, it serves no useful purpose. Remove this unnecessary
struct member.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tqLJJ-005aQm-Mv@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To add netmem support to the gve driver, add a union
to the struct gve_rx_slot_page_info. netmem_ref is used for
DQO queue format's raw DMA addressing(RDA) mode. The struct
page is retained for other usecases.
Then, switch to using relevant netmem helper functions for
page pool and skb frag management.
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20250307003905.601175-1-hramamurthy@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- btusb: Configure altsetting for HCI_USER_CHANNEL
- hci_event: Fix enabling passive scanning
- revert: "hci_core: Fix sleeping function called from invalid context"
- SCO: fix sco_conn refcounting on sco_conn_ready
* tag 'for-net-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Revert "Bluetooth: hci_core: Fix sleeping function called from invalid context"
Bluetooth: hci_event: Fix enabling passive scanning
Bluetooth: SCO: fix sco_conn refcounting on sco_conn_ready
Bluetooth: btusb: Configure altsetting for HCI_USER_CHANNEL
====================
Link: https://patch.msgid.link/20250307181854.99433-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ethnl_default_dump_one() operates on the device provided in its @dev
parameter, not from ctx->req_info->dev.
syzbot reported:
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000197: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000cb8-0x0000000000000cbf]
RIP: 0010:netdev_need_ops_lock include/linux/netdevice.h:2792 [inline]
RIP: 0010:netdev_lock_ops include/linux/netdevice.h:2803 [inline]
RIP: 0010:ethnl_default_dump_one net/ethtool/netlink.c:557 [inline]
RIP: 0010:ethnl_default_dumpit+0x447/0xd40 net/ethtool/netlink.c:593
Call Trace:
<TASK>
genl_dumpit+0x10d/0x1b0 net/netlink/genetlink.c:1027
netlink_dump+0x64d/0xe10 net/netlink/af_netlink.c:2309
__netlink_dump_start+0x5a2/0x790 net/netlink/af_netlink.c:2424
genl_family_rcv_msg_dumpit net/netlink/genetlink.c:1076 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1192 [inline]
genl_rcv_msg+0x894/0xec0 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x206/0x480 net/netlink/af_netlink.c:2534
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1339
netlink_sendmsg+0x8de/0xcb0 net/netlink/af_netlink.c:1883
sock_sendmsg_nosec net/socket.c:709 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:724
____sys_sendmsg+0x53a/0x860 net/socket.c:2564
___sys_sendmsg net/socket.c:2618 [inline]
__sys_sendmsg+0x269/0x350 net/socket.c:2650
Fixes: 2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock")
Reported-by: syzbot+3da2442641f0c6a705a2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/67caaf5e.050a0220.15b4b9.007a.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250307083544.1659135-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Call bpf_dev_offload_check() before netdev_lock_ops().
This is needed if attr->map_ifindex is not valid.
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000197: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000cb8-0x0000000000000cbf]
RIP: 0010:netdev_need_ops_lock include/linux/netdevice.h:2792 [inline]
RIP: 0010:netdev_lock_ops include/linux/netdevice.h:2803 [inline]
RIP: 0010:bpf_map_offload_map_alloc+0x19a/0x910 kernel/bpf/offload.c:533
Call Trace:
<TASK>
map_create+0x946/0x11c0 kernel/bpf/syscall.c:1455
__sys_bpf+0x6d3/0x820 kernel/bpf/syscall.c:5777
__do_sys_bpf kernel/bpf/syscall.c:5902 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5900 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5900
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
Fixes: 97246d6d21c2 ("net: hold netdev instance lock during ndo_bpf")
Reported-by: syzbot+0c7bfd8cf3aecec92708@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/67caa2b1.050a0220.15b4b9.0077.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250307074303.1497911-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Martin KaFai Lau says:
====================
pull-request: bpf-next 2025-03-06
We've added 6 non-merge commits during the last 13 day(s) which contain
a total of 6 files changed, 230 insertions(+), 56 deletions(-).
The main changes are:
1) Add XDP metadata support for tun driver, from Marcus Wichelmann.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftests/bpf: Fix file descriptor assertion in open_tuntap helper
selftests/bpf: Add test for XDP metadata support in tun driver
selftests/bpf: Refactor xdp_context_functional test and bpf program
selftests/bpf: Move open_tuntap to network helpers
net: tun: Enable transfer of XDP metadata to skb
net: tun: Enable XDP metadata support
====================
Link: https://patch.msgid.link/20250307055335.441298-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ensure git doesn't pick up this new target.
Fixes: 03544faad761 ("selftest: net: add proc_net_pktgen")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250307031356.368350-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Inochi Amaoto says:
====================
riscv: sophgo: Add ethernet support for SG2044
The ethernet controller of SG2044 is Synopsys DesignWare IP with
custom clock. Add glue layer for it.
v6: https://lore.kernel.org/20250305063920.803601-1-inochiama@gmail.com
v5: https://lore.kernel.org/20250216123953.1252523-1-inochiama@gmail.com
v4: https://lore.kernel.org/20250209013054.816580-1-inochiama@gmail.com
v3: https://lore.kernel.org/20241223005843.483805-1-inochiama@gmail.com
RFC: https://lore.kernel.org/20241101014327.513732-1-inochiama@gmail.com
v2: https://lore.kernel.org/20241025011000.244350-1-inochiama@gmail.com
v1: https://lore.kernel.org/20241021103617.653386-1-inochiama@gmail.com
====================
Link: https://patch.msgid.link/20250307011623.440792-1-inochiama@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Adds Sophgo dwmac driver support on the Sophgo SG2044 SoC.
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250307011623.440792-5-inochiama@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add "snps,dwmac-5.30a" compatible string for 5.30a version that can avoid
to define some platform data in the glue layer.
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Reviewed-by: Romain Gantois <romain.gantois@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250307011623.440792-4-inochiama@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use of_device_compatible_match to group existing compatible
check of GMAC4 device.
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Reviewed-by: Romain Gantois <romain.gantois@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250307011623.440792-3-inochiama@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The GMAC IP on SG2044 is almost a standard Synopsys DesignWare
MAC (version 5.30a) with some extra clock.
Add necessary compatible string for this device.
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250307011623.440792-2-inochiama@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
phylink_init_eee() is currently unused.
It was last added in 2019 by
commit 86e58135bc4a ("net: phylink: add phylink_init_eee() helper")
but it didn't actually wire a use up.
It had previous been removed in 2017 by
commit 939eae25d9a5 ("phylink: remove phylink_init_eee()").
Remove it again.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250306184534.246152-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix return address recovery of traced function in ftrace to ensure
reliable stack unwinding
- Fix compiler warnings and runtime crashes of vDSO selftests on s390
by introducing a dedicated GNU hash bucket pointer with correct
32-bit entry size
- Fix test_monitor_call() inline asm, which misses CC clobber, by
switching to an instruction that doesn't modify CC
* tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ftrace: Fix return address recovery of traced function
selftests/vDSO: Fix GNU hash table entry size for s390x
s390/traps: Fix test_monitor_call() inline assembly
|
|
Reintroduce dynamically-allocated LockClassKeys such that they are
automatically (de)registered. Require that all usages of LockClassKeys
ensure that they are Pin'd.
Currently, only `'static` LockClassKeys are supported, so Pin is
redundant. However, it is intended that dynamically-allocated
LockClassKeys will eventually be supported, so using Pin from the outset
will make that change simpler.
Closes: https://github.com/Rust-for-Linux/linux/issues/1102
Suggested-by: Benno Lossin <benno.lossin@proton.me>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Mitchell Levy <levymitchell0@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Link: https://lore.kernel.org/r/20250307232717.1759087-12-boqun.feng@gmail.com
|
|
To support waiting for a `CondVar` as a freezable process, add a
wait_interruptible_freezable() function.
Binder needs this function in the appropriate places to freeze a process
where some of its threads are blocked on the Binder driver.
[ Boqun: Cleaned up the changelog and documentation. ]
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-10-boqun.feng@gmail.com
|
|
To provide examples on usage of `Guard::lock_ref()` along with the unit
test, an "assert a lock is held by a guard" example is added.
(Also apply feedback from Benno.)
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20250223072114.3715-1-boqun.feng@gmail.com
Link: https://lore.kernel.org/r/20250307232717.1759087-9-boqun.feng@gmail.com
|
|
In order to assert a particular `Guard` is associated with a particular
`Lock`, add an accessor to obtain a reference to the underlying `Lock`
of a `Guard`.
Binder needs this assertion to ensure unsafe list operations are done
with the correct lock held.
[Boqun: Capitalize the title and reword the commit log]
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Fiona Behrens <me@kloenk.dev>
Link: https://lore.kernel.org/r/20250205-guard-get-lock-v2-1-ba32a8c1d5b7@google.com
Link: https://lore.kernel.org/r/20250307232717.1759087-8-boqun.feng@gmail.com
|
|
KASAN instrumentation of lockdep has been disabled, as we don't need
KASAN to check the validity of lockdep internal data structures and
incur unnecessary performance overhead. However, the lockdep_map pointer
passed in externally may not be valid (e.g. use-after-free) and we run
the risk of using garbage data resulting in false lockdep reports.
Add kasan_check_byte() call in lock_acquire() for non kernel core data
object to catch invalid lockdep_map and print out a KASAN report before
any lockdep splat, if any.
Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20250214195242.2480920-1-longman@redhat.com
Link: https://lore.kernel.org/r/20250307232717.1759087-7-boqun.feng@gmail.com
|
|
Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
Each of them can significantly slow down the speed of a debug kernel.
Enabling KASAN instrumentation of the LOCKDEP code will further slow
things down.
Since LOCKDEP is a high overhead debugging tool, it will never get
enabled in a production kernel. The LOCKDEP code is also pretty mature
and is unlikely to get major changes. There is also a possibility of
recursion similar to KCSAN.
To evaluate the performance impact of disabling KASAN instrumentation
of lockdep.c, the time to do a parallel build of the Linux defconfig
kernel was used as the benchmark. Two x86-64 systems (Skylake & Zen 2)
and an arm64 system were used as test beds. Two sets of non-RT and RT
kernels with similar configurations except mainly CONFIG_PREEMPT_RT
were used for evaluation.
For the Skylake system:
Kernel Run time Sys time
------ -------- --------
Non-debug kernel (baseline) 0m47.642s 4m19.811s
[CONFIG_KASAN_INLINE=y]
Debug kernel 2m11.108s (x2.8) 38m20.467s (x8.9)
Debug kernel (patched) 1m49.602s (x2.3) 31m28.501s (x7.3)
Debug kernel
(patched + mitigations=off) 1m30.988s (x1.9) 26m41.993s (x6.2)
RT kernel (baseline) 0m54.871s 7m15.340s
[CONFIG_KASAN_INLINE=n]
RT debug kernel 6m07.151s (x6.7) 135m47.428s (x18.7)
RT debug kernel (patched) 3m42.434s (x4.1) 74m51.636s (x10.3)
RT debug kernel
(patched + mitigations=off) 2m40.383s (x2.9) 57m54.369s (x8.0)
[CONFIG_KASAN_INLINE=y]
RT debug kernel 3m22.155s (x3.7) 77m53.018s (x10.7)
RT debug kernel (patched) 2m36.700s (x2.9) 54m31.195s (x7.5)
RT debug kernel
(patched + mitigations=off) 2m06.110s (x2.3) 45m49.493s (x6.3)
For the Zen 2 system:
Kernel Run time Sys time
------ -------- --------
Non-debug kernel (baseline) 1m42.806s 39m48.714s
[CONFIG_KASAN_INLINE=y]
Debug kernel 4m04.524s (x2.4) 125m35.904s (x3.2)
Debug kernel (patched) 3m56.241s (x2.3) 127m22.378s (x3.2)
Debug kernel
(patched + mitigations=off) 2m38.157s (x1.5) 92m35.680s (x2.3)
RT kernel (baseline) 1m51.500s 14m56.322s
[CONFIG_KASAN_INLINE=n]
RT debug kernel 16m04.962s (x8.7) 244m36.463s (x16.4)
RT debug kernel (patched) 9m09.073s (x4.9) 129m28.439s (x8.7)
RT debug kernel
(patched + mitigations=off) 3m31.662s (x1.9) 51m01.391s (x3.4)
For the arm64 system:
Kernel Run time Sys time
------ -------- --------
Non-debug kernel (baseline) 1m56.844s 8m47.150s
Debug kernel 3m54.774s (x2.0) 92m30.098s (x10.5)
Debug kernel (patched) 3m32.429s (x1.8) 77m40.779s (x8.8)
RT kernel (baseline) 4m01.641s 18m16.777s
[CONFIG_KASAN_INLINE=n]
RT debug kernel 19m32.977s (x4.9) 304m23.965s (x16.7)
RT debug kernel (patched) 16m28.354s (x4.1) 234m18.149s (x12.8)
Turning the mitigations off doesn't seems to have any noticeable impact
on the performance of the arm64 system. So the mitigation=off entries
aren't included.
For the x86 CPUs, CPU mitigations has a much bigger
impact on performance, especially the RT debug kernel with
CONFIG_KASAN_INLINE=n. The SRSO mitigation in Zen 2 has an especially
big impact on the debug kernel. It is also the majority of the slowdown
with mitigations on. It is because the patched RET instruction slows
down function returns. A lot of helper functions that are normally
compiled out or inlined may become real function calls in the debug
kernel.
With !CONFIG_KASAN_INLINE, the KASAN instrumentation inserts a
lot of __asan_loadX*() and __kasan_check_read() function calls to memory
access portion of the code. The lockdep's __lock_acquire() function,
for instance, has 66 __asan_loadX*() and 6 __kasan_check_read() calls
added with KASAN instrumentation. Of course, the actual numbers may vary
depending on the compiler used and the exact version of the lockdep code.
With the Skylake test system, the parallel kernel build times reduction
of the RT debug kernel with this patch are:
CONFIG_KASAN_INLINE=n: -37%
CONFIG_KASAN_INLINE=y: -22%
The time reduction is less with CONFIG_KASAN_INLINE=y, but it is still
significant.
Setting CONFIG_KASAN_INLINE=y can result in a significant performance
improvement. The major drawback is a significant increase in the size
of kernel text. In the case of vmlinux, its text size increases from
45997948 to 67606807. That is a 47% size increase (about 21 Mbytes). The
size increase of other kernel modules should be similar.
With the newly added rtmutex and lockdep lock events, the relevant
event counts for the test runs with the Skylake system were:
Event type Debug kernel RT debug kernel
---------- ------------ ---------------
lockdep_acquire 1,968,663,277 5,425,313,953
rtlock_slowlock - 401,701,156
rtmutex_slowlock - 139,672
The __lock_acquire() calls in the RT debug kernel are x2.8 times of the
non-RT debug kernel with the same workload. Since the __lock_acquire()
function is a big hitter in term of performance slowdown, this makes
the RT debug kernel much slower than the non-RT one. The average lock
nesting depth is likely to be higher in the RT debug kernel too leading
to longer execution time in the __lock_acquire() function.
As the small advantage of enabling KASAN instrumentation to catch
potential memory access error in the lockdep debugging tool is probably
not worth the drawback of further slowing down a debug kernel, disable
KASAN instrumentation in the lockdep code to allow the debug kernels
to regain some performance back, especially for the RT debug kernels.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-6-boqun.feng@gmail.com
|
|
Add some lock events to lockdep to profile its behavior.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-5-boqun.feng@gmail.com
|
|
Add locking events for rtlock_slowlock() and rt_mutex_slowlock() for
profiling the slow path behavior of rt_spin_lock() and rt_mutex_lock().
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-4-boqun.feng@gmail.com
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
A circular lock dependency splat has been seen involving down_trylock():
======================================================
WARNING: possible circular locking dependency detected
6.12.0-41.el10.s390x+debug
------------------------------------------------------
dd/32479 is trying to acquire lock:
0015a20accd0d4f8 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x26/0x90
but task is already holding lock:
000000017e461698 (&zone->lock){-.-.}-{2:2}, at: rmqueue_bulk+0xac/0x8f0
the existing dependency chain (in reverse order) is:
-> #4 (&zone->lock){-.-.}-{2:2}:
-> #3 (hrtimer_bases.lock){-.-.}-{2:2}:
-> #2 (&rq->__lock){-.-.}-{2:2}:
-> #1 (&p->pi_lock){-.-.}-{2:2}:
-> #0 ((console_sem).lock){-.-.}-{2:2}:
The console_sem -> pi_lock dependency is due to calling try_to_wake_up()
while holding the console_sem raw_spinlock. This dependency can be broken
by using wake_q to do the wakeup instead of calling try_to_wake_up()
under the console_sem lock. This will also make the semaphore's
raw_spinlock become a terminal lock without taking any further locks
underneath it.
The hrtimer_bases.lock is a raw_spinlock while zone->lock is a
spinlock. The hrtimer_bases.lock -> zone->lock dependency happens via
the debug_objects_fill_pool() helper function in the debugobjects code.
-> #4 (&zone->lock){-.-.}-{2:2}:
__lock_acquire+0xe86/0x1cc0
lock_acquire.part.0+0x258/0x630
lock_acquire+0xb8/0xe0
_raw_spin_lock_irqsave+0xb4/0x120
rmqueue_bulk+0xac/0x8f0
__rmqueue_pcplist+0x580/0x830
rmqueue_pcplist+0xfc/0x470
rmqueue.isra.0+0xdec/0x11b0
get_page_from_freelist+0x2ee/0xeb0
__alloc_pages_noprof+0x2c2/0x520
alloc_pages_mpol_noprof+0x1fc/0x4d0
alloc_pages_noprof+0x8c/0xe0
allocate_slab+0x320/0x460
___slab_alloc+0xa58/0x12b0
__slab_alloc.isra.0+0x42/0x60
kmem_cache_alloc_noprof+0x304/0x350
fill_pool+0xf6/0x450
debug_object_activate+0xfe/0x360
enqueue_hrtimer+0x34/0x190
__run_hrtimer+0x3c8/0x4c0
__hrtimer_run_queues+0x1b2/0x260
hrtimer_interrupt+0x316/0x760
do_IRQ+0x9a/0xe0
do_irq_async+0xf6/0x160
Normally a raw_spinlock to spinlock dependency is not legitimate
and will be warned if CONFIG_PROVE_RAW_LOCK_NESTING is enabled,
but debug_objects_fill_pool() is an exception as it explicitly
allows this dependency for non-PREEMPT_RT kernel without causing
PROVE_RAW_LOCK_NESTING lockdep splat. As a result, this dependency is
legitimate and not a bug.
Anyway, semaphore is the only locking primitive left that is still
using try_to_wake_up() to do wakeup inside critical section, all the
other locking primitives had been migrated to use wake_q to do wakeup
outside of the critical section. It is also possible that there are
other circular locking dependencies involving printk/console_sem or
other existing/new semaphores lurking somewhere which may show up in
the future. Let just do the migration now to wake_q to avoid headache
like this.
Reported-by: yzbot+ed801a886dfdbfe7136d@syzkaller.appspotmail.com
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-3-boqun.feng@gmail.com
|
|
Add the "struct" keyword to prevent a kernel-doc warning:
rtmutex_common.h:67: warning: cannot understand function prototype: 'struct rt_wake_q_head '
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20250307232717.1759087-2-boqun.feng@gmail.com
|
|
Currently, dynamically allocated LockCLassKeys can be used from the Rust
side without having them registered. This is a soundness issue, so
remove them.
Fixes: 6ea5aa08857a ("rust: sync: introduce `LockClassKey`")
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Mitchell Levy <levymitchell0@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250307232717.1759087-11-boqun.feng@gmail.com
|
|
Andy reported the following build warning from head_32.S:
In file included from arch/x86/kernel/head_32.S:29:
arch/x86/include/asm/pgtable_32.h:59:5: error: "PTRS_PER_PMD" is not defined, evaluates to 0 [-Werror=undef]
59 | #if PTRS_PER_PMD > 1
The reason is that on 2-level i386 paging the folded in PMD's
PTRS_PER_PMD constant is not defined in assembly headers,
only in generic MM C headers.
Instead of trying to fish out the definition from the generic
headers, just define it - it even has a comment for it already...
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/Z8oa8AUVyi2HWfo9@gmail.com
|
|
Apart from some sanity checks on the size of setup.bin, the only
remaining task carried out by the arch/x86/boot/tools/build.c build tool
is generating the CRC-32 checksum of the bzImage. This feature was added
in commit
7d6e737c8d2698b6 ("x86: add a crc32 checksum to the kernel image.")
without any motivation (or any commit log text, for that matter). This
checksum is not verified by any known bootloader, and given that
a) the checksum of the entire bzImage is reported by most tools (zlib,
rhash) as 0xffffffff and not 0x0 as documented,
b) the checksum is corrupted when the image is signed for secure boot,
which means that no distro ships x86 images with valid CRCs,
it seems quite unlikely that this checksum is being used, so let's just
drop it, along with the tool that generates it.
Instead, use simple file concatenation and truncation to combine the two
pieces into bzImage, and replace the checks on the size of the setup
block with a couple of ASSERT()s in the linker script.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ian Campbell <ijc@hellion.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250307164801.885261-2-ardb+git@google.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fix from Vlastimil Babka:
- Stable fix for kmem_cache_destroy() called from a WQ_MEM_RECLAIM
workqueue causing a warning due to the new kvfree_rcu_barrier()
(Uladzislau Rezki)
* tag 'slab-for-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Restore the previous behavior of the ACPI platform_profile sysfs
interface that has been changed recently in a way incompatible with
the existing user space (Mario Limonciello)"
* tag 'acpi-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
platform/x86/amd: pmf: Add balanced-performance to hidden choices
platform/x86/amd: pmf: Add 'quiet' to hidden choices
ACPI: platform_profile: Add support for hidden choices
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull core dumping fix from Kees Cook:
- Only sort VMAs when core_sort_vma sysctl is set
* tag 'execve-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
coredump: Only sort VMAs when core_sort_vma sysctl is set
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix leaked extent map after error when reading chunks
- replace use of deprecated strncpy
- in zoned mode, fixed range when ulocking extent range, causing a hang
* tag 'for-6.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix a leaked chunk map issue in read_one_chunk()
btrfs: replace deprecated strncpy() with strscpy()
btrfs: zoned: fix extent range end unlock in cow_file_range()
|
|
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- TCP use after free fix on polling (Sagi)
- Controller memory buffer cleanup fixes (Icenowy)
- Free leaking requests on bad user passthrough commands (Keith)
- TCP error message fix (Maurizio)
- TCP corruption fix on partial PDU (Maurizio)
- TCP memory ordering fix for weakly ordered archs (Meir)
- Type coercion fix on message error for TCP (Dan)
- Name the RQF flags enum, fixing issues with anon enums and BPF import
of it
- ublk parameter setting fix
- GPT partition 7-bit conversion fix
* tag 'block-6.14-20250306' of git://git.kernel.dk/linux:
block: Name the RQF flags enum
nvme-tcp: fix signedness bug in nvme_tcp_init_connection()
block: fix conversion of GPT partition name to 7-bit
ublk: set_params: properly check if parameters can be applied
nvmet-tcp: Fix a possible sporadic response drops in weakly ordered arch
nvme-tcp: fix potential memory corruption in nvme_tcp_recv_pdu()
nvme-tcp: Fix a C2HTermReq error message
nvmet: remove old function prototype
nvme-ioctl: fix leaked requests on mapping error
nvme-pci: skip CMB blocks incompatible with PCI P2P DMA
nvme-pci: clean up CMBMSC when registering CMB fails
nvme-tcp: fix possible UAF in nvme_tcp_poll
|
|
Pull io_uring fix from Jens Axboe:
"A single fix for a regression introduced in the 6.14 merge window,
causing stalls/hangs with IOPOLL reads or writes"
* tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linux:
io_uring/rw: ensure reissue path is correctly handled for IOPOLL
|
|
Commit ef623a647f42 ("io_uring: Move old async data allocation helper
to header") leave behind this unused declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20250305013454.3635021-1-yuehaibing@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc scheduler fixes from Ingo Molnar:
- Fix deadline scheduler sysctl parameter setting bug
- Fix RT scheduler sysctl parameter setting bug
- Fix possible memory corruption in child_cfs_rq_on_list()
* tag 'sched-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/rt: Update limit of sched_rt sysctl in documentation
sched/deadline: Use online cpus for validating runtime
sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
|
|
library versioning was broken:
libcpupower.so.0.0.1
libcpupower.so -> libcpupower.so.0.0.1
libcpupower.so.1 -> libcpupower.so.0.0.1
and is fixed by this patch to:
libcpupower.so.1.0.1
libcpupower.so -> libcpupower.so.1.0.1
libcpupower.so.1 -> libcpupower.so.1.0.1
Link: https://lore.kernel.org/r/20250307094334.39587-1-trenn@suse.de
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Shuah Khan <shuah@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fixes from Ingo Molnar:
"Fix a race between PMU registration and event creation, and fix
pmus_lock vs. pmus_srcu lock ordering"
* tag 'perf-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix perf_pmu_register() vs. perf_init_event()
perf/core: Fix pmus_lock vs. pmus_srcu ordering
|
|
Add the remaining SHF_ flags, as listed in the "Executable and
Linkable Format" Wikipedia page and the System V Application Binary
Interface[1]. This allows drivers to load and parse ELF images that use
some of those flags.
In particular, an upcoming change to the Nouveau GPU driver will
use some of the flags.
Link: https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.sheader.html#sh_flags [1]
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Link: https://lore.kernel.org/r/20250307171417.267488-1-ttabi@nvidia.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Add support for wildcard matching of network interface names. This is
useful for auto-generated interfaces, for example podman creates network
interfaces for containers with the naming scheme podman0, podman1,
podman2, ...
To maintain backward compatibility guard this feature with a new policy
capability 'netif_wildcard'.
Netifcon definitions are compared against in the order given by the
policy, so userspace tools should sort them in a reasonable order.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Fix CPUID leaf 0x2 parsing bugs
- Sanitize very early boot parameters to avoid crash
- Fix size overflows in the SGX code
- Make CALL_NOSPEC use consistent
* tag 'x86-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Sanitize boot params before parsing command line
x86/sgx: Fix size overflows in sgx_encl_create()
x86/cpu: Properly parse CPUID leaf 0x2 TLB descriptor 0x63
x86/cpu: Validate CPUID leaf 0x2 EDX output
x86/cacheinfo: Validate CPUID leaf 0x2 EDX output
x86/speculation: Add a conditional CS prefix to CALL_NOSPEC
x86/speculation: Simplify and make CALL_NOSPEC consistent
|
|
Subshell evaluations are not exempt from errexit, so if a command is
not available, `which` will fail and exit the script as a whole.
This causes the helpful error messages to not be printed if they are
tacked on using a `$?` comparison.
Resolve the issue by using chains of logical operators, which are not
subject to the effects of errexit.
Fixes: e37c1877ba5b1 ("scripts/selinux: modernize mdp")
Signed-off-by: Tim Schumacher <tim.schumacher1@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Similarly to what was done with the memcpy() routines, make
copy_to_user(), copy_from_user() and clear_user() also use the Armv8.8
FEAT_MOPS instructions.
Both MOPS implementation options (A and B) are supported, including
asymmetric systems. The exception fixup code fixes up the registers
according to the option used.
In case of a fault the routines return precisely how much was not copied
(as required by the comment in include/linux/uaccess.h), as unprivileged
versions of CPY/SET are guaranteed not to have written past the
addresses reported in the GPRs.
The MOPS instructions could possibly be inlined into callers (and
patched to branch to the generic implementation if not detected;
similarly to what x86 does), but as a first step this patch just uses
them in the out-of-line routines.
Signed-off-by: Kristina Martšenko <kristina.martsenko@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20250228170006.390100-4-kristina.martsenko@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
A subsequent patch will use CPY* instructions to copy between user and
kernel memory. Add handling for PAN faults caused by an intended kernel
memory access erroneously accessing user memory, in order to make it
easier to debug kernel bugs and to keep the same behavior as with
regular loads/stores.
Signed-off-by: Kristina Martšenko <kristina.martsenko@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20250228170006.390100-3-kristina.martsenko@arm.com
[catalin.marinas@arm.com: Folded the extable search into insn_may_access_user()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|