Age | Commit message (Collapse) | Author |
|
In LoongArch, get_numa_distances_cnt() isn't in use, resulting in a
compiler warning.
Fix follow errors with clang-18 when W=1e:
arch/loongarch/kernel/acpi.c:259:28: error: unused function 'get_numa_distances_cnt' [-Werror,-Wunused-function]
259 | static inline unsigned int get_numa_distances_cnt(struct acpi_table_slit *slit)
| ^~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Link: https://lore.kernel.org/all/Z7bHPVUH4lAezk0E@kernel.org/
Signed-off-by: Yuli Wang <wangyuli@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
When compiling on LoongArch, there exists the following objtool warning
in arch/loongarch/kernel/machine_kexec.o:
kexec_reboot() falls through to next function crash_shutdown_secondary()
Avoid using unreachable() as it can (and will in the absence of UBSAN)
generate fall-through code. Use BUG() so we get a "break BRK_BUG" trap
(with unreachable annotation).
Cc: stable@vger.kernel.org # 6.12+
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Fix below kernel warning:
fs/binfmt_elf_fdpic.c:1024:52: warning: variable 'excess1' set but not
used [-Wunused-but-set-variable]
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20250308022754.75013-1-sunliming@linux.dev
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Limit integer wrap-around mitigation to only the "size_t" type (for
now). Notably this covers all special functions/builtins that return
"size_t", like sizeof(). This remains an experimental feature and is
likely to be replaced with type annotations.
Reviewed-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20250307041914.937329-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
To make integer wrap-around mitigation actually useful, the associated
sanitizers must not instrument cases where the wrap-around is explicitly
defined (e.g. "-2UL"), being tested for (e.g. "if (a + b < a)"), or
where it has no impact on code flow (e.g. "while (var--)"). Enable
pattern exclusions for the integer wrap sanitizers.
Reviewed-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20250307041914.937329-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Since we're going to approach integer overflow mitigation a type at a
time, we need to enable all of the associated sanitizers, and then opt
into types one at a time.
Rename the existing "signed wrap" sanitizer to just the entire topic area:
"integer wrap". Enable the implicit integer truncation sanitizers, with
required callbacks and tests.
Notably, this requires features (currently) only available in Clang,
so we can depend on the cc-option tests to determine availability
instead of doing version tests.
Link: https://lore.kernel.org/r/20250307041914.937329-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
run-script-ask.sh had an incorrect file extension. This helper script
is not used by kselftests.
Fixes: 2a69962be4a7 ("samples/check-exec: Add an enlighten "inc" interpreter and 28 tests")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20250306180559.1289243-1-mic@digikod.net
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
current->group_leader is stable, no need to take rcu_read_lock() and do
get/put_task_struct().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250219161417.GA20851@redhat.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The function __netpoll_send_skb() is being invoked without holding the
RCU read lock. This oversight triggers a warning message when
CONFIG_PROVE_RCU_LIST is enabled:
net/core/netpoll.c:330 suspicious rcu_dereference_check() usage!
netpoll_send_skb
netpoll_send_udp
write_ext_msg
console_flush_all
console_unlock
vprintk_emit
To prevent npinfo from disappearing unexpectedly, ensure that
__netpoll_send_skb() is protected with the RCU read lock.
Fixes: 2899656b494dcd1 ("netpoll: take rcu_read_lock_bh() in netpoll_send_skb_on_dev()")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250306-netpoll_rcu_v2-v2-1-bc4f5c51742a@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Andrei Botila says:
====================
net: phy: nxp-c45-tja11xx: add errata for TJA112XA/B
This patch series implements two errata for TJA1120 and TJA1121.
The first errata applicable to both RGMII and SGMII version
of TJA1120 and TJA1121 deals with achieving full silicon performance.
The workaround in this case is putting the PHY in managed mode and
applying a series of PHY writes before the link gest established.
The second errata applicable only to SGMII version of TJA1120 and
TJA1121 deals with achieving a stable operation of SGMII after a
startup event.
The workaround puts the SGMII PCS into power down mode and back up
after restart or wakeup from sleep.
====================
Link: https://patch.msgid.link/20250304160619.181046-1-andrei.botila@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TJA1120B/TJA1121B can achieve a stable operation of SGMII after
a startup event by putting the SGMII PCS into power down mode and
restart afterwards.
It is necessary to put the SGMII PCS into power down mode and back up.
Cc: stable@vger.kernel.org
Fixes: f1fe5dff2b8a ("net: phy: nxp-c45-tja11xx: add TJA1120 support")
Signed-off-by: Andrei Botila <andrei.botila@oss.nxp.com>
Link: https://patch.msgid.link/20250304160619.181046-3-andrei.botila@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The most recent sillicon versions of TJA1120 and TJA1121 can achieve
full silicon performance by putting the PHY in managed mode.
It is necessary to apply these PHY writes before link gets established.
Application of this fix is required after restart of device and wakeup
from sleep.
Cc: stable@vger.kernel.org
Fixes: f1fe5dff2b8a ("net: phy: nxp-c45-tja11xx: add TJA1120 support")
Signed-off-by: Andrei Botila <andrei.botila@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250304160619.181046-2-andrei.botila@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use skb_cow_head() prior to modifying the TX SKB. This is necessary
when the SKB has been cloned, to avoid modifying other shared clones.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
Link: https://patch.msgid.link/20250306-matt-mctp-i2c-cow-v1-1-293827212681@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use skb_cow_head() prior to modifying the tx skb. This is necessary
when the skb has been cloned, to avoid modifying other shared clones.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Fixes: c8755b29b58e ("mctp i3c: MCTP I3C driver")
Link: https://patch.msgid.link/20250306-matt-i3c-cow-head-v1-1-d5e6a5495227@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ATU Load operations could fail silently if there's not enough space
on the device to hold the new entry. When this happens, the symptom
depends on the unknown flood settings. If unknown multicast flood is
disabled, the multicast packets are dropped when the ATU table is
full. If unknown multicast flood is enabled, the multicast packets
will be flooded to all ports. Either way, IGMP snooping is broken
when the ATU Load operation fails silently.
Do a Read-After-Write verification after each fdb/mdb add operation
to make sure that the operation was really successful, and return
-ENOSPC otherwise.
Fixes: defb05b9b9b4 ("net: dsa: mv88e6xxx: Add support for fdb_add, fdb_del, and fdb_getnext")
Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250306172306.3859214-1-Joseph.Huang@garmin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Firmware version query is supported on the PFs. Due to this
following kernel warning log is observed:
[ 188.590344] mlx5_core 0000:08:00.2: mlx5_fw_version_query:816:(pid 1453): fw query isn't supported by the FW
Fix it by restricting the query and devlink info to the PF.
Fixes: 8338d9378895 ("net/mlx5: Added devlink info callback")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Link: https://patch.msgid.link/20250306212529.429329-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently on stable trees we have support for netmem/devmem RX but not
TX. It is not safe to forward/redirect an RX unreadable netmem packet
into the device's TX path, as the device may call dma-mapping APIs on
dma addrs that should not be passed to it.
Fix this by preventing the xmit of unreadable skbs.
Tested by configuring tc redirect:
sudo tc qdisc add dev eth1 ingress
sudo tc filter add dev eth1 ingress protocol ip prio 1 flower ip_proto \
tcp src_ip 192.168.1.12 action mirred egress redirect dev eth1
Before, I see unreadable skbs in the driver's TX path passed to dma
mapping APIs.
After, I don't see unreadable skbs in the driver's TX path passed to dma
mapping APIs.
Fixes: 65249feb6b3d ("net: add support for skbs with unreadable frags")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250306215520.1415465-1-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- btusb: Configure altsetting for HCI_USER_CHANNEL
- hci_event: Fix enabling passive scanning
- revert: "hci_core: Fix sleeping function called from invalid context"
- SCO: fix sco_conn refcounting on sco_conn_ready
* tag 'for-net-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Revert "Bluetooth: hci_core: Fix sleeping function called from invalid context"
Bluetooth: hci_event: Fix enabling passive scanning
Bluetooth: SCO: fix sco_conn refcounting on sco_conn_ready
Bluetooth: btusb: Configure altsetting for HCI_USER_CHANNEL
====================
Link: https://patch.msgid.link/20250307181854.99433-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix return address recovery of traced function in ftrace to ensure
reliable stack unwinding
- Fix compiler warnings and runtime crashes of vDSO selftests on s390
by introducing a dedicated GNU hash bucket pointer with correct
32-bit entry size
- Fix test_monitor_call() inline asm, which misses CC clobber, by
switching to an instruction that doesn't modify CC
* tag 's390-6.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ftrace: Fix return address recovery of traced function
selftests/vDSO: Fix GNU hash table entry size for s390x
s390/traps: Fix test_monitor_call() inline assembly
|
|
Reintroduce dynamically-allocated LockClassKeys such that they are
automatically (de)registered. Require that all usages of LockClassKeys
ensure that they are Pin'd.
Currently, only `'static` LockClassKeys are supported, so Pin is
redundant. However, it is intended that dynamically-allocated
LockClassKeys will eventually be supported, so using Pin from the outset
will make that change simpler.
Closes: https://github.com/Rust-for-Linux/linux/issues/1102
Suggested-by: Benno Lossin <benno.lossin@proton.me>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Mitchell Levy <levymitchell0@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Link: https://lore.kernel.org/r/20250307232717.1759087-12-boqun.feng@gmail.com
|
|
To support waiting for a `CondVar` as a freezable process, add a
wait_interruptible_freezable() function.
Binder needs this function in the appropriate places to freeze a process
where some of its threads are blocked on the Binder driver.
[ Boqun: Cleaned up the changelog and documentation. ]
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-10-boqun.feng@gmail.com
|
|
To provide examples on usage of `Guard::lock_ref()` along with the unit
test, an "assert a lock is held by a guard" example is added.
(Also apply feedback from Benno.)
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20250223072114.3715-1-boqun.feng@gmail.com
Link: https://lore.kernel.org/r/20250307232717.1759087-9-boqun.feng@gmail.com
|
|
In order to assert a particular `Guard` is associated with a particular
`Lock`, add an accessor to obtain a reference to the underlying `Lock`
of a `Guard`.
Binder needs this assertion to ensure unsafe list operations are done
with the correct lock held.
[Boqun: Capitalize the title and reword the commit log]
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Fiona Behrens <me@kloenk.dev>
Link: https://lore.kernel.org/r/20250205-guard-get-lock-v2-1-ba32a8c1d5b7@google.com
Link: https://lore.kernel.org/r/20250307232717.1759087-8-boqun.feng@gmail.com
|
|
KASAN instrumentation of lockdep has been disabled, as we don't need
KASAN to check the validity of lockdep internal data structures and
incur unnecessary performance overhead. However, the lockdep_map pointer
passed in externally may not be valid (e.g. use-after-free) and we run
the risk of using garbage data resulting in false lockdep reports.
Add kasan_check_byte() call in lock_acquire() for non kernel core data
object to catch invalid lockdep_map and print out a KASAN report before
any lockdep splat, if any.
Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20250214195242.2480920-1-longman@redhat.com
Link: https://lore.kernel.org/r/20250307232717.1759087-7-boqun.feng@gmail.com
|
|
Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
Each of them can significantly slow down the speed of a debug kernel.
Enabling KASAN instrumentation of the LOCKDEP code will further slow
things down.
Since LOCKDEP is a high overhead debugging tool, it will never get
enabled in a production kernel. The LOCKDEP code is also pretty mature
and is unlikely to get major changes. There is also a possibility of
recursion similar to KCSAN.
To evaluate the performance impact of disabling KASAN instrumentation
of lockdep.c, the time to do a parallel build of the Linux defconfig
kernel was used as the benchmark. Two x86-64 systems (Skylake & Zen 2)
and an arm64 system were used as test beds. Two sets of non-RT and RT
kernels with similar configurations except mainly CONFIG_PREEMPT_RT
were used for evaluation.
For the Skylake system:
Kernel Run time Sys time
------ -------- --------
Non-debug kernel (baseline) 0m47.642s 4m19.811s
[CONFIG_KASAN_INLINE=y]
Debug kernel 2m11.108s (x2.8) 38m20.467s (x8.9)
Debug kernel (patched) 1m49.602s (x2.3) 31m28.501s (x7.3)
Debug kernel
(patched + mitigations=off) 1m30.988s (x1.9) 26m41.993s (x6.2)
RT kernel (baseline) 0m54.871s 7m15.340s
[CONFIG_KASAN_INLINE=n]
RT debug kernel 6m07.151s (x6.7) 135m47.428s (x18.7)
RT debug kernel (patched) 3m42.434s (x4.1) 74m51.636s (x10.3)
RT debug kernel
(patched + mitigations=off) 2m40.383s (x2.9) 57m54.369s (x8.0)
[CONFIG_KASAN_INLINE=y]
RT debug kernel 3m22.155s (x3.7) 77m53.018s (x10.7)
RT debug kernel (patched) 2m36.700s (x2.9) 54m31.195s (x7.5)
RT debug kernel
(patched + mitigations=off) 2m06.110s (x2.3) 45m49.493s (x6.3)
For the Zen 2 system:
Kernel Run time Sys time
------ -------- --------
Non-debug kernel (baseline) 1m42.806s 39m48.714s
[CONFIG_KASAN_INLINE=y]
Debug kernel 4m04.524s (x2.4) 125m35.904s (x3.2)
Debug kernel (patched) 3m56.241s (x2.3) 127m22.378s (x3.2)
Debug kernel
(patched + mitigations=off) 2m38.157s (x1.5) 92m35.680s (x2.3)
RT kernel (baseline) 1m51.500s 14m56.322s
[CONFIG_KASAN_INLINE=n]
RT debug kernel 16m04.962s (x8.7) 244m36.463s (x16.4)
RT debug kernel (patched) 9m09.073s (x4.9) 129m28.439s (x8.7)
RT debug kernel
(patched + mitigations=off) 3m31.662s (x1.9) 51m01.391s (x3.4)
For the arm64 system:
Kernel Run time Sys time
------ -------- --------
Non-debug kernel (baseline) 1m56.844s 8m47.150s
Debug kernel 3m54.774s (x2.0) 92m30.098s (x10.5)
Debug kernel (patched) 3m32.429s (x1.8) 77m40.779s (x8.8)
RT kernel (baseline) 4m01.641s 18m16.777s
[CONFIG_KASAN_INLINE=n]
RT debug kernel 19m32.977s (x4.9) 304m23.965s (x16.7)
RT debug kernel (patched) 16m28.354s (x4.1) 234m18.149s (x12.8)
Turning the mitigations off doesn't seems to have any noticeable impact
on the performance of the arm64 system. So the mitigation=off entries
aren't included.
For the x86 CPUs, CPU mitigations has a much bigger
impact on performance, especially the RT debug kernel with
CONFIG_KASAN_INLINE=n. The SRSO mitigation in Zen 2 has an especially
big impact on the debug kernel. It is also the majority of the slowdown
with mitigations on. It is because the patched RET instruction slows
down function returns. A lot of helper functions that are normally
compiled out or inlined may become real function calls in the debug
kernel.
With !CONFIG_KASAN_INLINE, the KASAN instrumentation inserts a
lot of __asan_loadX*() and __kasan_check_read() function calls to memory
access portion of the code. The lockdep's __lock_acquire() function,
for instance, has 66 __asan_loadX*() and 6 __kasan_check_read() calls
added with KASAN instrumentation. Of course, the actual numbers may vary
depending on the compiler used and the exact version of the lockdep code.
With the Skylake test system, the parallel kernel build times reduction
of the RT debug kernel with this patch are:
CONFIG_KASAN_INLINE=n: -37%
CONFIG_KASAN_INLINE=y: -22%
The time reduction is less with CONFIG_KASAN_INLINE=y, but it is still
significant.
Setting CONFIG_KASAN_INLINE=y can result in a significant performance
improvement. The major drawback is a significant increase in the size
of kernel text. In the case of vmlinux, its text size increases from
45997948 to 67606807. That is a 47% size increase (about 21 Mbytes). The
size increase of other kernel modules should be similar.
With the newly added rtmutex and lockdep lock events, the relevant
event counts for the test runs with the Skylake system were:
Event type Debug kernel RT debug kernel
---------- ------------ ---------------
lockdep_acquire 1,968,663,277 5,425,313,953
rtlock_slowlock - 401,701,156
rtmutex_slowlock - 139,672
The __lock_acquire() calls in the RT debug kernel are x2.8 times of the
non-RT debug kernel with the same workload. Since the __lock_acquire()
function is a big hitter in term of performance slowdown, this makes
the RT debug kernel much slower than the non-RT one. The average lock
nesting depth is likely to be higher in the RT debug kernel too leading
to longer execution time in the __lock_acquire() function.
As the small advantage of enabling KASAN instrumentation to catch
potential memory access error in the lockdep debugging tool is probably
not worth the drawback of further slowing down a debug kernel, disable
KASAN instrumentation in the lockdep code to allow the debug kernels
to regain some performance back, especially for the RT debug kernels.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-6-boqun.feng@gmail.com
|
|
Add some lock events to lockdep to profile its behavior.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-5-boqun.feng@gmail.com
|
|
Add locking events for rtlock_slowlock() and rt_mutex_slowlock() for
profiling the slow path behavior of rt_spin_lock() and rt_mutex_lock().
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-4-boqun.feng@gmail.com
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
A circular lock dependency splat has been seen involving down_trylock():
======================================================
WARNING: possible circular locking dependency detected
6.12.0-41.el10.s390x+debug
------------------------------------------------------
dd/32479 is trying to acquire lock:
0015a20accd0d4f8 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x26/0x90
but task is already holding lock:
000000017e461698 (&zone->lock){-.-.}-{2:2}, at: rmqueue_bulk+0xac/0x8f0
the existing dependency chain (in reverse order) is:
-> #4 (&zone->lock){-.-.}-{2:2}:
-> #3 (hrtimer_bases.lock){-.-.}-{2:2}:
-> #2 (&rq->__lock){-.-.}-{2:2}:
-> #1 (&p->pi_lock){-.-.}-{2:2}:
-> #0 ((console_sem).lock){-.-.}-{2:2}:
The console_sem -> pi_lock dependency is due to calling try_to_wake_up()
while holding the console_sem raw_spinlock. This dependency can be broken
by using wake_q to do the wakeup instead of calling try_to_wake_up()
under the console_sem lock. This will also make the semaphore's
raw_spinlock become a terminal lock without taking any further locks
underneath it.
The hrtimer_bases.lock is a raw_spinlock while zone->lock is a
spinlock. The hrtimer_bases.lock -> zone->lock dependency happens via
the debug_objects_fill_pool() helper function in the debugobjects code.
-> #4 (&zone->lock){-.-.}-{2:2}:
__lock_acquire+0xe86/0x1cc0
lock_acquire.part.0+0x258/0x630
lock_acquire+0xb8/0xe0
_raw_spin_lock_irqsave+0xb4/0x120
rmqueue_bulk+0xac/0x8f0
__rmqueue_pcplist+0x580/0x830
rmqueue_pcplist+0xfc/0x470
rmqueue.isra.0+0xdec/0x11b0
get_page_from_freelist+0x2ee/0xeb0
__alloc_pages_noprof+0x2c2/0x520
alloc_pages_mpol_noprof+0x1fc/0x4d0
alloc_pages_noprof+0x8c/0xe0
allocate_slab+0x320/0x460
___slab_alloc+0xa58/0x12b0
__slab_alloc.isra.0+0x42/0x60
kmem_cache_alloc_noprof+0x304/0x350
fill_pool+0xf6/0x450
debug_object_activate+0xfe/0x360
enqueue_hrtimer+0x34/0x190
__run_hrtimer+0x3c8/0x4c0
__hrtimer_run_queues+0x1b2/0x260
hrtimer_interrupt+0x316/0x760
do_IRQ+0x9a/0xe0
do_irq_async+0xf6/0x160
Normally a raw_spinlock to spinlock dependency is not legitimate
and will be warned if CONFIG_PROVE_RAW_LOCK_NESTING is enabled,
but debug_objects_fill_pool() is an exception as it explicitly
allows this dependency for non-PREEMPT_RT kernel without causing
PROVE_RAW_LOCK_NESTING lockdep splat. As a result, this dependency is
legitimate and not a bug.
Anyway, semaphore is the only locking primitive left that is still
using try_to_wake_up() to do wakeup inside critical section, all the
other locking primitives had been migrated to use wake_q to do wakeup
outside of the critical section. It is also possible that there are
other circular locking dependencies involving printk/console_sem or
other existing/new semaphores lurking somewhere which may show up in
the future. Let just do the migration now to wake_q to avoid headache
like this.
Reported-by: yzbot+ed801a886dfdbfe7136d@syzkaller.appspotmail.com
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-3-boqun.feng@gmail.com
|
|
Add the "struct" keyword to prevent a kernel-doc warning:
rtmutex_common.h:67: warning: cannot understand function prototype: 'struct rt_wake_q_head '
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20250307232717.1759087-2-boqun.feng@gmail.com
|
|
Currently, dynamically allocated LockCLassKeys can be used from the Rust
side without having them registered. This is a soundness issue, so
remove them.
Fixes: 6ea5aa08857a ("rust: sync: introduce `LockClassKey`")
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Mitchell Levy <levymitchell0@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250307232717.1759087-11-boqun.feng@gmail.com
|
|
Andy reported the following build warning from head_32.S:
In file included from arch/x86/kernel/head_32.S:29:
arch/x86/include/asm/pgtable_32.h:59:5: error: "PTRS_PER_PMD" is not defined, evaluates to 0 [-Werror=undef]
59 | #if PTRS_PER_PMD > 1
The reason is that on 2-level i386 paging the folded in PMD's
PTRS_PER_PMD constant is not defined in assembly headers,
only in generic MM C headers.
Instead of trying to fish out the definition from the generic
headers, just define it - it even has a comment for it already...
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/Z8oa8AUVyi2HWfo9@gmail.com
|
|
Apart from some sanity checks on the size of setup.bin, the only
remaining task carried out by the arch/x86/boot/tools/build.c build tool
is generating the CRC-32 checksum of the bzImage. This feature was added
in commit
7d6e737c8d2698b6 ("x86: add a crc32 checksum to the kernel image.")
without any motivation (or any commit log text, for that matter). This
checksum is not verified by any known bootloader, and given that
a) the checksum of the entire bzImage is reported by most tools (zlib,
rhash) as 0xffffffff and not 0x0 as documented,
b) the checksum is corrupted when the image is signed for secure boot,
which means that no distro ships x86 images with valid CRCs,
it seems quite unlikely that this checksum is being used, so let's just
drop it, along with the tool that generates it.
Instead, use simple file concatenation and truncation to combine the two
pieces into bzImage, and replace the checks on the size of the setup
block with a couple of ASSERT()s in the linker script.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ian Campbell <ijc@hellion.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250307164801.885261-2-ardb+git@google.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fix from Vlastimil Babka:
- Stable fix for kmem_cache_destroy() called from a WQ_MEM_RECLAIM
workqueue causing a warning due to the new kvfree_rcu_barrier()
(Uladzislau Rezki)
* tag 'slab-for-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Restore the previous behavior of the ACPI platform_profile sysfs
interface that has been changed recently in a way incompatible with
the existing user space (Mario Limonciello)"
* tag 'acpi-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
platform/x86/amd: pmf: Add balanced-performance to hidden choices
platform/x86/amd: pmf: Add 'quiet' to hidden choices
ACPI: platform_profile: Add support for hidden choices
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull core dumping fix from Kees Cook:
- Only sort VMAs when core_sort_vma sysctl is set
* tag 'execve-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
coredump: Only sort VMAs when core_sort_vma sysctl is set
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix leaked extent map after error when reading chunks
- replace use of deprecated strncpy
- in zoned mode, fixed range when ulocking extent range, causing a hang
* tag 'for-6.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix a leaked chunk map issue in read_one_chunk()
btrfs: replace deprecated strncpy() with strscpy()
btrfs: zoned: fix extent range end unlock in cow_file_range()
|
|
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- TCP use after free fix on polling (Sagi)
- Controller memory buffer cleanup fixes (Icenowy)
- Free leaking requests on bad user passthrough commands (Keith)
- TCP error message fix (Maurizio)
- TCP corruption fix on partial PDU (Maurizio)
- TCP memory ordering fix for weakly ordered archs (Meir)
- Type coercion fix on message error for TCP (Dan)
- Name the RQF flags enum, fixing issues with anon enums and BPF import
of it
- ublk parameter setting fix
- GPT partition 7-bit conversion fix
* tag 'block-6.14-20250306' of git://git.kernel.dk/linux:
block: Name the RQF flags enum
nvme-tcp: fix signedness bug in nvme_tcp_init_connection()
block: fix conversion of GPT partition name to 7-bit
ublk: set_params: properly check if parameters can be applied
nvmet-tcp: Fix a possible sporadic response drops in weakly ordered arch
nvme-tcp: fix potential memory corruption in nvme_tcp_recv_pdu()
nvme-tcp: Fix a C2HTermReq error message
nvmet: remove old function prototype
nvme-ioctl: fix leaked requests on mapping error
nvme-pci: skip CMB blocks incompatible with PCI P2P DMA
nvme-pci: clean up CMBMSC when registering CMB fails
nvme-tcp: fix possible UAF in nvme_tcp_poll
|
|
Pull io_uring fix from Jens Axboe:
"A single fix for a regression introduced in the 6.14 merge window,
causing stalls/hangs with IOPOLL reads or writes"
* tag 'io_uring-6.14-20250306' of git://git.kernel.dk/linux:
io_uring/rw: ensure reissue path is correctly handled for IOPOLL
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc scheduler fixes from Ingo Molnar:
- Fix deadline scheduler sysctl parameter setting bug
- Fix RT scheduler sysctl parameter setting bug
- Fix possible memory corruption in child_cfs_rq_on_list()
* tag 'sched-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/rt: Update limit of sched_rt sysctl in documentation
sched/deadline: Use online cpus for validating runtime
sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fixes from Ingo Molnar:
"Fix a race between PMU registration and event creation, and fix
pmus_lock vs. pmus_srcu lock ordering"
* tag 'perf-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix perf_pmu_register() vs. perf_init_event()
perf/core: Fix pmus_lock vs. pmus_srcu ordering
|
|
Add the remaining SHF_ flags, as listed in the "Executable and
Linkable Format" Wikipedia page and the System V Application Binary
Interface[1]. This allows drivers to load and parse ELF images that use
some of those flags.
In particular, an upcoming change to the Nouveau GPU driver will
use some of the flags.
Link: https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.sheader.html#sh_flags [1]
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Link: https://lore.kernel.org/r/20250307171417.267488-1-ttabi@nvidia.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Fix CPUID leaf 0x2 parsing bugs
- Sanitize very early boot parameters to avoid crash
- Fix size overflows in the SGX code
- Make CALL_NOSPEC use consistent
* tag 'x86-urgent-2025-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Sanitize boot params before parsing command line
x86/sgx: Fix size overflows in sgx_encl_create()
x86/cpu: Properly parse CPUID leaf 0x2 TLB descriptor 0x63
x86/cpu: Validate CPUID leaf 0x2 EDX output
x86/cacheinfo: Validate CPUID leaf 0x2 EDX output
x86/speculation: Add a conditional CS prefix to CALL_NOSPEC
x86/speculation: Simplify and make CALL_NOSPEC consistent
|
|
This reverts commit 4d94f05558271654670d18c26c912da0c1c15549 which has
problems (see [1]) and is no longer needed since 581dd2dc168f
("Bluetooth: hci_event: Fix using rcu_read_(un)lock while iterating")
has reworked the code where the original bug has been found.
[1] Link: https://lore.kernel.org/linux-bluetooth/877c55ci1r.wl-tiwai@suse.de/T/#t
Fixes: 4d94f0555827 ("Bluetooth: hci_core: Fix sleeping function called from invalid context")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- xgene-hwmon: Fix a NULL vs IS_ERR_OR_NULL() check
- ad7314: Return error if leading zero bits are non-zero
- ntc_thermistor: Update/fix the ncpXXxh103 sensor table
- pmbus: Initialise page count in pmbus_identify()
- peci/dimmtemp: Do not provide fake threshold data
* tag 'hwmon-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: fix a NULL vs IS_ERR_OR_NULL() check in xgene_hwmon_probe()
hwmon: (ad7314) Validate leading zero bits and return error
hwmon: (ntc_thermistor) Fix the ncpXXxh103 sensor table
hwmon: (pmbus) Initialise page count in pmbus_identify()
hwmon: (peci/dimmtemp) Do not provide fake thresholds data
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- protect gpio-aggregator against module unload
- use raw spinlock in gpio-rcar to fix a lockdep splat
- fix OF node leak in gpio-rcar
* tag 'gpio-fixes-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: rcar: Fix missing of_node_put() call
gpio: rcar: Use raw_spinlock to protect register access
gpio: aggregator: protect driver attr handlers against module unload
|
|
Passive scanning shall only be enabled when disconnecting LE links,
otherwise it may start result in triggering scanning when e.g. an ISO
link disconnects:
> HCI Event: LE Meta Event (0x3e) plen 29
LE Connected Isochronous Stream Established (0x19)
Status: Success (0x00)
Connection Handle: 257
CIG Synchronization Delay: 0 us (0x000000)
CIS Synchronization Delay: 0 us (0x000000)
Central to Peripheral Latency: 10000 us (0x002710)
Peripheral to Central Latency: 10000 us (0x002710)
Central to Peripheral PHY: LE 2M (0x02)
Peripheral to Central PHY: LE 2M (0x02)
Number of Subevents: 1
Central to Peripheral Burst Number: 1
Peripheral to Central Burst Number: 1
Central to Peripheral Flush Timeout: 2
Peripheral to Central Flush Timeout: 2
Central to Peripheral MTU: 320
Peripheral to Central MTU: 160
ISO Interval: 10.00 msec (0x0008)
...
> HCI Event: Disconnect Complete (0x05) plen 4
Status: Success (0x00)
Handle: 257
Reason: Remote User Terminated Connection (0x13)
< HCI Command: LE Set Extended Scan Enable (0x08|0x0042) plen 6
Extended scan: Enabled (0x01)
Filter duplicates: Enabled (0x01)
Duration: 0 msec (0x0000)
Period: 0.00 sec (0x0000)
Fixes: 9fcb18ef3acb ("Bluetooth: Introduce LE auto connect options")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
sco_conn refcount shall not be incremented a second time if the sk
already owns the refcount, so hold only when adding new chan.
Add sco_conn_hold() for clarity, as refcnt is never zero here due to the
sco_conn_add().
Fixes SCO socket shutdown not actually closing the SCO connection.
Fixes: ed9588554943 ("Bluetooth: SCO: remove the redundant sco_conn_put")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Automatically configure the altsetting for HCI_USER_CHANNEL when a SCO
is connected.
The motivation is to enable the HCI_USER_CHANNEL user to send out SCO
data through USB Bluetooth chips, which is mainly used for bidirectional
audio transfer (voice call). This was not capable because:
- Per Bluetooth Core Spec v5, Vol 4, Part B, 2.1, the corresponding
alternate setting should be set based on the air mode in order to
transfer SCO data, but
- The Linux Bluetooth HCI_USER_CHANNEL exposes the Bluetooth Host
Controller Interface to the user space, which is something above the
USB layer. The user space is not able to configure the USB alt while
keeping the channel open.
This patch intercepts the HCI_EV_SYNC_CONN_COMPLETE packets in btusb,
extracts the air mode, and configures the alt setting in btusb.
This patch is tested on ChromeOS devices. The USB Bluetooth models
(CVSD, TRANS alt3 and alt6) could work without a customized kernel.
Fixes: b16b327edb4d ("Bluetooth: btusb: add sysfs attribute to control USB alt setting")
Signed-off-by: Hsin-chen Chuang <chharry@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- amd/pmf:
- Initialize 'cb_mutex'
- Support for new version of PMF-TA
- intel-hid: Fix volume buttons on Microsoft Surface Go 4 tablet
- intel/vsec: Add Diamond Rapids support
- thinkpad_acpi: Add battery quirk for ThinkPad X131e
* tag 'platform-drivers-x86-v6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA
platform/x86/amd/pmf: Propagate PMF-TA return codes
platform/x86/intel/vsec: Add Diamond Rapids support
platform/x86: thinkpad_acpi: Add battery quirk for ThinkPad X131e
platform/x86: intel-hid: fix volume buttons on Microsoft Surface Go 4 tablet
platform/x86/amd/pmf: Initialize and clean up `cb_mutex`
|