Age | Commit message (Collapse) | Author |
|
Pull bcachefs fixes from Kent Overstreet:
- Some fixes to help with filesystem analysis: ensure superblock
error count gets written if we go ERO, don't discard the journal
aggressively (so it's available for list_journal -a)
- Fix lost wakeup on arm causing us to get stuck when reading btree
nodes
- Fix fsck failing to exit on ctrl-c
- An additional fix for filesystems with misaligned bucket sizes: we
now ensure that allocations are properly aligned
- Setting background target but not promote target will now leave that
data cached on the foreground target, as it used to
- Revert a change to when we allocate the VFS superblock, this was done
for implementing blk_holder_ops but ended up not being needed, and
allocating a superblock and not setting SB_BORN while we do recovery
caused sync() calls and other things to hang
- Assorted fixes for harmless error messages that caused concern to
users
* tag 'bcachefs-2025-05-08' of git://evilpiepirate.org/bcachefs:
bcachefs: Don't aggressively discard the journal
bcachefs: Ensure superblock gets written when we go ERO
bcachefs: Filter out harmless EROFS error messages
bcachefs: journal_shutdown is EROFS, not EIO
bcachefs: Call bch2_fs_start before getting vfs superblock
bcachefs: fix hung task timeout in journal read
bcachefs: Add missing barriers before wake_up_bit()
bcachefs: Ensure proper write alignment
bcachefs: Improve want_cached_ptr()
bcachefs: thread_with_stdio: fix spinning instead of exiting
|
|
in probe()
With UBSAN enabled, we're getting the following trace:
UBSAN: array-index-out-of-bounds in .../drivers/clk/clk-s2mps11.c:186:3
index 0 is out of range for type 'struct clk_hw *[] __counted_by(num)' (aka 'struct clk_hw *[]')
This is because commit f316cdff8d67 ("clk: Annotate struct
clk_hw_onecell_data with __counted_by") annotated the hws member of
that struct with __counted_by, which informs the bounds sanitizer about
the number of elements in hws, so that it can warn when hws is accessed
out of bounds.
As noted in that change, the __counted_by member must be initialised
with the number of elements before the first array access happens,
otherwise there will be a warning from each access prior to the
initialisation because the number of elements is zero. This occurs in
s2mps11_clk_probe() due to ::num being assigned after ::hws access.
Move the assignment to satisfy the requirement of assign-before-access.
Cc: stable@vger.kernel.org
Fixes: f316cdff8d67 ("clk: Annotate struct clk_hw_onecell_data with __counted_by")
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Link: https://lore.kernel.org/r/20250326-s2mps11-ubsan-v1-1-fcc6fce5c8a9@linaro.org
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below
warning pops. Refine the flush work code to be controlled by the config
to avoid below warning:
"
[ 453.132028] ------------[ cut here ]------------
[ 453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0
[ 453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec
[ 453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G U W 6.15.0-rc3+ #7 PREEMPT(full)
[ 453.135405] Tainted: [U]=USER, [W]=WARN
...
[ 453.136921] RIP: 0010:__flush_work+0x379/0x3a0
[ 453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe
[ 453.139250] RSP: 0018:ffffc90000c67b18 EFLAGS: 00010246
[ 453.139782] RAX: 0000000000000000 RBX: ffff888108a24000 RCX: 0000000000002000
[ 453.140521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881016d61c8
[ 453.141253] RBP: ffff8881016d61c8 R08: 0000000000000000 R09: 0000000000000000
[ 453.141985] R10: 0000000000000000 R11: 0000000008a24000 R12: 0000000000000001
[ 453.142709] R13: 0000000000000002 R14: 0000000000000000 R15: ffff888107db8c00
[ 453.143450] FS: 00007f44853d4c80(0000) GS:ffff8882f469b000(0000) knlGS:0000000000000000
[ 453.144276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 453.144853] CR2: 00007f4487629228 CR3: 00000001016aa000 CR4: 00000000000406f0
[ 453.145594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 453.146320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 453.147061] Call Trace:
[ 453.147336] <TASK>
[ 453.147579] ? tick_nohz_tick_stopped+0xd/0x30
[ 453.148067] ? xas_load+0x9/0xb0
[ 453.148435] ? xa_load+0x6f/0xb0
[ 453.148781] __xe_vm_bind_ioctl+0xbd5/0x1500 [xe]
[ 453.149338] ? dev_printk_emit+0x48/0x70
[ 453.149762] ? _dev_printk+0x57/0x80
[ 453.150148] ? drm_ioctl+0x17c/0x440
[ 453.150544] ? __drm_dev_vprintk+0x36/0x90
[ 453.150983] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 453.151575] ? drm_ioctl_kernel+0x9f/0xf0
[ 453.151998] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 453.152560] drm_ioctl_kernel+0x9f/0xf0
[ 453.152968] drm_ioctl+0x20f/0x440
[ 453.153332] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[ 453.153893] ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100
[ 453.154489] ? memory_bm_test_bit+0x5/0x60
[ 453.154935] xe_drm_ioctl+0x47/0x70 [xe]
[ 453.155419] __x64_sys_ioctl+0x8d/0xc0
[ 453.155824] do_syscall_64+0x47/0x110
[ 453.156228] entry_SYSCALL_64_after_hwframe+0x76/0x7e
"
v2 (Matt):
refine commit message to have more details
add Fixes tag
move the code to xe_svm.h which already have the config
remove a blank line per codestyle suggestion
Fixes: 63f6e480d115 ("drm/xe: Add SVM garbage collector")
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com
(cherry picked from commit 9d80698bcd97a5ad1088bcbb055e73fd068895e2)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
While an event tears down all links to it as an aux, the iteration
happens on the event's group leader instead of the group itself.
If the event is a group leader, it has no effect because the event is
also its own group leader. But otherwise there would be a risk to detach
all the siblings events from the wrong group leader.
It just happens to work because each sibling's aux link is tested
against the right event before proceeding. Also the ctx lock is the same
for the events and their group leader so the iteration is safe.
Yet the iteration is confusing. Clarify the actual intent.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250424161128.29176-5-frederic@kernel.org
|
|
The CPU hotplug handlers are called twice: at prepare and online stage.
Their role is to:
1) Enable/disable a CPU context. This is irrelevant and even buggy at
the prepare stage because the CPU is still offline. On early
secondary CPU up, creating an event attached to that CPU might
silently fail because the CPU context is observed as online but the
context installation's IPI failure is ignored.
2) Update the scope cpumasks and re-migrate the events accordingly in
the CPU down case. This is irrelevant at the prepare stage.
3) Remove the events attached to the context of the offlining CPU. It
even uses an (unnecessary) IPI for it. This is also irrelevant at the
prepare stage.
Also none of the *_PREPARE and *_STARTING architecture perf related CPU
hotplug callbacks rely on CPUHP_PERF_PREPARE.
CPUHP_AP_PERF_ONLINE is enough and the right place to perform the work.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250424161128.29176-4-frederic@kernel.org
|
|
The following commit:
da916e96e2de ("perf: Make perf_pmu_unregister() useable")
has introduced two significant event's parent lifecycle changes:
1) An event that has exited now has EVENT_TOMBSTONE as a parent.
This can result in a situation where the delayed wakeup irq_work can
accidentally dereference EVENT_TOMBSTONE on:
CPU 0 CPU 1
----- -----
__schedule()
local_irq_disable()
rq_lock()
<NMI>
perf_event_overflow()
irq_work_queue(&child->pending_irq)
</NMI>
perf_event_task_sched_out()
raw_spin_lock(&ctx->lock)
ctx_sched_out()
ctx->is_active = 0
event_sched_out(child)
raw_spin_unlock(&ctx->lock)
perf_event_release_kernel(parent)
perf_remove_from_context(child)
raw_spin_lock_irq(&ctx->lock)
// Sees !ctx->is_active
// Removes from context inline
__perf_remove_from_context(child)
perf_child_detach(child)
event->parent = EVENT_TOMBSTONE
raw_spin_rq_unlock_irq(rq);
<IRQ>
perf_pending_irq()
perf_event_wakeup(child)
ring_buffer_wakeup(child)
rcu_dereference(child->parent->rb) <--- CRASH
This also concerns the call to kill_fasync() on parent->fasync.
2) The final parent reference count decrement can now happen before the
the final child reference count decrement. ie: the parent can now
be freed before its child. On PREEMPT_RT, this can result in a
situation where the delayed wakeup irq_work can accidentally
dereference a freed parent:
CPU 0 CPU 1 CPU 2
----- ----- ------
perf_pmu_unregister()
pmu_detach_events()
pmu_get_event()
atomic_long_inc_not_zero(&child->refcount)
<NMI>
perf_event_overflow()
irq_work_queue(&child->pending_irq);
</NMI>
<IRQ>
irq_work_run()
wake_irq_workd()
</IRQ>
preempt_schedule_irq()
=========> SWITCH to workd
irq_work_run_list()
perf_pending_irq()
perf_event_wakeup(child)
ring_buffer_wakeup(child)
event = child->parent
perf_event_release_kernel(parent)
// Not last ref, PMU holds it
put_event(child)
// Last ref
put_event(parent)
free_event()
call_rcu(...)
rcu_core()
free_event_rcu()
rcu_dereference(event->rb) <--- CRASH
This also concerns the call to kill_fasync() on parent->fasync.
The "easy" solution to 1) is to check that event->parent is not
EVENT_TOMBSTONE on perf_event_wakeup() (including both ring buffer
and fasync uses).
The "easy" solution to 2) is to turn perf_event_wakeup() to wholefully
run under rcu_read_lock().
However because of 2), sanity would prescribe to make event::parent
an __rcu pointer and annotate each and every users to prove they are
reliable.
Propose an alternate solution and restore the stable pointer to the
parent until all its children have called _free_event() themselves to
avoid any further accident. Also revert the EVENT_TOMBSTONE design
that is mostly here to determine which caller of perf_event_exit_event()
must perform the refcount decrement on a child event matching the
increment in inherit_event().
Arrange instead for checking the attach state of an event prior to its
removal and decrement the refcount of the child accordingly.
Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
When inherit_event() fails after the child allocation but before the
parent refcount has been incremented, calling put_event() wrongly
decrements the reference to the parent, risking to free it too early.
Also pmu_get_event() can't be holding a reference to the child
concurrently at this point since it is under pmus_srcu critical section.
Fix it with restoring the deleted free_event() function and call it on
the failing child in order to free it directly under the verified
assumption that its refcount is only 1. The refcount to the parent is
then voluntarily omitted.
Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250424161128.29176-2-frederic@kernel.org
|
|
xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for
the release path, xe_force_wake_put() should be called first then
xe_pm_runtime_put().
Combine the error path and normal path together with goto.
Fixes: 85d547608ef5 ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return")
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 432cd94efdca06296cc5e76d673546f58aa90ee1)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM,
while the GSC one isn't (and can't be as we need to do memory
allocations in the gsc worker). Therefore, we can't flush the latter
from the former.
The reason why we had such a flush was to avoid interrupting either
the GSC FW load or in progress GSC proxy operations. GSC proxy
operations fall into 2 categories:
1) GSC proxy init: this only happens once immediately after GSC FW load
and does not support being interrupted. The only way to recover from
an interruption of the proxy init is to do an FLR and re-load the GSC.
2) GSC proxy request: this can happen in response to a request that
the driver sends to the GSC. If this is interrupted, the GSC FW will
timeout and the driver request will be failed, but overall the GSC
will keep working fine.
Flushing the work allowed us to avoid interruption in both cases (unless
the hang came from the GSC engine itself, in which case we're toast
anyway). However, a failure on a proxy request is tolerable if we're in
a scenario where we're triggering a GT reset (i.e., something is already
gone pretty wrong), so what we really need to avoid is interrupting
the init flow, which we can do by polling on the register that reports
when the proxy init is complete (as that ensure us that all the load and
init operations have been completed).
Note that during suspend we still want to do a flush of the worker to
make sure it completes any operations involving the HW before the power
is cut.
v2: fix spelling in commit msg, rename waiter function (Julia)
Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 12370bfcc4f0bdf70279ec5b570eb298963422b5)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
LNCF registers report wrong values when XE_FORCEWAKE_GT
only is held. Holding XE_FORCEWAKE_ALL ensures correct
operations on LNCF regs.
V2(Himal):
- Use xe_force_wake_ref_has_domain
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1999
Fixes: a6a4ea6d7d37 ("drm/xe: Add mocs kunit")
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250428082357.1730068-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit 70a2585e582058e94fe4381a337be42dec800337)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
For an unknown reason the math to determine the PF queue size does is
not correct - compute UMD applications are overflowing the PF queue
which is fatal. A multippier of 8 fixes the problem.
Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
Link: https://lore.kernel.org/r/20250408155915.78770-1-matthew.brost@intel.com
(cherry picked from commit 29582e0ea75c95668d168b12406e3c56cf5a73c4)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Pull vfio fix from Alex Williamson:
- Fix an issue in vfio-pci huge_fault handling by aligning faults to
the order, resulting in deterministic use of huge pages. This
avoids a race where simultaneous aligned and unaligned faults to
the same PMD can result in a VM_FAULT_OOM and subsequent VM crash.
(Alex Williamson)
* tag 'vfio-v6.15-rc6' of https://github.com/awilliam/linux-vfio:
vfio/pci: Align huge faults to order
|
|
The netkit program is not a cgroup bpf program and should not be shown
in the output of the "bpftool cgroup show" command.
However, if the netkit device happens to have ifindex 3,
the "bpftool cgroup show" command will output the netkit
bpf program as well:
> ip -d link show dev nk1
3: nk1@if2: ...
link/ether ...
netkit mode ...
> bpftool net show
tc:
nk1(3) netkit/peer tw_ns_nk2phy prog_id 469447
> bpftool cgroup show /sys/fs/cgroup/...
ID AttachType AttachFlags Name
... ... ...
469447 netkit_peer tw_ns_nk2phy
The reason is that the target_fd (which is the cgroup_fd here) and
the target_ifindex are in a union in the uapi/linux/bpf.h. The bpftool
iterates all values in "enum bpf_attach_type" which includes
non cgroup attach types like netkit. The cgroup_fd is usually 3 here,
so the bug is triggered when the netkit ifindex just happens
to be 3 as well.
The bpftool's cgroup.c already has a list of cgroup-only attach type
defined in "cgroup_attach_types[]". This patch fixes it by iterating
over "cgroup_attach_types[]" instead of "__MAX_BPF_ATTACH_TYPE".
Cc: Quentin Monnet <qmo@kernel.org>
Reported-by: Takshak Chahande <ctakshak@meta.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/r/20250507203232.1420762-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since commit 2070887fdeac ("futex: fix restart in wait_requeue_pi"),
futex_wait_requeue_pi() no longer uses restart_block. Update the comment
accordingly.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250428193445.4571-1-namcao@linutronix.de
|
|
The original PPTT code had a bug where the processor subtable length
was not correctly validated when encountering a truncated
acpi_pptt_processor node.
Commit 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of
sizeof() calls") attempted to fix this by validating the size is as
large as the acpi_pptt_processor node structure. This introduced a
regression where the last processor node in the PPTT table is ignored
if it doesn't contain any private resources. That results errors like:
ACPI PPTT: PPTT table found, but unable to locate core XX (XX)
ACPI: SPE must be homogeneous
Furthermore, it fails in a common case where the node length isn't
equal to the acpi_pptt_processor structure size, leaving the original
bug in a modified form.
Correct the regression by adjusting the loop termination conditions as
suggested by the bug reporters. An additional check performed after
the subtable node type is detected, validates the acpi_pptt_processor
node is fully contained in the PPTT table. Repeating the check in
acpi_pptt_leaf_node() is largely redundant as the node is already
known to be fully contained in the table.
The case where a final truncated node's parent property is accepted,
but the node itself is rejected should not be considered a bug.
Fixes: 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls")
Reported-by: Maximilian Heyne <mheyne@amazon.de>
Closes: https://lore.kernel.org/linux-acpi/20250506-draco-taped-15f475cd@mheyne-amazon/
Reported-by: Yicong Yang <yangyicong@hisilicon.com>
Closes: https://lore.kernel.org/linux-acpi/20250507035124.28071-1-yangyicong@huawei.com/
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Maximilian Heyne <mheyne@amazon.de>
Cc: All applicable <stable@vger.kernel.org> # 7ab4f0e37a0f4: ACPI PPTT: Fix coding mistakes ...
Link: https://patch.msgid.link/20250508023025.1301030-1-jeremy.linton@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Move this API to the canonical timer_*() namespace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250507175338.672442-10-mingo@kernel.org
|
|
Move this API to the canonical timer_*() namespace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250507175338.672442-9-mingo@kernel.org
|
|
Move this API to the canonical timers_*() namespace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250507175338.672442-8-mingo@kernel.org
|
|
Move this macro to the canonical TIMER_* namespace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250507175338.672442-7-mingo@kernel.org
|
|
Move this API to the canonical __timer_*() namespace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250507175338.672442-6-mingo@kernel.org
|
|
Move this API to the canonical __timer_*() namespace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250507175338.672442-5-mingo@kernel.org
|
|
Move this API to the canonical timer_*() namespace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250507175338.672442-4-mingo@kernel.org
|
|
Move this API to the canonical timer_*() namespace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250507175338.672442-3-mingo@kernel.org
|
|
Pull in the platform MSI/GIC changes which are seperate for the PCI
endpoint driver updates.
|
|
The latest userspace runtime allows generating commands which do not
have any argument. Remove the corresponding check in driver IOCTL to
enable this use case.
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250507161500.2339701-1-lizhi.hou@amd.com
Link: https://lore.kernel.org/r/20250507161500.2339701-1-lizhi.hou@amd.com
|
|
This reverts commit f5c68a4e84f9feca3be578199ec648b676db2030.
It is again possible to build "allmodconfig" with the randstruct GCC
plugin, so enable it for COMPILE_TEST to catch future bugs.
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The recent fix in commit c2ea09b193d2 ("randstruct: gcc-plugin: Remove
bogus void member") has fixed another issue: it was not always detecting
composite structures made only of function pointers and structures of
function pointers. Add a test for this case, and break out the layout
tests since this issue is actually a problem for Clang as well[1].
Link: https://github.com/llvm/llvm-project/issues/138355 [1]
Link: https://lore.kernel.org/r/20250502224116.work.591-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Perform basic validation about layout randomization and initialization
tracking when using CONFIG_RANDSTRUCT=y. Tested using:
$ ./tools/testing/kunit/kunit.py run \
--kconfig_add CONFIG_RANDSTRUCT_FULL=y \
randstruct
[17:22:30] ================= randstruct (2 subtests) ==================
[17:22:30] [PASSED] randstruct_layout
[17:22:30] [PASSED] randstruct_initializers
[17:22:30] =================== [PASSED] randstruct ====================
[17:22:30] ============================================================
[17:22:30] Testing complete. Ran 2 tests: passed: 2
[17:22:30] Elapsed time: 5.091s total, 0.001s configuring, 4.974s building, 0.086s running
Adding "--make_option LLVM=1" can be used to test Clang, which also
passes.
Acked-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
When building the randomized replacement tree of struct members, the
randstruct GCC plugin would insert, as the first member, a 0-sized void
member. This appears as though it was done to catch non-designated
("unnamed") static initializers, which wouldn't be stable since they
depend on the original struct layout order.
This was accomplished by having the side-effect of the "void member"
tripping an assert in GCC internals (count_type_elements) if the member
list ever needed to be counted (e.g. for figuring out the order of members
during a non-designated initialization), which would catch impossible type
(void) in the struct:
security/landlock/fs.c: In function ‘hook_file_ioctl_common’:
security/landlock/fs.c:1745:61: internal compiler error: in count_type_elements, at expr.cc:7075
1745 | .u.op = &(struct lsm_ioctlop_audit) {
| ^
static HOST_WIDE_INT
count_type_elements (const_tree type, bool for_ctor_p)
{
switch (TREE_CODE (type))
...
case VOID_TYPE:
default:
gcc_unreachable ();
}
}
However this is a redundant safety measure since randstruct uses the
__designated_initializer attribute both internally and within the
__randomized_layout attribute macro so that this would be enforced
by the compiler directly even when randstruct was not enabled (via
-Wdesignated-init).
A recent change in Landlock ended up tripping the same member counting
routine when using a full-struct copy initializer as part of an anonymous
initializer. This, however, is a false positive as the initializer is
copying between identical structs (and hence identical layouts). The
"path" member is "struct path", a randomized struct, and is being copied
to from another "struct path", the "f_path" member:
landlock_log_denial(landlock_cred(file->f_cred), &(struct landlock_request) {
.type = LANDLOCK_REQUEST_FS_ACCESS,
.audit = {
.type = LSM_AUDIT_DATA_IOCTL_OP,
.u.op = &(struct lsm_ioctlop_audit) {
.path = file->f_path,
.cmd = cmd,
},
},
...
As can be seen with the coming randstruct KUnit test, there appears to
be no behavioral problems with this kind of initialization when the void
member is removed from the randstruct GCC plugin, so remove it.
Reported-by: "Dr. David Alan Gilbert" <linux@treblig.org>
Closes: https://lore.kernel.org/lkml/Z_PRaKx7q70MKgCA@gallifrey/
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/lkml/20250407-kbuild-disable-gcc-plugins-v1-1-5d46ae583f5e@kernel.org/
Reported-by: WangYuli <wangyuli@uniontech.com>
Closes: https://lore.kernel.org/lkml/337D5D4887277B27+3c677db3-a8b9-47f0-93a4-7809355f1381@uniontech.com/
Fixes: 313dd1b62921 ("gcc-plugins: Add the randstruct plugin")
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
After a recent change [1] in clang's randstruct implementation to
randomize structures that only contain function pointers, there is an
error because qede_ll_ops get randomized but does not use a designated
initializer for the first member:
drivers/net/ethernet/qlogic/qede/qede_main.c:206:2: error: a randomized struct can only be initialized with a designated initializer
206 | {
| ^
Explicitly initialize the common member using a designated initializer
to fix the build.
Cc: stable@vger.kernel.org
Fixes: 035f7f87b729 ("randstruct: Enable Clang support")
Link: https://github.com/llvm/llvm-project/commit/04364fb888eea6db9811510607bed4b200bcb082 [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250507-qede-fix-clang-randstruct-v1-1-5ccc15626fba@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Recent fixes to the randstruct GCC plugin allowed it to notice
that this structure is entirely function pointers and is therefore
subject to randomization, but doing so requires that it always use
designated initializers. Explicitly specify the "common" member as being
initialized. Silences:
drivers/scsi/qedf/qedf_main.c:702:9: error: positional initialization of field in 'struct' declared with 'designated_init' attribute [-Werror=designated-init]
702 | {
| ^
Fixes: 035f7f87b729 ("randstruct: Enable Clang support")
Link: https://lore.kernel.org/r/20250502224156.work.617-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
GCC 15's new -Wunterminated-string-initialization notices that the 16
character lookup table "zero_uuid" (which is not used as a C-String)
needs to be marked as "nonstring":
drivers/md/bcache/super.c: In function 'uuid_find_empty':
drivers/md/bcache/super.c:549:43: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (17 chars into 16 available) [-Wunterminated-string-initialization]
549 | static const char zero_uuid[16] = "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0";
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add the annotation (since it is not used as a C-String), and switch the
initializer to an array of bytes rather than an empty initializer,
as preferred by Coly Li.
Suggested-by: Coly Li <colyli@kernel.org>
Link: https://lore.kernel.org/lkml/389A9925-0990-422C-A1B3-0195FAA73288@coly.li/
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Since the integer wrapping sanitizer's behavior depends on its associated
.scl file, we must force a full rebuild if the file changes. If not,
instrumentation may differ between targets based on when they were built.
Generate a new header file, integer-wrap.h, any time the Clang .scl
file changes. Include the header file in compiler-version.h when its
associated feature name, INTEGER_WRAP, is defined. This will be picked
up by fixdep and force rebuilds where needed.
Acked-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20250503184623.2572355-3-kees@kernel.org
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
While the randstruct GCC plugin was being rebuilt if the randstruct seed
changed, Clang builds did not notice the change. This could result in
differing struct layouts in a target depending on when it was built.
Include the existing generated header file in compiler-version.h when
its associated feature name, RANDSTRUCT, is defined. This will be picked
up by fixdep and force rebuilds where needed.
Link: https://lore.kernel.org/r/20250503184623.2572355-2-kees@kernel.org
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Tested-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
There was no dependency between the plugins changing and the rest of the
kernel being built. This could cause strange behaviors as instrumentation
could vary between targets depending on when they were built.
Generate a new header file, gcc-plugins.h, any time the GCC plugins
change. Include the header file in compiler-version.h when its associated
feature name, GCC_PLUGINS, is defined. This will be picked up by fixdep
and force rebuilds where needed.
Add a generic "touch" kbuild command, which will be used again in
a following patch. Add a "normalize_path" string helper to make the
"TOUCH" output less ugly.
Link: https://lore.kernel.org/r/20250503184623.2572355-1-kees@kernel.org
Tested-by: Nicolas Schier <n.schier@avm.de>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Variable Length Arrays (VLAs) on the stack must not be used in the kernel.
Function parameter VLAs[1] should be usable, but -Wvla will warn for
those. For example, this will produce a warning but it is not using a
stack VLA:
int something(size_t n, int array[n]) { ...
Clang has no way yet to distinguish between the VLA types[2], so
depend on GCC for now to keep stack VLAs out of the tree by using GCC's
-Wvla-larger-than=N option (though GCC may split -Wvla similarly[3] to
how Clang is planning to).
While GCC 8+ supports -Wvla-larger-than, only 9+ supports ...=0[4],
so use -Wvla-larger-than=1. Adjust mm/kasan/Makefile to remove it from
CFLAGS (GCC <9 appears unable to disable the warning correctly[5]).
The VLA usage in lib/test_ubsan.c was removed in commit 9d7ca61b1366
("lib/test_ubsan.c: VLA no longer used in kernel") so the lib/Makefile
disabling of VLA checking can be entirely removed.
Link: https://en.cppreference.com/w/c/language/array [1]
Link: https://github.com/llvm/llvm-project/issues/57098 [2]
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98217 [3]
Link: https://lore.kernel.org/lkml/7780883c-0ac8-4aaa-b850-469e33b50672@linux.ibm.com/ [4]
Link: https://lore.kernel.org/r/202505071331.4iOzqmuE-lkp@intel.com/ [5]
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Link: https://lore.kernel.org/r/20250418213235.work.532-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Simplifies CONFIG_CC_HAS_COUNTED_BY by removing the build test and
relying solely on gcc/clang version numbering (GCC_VERSION >= 150100 and
CLANG_VERSION >= 190103).
The build test was used to allow unreleased gcc 15.0 builds to use the
__counted_by attribute. Now that gcc 15.1.0 has been released, this is
not needed anymore. Note: This will disable __counted_by on unreleased
gcc 15.0 builds.
clang version support for __counted_by remains unchanged.
Link: https://lore.kernel.org/all/Zw8iawAF5W2uzGuh@archlinux/T/#m204c09f63c076586a02d194b87dffc7e81b8de7b
Link: https://lore.kernel.org/r/20241029140036.577804-2-kernel@jfarr.cc
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jan Hendrik Farr <kernel@jfarr.cc>
Link: https://lore.kernel.org/r/20250430184231.671365-2-kernel@jfarr.cc
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Currently, to statically initialize the struct members of the `type`
object created by _DEFINE_FLEX(), the internal `obj` member must be
explicitly referenced at the call site. See:
struct flex {
int a;
int b;
struct foo flex_array[];
};
_DEFINE_FLEX(struct flex, instance, flex_array,
FIXED_SIZE, = {
.obj = {
.a = 0,
.b = 1,
},
});
This leaks _DEFINE_FLEX() internal implementation details and make
the helper harder to use and read.
Fix this and allow for a more natural and intuitive C99 init-style:
_DEFINE_FLEX(struct flex, instance, flex_array,
FIXED_SIZE, = {
.a = 0,
.b = 1,
});
Note that before these changes, the `initializer` argument was optional,
but now it's required.
Also, update "counter" member initialization in DEFINE_FLEX().
Fixes: 26dd68d293fd ("overflow: add DEFINE_FLEX() for on-stack allocs")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/aBQVeyKfLOkO9Yss@kspp
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Add a couple of tests for new STACK_FLEX_ARRAY_SIZE() helper.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/c127631a03cdd7f59bfa091b9666a93bf69d0322.1745355442.git.gustavoars@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Add new STACK_FLEX_ARRAY_SIZE() helper to get the size of a
flexible-array member defined using DEFINE_FLEX()/DEFINE_RAW_FLEX()
at compile time.
This is essentially the same as ARRAY_SIZE() but for on-stack
flexible-array members.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/83d53744e11c80eb3f03765238cbe648855f4168.1745355442.git.gustavoars@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into fixes
riscv fixes for 6.15-rc6
- A fix to handle compressed halfword load/store instructions misaligned accesses
- A fix to allow user memory access while handling a misaligned access
- 2 fixes to return an error if the pointer masking extension is not implemented on the platform but userspace still tries to access it, which caused oops on some early platforms
- A fix to prevent the stripping of .rela.dyn so that a vmlinux loaded by kexec can successfully boot
* tag 'riscv-fixes-6.15-rc6' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux:
riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without Supm
scripts: Do not strip .rela.dyn section
riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
riscv: misaligned: use get_user() instead of __get_user()
riscv: misaligned: enable IRQs while handling misaligned accesses
riscv: misaligned: factorize trap handling
riscv: misaligned: Add handling for ZCB instructions
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc6).
No conflicts.
Adjacent changes:
net/core/dev.c:
08e9f2d584c4 ("net: Lock netdevices during dev_shutdown")
a82dc19db136 ("net: avoid potential race between netdev_get_by_index_lock() and netns switch")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add driver for the Novatek NT37801 or NT37810 AMOLED DSI 1440x3200
panel in CMD mode, used on Qualcomm MTP8750 board (SM8750).
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250508-sm8750-display-panel-v2-2-3ca072e3d1fa@linaro.org
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250508-sm8750-display-panel-v2-2-3ca072e3d1fa@linaro.org
|
|
Add bindings for the Novatek NT37801 or NT37810 AMOLED DSI panel.
Sources, like downstream DTS, schematics and hardware manuals, use two
model names (NT37801 and NT37810), so choose one and hope it is correct.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20250508-sm8750-display-panel-v2-1-3ca072e3d1fa@linaro.org
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250508-sm8750-display-panel-v2-1-3ca072e3d1fa@linaro.org
|
|
Convert the Truly NT35597 2K display panel bindings to dt-schema.
The vdispp-supply & vdispn-supply are not marked as required since
in practice they are not defined in sdm845-mtp.dts which is the
only used of these bindings.
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20250507-topic-misc-truly-nt35597-yaml-v1-1-bc719ad8dfff@linaro.org
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250507-topic-misc-truly-nt35597-yaml-v1-1-bc719ad8dfff@linaro.org
|
|
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes: 689275140cb8 ("drm/amdgpu/hdp7.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit dbc064adfcf9095e7d895bea87b2f75c1ab23236)
Cc: stable@vger.kernel.org
|
|
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes: abe1cbaec6cf ("drm/amdgpu/hdp6.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 84141ff615951359c9a99696fd79a36c465ed847)
Cc: stable@vger.kernel.org
|
|
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes: f756dbac1ce1 ("drm/amdgpu/hdp5.2: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4a89b7698e771914b4d5b571600c76e2fdcbe2a9)
Cc: stable@vger.kernel.org
|
|
Reading back the remapped HDP flush register seems to cause
problems on some platforms. All we need is a read, so read back
the memcfg register.
Fixes: cf424020e040 ("drm/amdgpu/hdp5.0: do a posting read when flushing HDP")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a5cb344033c7598762e89255e8ff52827abb57a4)
Cc: stable@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from CAN, WiFi and netfilter.
We have still a comple of regressions open due to the recent
drivers locking refactor. The patches are in-flight, but not
ready yet.
Current release - regressions:
- core: lock netdevices during dev_shutdown
- sch_htb: make htb_deactivate() idempotent
- eth: virtio-net: don't re-enable refill work too early
Current release - new code bugs:
- eth: icssg-prueth: fix kernel panic during concurrent Tx queue
access
Previous releases - regressions:
- gre: fix again IPv6 link-local address generation.
- eth: b53: fix learning on VLAN unaware bridges
Previous releases - always broken:
- wifi: fix out-of-bounds access during multi-link element
defragmentation
- can:
- initialize spin lock on device probe
- fix order of unregistration calls
- openvswitch: fix unsafe attribute parsing in output_userspace()
- eth:
- virtio-net: fix total qstat values
- mtk_eth_soc: reset all TX queues on DMA free
- fbnic: firmware IPC mailbox fixes"
* tag 'net-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
virtio-net: fix total qstat values
net: export a helper for adding up queue stats
fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready
fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready
fbnic: Cleanup handling of completions
fbnic: Actually flush_tx instead of stalling out
fbnic: Add additional handling of IRQs
fbnic: Gate AXI read/write enabling on FW mailbox
fbnic: Fix initialization of mailbox descriptor rings
net: dsa: b53: do not set learning and unicast/multicast on up
net: dsa: b53: fix learning on VLAN unaware bridges
net: dsa: b53: fix toggling vlan_filtering
net: dsa: b53: do not program vlans when vlan filtering is off
net: dsa: b53: do not allow to configure VLAN 0
net: dsa: b53: always rejoin default untagged VLAN on bridge leave
net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave
net: dsa: b53: fix flushing old pvid VLAN on pvid change
net: dsa: b53: fix clearing PVID of a port
net: dsa: b53: keep CPU port always tagged again
...
|