Age | Commit message (Collapse) | Author |
|
handle 'tp_vec[]' based requests
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-26-mingo@kernel.org
|
|
existing APIs
Instead of constructing a vector on-stack, just use the already
available batch-patching vector - which should always be empty
at this point.
This will allow subsequent simplifications.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-25-mingo@kernel.org
|
|
smp_text_poke_batch_finish()/smp_text_poke_batch_flush()/text_poke_addr_ordered()
There's this weird hack used by smp_text_poke_batch_finish() to indicate
a 'forced flush':
smp_text_poke_batch_flush(NULL);
Just open-code the vector-flush in a straightforward fashion:
smp_text_poke_batch_process(tp_vec, tp_vec_nr);
tp_vec_nr = 0;
And get rid of !addr hack from text_poke_addr_ordered().
Leave a WARN_ON_ONCE(), just in case some external code learned
to rely on this behavior.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-24-mingo@kernel.org
|
|
tp_order_fail() uses inverted logic: it returns true in case something
is false, which is only a plus at the IOCCC.
Instead rename it to regular parity as 'text_poke_addr_ordered()',
and adjust the code accordingly.
Also add a comment explaining how the address ordering should be
understood.
No change in functionality intended.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-23-mingo@kernel.org
|
|
It's possible to escape the text_mutex-held assert in
smp_text_poke_batch_process() if the caller uses a properly
batched and sorted series of patch requests, so add
an explicit lockdep_assert_held() to make sure it's
held by all callers.
All text_poke_int3_*() APIs will call either smp_text_poke_batch_process()
or smp_text_poke_batch_flush() internally.
The text_mutex must be held, because tp_vec and tp_vec_nr et al
are all globals, and the INT3 patching machinery itself relies on
external serialization.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-22-mingo@kernel.org
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-21-mingo@kernel.org
|
|
Make it clear that this structure is part of the INT3 based
SMP patching facility, not the regular text_poke*() MM-switch
based facility.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-19-mingo@kernel.org
|
|
This name is actively confusing as well, because the simple text_poke*()
APIs use MM-switching based code patching, while text_poke_loc_init()
is part of the INT3 based text_poke_int3_*() machinery that is an
additional layer of functionality on top of regular text_poke*() functionality.
Rename it to text_poke_int3_loc_init() to make it clear which layer
it belongs to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-18-mingo@kernel.org
|
|
This name is actively confusing as well, because the simple text_poke*()
APIs use MM-switching based code patching, while text_poke_queue()
is part of the INT3 based text_poke_int3_*() machinery that is an
additional layer of functionality on top of regular text_poke*() functionality.
Rename it to smp_text_poke_batch_add() to make it clear which layer
it belongs to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-17-mingo@kernel.org
|
|
This name is actively confusing as well, because the simple text_poke*()
APIs use MM-switching based code patching, while text_poke_finish()
is part of the INT3 based text_poke_int3_*() machinery that is an
additional layer of functionality on top of regular text_poke*() functionality.
Rename it to smp_text_poke_batch_finish() to make it clear which layer
it belongs to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-16-mingo@kernel.org
|
|
This name is actually actively confusing, because the simple text_poke*()
APIs use MM-switching based code patching, while text_poke_flush()
is part of the INT3 based text_poke_int3_*() machinery that is an
additional layer of functionality on top of regular text_poke*() functionality.
Rename it to smp_text_poke_batch_flush() to make it clear which layer
it belongs to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-15-mingo@kernel.org
|
|
'temp_mm_state_t' abstraction
So the temp_mm_state_t abstraction used by use_temporary_mm() and
unuse_temporary_mm() is super confusing:
- The whole machinery is about temporarily switching to the
text_poke_mm utility MM that got allocated during bootup
for text-patching purposes alone:
temp_mm_state_t prev;
/*
* Loading the temporary mm behaves as a compiler barrier, which
* guarantees that the PTE will be set at the time memcpy() is done.
*/
prev = use_temporary_mm(text_poke_mm);
- Yet the value that gets saved in the temp_mm_state_t variable
is not the temporary MM ... but the previous MM...
- Ie. we temporarily put the non-temporary MM into a variable
that has the temp_mm_state_t type. This makes no sense whatsoever.
- The confusion continues in unuse_temporary_mm():
static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
Here we unuse an MM that is ... not the temporary MM, but the
previous MM. :-/
Fix up all this confusion by removing the unnecessary layer of
abstraction and using a bog-standard 'struct mm_struct *prev_mm'
variable to save the MM to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-14-mingo@kernel.org
|
|
The idtentry macro in entry_64.S hasn't had a create_gap
option for 5 years - update the comment.
(Also clean up the entire comment block while at it.)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-13-mingo@kernel.org
|
|
It's declared in <asm/text-patching.h> already.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-12-mingo@kernel.org
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-11-mingo@kernel.org
|
|
Put it into the text_poke_* namespace of <asm/text-patching.h>.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-10-mingo@kernel.org
|
|
Put it into the text_poke_* namespace of <asm/text-patching.h>.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-9-mingo@kernel.org
|
|
All related functions in this subsystem already have a
text_poke_int3_ prefix - add it to the trap handler
as well.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-8-mingo@kernel.org
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-7-mingo@kernel.org
|
|
'smp_text_poke_batch_process()'
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-6-mingo@kernel.org
|
|
Make it clear that these reference counts lock access
to text_poke_array.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-5-mingo@kernel.org
|
|
text_poke_int3_vec'
Follow the INT3 text-poking nomenclature, and also adopt the
'vector' name for the entire object, instead of the rather
opaque 'descriptor' naming.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-4-mingo@kernel.org
|
|
bit more
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250411054105.2341982-3-mingo@kernel.org
|
|
sharing in poke_int3_handler()
eBPF programs can be run 50,000,000 times per second on busy servers.
Whenever /proc/sys/kernel/bpf_stats_enabled is turned off,
hundreds of calls sites are patched from text_poke_bp_batch()
and we see a huge loss of performance due to false sharing
on bp_desc.refs lasting up to three seconds.
51.30% server_bin [kernel.kallsyms] [k] poke_int3_handler
|
|--46.45%--poke_int3_handler
| exc_int3
| asm_exc_int3
| |
| |--24.26%--cls_bpf_classify
| | tcf_classify
| | __dev_queue_xmit
| | ip6_finish_output2
| | ip6_output
| | ip6_xmit
| | inet6_csk_xmit
| | __tcp_transmit_skb
Fix this by replacing bp_desc.refs with a per-cpu bp_refs.
Before the patch, on a host with 240 cores (480 threads):
$ sysctl -wq kernel.bpf_stats_enabled=0
text_poke_bp_batch(nr_entries=164) : Took 2655300 usec
$ bpftool prog | grep run_time_ns
...
105: sched_cls name hn_egress tag 699fc5eea64144e3 gpl run_time_ns
3009063719 run_cnt 82757845 : average cost is 36 nsec per call
After this patch:
$ sysctl -wq kernel.bpf_stats_enabled=0
text_poke_bp_batch(nr_entries=164) : Took 702 usec
$ bpftool prog | grep run_time_ns
...
105: sched_cls name hn_egress tag 699fc5eea64144e3 gpl run_time_ns
1928223019 run_cnt 67682728 : average cost is 28 nsec per call
Ie. text-patching performance improved 3700x: from 2.65 seconds
to 0.0007 seconds.
Since the atomic_cond_read_acquire(refs, !VAL) spin-loop was not triggered
even once in my tests, add an unlikely() annotation, because this appears
to be the common case.
[ mingo: Improved the changelog some more. ]
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250411054105.2341982-2-mingo@kernel.org
|
|
|
|
The "real" linux/types.h UAPI header gracefully degrades to a NOOP when
included from assembly code.
Mirror this behaviour in the tools/ variant.
Test for __ASSEMBLER__ over __ASSEMBLY__ as the former is provided by the
toolchain automatically.
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/lkml/af553c62-ca2f-4956-932c-dd6e3a126f58@sirena.org.uk/
Fixes: c9fbaa879508 ("selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20250321-uapi-consistency-v1-1-439070118dc0@linutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:
- support up to 8192 processors
- add cpuidle governor debug telemetry, disabled by default
- update default output to exclude cpuidle invocation counts
- bug fixes
* tag 'turbostat-2025.05.06' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: v2025.05.06
tools/power turbostat: disable "cpuidle" invocation counters, by default
tools/power turbostat: re-factor sysfs code
tools/power turbostat: Restore GFX sysfs fflush() call
tools/power turbostat: Document GNR UncMHz domain convention
tools/power turbostat: report CoreThr per measurement interval
tools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192
tools/power turbostat: Add idle governor statistics reporting
tools/power turbostat: Fix names matching
tools/power turbostat: Allow Zero return value for some RAPL registers
tools/power turbostat: Clustered Uncore MHz counters should honor show/hide options
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:
- add missing config symbol CONFIG_SND_HDA_EXT_CORE required for asoc
driver CONFIG_SND_SOF_SOF_HDA_SDW_BPT
* tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
ASoC: SOF: Intel: Let SND_SOF_SOF_HDA_SDW_BPT select SND_HDA_EXT_CORE
|
|
Support up to 8192 processors
Add cpuidle governor debug telemetry, disabled by default
Update default output to exclude cpuidle invocation counts
Bug fixes
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Create "pct_idle" counter group, the sofware notion of residency
so it can now be singled out, independent of other counter groups.
Create "cpuidle" group, the cpuidle invocation counts.
Disable "cpuidle", by default.
Create "swidle" = "cpuidle" + "pct_idle".
Undocument "sysfs", the old name for "swidle", but keep it working
for backwards compatibilty.
Create "hwidle", all the HW idle counters
Modify "idle", enabled by default
"idle" = "hwidle" + "pct_idle" (and now excludes "cpuidle")
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fix from Ingo Molnar:
"Fix a perf events time accounting bug"
* tag 'perf-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix child_total_time_enabled accounting bug at task exit
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix a nonsensical Kconfig combination
- Remove an unnecessary rseq-notification
* tag 'sched-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Eliminate useless task_work on execve
sched/isolation: Make CONFIG_CPU_ISOLATION depend on CONFIG_SMP
|
|
... and don't error out so hard on missing module descriptions.
Before commit 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()")
we used to warn about missing module descriptions, but only when
building with extra warnigns (ie 'W=1').
After that commit the warning became an unconditional hard error.
And it turns out not all modules have been converted despite the claims
to the contrary. As reported by Damian Tometzki, the slub KUnit test
didn't have a module description, and apparently nobody ever really
noticed.
The reason nobody noticed seems to be that the slub KUnit tests get
disabled by SLUB_TINY, which also ends up disabling a lot of other code,
both in tests and in slub itself. And so anybody doing full build tests
didn't actually see this failre.
So let's disable SLUB_TINY for build-only tests, since it clearly ends
up limiting build coverage. Also turn the missing module descriptions
error back into a warning, but let's keep it around for non-'W=1'
builds.
Reported-by: Damian Tometzki <damian@riscv-rocks.de>
Link: https://lore.kernel.org/all/01070196099fd059-e8463438-7b1b-4ec8-816d-173874be9966-000000@eu-central-1.amazonses.com/
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Fixes: 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Probe cpuidle "sysfs" residency and counts separately,
since soon we will make one disabled on, and the
other disabled off.
Clarify that some BIC (build-in-counters) are actually "groups".
since we're about to re-name some of those groups.
no functional change.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Do fflush() to discard the buffered data, before each read of the
graphics sysfs knobs.
Fixes: ba99a4fc8c24 ("tools/power turbostat: Remove unnecessary fflush() call")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Document that on Intel Granite Rapids Systems,
Uncore domains 0-2 are CPU domains, and
uncore domains 3-4 are IO domains.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
The CoreThr column displays total thermal throttling events
since boot time.
Change it to report events during the measurement interval.
This is more useful for showing a user the current conditions.
Total events since boot time are still available to the user via
/sys/devices/system/cpu/cpu*/thermal_throttle/*
Document CoreThr on turbostat.8
Fixes: eae97e053fe30 ("turbostat: Support thermal throttle count print")
Reported-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Chen Yu <yu.c.chen@intel.com>
|
|
On systems with >= 1024 cpus (in my case 1152), turbostat fails with the error output:
"turbostat: /sys/fs/cgroup/cpuset.cpus.effective: cpu str malformat 0-1151"
A similar error appears with the use of turbostat --cpu when the inputted cpu
range contains a cpu number >= 1024:
# turbostat -c 1100-1151
"--cpu 1100-1151" malformed
...
Both errors are caused by parse_cpu_str() reaching its limit of CPU_SUBSET_MAXCPUS.
It's a good idea to limit the maximum cpu number being parsed, but 1024 is too low.
For a small increase in compute and allocated memory, increasing CPU_SUBSET_MAXCPUS
brings support for parsing cpu numbers >= 1024.
Increase CPU_SUBSET_MAXCPUS to 8192, a common setting for CONFIG_NR_CPUS on x86_64.
Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer cleanups from Thomas Gleixner:
"A set of final cleanups for the timer subsystem:
- Convert all del_timer[_sync]() instances over to the new
timer_delete[_sync]() API and remove the legacy wrappers.
Conversion was done with coccinelle plus some manual fixups as
coccinelle chokes on scoped_guard().
- The final cleanup of the hrtimer_init() to hrtimer_setup()
conversion.
This has been delayed to the end of the merge window, so that all
patches which have been merged through other trees are in mainline
and all new users are catched.
Doing this right before rc1 ensures that new code which is merged post
rc1 is not introducing new instances of the original functionality"
* tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tracing/timers: Rename the hrtimer_init event to hrtimer_setup
hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack()
hrtimers: Rename debug_init() to debug_setup()
hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper()
hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns()
hrtimers: Make callback function pointer private
hrtimers: Merge __hrtimer_init() into __hrtimer_setup()
hrtimers: Switch to use __htimer_setup()
hrtimers: Delete hrtimer_init()
treewide: Convert new and leftover hrtimer_init() users
treewide: Switch/rename to timer_delete[_sync]()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more irq updates from Thomas Gleixner:
"A set of updates for the interrupt subsystem:
- A treewide cleanup for the irq_domain code, which makes the naming
consistent and gets rid of the original oddity of naming domains
'host'.
This is a trivial mechanical change and is done late to ensure that
all instances have been catched and new code merged post rc1 wont
reintroduce new instances.
- A trivial consistency fix in the migration code
The recent introduction of irq_force_complete_move() in the core
code, causes a problem for the nostalgia crowd who maintains ia64
out of tree.
The code assumes that hierarchical interrupt domains are enabled
and dereferences irq_data::parent_data unconditionally. That works
in mainline because both architectures which enable that code have
hierarchical domains enabled. Though it breaks the ia64 build,
which enables the functionality, but does not have hierarchical
domains.
While it's not really a problem for mainline today, this
unconditional dereference is inconsistent and trivially fixable by
using the existing helper function irqd_get_parent_data(), which
has the appropriate #ifdeffery in place"
* tag 'irq-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/migration: Use irqd_get_parent_data() in irq_force_complete_move()
irqdomain: Stop using 'host' for domain
irqdomain: Rename irq_get_default_host() to irq_get_default_domain()
irqdomain: Rename irq_set_default_host() to irq_set_default_domain()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A revert to fix a adjtimex() regression:
The recent change to prevent that time goes backwards for the coarse
time getters due to immediate multiplier adjustments via adjtimex(),
changed the way how the timekeeping core treats that.
That change result in a regression on the adjtimex() side, which is
user space visible:
1) The forwarding of the base time moves the update out of the
original period and establishes a new one. That's changing the
behaviour of the [PF]LL control, which user space expects to be
applied periodically.
2) The clearing of the accumulated NTP error due to #1, changes the
behaviour as well.
An attempt to delay the multiplier/frequency update to the next tick
did not solve the problem as userspace expects that the multiplier or
frequency updates are in effect, when the syscall returns.
There is a different solution for the coarse time problem available,
so revert the offending commit to restore the existing adjtimex()
behaviour"
* tag 'timers-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "timekeeping: Fix possible inconsistencies in _COARSE clockids"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux
Pull sh updates from John Paul Adrian Glaubitz:
"One important fix and one small configuration update.
The first patch by Artur Rojek fixes an issue with the J2 firmware
loader not being able to find the location of the device tree blob due
to insufficient alignment of the .bss section which rendered J2 boards
unbootable.
The second patch by Johan Korsnes updates the defconfigs on sh to drop
the CONFIG_NET_CLS_TCINDEX configuration option which became obsolete
after 8c710f75256b ("net/sched: Retire tcindex classifier").
Summary:
- sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
- sh: Align .bss section padding to 8-byte boundary"
* tag 'sh-for-v6.15-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
sh: Align .bss section padding to 8-byte boundary
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Improve performance in gendwarfksyms
- Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS
- Support CONFIG_HEADERS_INSTALL for ARCH=um
- Use more relative paths to sources files for better reproducibility
- Support the loong64 Debian architecture
- Add Kbuild bash completion
- Introduce intermediate vmlinux.unstripped for architectures that need
static relocations to be stripped from the final vmlinux
- Fix versioning in Debian packages for -rc releases
- Treat missing MODULE_DESCRIPTION() as an error
- Convert Nios2 Makefiles to use the generic rule for built-in DTB
- Add debuginfo support to the RPM package
* tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits)
kbuild: rpm-pkg: build a debuginfo RPM
kconfig: merge_config: use an empty file as initfile
nios2: migrate to the generic rule for built-in DTB
rust: kbuild: skip `--remap-path-prefix` for `rustdoc`
kbuild: pacman-pkg: hardcode module installation path
kbuild: deb-pkg: don't set KBUILD_BUILD_VERSION unconditionally
modpost: require a MODULE_DESCRIPTION()
kbuild: make all file references relative to source root
x86: drop unnecessary prefix map configuration
kbuild: deb-pkg: add comment about future removal of KDEB_COMPRESS
kbuild: Add a help message for "headers"
kbuild: deb-pkg: remove "version" variable in mkdebian
kbuild: deb-pkg: fix versioning for -rc releases
Documentation/kbuild: Fix indentation in modules.rst example
x86: Get rid of Makefile.postlink
kbuild: Create intermediate vmlinux build with relocations preserved
kbuild: Introduce Kconfig symbol for linking vmlinux with relocations
kbuild: link-vmlinux.sh: Make output file name configurable
kbuild: do not generate .tmp_vmlinux*.map when CONFIG_VMLINUX_MAP=y
Revert "kheaders: Ignore silly-rename files"
...
|
|
Pull drm fixes from Dave Airlie:
"Weekly fixes, mostly from the end of last week, this week was very
quiet, maybe you scared everyone away. It's mostly amdgpu, and xe,
with some i915, adp and bridge bits, since I think this is overly
quiet I'd expect rc2 to be a bit more lively.
bridge:
- tda998x: Select CONFIG_DRM_KMS_HELPER
amdgpu:
- Guard against potential division by 0 in fan code
- Zero RPM support for SMU 14.0.2
- Properly handle SI and CIK support being disabled
- PSR fixes
- DML2 fixes
- DP Link training fix
- Vblank fixes
- RAS fixes
- Partitioning fix
- SDMA fix
- SMU 13.0.x fixes
- Rom fetching fix
- MES fixes
- Queue reset fix
xe:
- Fix NULL pointer dereference on error path
- Add missing HW workaround for BMG
- Fix survivability mode not triggering
- Fix build warning when DRM_FBDEV_EMULATION is not set
i915:
- Bounds check for scalers in DSC prefill latency computation
- Fix build by adding a missing include
adp:
- Fix error handling in plane setup"
# -----BEGIN PGP SIGNATURE-----
* tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernel: (34 commits)
drm/i2c: tda998x: select CONFIG_DRM_KMS_HELPER
drm/amdgpu/gfx12: fix num_mec
drm/amdgpu/gfx11: fix num_mec
drm/amd/pm: Add gpu_metrics_v1_8
drm/amdgpu: Prefer shadow rom when available
drm/amd/pm: Update smu metrics table for smu_v13_0_6
drm/amd/pm: Remove host limit metrics support
Remove unnecessary firmware version check for gc v9_4_2
drm/amdgpu: stop unmapping MQD for kernel queues v3
Revert "drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA"
drm/amdgpu: Parse all deferred errors with UMC aca handle
drm/amdgpu: Update ta ras block
drm/amdgpu: Add NPS2 to DPX compatible mode
drm/amdgpu: Use correct gfx deferred error count
drm/amd/display: Actually do immediate vblank disable
drm/amd/display: prevent hang on link training fail
Revert "drm/amd/display: dml2 soc dscclk use DPM table clk setting"
drm/amd/display: Increase vblank offdelay for PSR panels
drm/amd: Handle being compiled without SI or CIK support better
drm/amd/pm: Add zero RPM enabled OD setting support for SMU14.0.2
...
|
|
The rpm-pkg make target currently suffers from a few issues related to
debuginfo:
1. debuginfo for things built into the kernel (vmlinux) is not available
in any RPM produced by make rpm-pkg. This makes using tools like
systemtap against a make rpm-pkg kernel impossible.
2. debug source for the kernel is not available. This means that
commands like 'disas /s' in gdb, which display source intermixed with
assembly, can only print file names/line numbers which then must be
painstakingly resolved to actual source in a separate editor.
3. debuginfo for modules is available, but it remains bundled with the
.ko files that contain module code, in the main kernel RPM. This is a
waste of space for users who do not need to debug the kernel (i.e.
most users).
Address all of these issues by additionally building a debuginfo RPM
when the kernel configuration allows for it, in line with standard
patterns followed by RPM distributors. With these changes:
1. systemtap now works (when these changes are backported to 6.11, since
systemtap lags a bit behind in compatibility), as verified by the
following simple test script:
# stap -e 'probe kernel.function("do_sys_open").call { printf("%s\n", $$parms); }'
dfd=0xffffffffffffff9c filename=0x7fe18800b160 flags=0x88800 mode=0x0
...
2. disas /s works correctly in gdb, with source and disassembly
interspersed:
# gdb vmlinux --batch -ex 'disas /s blk_op_str'
Dump of assembler code for function blk_op_str:
block/blk-core.c:
125 {
0xffffffff814c8740 <+0>: endbr64
127
128 if (op < ARRAY_SIZE(blk_op_name) && blk_op_name[op])
0xffffffff814c8744 <+4>: mov $0xffffffff824a7378,%rax
0xffffffff814c874b <+11>: cmp $0x23,%edi
0xffffffff814c874e <+14>: ja 0xffffffff814c8768 <blk_op_str+40>
0xffffffff814c8750 <+16>: mov %edi,%edi
126 const char *op_str = "UNKNOWN";
0xffffffff814c8752 <+18>: mov $0xffffffff824a7378,%rdx
127
128 if (op < ARRAY_SIZE(blk_op_name) && blk_op_name[op])
0xffffffff814c8759 <+25>: mov -0x7dfa0160(,%rdi,8),%rax
126 const char *op_str = "UNKNOWN";
0xffffffff814c8761 <+33>: test %rax,%rax
0xffffffff814c8764 <+36>: cmove %rdx,%rax
129 op_str = blk_op_name[op];
130
131 return op_str;
132 }
0xffffffff814c8768 <+40>: jmp 0xffffffff81d01360 <__x86_return_thunk>
End of assembler dump.
3. The size of the main kernel package goes down substantially,
especially if many modules are built (quite typical). Here is a
comparison of installed size of the kernel package (configured with
allmodconfig, dwarf4 debuginfo, and module compression turned off)
before and after this patch:
# rpm -qi kernel-6.13* | grep -E '^(Version|Size)'
Version : 6.13.0postpatch+
Size : 1382874089
Version : 6.13.0prepatch+
Size : 17870795887
This is a ~92% size reduction.
Note that a debuginfo package can only be produced if the following
configs are set:
- CONFIG_DEBUG_INFO=y
- CONFIG_MODULE_COMPRESS=n
- CONFIG_DEBUG_INFO_SPLIT=n
The first of these is obvious - we can't produce debuginfo if the build
does not generate it. The second two requirements can in principle be
removed, but doing so is difficult with the current approach, which uses
a generic rpmbuild script find-debuginfo.sh that processes all packaged
executables. If we want to remove those requirements the best path
forward is likely to add some debuginfo extraction/installation logic to
the modules_install target (controllable by flags). That way, it's
easier to operate on modules before they're compressed, and the logic
can be reused by all packaging targets.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
The scripts/kconfig/merge_config.sh script requires an existing
$INITFILE (or the $1 argument) as a base file for merging Kconfig
fragments. However, an empty $INITFILE can serve as an initial starting
point, later referenced by the KCONFIG_ALLCONFIG Makefile variable
if -m is not used. This variable can point to any configuration file
containing preset config symbols (the merged output) as stated in
Documentation/kbuild/kconfig.rst. When -m is used $INITFILE will
contain just the merge output requiring the user to run make (i.e.
KCONFIG_ALLCONFIG=<$INITFILE> make <allnoconfig/alldefconfig> or make
olddefconfig).
Instead of failing when `$INITFILE` is missing, create an empty file and
use it as the starting point for merges.
Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Commit 654102df2ac2 ("kbuild: add generic support for built-in boot
DTBs") introduced generic support for built-in DTBs.
Select GENERIC_BUILTIN_DTB when built-in DTB support is enabled.
To keep consistency across architectures, this commit also renames
CONFIG_NIOS2_DTB_SOURCE_BOOL to CONFIG_BUILTIN_DTB, and
CONFIG_NIOS2_DTB_SOURCE to CONFIG_BUILTIN_DTB_NAME.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
This option was removed from Kconfig in 8c710f75256b ("net/sched:
Retire tcindex classifier") but from the defconfigs.
Fixes: 8c710f75256b ("net/sched: Retire tcindex classifier")
Signed-off-by: Johan Korsnes <johan.korsnes@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
|
|
J2-based devices expect to find a device tree blob at the end of the
.bss section. As of a77725a9a3c5 ("scripts/dtc: Update to upstream
version v1.6.1-19-g0a3a9d3449c8"), libfdt enforces 8-byte alignment
for the DTB, causing J2 devices to fail early in sh_fdt_init().
As the J2 loader firmware calculates the DTB location based on the kernel
image .bss section size rather than the __bss_stop symbol offset, the
required alignment can't be enforced with BSS_SECTION(0, PAGE_SIZE, 8).
To fix this, inline a modified version of the above macro which grows
.bss by the required size. While this change affects all existing SH
boards, it should be benign on platforms which don't need this alignment.
Signed-off-by: Artur Rojek <contact@artur-rojek.eu>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Tested-by: Rob Landley <rob@landley.net>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- a brand new driver for touchpads and touchbars in newer Apple devices
- support for Berlin-A series in goodix-berlin touchscreen driver
- improvements to matrix_keypad driver to better handle GPIOs toggling
- assorted small cleanups in other input drivers
* tag 'input-for-v6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: goodix_berlin - add support for Berlin-A series
dt-bindings: input: goodix,gt9916: Document gt9897 compatible
dt-bindings: input: matrix_keypad - add wakeup-source property
dt-bindings: input: matrix_keypad - add missing property
Input: pm8941-pwrkey - fix dev_dbg() output in pm8941_pwrkey_irq()
Input: synaptics - hide unused smbus_pnp_ids[] array
Input: apple_z2 - fix potential confusion in Kconfig
Input: matrix_keypad - use fsleep for delays after activating columns
Input: matrix_keypad - add settle time after enabling all columns
dt-bindings: input: matrix_keypad: add settle time after enabling all columns
dt-bindings: input: matrix_keypad: convert to YAML
dt-bindings: input: Correct indentation and style in DTS example
MAINTAINERS: Add entries for Apple Z2 touchscreen driver
Input: apple_z2 - add a driver for Apple Z2 touchscreens
dt-bindings: input: touchscreen: Add Z2 controller
Input: Switch to use hrtimer_setup()
Input: drop vb2_ops_wait_prepare/finish
|