summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-11x86/alternatives: Assert that smp_text_poke_int3_handler() can only ever ↵Ingo Molnar
handle 'tp_vec[]' based requests Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-26-mingo@kernel.org
2025-04-11x86/alternatives: Simplify smp_text_poke_single() by using tp_vec and ↵Ingo Molnar
existing APIs Instead of constructing a vector on-stack, just use the already available batch-patching vector - which should always be empty at this point. This will allow subsequent simplifications. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-25-mingo@kernel.org
2025-04-11x86/alternatives: Remove the 'addr == NULL means forced-flush' hack from ↵Ingo Molnar
smp_text_poke_batch_finish()/smp_text_poke_batch_flush()/text_poke_addr_ordered() There's this weird hack used by smp_text_poke_batch_finish() to indicate a 'forced flush': smp_text_poke_batch_flush(NULL); Just open-code the vector-flush in a straightforward fashion: smp_text_poke_batch_process(tp_vec, tp_vec_nr); tp_vec_nr = 0; And get rid of !addr hack from text_poke_addr_ordered(). Leave a WARN_ON_ONCE(), just in case some external code learned to rely on this behavior. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-24-mingo@kernel.org
2025-04-11x86/alternatives: Use non-inverted logic instead of 'tp_order_fail()'Ingo Molnar
tp_order_fail() uses inverted logic: it returns true in case something is false, which is only a plus at the IOCCC. Instead rename it to regular parity as 'text_poke_addr_ordered()', and adjust the code accordingly. Also add a comment explaining how the address ordering should be understood. No change in functionality intended. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-23-mingo@kernel.org
2025-04-11x86/alternatives: Add text_mutex) assert to smp_text_poke_batch_flush()Ingo Molnar
It's possible to escape the text_mutex-held assert in smp_text_poke_batch_process() if the caller uses a properly batched and sorted series of patch requests, so add an explicit lockdep_assert_held() to make sure it's held by all callers. All text_poke_int3_*() APIs will call either smp_text_poke_batch_process() or smp_text_poke_batch_flush() internally. The text_mutex must be held, because tp_vec and tp_vec_nr et al are all globals, and the INT3 patching machinery itself relies on external serialization. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-22-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'int3_desc' to 'int3_vec'Ingo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-21-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'struct text_poke_loc' to 'struct smp_text_poke_loc'Ingo Molnar
Make it clear that this structure is part of the INT3 based SMP patching facility, not the regular text_poke*() MM-switch based facility. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-19-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'text_poke_loc_init()' to 'text_poke_int3_loc_init()'Ingo Molnar
This name is actively confusing as well, because the simple text_poke*() APIs use MM-switching based code patching, while text_poke_loc_init() is part of the INT3 based text_poke_int3_*() machinery that is an additional layer of functionality on top of regular text_poke*() functionality. Rename it to text_poke_int3_loc_init() to make it clear which layer it belongs to. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-18-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'text_poke_queue()' to 'smp_text_poke_batch_add()'Ingo Molnar
This name is actively confusing as well, because the simple text_poke*() APIs use MM-switching based code patching, while text_poke_queue() is part of the INT3 based text_poke_int3_*() machinery that is an additional layer of functionality on top of regular text_poke*() functionality. Rename it to smp_text_poke_batch_add() to make it clear which layer it belongs to. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-17-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'text_poke_finish()' to 'smp_text_poke_batch_finish()'Ingo Molnar
This name is actively confusing as well, because the simple text_poke*() APIs use MM-switching based code patching, while text_poke_finish() is part of the INT3 based text_poke_int3_*() machinery that is an additional layer of functionality on top of regular text_poke*() functionality. Rename it to smp_text_poke_batch_finish() to make it clear which layer it belongs to. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-16-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'text_poke_flush()' to 'smp_text_poke_batch_flush()'Ingo Molnar
This name is actually actively confusing, because the simple text_poke*() APIs use MM-switching based code patching, while text_poke_flush() is part of the INT3 based text_poke_int3_*() machinery that is an additional layer of functionality on top of regular text_poke*() functionality. Rename it to smp_text_poke_batch_flush() to make it clear which layer it belongs to. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-15-mingo@kernel.org
2025-04-11x86/alternatives: Remove the confusing, inaccurate & unnecessary ↵Ingo Molnar
'temp_mm_state_t' abstraction So the temp_mm_state_t abstraction used by use_temporary_mm() and unuse_temporary_mm() is super confusing: - The whole machinery is about temporarily switching to the text_poke_mm utility MM that got allocated during bootup for text-patching purposes alone: temp_mm_state_t prev; /* * Loading the temporary mm behaves as a compiler barrier, which * guarantees that the PTE will be set at the time memcpy() is done. */ prev = use_temporary_mm(text_poke_mm); - Yet the value that gets saved in the temp_mm_state_t variable is not the temporary MM ... but the previous MM... - Ie. we temporarily put the non-temporary MM into a variable that has the temp_mm_state_t type. This makes no sense whatsoever. - The confusion continues in unuse_temporary_mm(): static inline void unuse_temporary_mm(temp_mm_state_t prev_state) Here we unuse an MM that is ... not the temporary MM, but the previous MM. :-/ Fix up all this confusion by removing the unnecessary layer of abstraction and using a bog-standard 'struct mm_struct *prev_mm' variable to save the MM to. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-14-mingo@kernel.org
2025-04-11x86/alternatives: Update comments in int3_emulate_push()Ingo Molnar
The idtentry macro in entry_64.S hasn't had a create_gap option for 5 years - update the comment. (Also clean up the entire comment block while at it.) Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-13-mingo@kernel.org
2025-04-11x86/alternatives: Remove duplicate 'text_poke_early()' prototypeIngo Molnar
It's declared in <asm/text-patching.h> already. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-12-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'bp_desc' to 'int3_desc'Ingo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-11-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'poking_addr' to 'text_poke_mm_addr'Ingo Molnar
Put it into the text_poke_* namespace of <asm/text-patching.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-10-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'poking_mm' to 'text_poke_mm'Ingo Molnar
Put it into the text_poke_* namespace of <asm/text-patching.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-9-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'poke_int3_handler()' to 'smp_text_poke_int3_handler()'Ingo Molnar
All related functions in this subsystem already have a text_poke_int3_ prefix - add it to the trap handler as well. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-8-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'text_poke_bp()' to 'smp_text_poke_single()'Ingo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-7-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'text_poke_bp_batch()' to ↵Ingo Molnar
'smp_text_poke_batch_process()' Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-6-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'bp_refs' to 'text_poke_array_refs'Ingo Molnar
Make it clear that these reference counts lock access to text_poke_array. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-5-mingo@kernel.org
2025-04-11x86/alternatives: Rename 'struct bp_patching_desc' to 'struct ↵Ingo Molnar
text_poke_int3_vec' Follow the INT3 text-poking nomenclature, and also adopt the 'vector' name for the entire object, instead of the rather opaque 'descriptor' naming. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250411054105.2341982-4-mingo@kernel.org
2025-04-11x86/alternatives: Document the text_poke_bp_batch() synchronization rules a ↵Peter Zijlstra
bit more Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250411054105.2341982-3-mingo@kernel.org
2025-04-11x86/alternatives: Improve code-patching scalability by removing false ↵Eric Dumazet
sharing in poke_int3_handler() eBPF programs can be run 50,000,000 times per second on busy servers. Whenever /proc/sys/kernel/bpf_stats_enabled is turned off, hundreds of calls sites are patched from text_poke_bp_batch() and we see a huge loss of performance due to false sharing on bp_desc.refs lasting up to three seconds. 51.30% server_bin [kernel.kallsyms] [k] poke_int3_handler | |--46.45%--poke_int3_handler | exc_int3 | asm_exc_int3 | | | |--24.26%--cls_bpf_classify | | tcf_classify | | __dev_queue_xmit | | ip6_finish_output2 | | ip6_output | | ip6_xmit | | inet6_csk_xmit | | __tcp_transmit_skb Fix this by replacing bp_desc.refs with a per-cpu bp_refs. Before the patch, on a host with 240 cores (480 threads): $ sysctl -wq kernel.bpf_stats_enabled=0 text_poke_bp_batch(nr_entries=164) : Took 2655300 usec $ bpftool prog | grep run_time_ns ... 105: sched_cls name hn_egress tag 699fc5eea64144e3 gpl run_time_ns 3009063719 run_cnt 82757845 : average cost is 36 nsec per call After this patch: $ sysctl -wq kernel.bpf_stats_enabled=0 text_poke_bp_batch(nr_entries=164) : Took 702 usec $ bpftool prog | grep run_time_ns ... 105: sched_cls name hn_egress tag 699fc5eea64144e3 gpl run_time_ns 1928223019 run_cnt 67682728 : average cost is 28 nsec per call Ie. text-patching performance improved 3700x: from 2.65 seconds to 0.0007 seconds. Since the atomic_cond_read_acquire(refs, !VAL) spin-loop was not triggered even once in my tests, add an unlikely() annotation, because this appears to be the common case. [ mingo: Improved the changelog some more. ] Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250411054105.2341982-2-mingo@kernel.org
2025-04-06Linux 6.15-rc1v6.15-rc1Linus Torvalds
2025-04-06tools/include: make uapi/linux/types.h usable from assemblyThomas Weißschuh
The "real" linux/types.h UAPI header gracefully degrades to a NOOP when included from assembly code. Mirror this behaviour in the tools/ variant. Test for __ASSEMBLER__ over __ASSEMBLY__ as the former is provided by the toolchain automatically. Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/lkml/af553c62-ca2f-4956-932c-dd6e3a126f58@sirena.org.uk/ Fixes: c9fbaa879508 ("selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://patch.msgid.link/20250321-uapi-consistency-v1-1-439070118dc0@linutronix.de Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-06Merge tag 'turbostat-2025.05.06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: - support up to 8192 processors - add cpuidle governor debug telemetry, disabled by default - update default output to exclude cpuidle invocation counts - bug fixes * tag 'turbostat-2025.05.06' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: v2025.05.06 tools/power turbostat: disable "cpuidle" invocation counters, by default tools/power turbostat: re-factor sysfs code tools/power turbostat: Restore GFX sysfs fflush() call tools/power turbostat: Document GNR UncMHz domain convention tools/power turbostat: report CoreThr per measurement interval tools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192 tools/power turbostat: Add idle governor statistics reporting tools/power turbostat: Fix names matching tools/power turbostat: Allow Zero return value for some RAPL registers tools/power turbostat: Clustered Uncore MHz counters should honor show/hide options
2025-04-06Merge tag 'soundwire-6.15-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire fix from Vinod Koul: - add missing config symbol CONFIG_SND_HDA_EXT_CORE required for asoc driver CONFIG_SND_SOF_SOF_HDA_SDW_BPT * tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: ASoC: SOF: Intel: Let SND_SOF_SOF_HDA_SDW_BPT select SND_HDA_EXT_CORE
2025-04-06tools/power turbostat: v2025.05.06Len Brown
Support up to 8192 processors Add cpuidle governor debug telemetry, disabled by default Update default output to exclude cpuidle invocation counts Bug fixes Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06tools/power turbostat: disable "cpuidle" invocation counters, by defaultLen Brown
Create "pct_idle" counter group, the sofware notion of residency so it can now be singled out, independent of other counter groups. Create "cpuidle" group, the cpuidle invocation counts. Disable "cpuidle", by default. Create "swidle" = "cpuidle" + "pct_idle". Undocument "sysfs", the old name for "swidle", but keep it working for backwards compatibilty. Create "hwidle", all the HW idle counters Modify "idle", enabled by default "idle" = "hwidle" + "pct_idle" (and now excludes "cpuidle") Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06Merge tag 'perf-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fix from Ingo Molnar: "Fix a perf events time accounting bug" * tag 'perf-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix child_total_time_enabled accounting bug at task exit
2025-04-06Merge tag 'sched-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Fix a nonsensical Kconfig combination - Remove an unnecessary rseq-notification * tag 'sched-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rseq: Eliminate useless task_work on execve sched/isolation: Make CONFIG_CPU_ISOLATION depend on CONFIG_SMP
2025-04-06Disable SLUB_TINY for build testingLinus Torvalds
... and don't error out so hard on missing module descriptions. Before commit 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()") we used to warn about missing module descriptions, but only when building with extra warnigns (ie 'W=1'). After that commit the warning became an unconditional hard error. And it turns out not all modules have been converted despite the claims to the contrary. As reported by Damian Tometzki, the slub KUnit test didn't have a module description, and apparently nobody ever really noticed. The reason nobody noticed seems to be that the slub KUnit tests get disabled by SLUB_TINY, which also ends up disabling a lot of other code, both in tests and in slub itself. And so anybody doing full build tests didn't actually see this failre. So let's disable SLUB_TINY for build-only tests, since it clearly ends up limiting build coverage. Also turn the missing module descriptions error back into a warning, but let's keep it around for non-'W=1' builds. Reported-by: Damian Tometzki <damian@riscv-rocks.de> Link: https://lore.kernel.org/all/01070196099fd059-e8463438-7b1b-4ec8-816d-173874be9966-000000@eu-central-1.amazonses.com/ Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Fixes: 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-06tools/power turbostat: re-factor sysfs codeLen Brown
Probe cpuidle "sysfs" residency and counts separately, since soon we will make one disabled on, and the other disabled off. Clarify that some BIC (build-in-counters) are actually "groups". since we're about to re-name some of those groups. no functional change. Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06tools/power turbostat: Restore GFX sysfs fflush() callZhang Rui
Do fflush() to discard the buffered data, before each read of the graphics sysfs knobs. Fixes: ba99a4fc8c24 ("tools/power turbostat: Remove unnecessary fflush() call") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06tools/power turbostat: Document GNR UncMHz domain conventionLen Brown
Document that on Intel Granite Rapids Systems, Uncore domains 0-2 are CPU domains, and uncore domains 3-4 are IO domains. Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06tools/power turbostat: report CoreThr per measurement intervalLen Brown
The CoreThr column displays total thermal throttling events since boot time. Change it to report events during the measurement interval. This is more useful for showing a user the current conditions. Total events since boot time are still available to the user via /sys/devices/system/cpu/cpu*/thermal_throttle/* Document CoreThr on turbostat.8 Fixes: eae97e053fe30 ("turbostat: Support thermal throttle count print") Reported-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: Chen Yu <yu.c.chen@intel.com>
2025-04-06tools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192Justin Ernst
On systems with >= 1024 cpus (in my case 1152), turbostat fails with the error output: "turbostat: /sys/fs/cgroup/cpuset.cpus.effective: cpu str malformat 0-1151" A similar error appears with the use of turbostat --cpu when the inputted cpu range contains a cpu number >= 1024: # turbostat -c 1100-1151 "--cpu 1100-1151" malformed ... Both errors are caused by parse_cpu_str() reaching its limit of CPU_SUBSET_MAXCPUS. It's a good idea to limit the maximum cpu number being parsed, but 1024 is too low. For a small increase in compute and allocated memory, increasing CPU_SUBSET_MAXCPUS brings support for parsing cpu numbers >= 1024. Increase CPU_SUBSET_MAXCPUS to 8192, a common setting for CONFIG_NR_CPUS on x86_64. Signed-off-by: Justin Ernst <justin.ernst@hpe.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06Merge tag 'timers-cleanups-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "A set of final cleanups for the timer subsystem: - Convert all del_timer[_sync]() instances over to the new timer_delete[_sync]() API and remove the legacy wrappers. Conversion was done with coccinelle plus some manual fixups as coccinelle chokes on scoped_guard(). - The final cleanup of the hrtimer_init() to hrtimer_setup() conversion. This has been delayed to the end of the merge window, so that all patches which have been merged through other trees are in mainline and all new users are catched. Doing this right before rc1 ensures that new code which is merged post rc1 is not introducing new instances of the original functionality" * tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracing/timers: Rename the hrtimer_init event to hrtimer_setup hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack() hrtimers: Rename debug_init() to debug_setup() hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper() hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns() hrtimers: Make callback function pointer private hrtimers: Merge __hrtimer_init() into __hrtimer_setup() hrtimers: Switch to use __htimer_setup() hrtimers: Delete hrtimer_init() treewide: Convert new and leftover hrtimer_init() users treewide: Switch/rename to timer_delete[_sync]()
2025-04-06Merge tag 'irq-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more irq updates from Thomas Gleixner: "A set of updates for the interrupt subsystem: - A treewide cleanup for the irq_domain code, which makes the naming consistent and gets rid of the original oddity of naming domains 'host'. This is a trivial mechanical change and is done late to ensure that all instances have been catched and new code merged post rc1 wont reintroduce new instances. - A trivial consistency fix in the migration code The recent introduction of irq_force_complete_move() in the core code, causes a problem for the nostalgia crowd who maintains ia64 out of tree. The code assumes that hierarchical interrupt domains are enabled and dereferences irq_data::parent_data unconditionally. That works in mainline because both architectures which enable that code have hierarchical domains enabled. Though it breaks the ia64 build, which enables the functionality, but does not have hierarchical domains. While it's not really a problem for mainline today, this unconditional dereference is inconsistent and trivially fixable by using the existing helper function irqd_get_parent_data(), which has the appropriate #ifdeffery in place" * tag 'irq-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/migration: Use irqd_get_parent_data() in irq_force_complete_move() irqdomain: Stop using 'host' for domain irqdomain: Rename irq_get_default_host() to irq_get_default_domain() irqdomain: Rename irq_set_default_host() to irq_set_default_domain()
2025-04-06Merge tag 'timers-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A revert to fix a adjtimex() regression: The recent change to prevent that time goes backwards for the coarse time getters due to immediate multiplier adjustments via adjtimex(), changed the way how the timekeeping core treats that. That change result in a regression on the adjtimex() side, which is user space visible: 1) The forwarding of the base time moves the update out of the original period and establishes a new one. That's changing the behaviour of the [PF]LL control, which user space expects to be applied periodically. 2) The clearing of the accumulated NTP error due to #1, changes the behaviour as well. An attempt to delay the multiplier/frequency update to the next tick did not solve the problem as userspace expects that the multiplier or frequency updates are in effect, when the syscall returns. There is a different solution for the coarse time problem available, so revert the offending commit to restore the existing adjtimex() behaviour" * tag 'timers-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "timekeeping: Fix possible inconsistencies in _COARSE clockids"
2025-04-06Merge tag 'sh-for-v6.15-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux Pull sh updates from John Paul Adrian Glaubitz: "One important fix and one small configuration update. The first patch by Artur Rojek fixes an issue with the J2 firmware loader not being able to find the location of the device tree blob due to insufficient alignment of the .bss section which rendered J2 boards unbootable. The second patch by Johan Korsnes updates the defconfigs on sh to drop the CONFIG_NET_CLS_TCINDEX configuration option which became obsolete after 8c710f75256b ("net/sched: Retire tcindex classifier"). Summary: - sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX - sh: Align .bss section padding to 8-byte boundary" * tag 'sh-for-v6.15-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux: sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX sh: Align .bss section padding to 8-byte boundary
2025-04-05Merge tag 'kbuild-v6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Improve performance in gendwarfksyms - Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS - Support CONFIG_HEADERS_INSTALL for ARCH=um - Use more relative paths to sources files for better reproducibility - Support the loong64 Debian architecture - Add Kbuild bash completion - Introduce intermediate vmlinux.unstripped for architectures that need static relocations to be stripped from the final vmlinux - Fix versioning in Debian packages for -rc releases - Treat missing MODULE_DESCRIPTION() as an error - Convert Nios2 Makefiles to use the generic rule for built-in DTB - Add debuginfo support to the RPM package * tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits) kbuild: rpm-pkg: build a debuginfo RPM kconfig: merge_config: use an empty file as initfile nios2: migrate to the generic rule for built-in DTB rust: kbuild: skip `--remap-path-prefix` for `rustdoc` kbuild: pacman-pkg: hardcode module installation path kbuild: deb-pkg: don't set KBUILD_BUILD_VERSION unconditionally modpost: require a MODULE_DESCRIPTION() kbuild: make all file references relative to source root x86: drop unnecessary prefix map configuration kbuild: deb-pkg: add comment about future removal of KDEB_COMPRESS kbuild: Add a help message for "headers" kbuild: deb-pkg: remove "version" variable in mkdebian kbuild: deb-pkg: fix versioning for -rc releases Documentation/kbuild: Fix indentation in modules.rst example x86: Get rid of Makefile.postlink kbuild: Create intermediate vmlinux build with relocations preserved kbuild: Introduce Kconfig symbol for linking vmlinux with relocations kbuild: link-vmlinux.sh: Make output file name configurable kbuild: do not generate .tmp_vmlinux*.map when CONFIG_VMLINUX_MAP=y Revert "kheaders: Ignore silly-rename files" ...
2025-04-05Merge tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Weekly fixes, mostly from the end of last week, this week was very quiet, maybe you scared everyone away. It's mostly amdgpu, and xe, with some i915, adp and bridge bits, since I think this is overly quiet I'd expect rc2 to be a bit more lively. bridge: - tda998x: Select CONFIG_DRM_KMS_HELPER amdgpu: - Guard against potential division by 0 in fan code - Zero RPM support for SMU 14.0.2 - Properly handle SI and CIK support being disabled - PSR fixes - DML2 fixes - DP Link training fix - Vblank fixes - RAS fixes - Partitioning fix - SDMA fix - SMU 13.0.x fixes - Rom fetching fix - MES fixes - Queue reset fix xe: - Fix NULL pointer dereference on error path - Add missing HW workaround for BMG - Fix survivability mode not triggering - Fix build warning when DRM_FBDEV_EMULATION is not set i915: - Bounds check for scalers in DSC prefill latency computation - Fix build by adding a missing include adp: - Fix error handling in plane setup" # -----BEGIN PGP SIGNATURE----- * tag 'drm-next-2025-04-05' of https://gitlab.freedesktop.org/drm/kernel: (34 commits) drm/i2c: tda998x: select CONFIG_DRM_KMS_HELPER drm/amdgpu/gfx12: fix num_mec drm/amdgpu/gfx11: fix num_mec drm/amd/pm: Add gpu_metrics_v1_8 drm/amdgpu: Prefer shadow rom when available drm/amd/pm: Update smu metrics table for smu_v13_0_6 drm/amd/pm: Remove host limit metrics support Remove unnecessary firmware version check for gc v9_4_2 drm/amdgpu: stop unmapping MQD for kernel queues v3 Revert "drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA" drm/amdgpu: Parse all deferred errors with UMC aca handle drm/amdgpu: Update ta ras block drm/amdgpu: Add NPS2 to DPX compatible mode drm/amdgpu: Use correct gfx deferred error count drm/amd/display: Actually do immediate vblank disable drm/amd/display: prevent hang on link training fail Revert "drm/amd/display: dml2 soc dscclk use DPM table clk setting" drm/amd/display: Increase vblank offdelay for PSR panels drm/amd: Handle being compiled without SI or CIK support better drm/amd/pm: Add zero RPM enabled OD setting support for SMU14.0.2 ...
2025-04-06kbuild: rpm-pkg: build a debuginfo RPMUday Shankar
The rpm-pkg make target currently suffers from a few issues related to debuginfo: 1. debuginfo for things built into the kernel (vmlinux) is not available in any RPM produced by make rpm-pkg. This makes using tools like systemtap against a make rpm-pkg kernel impossible. 2. debug source for the kernel is not available. This means that commands like 'disas /s' in gdb, which display source intermixed with assembly, can only print file names/line numbers which then must be painstakingly resolved to actual source in a separate editor. 3. debuginfo for modules is available, but it remains bundled with the .ko files that contain module code, in the main kernel RPM. This is a waste of space for users who do not need to debug the kernel (i.e. most users). Address all of these issues by additionally building a debuginfo RPM when the kernel configuration allows for it, in line with standard patterns followed by RPM distributors. With these changes: 1. systemtap now works (when these changes are backported to 6.11, since systemtap lags a bit behind in compatibility), as verified by the following simple test script: # stap -e 'probe kernel.function("do_sys_open").call { printf("%s\n", $$parms); }' dfd=0xffffffffffffff9c filename=0x7fe18800b160 flags=0x88800 mode=0x0 ... 2. disas /s works correctly in gdb, with source and disassembly interspersed: # gdb vmlinux --batch -ex 'disas /s blk_op_str' Dump of assembler code for function blk_op_str: block/blk-core.c: 125 { 0xffffffff814c8740 <+0>: endbr64 127 128 if (op < ARRAY_SIZE(blk_op_name) && blk_op_name[op]) 0xffffffff814c8744 <+4>: mov $0xffffffff824a7378,%rax 0xffffffff814c874b <+11>: cmp $0x23,%edi 0xffffffff814c874e <+14>: ja 0xffffffff814c8768 <blk_op_str+40> 0xffffffff814c8750 <+16>: mov %edi,%edi 126 const char *op_str = "UNKNOWN"; 0xffffffff814c8752 <+18>: mov $0xffffffff824a7378,%rdx 127 128 if (op < ARRAY_SIZE(blk_op_name) && blk_op_name[op]) 0xffffffff814c8759 <+25>: mov -0x7dfa0160(,%rdi,8),%rax 126 const char *op_str = "UNKNOWN"; 0xffffffff814c8761 <+33>: test %rax,%rax 0xffffffff814c8764 <+36>: cmove %rdx,%rax 129 op_str = blk_op_name[op]; 130 131 return op_str; 132 } 0xffffffff814c8768 <+40>: jmp 0xffffffff81d01360 <__x86_return_thunk> End of assembler dump. 3. The size of the main kernel package goes down substantially, especially if many modules are built (quite typical). Here is a comparison of installed size of the kernel package (configured with allmodconfig, dwarf4 debuginfo, and module compression turned off) before and after this patch: # rpm -qi kernel-6.13* | grep -E '^(Version|Size)' Version : 6.13.0postpatch+ Size : 1382874089 Version : 6.13.0prepatch+ Size : 17870795887 This is a ~92% size reduction. Note that a debuginfo package can only be produced if the following configs are set: - CONFIG_DEBUG_INFO=y - CONFIG_MODULE_COMPRESS=n - CONFIG_DEBUG_INFO_SPLIT=n The first of these is obvious - we can't produce debuginfo if the build does not generate it. The second two requirements can in principle be removed, but doing so is difficult with the current approach, which uses a generic rpmbuild script find-debuginfo.sh that processes all packaged executables. If we want to remove those requirements the best path forward is likely to add some debuginfo extraction/installation logic to the modules_install target (controllable by flags). That way, it's easier to operate on modules before they're compressed, and the logic can be reused by all packaging targets. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-04-06kconfig: merge_config: use an empty file as initfileDaniel Gomez
The scripts/kconfig/merge_config.sh script requires an existing $INITFILE (or the $1 argument) as a base file for merging Kconfig fragments. However, an empty $INITFILE can serve as an initial starting point, later referenced by the KCONFIG_ALLCONFIG Makefile variable if -m is not used. This variable can point to any configuration file containing preset config symbols (the merged output) as stated in Documentation/kbuild/kconfig.rst. When -m is used $INITFILE will contain just the merge output requiring the user to run make (i.e. KCONFIG_ALLCONFIG=<$INITFILE> make <allnoconfig/alldefconfig> or make olddefconfig). Instead of failing when `$INITFILE` is missing, create an empty file and use it as the starting point for merges. Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-04-06nios2: migrate to the generic rule for built-in DTBMasahiro Yamada
Commit 654102df2ac2 ("kbuild: add generic support for built-in boot DTBs") introduced generic support for built-in DTBs. Select GENERIC_BUILTIN_DTB when built-in DTB support is enabled. To keep consistency across architectures, this commit also renames CONFIG_NIOS2_DTB_SOURCE_BOOL to CONFIG_BUILTIN_DTB, and CONFIG_NIOS2_DTB_SOURCE to CONFIG_BUILTIN_DTB_NAME. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-04-05sh: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEXJohan Korsnes
This option was removed from Kconfig in 8c710f75256b ("net/sched: Retire tcindex classifier") but from the defconfigs. Fixes: 8c710f75256b ("net/sched: Retire tcindex classifier") Signed-off-by: Johan Korsnes <johan.korsnes@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2025-04-05sh: Align .bss section padding to 8-byte boundaryArtur Rojek
J2-based devices expect to find a device tree blob at the end of the .bss section. As of a77725a9a3c5 ("scripts/dtc: Update to upstream version v1.6.1-19-g0a3a9d3449c8"), libfdt enforces 8-byte alignment for the DTB, causing J2 devices to fail early in sh_fdt_init(). As the J2 loader firmware calculates the DTB location based on the kernel image .bss section size rather than the __bss_stop symbol offset, the required alignment can't be enforced with BSS_SECTION(0, PAGE_SIZE, 8). To fix this, inline a modified version of the above macro which grows .bss by the required size. While this change affects all existing SH boards, it should be benign on platforms which don't need this alignment. Signed-off-by: Artur Rojek <contact@artur-rojek.eu> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: Rob Landley <rob@landley.net> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2025-04-05Merge tag 'input-for-v6.15-rc0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - a brand new driver for touchpads and touchbars in newer Apple devices - support for Berlin-A series in goodix-berlin touchscreen driver - improvements to matrix_keypad driver to better handle GPIOs toggling - assorted small cleanups in other input drivers * tag 'input-for-v6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: goodix_berlin - add support for Berlin-A series dt-bindings: input: goodix,gt9916: Document gt9897 compatible dt-bindings: input: matrix_keypad - add wakeup-source property dt-bindings: input: matrix_keypad - add missing property Input: pm8941-pwrkey - fix dev_dbg() output in pm8941_pwrkey_irq() Input: synaptics - hide unused smbus_pnp_ids[] array Input: apple_z2 - fix potential confusion in Kconfig Input: matrix_keypad - use fsleep for delays after activating columns Input: matrix_keypad - add settle time after enabling all columns dt-bindings: input: matrix_keypad: add settle time after enabling all columns dt-bindings: input: matrix_keypad: convert to YAML dt-bindings: input: Correct indentation and style in DTS example MAINTAINERS: Add entries for Apple Z2 touchscreen driver Input: apple_z2 - add a driver for Apple Z2 touchscreens dt-bindings: input: touchscreen: Add Z2 controller Input: Switch to use hrtimer_setup() Input: drop vb2_ops_wait_prepare/finish