Age | Commit message (Collapse) | Author |
|
If a global function doesn't return, and its prototype has the
__noreturn attribute, its weak counterpart must also not return so that
it matches the prototype and meets call site expectations.
To properly follow the compiled control flow at the call sites, change
the global_noreturns check to include both global and weak functions.
On the other hand, if a weak function isn't in global_noreturns, assume
the prototype doesn't have __noreturn. Even if the weak function
doesn't return, call sites treat it like a returnable function.
Fixes the following warning:
kernel/sched/build_policy.o: warning: objtool: do_idle() falls through to next function play_idle_precise()
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/ede3460d63f4a65d282c86f1175bd2662c2286ba.1681342859.git.jpoimboe@kernel.org
|
|
In preparation for improving objtool's handling of weak noreturn
functions, mark nmi_panic_self_stop() __noreturn.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/316fc6dfab5a8c4e024c7185484a1ee5fb0afb79.1681342859.git.jpoimboe@kernel.org
|
|
In preparation for improving objtool's handling of weak noreturn
functions, mark panic_smp_self_stop() __noreturn.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/92d76ab5c8bf660f04fdcd3da1084519212de248.1681342859.git.jpoimboe@kernel.org
|
|
In preparation for marking panic_smp_self_stop() __noreturn across the
kernel, first mark the arm64 implementation of cpu_park_loop() and
related functions __noreturn.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/55787d3193ea3e295ccbb097abfab0a10ae49d45.1681342859.git.jpoimboe@kernel.org
|
|
Now that start_kernel() is __noreturn, mark its chain of callers
__noreturn.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/c2525f96b88be98ee027ee0291d58003036d4120.1681342859.git.jpoimboe@kernel.org
|
|
Now that arch_call_rest_init() is __noreturn, mark its caller
start_kernel() __noreturn.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/7069acf026a195f26a88061227fba5a3b0337b9a.1681342859.git.jpoimboe@kernel.org
|
|
In preparation for improving objtool's handling of weak noreturn
functions, mark start_kernel(), arch_call_rest_init(), and rest_init()
__noreturn.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/7194ed8a989a85b98d92e62df660f4a90435a723.1681342859.git.jpoimboe@kernel.org
|
|
Allow unwinding from prefix code by copying the CFI from the starting
instruction of the corresponding function. Even when the NOPs are
replaced, they're still stack-invariant instructions so the same ORC
entry can be reused everywhere.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/bc3344e51f3e87102f1301a0be0f72a7689ea4a4.1681331135.git.jpoimboe@kernel.org
|
|
CFI typed functions are failing to get padded properly for
CONFIG_CALL_PADDING.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/721f0da48d2a49fe907225711b8b76a2b787f9a8.1681331135.git.jpoimboe@kernel.org
|
|
Simplify the prefix code by moving it after
validate_reachable_instructions().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/d7f31ac2de462d0cd7b1db01b7ecb525c057c8f6.1681331135.git.jpoimboe@kernel.org
|
|
annotate_call_site() already sets 'insn->dead_end' for calls to dead end
functions.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/5d603a301e9a8b1036b61503385907e154867ace.1681325924.git.jpoimboe@kernel.org
|
|
Add [sec_]for_each_sym() and use them.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/59023e5886ab125aa30702e633be7732b1acaa7e.1681325924.git.jpoimboe@kernel.org
|
|
It's easier to use and also gives easy access to the instruction's
containing function, which is useful for printing that function's
symbol. It will also be useful in the future for rate-limiting and
disassembly of warned functions.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/2eaa3155c90fba683d8723599f279c46025b75f3.1681325924.git.jpoimboe@kernel.org
|
|
Allow specifying multiple functions on the cmdline. Note this removes
the secret EXTRA_ARGS feature.
While at it, spread out the awk to make it more readable.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/0bf5f4f5978660985037b24c6db49b114374eb4d.1681325924.git.jpoimboe@kernel.org
|
|
With KCSAN enabled, even empty inline stubs can be out-of-lined.
Force the context_tracking_guest_exit() stub inline.
Fixes the following warnings:
vmlinux.o: warning: objtool: vmx_vcpu_enter_exit+0x1be: call to context_tracking_guest_exit() leaves .noinstr.text section
vmlinux.o: warning: objtool: svm_vcpu_enter_exit+0x85: call to context_tracking_guest_exit() leaves .noinstr.text section
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/dc93f45abdec90c171108b4b590b7fff5790963c.1681320026.git.jpoimboe@kernel.org
|
|
If a function has a large stack frame, the stackleak plugin adds a call
to stackleak_track_stack() after the prologue.
This function may be called in uaccess-enabled code. Add it to the
uaccess safe list.
Fixes the following warning:
vmlinux.o: warning: objtool: kasan_report+0x12: call to stackleak_track_stack() with UACCESS enabled
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/42e9b487ef89e9b237fd5220ad1c7cf1a2ad7eb8.1681320562.git.jpoimboe@kernel.org
|
|
Commit 468af56a7bba ("objtool: Support addition to set CFA base") was
added as a preparatory patch for arm64 support, but that support never
came. It triggers a false positive warning on x86, so just revert it
for now.
Fixes the following warning:
vmlinux.o: warning: objtool: cdce925_regmap_i2c_write+0xdb: stack state mismatch: cfa1=4+120 cfa2=5+40
Fixes: 468af56a7bba ("objtool: Support addition to set CFA base")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/oe-kbuild-all/202304080538.j5G6h1AB-lkp@intel.com/
|
|
Fixes the following warning:
vmlinux.o: warning: objtool: check_stackleak_irqoff+0x2b6: call to _printk() leaves .noinstr.text section
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/ee5209f53aa0a62aea58be18f2b78b17606779a6.1681320026.git.jpoimboe@kernel.org
|
|
With KCSAN enabled, end_of_stack() can get out-of-lined. Force it
inline.
Fixes the following warnings:
vmlinux.o: warning: objtool: check_stackleak_irqoff+0x2b: call to end_of_stack() leaves .noinstr.text section
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/cc1b4d73d3a428a00d206242a68fdf99a934ca7b.1681320026.git.jpoimboe@kernel.org
|
|
Mark reported that the ORC unwinder incorrectly marks an unwind as
reliable when the unwind terminates prematurely in the dark corners of
return_to_handler() due to lack of information about the next frame.
The problem is UNWIND_HINT_EMPTY is used in two different situations:
1) The end of the kernel stack unwind before hitting user entry, boot
code, or fork entry
2) A blind spot in ORC coverage where the unwinder has to bail due to
lack of information about the next frame
The ORC unwinder has no way to tell the difference between the two.
When it encounters an undefined stack state with 'end=1', it blindly
marks the stack reliable, which can break the livepatch consistency
model.
Fix it by splitting UNWIND_HINT_EMPTY into UNWIND_HINT_UNDEFINED and
UNWIND_HINT_END_OF_STACK.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/fd6212c8b450d3564b855e1cb48404d6277b4d9f.1677683419.git.jpoimboe@kernel.org
|
|
The ENTRY unwind hint type is serving double duty as both an empty
unwind hint and an unret validation annotation.
Unret validation is unrelated to unwinding. Separate it out into its own
annotation.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/ff7448d492ea21b86d8a90264b105fbd0d751077.1677683419.git.jpoimboe@kernel.org
|
|
Unwind hints and ORC entry types are two distinct things. Separate them
out more explicitly.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/cc879d38fff8a43f8f7beb2fd56e35a5a384d7cd.1677683419.git.jpoimboe@kernel.org
|
|
The most important argument is 'type', make that one first.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/d994f8c29376c5618c75698df28fc03b52d3a868.1677683419.git.jpoimboe@kernel.org
|
|
They produce the needed relocations while using half the space.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/bed05c64e28200220c9b1754a2f3ce71f73076ea.1677683419.git.jpoimboe@kernel.org
|
|
Reduce the amount of header sync churn by splitting the shared objtool.h
types into a new file.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/dec622720851210ceafa12d4f4c5f9e73c832152.1677683419.git.jpoimboe@kernel.org
|
|
Include <linux/cpu.h> to make sure arch_cpu_idle_dead() matches its
prototype going forward.
Inspired-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20230214083857.50163-1-philmd@linaro.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
Before commit 076cbf5d2163 ("x86/xen: don't let xen_pv_play_dead()
return"), in Xen, when a previously offlined CPU was brought back
online, it unexpectedly resumed execution where it left off in the
middle of the idle loop.
There were some hacks to make that work, but the behavior was surprising
as do_idle() doesn't expect an offlined CPU to return from the dead (in
arch_cpu_idle_dead()).
Now that Xen has been fixed, and the arch-specific implementations of
arch_cpu_idle_dead() also don't return, give it a __noreturn attribute.
This will cause the compiler to complain if an arch-specific
implementation might return. It also improves code generation for both
caller and callee.
Also fixes the following warning:
vmlinux.o: warning: objtool: do_idle+0x25f: unreachable instruction
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/60d527353da8c99d4cf13b6473131d46719ed16d.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
arch_cpu_idle_dead() should never return. Make it so.
Link: https://lore.kernel.org/r/cf5ad95eef50f7704bb30e7770c59bfe23372af7.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
cpu_die() doesn't return. Annotate it as such. By extension this also
makes arch_cpu_idle_dead() noreturn.
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Link: https://lore.kernel.org/r/ad801544cab7c26a0f3bbf7cfefb67303f4cd866.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
cpu_die() doesn't return. Make that more explicit with a BUG().
BUG() is preferable to unreachable() because BUG() is a more explicit
failure mode and avoids undefined behavior like falling off the edge of
the function into whatever code happens to be next.
Link: https://lore.kernel.org/r/cca346b5c87693499e630291d78fb0bf12c24290.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
play_dead() doesn't return. Annotate it as such. By extension this
also makes arch_cpu_idle_dead() noreturn.
Link: https://lore.kernel.org/r/f3a069e6869c51ccfdda656b76882363bc9fcfa4.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
After commit 076cbf5d2163 ("x86/xen: don't let xen_pv_play_dead()
return"), play_dead() never returns. Make that more explicit with a
BUG().
BUG() is preferable to unreachable() because BUG() is a more explicit
failure mode and avoids undefined behavior like falling off the edge of
the function into whatever code happens to be next.
Link: https://lore.kernel.org/r/11e6ac1cf10f92967882926e3ac16287b50642f2.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
cpu_play_dead() doesn't return. Annotate it as such. By extension this
also makes arch_cpu_idle_dead() noreturn.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/847fdb53cc7124bb7c94e3e104e443a29be85184.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
Include <linux/cpu.h> to make sure arch_cpu_idle_dead() matches its
prototype going forward.
Link: https://lore.kernel.org/r/3d9661e97828fb464a48d4becf18f12604831903.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
play_dead() doesn't return. Annotate it as such. By extension this
also makes arch_cpu_idle_dead() noreturn.
Link: https://lore.kernel.org/r/03549a74fad9f73576d57e6fc0b5102322f9cff4.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
play_dead() doesn't return. Make that more explicit with a BUG().
BUG() is preferable to unreachable() because BUG() is a more explicit
failure mode and avoids undefined behavior like falling off the edge of
the function into whatever code happens to be next.
Link: https://lore.kernel.org/r/d0c3ff5349adfe8fd227acc236ae2c278a05eb4c.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
start_secondary_resume() doesn't return. Annotate it as such. By
extension this also makes arch_cpu_idle_dead() noreturn.
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/r/b6b2141f832d8cd8ade65f190d04b011cda5f9bb.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
play_dead() doesn't return. Annotate it as such. By extension this
also makes arch_cpu_idle_dead() noreturn.
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/2897b51a9b8beb5b594fe66fb1d3a479ddd2a0e2.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
play_dead() doesn't return. Make that more explicit with a BUG().
BUG() is preferable to unreachable() because BUG() is a more explicit
failure mode and avoids undefined behavior like falling off the edge of
the function into whatever code happens to be next.
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/b195e4da190bb06b7d4af15d66ce6129e2347630.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
Include <asm/smp.h> to make sure play_dead() matches its prototype going
forward.
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lkml.kernel.org/r/20230216184249.ogaqsaykottpxtcb@treble
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
play_dead() doesn't return. Annotate it as such. By extension this
also makes arch_cpu_idle_dead() noreturn.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/4da55acfdec8a9132c4e21ffb7edb1f846841193.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
play_dead() doesn't return. Make that more explicit with a BUG().
BUG() is preferable to unreachable() because BUG() is a more explicit
failure mode and avoids undefined behavior like falling off the edge of
the function into whatever code happens to be next.
Link: https://lore.kernel.org/r/21245d687ffeda34dbcf04961a2df3724f04f7c8.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
play_dead() doesn't return. Annotate it as such. By extension this
also makes arch_cpu_idle_dead() noreturn.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/7575bb38417bd8bcb5be980443f99cab29319342.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
arch_cpu_idle_dead() doesn't return. Make that more explicit with a
BUG().
BUG() is preferable to unreachable() because BUG() is a more explicit
failure mode and avoids undefined behavior like falling off the edge of
the function into whatever code happens to be next.
Acked-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/1e9ecc3d248e82973e80bc336fc9f97e3ba2708d.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
cpu_die() doesn't return. Annotate it as such. By extension this also
makes arch_cpu_idle_dead() noreturn.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lkml.kernel.org/r/20230216184157.4hup6y6mmspr2kll@treble
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
arch_cpu_idle_dead() doesn't return. Make that visible to the compiler
with an unreachable() code annotation.
Link: https://lkml.kernel.org/r/20230216183851.s5bnvniomq44rytu@treble
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
arch_cpu_idle_dead() doesn't return. Make that more explicit with a
BUG().
BUG() is preferable to unreachable() because BUG() is a more explicit
failure mode and avoids undefined behavior like falling off the edge of
the function into whatever code happens to be next.
Link: https://lore.kernel.org/r/19ffef09a175fecb783abcd44d6bcfeade2857eb.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
Include <linux/cpu.h> to make sure arch_cpu_idle_dead() matches its
prototype going forward.
Link: https://lore.kernel.org/r/b0405c2ac5686303b6026e1ac27cfd769b21a7d0.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
|
|
Commit aa47a7c215e7 ("lib/cpumask: deprecate nr_cpumask_bits") resulted
in the cpumask operations potentially becoming hugely less efficient,
because suddenly the cpumask was always considered to be variable-sized.
The optimization was then later added back in a limited form by commit
6f9c07be9d02 ("lib/cpumask: add FORCE_NR_CPUS config option"), but that
FORCE_NR_CPUS option is not useful in a generic kernel and more of a
special case for embedded situations with fixed hardware.
Instead, just re-introduce the optimization, with some changes.
Instead of depending on CPUMASK_OFFSTACK being false, and then always
using the full constant cpumask width, this introduces three different
cpumask "sizes":
- the exact size (nr_cpumask_bits) remains identical to nr_cpu_ids.
This is used for situations where we should use the exact size.
- the "small" size (small_cpumask_bits) is the NR_CPUS constant if it
fits in a single word and the bitmap operations thus end up able
to trigger the "small_const_nbits()" optimizations.
This is used for the operations that have optimized single-word
cases that get inlined, notably the bit find and scanning functions.
- the "large" size (large_cpumask_bits) is the NR_CPUS constant if it
is an sufficiently small constant that makes simple "copy" and
"clear" operations more efficient.
This is arbitrarily set at four words or less.
As a an example of this situation, without this fixed size optimization,
cpumask_clear() will generate code like
movl nr_cpu_ids(%rip), %edx
addq $63, %rdx
shrq $3, %rdx
andl $-8, %edx
callq memset@PLT
on x86-64, because it would calculate the "exact" number of longwords
that need to be cleared.
In contrast, with this patch, using a MAX_CPU of 64 (which is quite a
reasonable value to use), the above becomes a single
movq $0,cpumask
instruction instead, because instead of caring to figure out exactly how
many CPU's the system has, it just knows that the cpumask will be a
single word and can just clear it all.
Note that this does end up tightening the rules a bit from the original
version in another way: operations that set bits in the cpumask are now
limited to the actual nr_cpu_ids limit, whereas we used to do the
nr_cpumask_bits thing almost everywhere in the cpumask code.
But if you just clear bits, or scan for bits, we can use the simpler
compile-time constants.
In the process, remove 'cpumask_complement()' and 'for_each_cpu_not()'
which were not useful, and which fundamentally have to be limited to
'nr_cpu_ids'. Better remove them now than have somebody introduce use
of them later.
Of course, on x86-64 with MAXSMP there is no sane small compile-time
constant for the cpumask sizes, and we end up using the actual CPU bits,
and will generate the above kind of horrors regardless. Please don't
use MAXSMP unless you really expect to have machines with thousands of
cores.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|