summaryrefslogtreecommitdiff
path: root/arch/x86/include/asm/alternative.h
AgeCommit message (Collapse)Author
2024-12-02objtool: Collect more annotations in objtool.hPeter Zijlstra
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/20241128094311.786598147@infradead.org
2024-12-02objtool: Convert ANNOTATE_IGNORE_ALTERNATIVE to ANNOTATEPeter Zijlstra
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/20241128094311.465691316@infradead.org
2024-11-07x86/module: prepare module loading for ROX allocations of textMike Rapoport (Microsoft)
When module text memory will be allocated with ROX permissions, the memory at the actual address where the module will live will contain invalid instructions and there will be a writable copy that contains the actual module code. Update relocations and alternatives patching to deal with it. [rppt@kernel.org: fix writable address in cfi_rewrite_endbr()] Link: https://lkml.kernel.org/r/ZysRwR29Ji8CcbXc@kernel.org Link: https://lkml.kernel.org/r/20241023162711.2579610-7-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Tested-by: kdevops <kdevops@lists.linux.dev> Tested-by: Nathan Chancellor <nathan@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-01x86/alternatives, kvm: Fix a couple of CALLs without a frame pointerBorislav Petkov (AMD)
objtool complains: arch/x86/kvm/kvm.o: warning: objtool: .altinstr_replacement+0xc5: call without frame pointer save/setup vmlinux.o: warning: objtool: .altinstr_replacement+0x2eb: call without frame pointer save/setup Make sure %rSP is an output operand to the respective asm() statements. The test_cc() hunk and ALT_OUTPUT_SP() courtesy of peterz. Also from him add some helpful debugging info to the documentation. Now on to the explanations: tl;dr: The alternatives macros are pretty fragile. If I do ALT_OUTPUT_SP(output) in order to be able to package in a %rsp reference for objtool so that a stack frame gets properly generated, the inline asm input operand with positional argument 0 in clear_page(): "0" (page) gets "renumbered" due to the added : "+r" (current_stack_pointer), "=D" (page) and then gcc says: ./arch/x86/include/asm/page_64.h:53:9: error: inconsistent operand constraints in an ‘asm’ The fix is to use an explicit "D" constraint which points to a singleton register class (gcc terminology) which ends up doing what is expected here: the page pointer - input and output - should be in the same %rdi register. Other register classes have more than one register in them - example: "r" and "=r" or "A": ‘A’ The ‘a’ and ‘d’ registers. This class is used for instructions that return double word results in the ‘ax:dx’ register pair. Single word values will be allocated either in ‘ax’ or ‘dx’. so using "D" and "=D" just works in this particular case. And yes, one would say, sure, why don't you do "+D" but then: : "+r" (current_stack_pointer), "+D" (page) : [old] "i" (clear_page_orig), [new1] "i" (clear_page_rep), [new2] "i" (clear_page_erms), : "cc", "memory", "rax", "rcx") now find the Waldo^Wcomma which throws a wrench into all this. Because that silly macro has an "input..." consume-all last macro arg and in it, one is supposed to supply input *and* clobbers, leading to silly syntax snafus. Yap, they need to be cleaned up, one fine day... Closes: https://lore.kernel.org/oe-kbuild-all/202406141648.jO9qNGLa-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240625112056.GDZnqoGDXgYuWBDUwu@fat_crate.local
2024-06-11x86/alternative: Replace the old macrosBorislav Petkov (AMD)
Now that the new macros have been gradually put in place, replace the old ones. Leave the new label numbers starting at 7xx as a hint that the new nested alternatives are being used now. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-15-bp@kernel.org
2024-06-11x86/alternative: Convert the asm ALTERNATIVE_3() macroBorislav Petkov (AMD)
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-14-bp@kernel.org
2024-06-11x86/alternative: Convert the asm ALTERNATIVE_2() macroBorislav Petkov (AMD)
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-13-bp@kernel.org
2024-06-11x86/alternative: Convert the asm ALTERNATIVE() macroBorislav Petkov (AMD)
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-12-bp@kernel.org
2024-06-11x86/alternative: Convert ALTERNATIVE_3()Borislav Petkov (AMD)
Zap the hack of using an ALTERNATIVE_3() internal label, as suggested by bgerst: https://lore.kernel.org/r/CAMzpN2i4oJ-Dv0qO46Fd-DxNv5z9=x%2BvO%2B8g=47NiiAf8QEJYA@mail.gmail.com in favor of a label local to this macro only, as it should be done. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-11-bp@kernel.org
2024-06-11x86/alternative: Convert ALTERNATIVE_TERNARY()Borislav Petkov (AMD)
The C macro. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-10-bp@kernel.org
2024-06-11x86/alternative: Convert alternative_call_2()Borislav Petkov (AMD)
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-9-bp@kernel.org
2024-06-11x86/alternative: Convert alternative_call()Borislav Petkov (AMD)
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-8-bp@kernel.org
2024-06-11x86/alternative: Convert alternative_io()Borislav Petkov (AMD)
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-7-bp@kernel.org
2024-06-11x86/alternative: Convert alternative_input()Borislav Petkov (AMD)
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-6-bp@kernel.org
2024-06-11x86/alternative: Convert alternative_2()Borislav Petkov (AMD)
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-5-bp@kernel.org
2024-06-11x86/alternative: Convert alternative()Borislav Petkov (AMD)
Split conversion deliberately into minimal pieces to ease bisection because debugging alternatives is a nightmare. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-4-bp@kernel.org
2024-06-11x86/alternatives: Add nested alternatives macrosPeter Zijlstra
Instead of making increasingly complicated ALTERNATIVE_n() implementations, use a nested alternative expression. The only difference between: ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2) and ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1), newinst2, flag2) is that the outer alternative can add additional padding when the inner alternative is the shorter one, which then results in alt_instr::instrlen being inconsistent. However, this is easily remedied since the alt_instr entries will be consecutive and it is trivial to compute the max(alt_instr::instrlen) at runtime while patching. Specifically, after this the ALTERNATIVE_2 macro, after CPP expansion (and manual layout), looks like this: .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2 740: 740: \oldinstr ; 741: .skip -(((744f-743f)-(741b-740b)) > 0) * ((744f-743f)-(741b-740b)),0x90 ; 742: .pushsection .altinstructions,"a" ; altinstr_entry 740b,743f,\ft_flags1,742b-740b,744f-743f ; .popsection ; .pushsection .altinstr_replacement,"ax" ; 743: \newinstr1 ; 744: .popsection ; ; 741: .skip -(((744f-743f)-(741b-740b)) > 0) * ((744f-743f)-(741b-740b)),0x90 ; 742: .pushsection .altinstructions,"a" ; altinstr_entry 740b,743f,\ft_flags2,742b-740b,744f-743f ; .popsection ; .pushsection .altinstr_replacement,"ax" ; 743: \newinstr2 ; 744: .popsection ; .endm The only label that is ambiguous is 740, however they all reference the same spot, so that doesn't matter. NOTE: obviously only @oldinstr may be an alternative; making @newinstr an alternative would mean patching .altinstr_replacement which very likely isn't what is intended, also the labels will be confused in that case. [ bp: Debug an issue where it would match the wrong two insns and and consider them nested due to the same signed offsets in the .alternative section and use instr_va() to compare the full virtual addresses instead. - Use new labels to denote that the new, nested alternatives are being used when staring at preprocessed output. - Use the %c constraint everywhere instead of %P and document the difference for future reference. ] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230628104952.GA2439977@hirez.programming.kicks-ass.net
2024-06-11x86/alternative: Zap alternative_ternary()Borislav Petkov (AMD)
Unused. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240607111701.8366-2-bp@kernel.org
2024-05-14Merge tag 'x86_alternatives_for_v6.10_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm alternatives updates from Borislav Petkov: - Switch the in-place instruction patching which lead to at least one weird bug with 32-bit guests, seeing stale instruction bytes, to one working on a buffer, like the rest of the alternatives code does - Add a long overdue check to the X86_FEATURE flag modifying functions to warn when former get changed in a non-compatible way after alternatives have been patched because those changes will be already wrong - Other cleanups * tag 'x86_alternatives_for_v6.10_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Remove alternative_input_2() x86/alternatives: Sort local vars in apply_alternatives() x86/alternatives: Optimize optimize_nops() x86/alternatives: Get rid of __optimize_nops() x86/alternatives: Use a temporary buffer when optimizing NOPs x86/alternatives: Catch late X86_FEATURE modifiers
2024-05-13Merge tag 'x86-asm-2024-05-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: - Clean up & fix asm() operand modifiers & constraints - Misc cleanups * tag 'x86-asm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Remove a superfluous newline in _static_cpu_has() x86/asm/64: Clean up memset16(), memset32(), memset64() assembly constraints in <asm/string_64.h> x86/asm: Use "m" operand constraint in WRUSSQ asm template x86/asm: Use %a instead of %P operand modifier in asm templates x86/asm: Use %c/%n instead of %P operand modifier in asm templates x86/asm: Remove %P operand modifier from altinstr asm templates
2024-05-06x86/alternatives: Remove alternative_input_2()Borislav Petkov (AMD)
It is unused. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240506122848.20326-1-bp@kernel.org
2024-04-01x86/bpf: Fix IP for relocating call depth accountingJoan Bruguera Micó
The commit: 59bec00ace28 ("x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR()") made PER_CPU_VAR() to use rip-relative addressing, hence INCREMENT_CALL_DEPTH macro and skl_call_thunk_template got rip-relative asm code inside of it. A follow up commit: 17bce3b2ae2d ("x86/callthunks: Handle %rip-relative relocations in call thunk template") changed x86_call_depth_emit_accounting() to use apply_relocation(), but mistakenly assumed that the code is being patched in-place (where the destination of the relocation matches the address of the code), using *pprog as the destination ip. This is not true for the call depth accounting, emitted by the BPF JIT, so the calculated address was wrong, JIT-ed BPF progs on kernels with call depth tracking got broken and usually caused a page fault. Pass the destination IP when the BPF JIT emits call depth accounting. Fixes: 17bce3b2ae2d ("x86/callthunks: Handle %rip-relative relocations in call thunk template") Signed-off-by: Joan Bruguera Micó <joanbrugueram@gmail.com> Reviewed-by: Uros Bizjak <ubizjak@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20240401185821.224068-3-ubizjak@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-19x86/asm: Use %c/%n instead of %P operand modifier in asm templatesUros Bizjak
The "P" asm operand modifier is a x86 target-specific modifier. When used with a constant, the "P" modifier emits "cst" instead of "$cst". This property is currently used to emit the bare constant without all syntax-specific prefixes. The generic "c" resp. "n" operand modifier should be used instead. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240319104418.284519-3-ubizjak@gmail.com
2023-12-10x86/paravirt: Switch mixed paravirt/alternative calls to alternativesJuergen Gross
Instead of stacking alternative and paravirt patching, use the new ALT_FLAG_CALL flag to switch those mixed calls to pure alternative handling. Eliminate the need to be careful regarding the sequence of alternative and paravirt patching. [ bp: Touch up commit message. ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20231210062138.2417-5-jgross@suse.com
2023-12-10x86/alternative: Add indirect call patchingJuergen Gross
In order to prepare replacing of paravirt patching with alternative patching, add the capability to replace an indirect call with a direct one. This is done via a new flag ALT_FLAG_CALL as the target of the CALL instruction needs to be evaluated using the value of the location addressed by the indirect call. For convenience, add a macro for a default CALL instruction. In case it is being used without the new flag being set, it will result in a BUG() when being executed. As in most cases, the feature used will be X86_FEATURE_ALWAYS so add another macro ALT_CALL_ALWAYS usable for the flags parameter of the ALTERNATIVE macros. For a complete replacement, handle the special cases of calling a nop function and an indirect call of NULL the same way as paravirt does. [ bp: Massage commit message, fixup the debug output and clarify flow more. ] Co-developed-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231210062138.2417-4-jgross@suse.com
2023-12-10x86/paravirt: Move some functions and defines to alternative.cJuergen Gross
As a preparation for replacing paravirt patching completely by alternative patching, move some backend functions and #defines to the alternatives code and header. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231129133332.31043-3-jgross@suse.com
2023-09-22x86/speculation, objtool: Use absolute relocations for annotationsFangrui Song
.discard.retpoline_safe sections do not have the SHF_ALLOC flag. These sections referencing text sections' STT_SECTION symbols with PC-relative relocations like R_386_PC32 [0] is conceptually not suitable. Newer LLD will report warnings for REL relocations even for relocatable links [1]: ld.lld: warning: vmlinux.a(drivers/i2c/busses/i2c-i801.o):(.discard.retpoline_safe+0x120): has non-ABS relocation R_386_PC32 against symbol '' Switch to absolute relocations instead, which indicate link-time addresses. In a relocatable link, these addresses are also output section offsets, used by checks in tools/objtool/check.c. When linking vmlinux, these .discard.* sections will be discarded, therefore it is not a problem that R_X86_64_32 cannot represent a kernel address. Alternatively, we could set the SHF_ALLOC flag for .discard.* sections, but I think non-SHF_ALLOC for sections to be discarded makes more sense. Note: if we decide to never support REL architectures (e.g. arm, i386), we can utilize R_*_NONE relocations (.reloc ., BFD_RELOC_NONE, sym), making .discard.* sections zero-sized. That said, the section content waste is 4 bytes per entry, much smaller than sizeof(Elf{32,64}_Rel). [0] commit 1c0c1faf5692 ("objtool: Use relative pointers for annotations") [1] https://github.com/ClangBuiltLinux/linux/issues/1937 Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20230920001728.1439947-1-maskray@google.com
2023-07-10x86/alternative: Rename apply_ibt_endbr()Peter Zijlstra
The current name doesn't reflect what it does very well. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lkml.kernel.org/r/20230622144321.427441595%40infradead.org
2023-06-07Revert "x86/orc: Make it callthunk aware"Josh Poimboeuf
Commit 396e0b8e09e8 ("x86/orc: Make it callthunk aware") attempted to deal with the fact that function prefix code didn't have ORC coverage. However, it didn't work as advertised. Use of the "null" ORC entry just caused affected unwinds to end early. The root cause has now been fixed with commit 5743654f5e2e ("objtool: Generate ORC data for __pfx code"). Revert most of commit 396e0b8e09e8 ("x86/orc: Make it callthunk aware"). The is_callthunk() function remains as it's now used by other code. Link: https://lore.kernel.org/r/a05b916ef941da872cbece1ab3593eceabd05a79.1684245404.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-04-18x86/alternatives: Do not use integer constant suffixes in inline asmWilly Tarreau
The usage of the BIT() macro in inline asm code was introduced in 6.3 by the commit in the Fixes tag. However, this macro uses "1UL" for integer constant suffixes in its shift operation, while gas before 2.28 does not support the "L" suffix after a number, and gas before 2.27 does not support the "U" suffix, resulting in build errors such as the following with such versions: ./arch/x86/include/asm/uaccess_64.h:124: Error: found 'L', expected: ')' ./arch/x86/include/asm/uaccess_64.h:124: Error: junk at end of line, first unrecognized character is `L' However, the currently minimal binutils version the kernel supports is 2.25. There's a single use of this macro here, revert to (1 << 0) that works with such older binutils. As an additional info, the binutils PRs which add support for those suffixes are: https://sourceware.org/bugzilla/show_bug.cgi?id=19910 https://sourceware.org/bugzilla/show_bug.cgi?id=20732 [ bp: Massage and extend commit message. ] Fixes: 5d1dd961e743 ("x86/alternatives: Add alt_instr.flags") Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Jingbo Xu <jefflexu@linux.alibaba.com> Link: https://lore.kernel.org/lkml/a9aae568-3046-306c-bd71-92c1fc8eeddc@linux.alibaba.com/
2023-01-05x86/alternatives: Add alt_instr.flagsBorislav Petkov (AMD)
Add a struct alt_instr.flags field which will contain different flags controlling alternatives patching behavior. The initial idea was to be able to specify it as a separate macro parameter but that would mean touching all possible invocations of the alternatives macros and thus a lot of churn. What is more, as PeterZ suggested, being able to say ALT_NOT(feature) is very readable and explains exactly what is meant. So make the feature field a u32 where the patching flags are the upper u16 part of the dword quantity while the lower u16 word is the feature. The highest feature number currently is 0x26a (i.e., word 19) so there is plenty of space. If that becomes insufficient, the field can be extended to u64 which will then make struct alt_instr of the nice size of 16 bytes (14 bytes currently). There should be no functional changes resulting from this. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/Y6RCoJEtxxZWwotd@zn.tnic
2022-11-01x86/ibt: Implement FineIBTPeter Zijlstra
Implement an alternative CFI scheme that merges both the fine-grained nature of kCFI but also takes full advantage of the coarse grained hardware CFI as provided by IBT. To contrast: kCFI is a pure software CFI scheme and relies on being able to read text -- specifically the instruction *before* the target symbol, and does the hash validation *before* doing the call (otherwise control flow is compromised already). FineIBT is a software and hardware hybrid scheme; by ensuring every branch target starts with a hash validation it is possible to place the hash validation after the branch. This has several advantages: o the (hash) load is avoided; no memop; no RX requirement. o IBT WAIT-FOR-ENDBR state is a speculation stop; by placing the hash validation in the immediate instruction after the branch target there is a minimal speculation window and the whole is a viable defence against SpectreBHB. o Kees feels obliged to mention it is slightly more vulnerable when the attacker can write code. Obviously this patch relies on kCFI, but additionally it also relies on the padding from the call-depth-tracking patches. It uses this padding to place the hash-validation while the call-sites are re-written to modify the indirect target to be 16 bytes in front of the original target, thus hitting this new preamble. Notably, there is no hardware that needs call-depth-tracking (Skylake) and supports IBT (Tigerlake and onwards). Suggested-by: Joao Moreira (Intel) <joao@overdrivepizza.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221027092842.634714496@infradead.org
2022-10-17x86/bpf: Emit call depth accounting if requiredThomas Gleixner
Ensure that calls in BPF jitted programs are emitting call depth accounting when enabled to keep the call/return balanced. The return thunk jump is already injected due to the earlier retbleed mitigations. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111148.615413406@infradead.org
2022-10-17x86/orc: Make it callthunk awarePeter Zijlstra
Callthunks addresses on the stack would confuse the ORC unwinder. Handle them correctly and tell ORC to proceed further down the stack. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111148.511637628@infradead.org
2022-10-17static_call: Add call depth tracking supportPeter Zijlstra
When indirect calls are switched to direct calls then it has to be ensured that the call target is not the function, but the call thunk when call depth tracking is enabled. But static calls are available before call thunks have been set up. Ensure a second run through the static call patching code after call thunks have been created. When call thunks are not enabled this has no side effects. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111148.306100465@infradead.org
2022-10-17x86/asm: Provide ALTERNATIVE_3Peter Zijlstra
Fairly straight forward adaptation/extention of ALTERNATIVE_2. Required for call depth tracking. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111147.787711192@infradead.org
2022-10-17x86/modules: Add call patchingThomas Gleixner
As for the builtins create call thunks and patch the call sites to call the thunk on Intel SKL CPUs for retbleed mitigation. Note, that module init functions are ignored for sake of simplicity because loading modules is not something which is done in high frequent loops and the attacker has not really a handle on when this happens in order to launch a matching attack. The depth tracking will still work for calls into the builtins and because the call is not accounted it will underflow faster and overstuff, but that's mitigated by the saturating counter and the side effect is only temporary. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111147.575673066@infradead.org
2022-10-17x86/callthunks: Add call patching for call depth trackingThomas Gleixner
Mitigating the Intel SKL RSB underflow issue in software requires to track the call depth. That is every CALL and every RET need to be intercepted and additional code injected. The existing retbleed mitigations already include means of redirecting RET to __x86_return_thunk; this can be re-purposed and RET can be redirected to another function doing RET accounting. CALL accounting will use the function padding introduced in prior patches. For each CALL instruction, the destination symbol's padding is rewritten to do the accounting and the CALL instruction is adjusted to call into the padding. This ensures only affected CPUs pay the overhead of this accounting. Unaffected CPUs will leave the padding unused and have their 'JMP __x86_return_thunk' replaced with an actual 'RET' instruction. Objtool has been modified to supply a .call_sites section that lists all the 'CALL' instructions. Additionally the paravirt instruction sites are iterated since they will have been patched from an indirect call to direct calls (or direct instructions in which case it'll be ignored). Module handling and the actual thunk code for SKL will be added in subsequent steps. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111147.470877038@infradead.org
2022-06-27x86: Undo return-thunk damagePeter Zijlstra
Introduce X86_FEATURE_RETHUNK for those afflicted with needing this. [ bp: Do only INT3 padding - simpler. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>
2022-03-15x86/alternative: Use .ibt_endbr_seal to seal indirect callsPeter Zijlstra
Objtool's --ibt option generates .ibt_endbr_seal which lists superfluous ENDBR instructions. That is those instructions for which the function is never indirectly called. Overwrite these ENDBR instructions with a NOP4 such that these function can never be indirect called, reducing the number of viable ENDBR targets in the kernel. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154319.822545231@infradead.org
2021-10-28x86/alternative: Implement .retpoline_sites supportPeter Zijlstra
Rewrite retpoline thunk call sites to be indirect calls for spectre_v2=off. This ensures spectre_v2=off is as near to a RETPOLINE=n build as possible. This is the replacement for objtool writing alternative entries to ensure the same and achieves feature-parity with the previous approach. One noteworthy feature is that it relies on the thunks to be in machine order to compute the register index. Specifically, this does not yet address the Jcc __x86_indirect_thunk_* calls generated by clang, a future patch will add this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.232495794@infradead.org
2021-04-02x86/alternatives: Optimize optimize_nops()Peter Zijlstra
Currently, optimize_nops() scans to see if the alternative starts with NOPs. However, the emit pattern is: 141: \oldinstr 142: .skip (len-(142b-141b)), 0x90 That is, when 'oldinstr' is short, the tail is padded with NOPs. This case never gets optimized. Rewrite optimize_nops() to replace any trailing string of NOPs inside the alternative to larger NOPs. Also run it irrespective of patching, replacing NOPs in both the original and replaced code. A direct consequence is that 'padlen' becomes superfluous, so remove it. [ bp: - Adjust commit message - remove a stale comment about needing to pad - add a comment in optimize_nops() - exit early if the NOP verif. loop catches a mismatch - function should not not add NOPs in that case - fix the "optimized NOPs" offsets output ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210326151259.442992235@infradead.org
2021-03-11x86/alternative: Support ALTERNATIVE_TERNARYJuergen Gross
Add ALTERNATIVE_TERNARY support for replacing an initial instruction with either of two instructions depending on a feature: ALTERNATIVE_TERNARY "default_instr", FEATURE_NR, "feature_on_instr", "feature_off_instr" which will start with "default_instr" and at patch time will, depending on FEATURE_NR being set or not, patch that with either "feature_on_instr" or "feature_off_instr". [ bp: Add comment ontop. ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-7-jgross@suse.com
2021-03-11x86/alternative: Support not-featureJuergen Gross
Add support for alternative patching for the case a feature is not present on the current CPU. For users of ALTERNATIVE() and friends, an inverted feature is specified by applying the ALT_NOT() macro to it, e.g.: ALTERNATIVE(old, new, ALT_NOT(feature)); Committer note: The decision to encode the NOT-bit in the feature bit itself is because a future change which would make objtool generate such alternative calls, would keep the code in objtool itself fairly simple. Also, this allows for the alternative macros to support the NOT feature without having to change them. Finally, the u16 cpuid member encoding the X86_FEATURE_ flags is not an ABI so if more bits are needed, cpuid itself can be enlarged or a flags field can be added to struct alt_instr after having considered the size growth in either cases. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311142319.4723-6-jgross@suse.com
2021-03-11x86/alternative: Merge include filesJuergen Gross
Merge arch/x86/include/asm/alternative-asm.h into arch/x86/include/asm/alternative.h in order to make it easier to use common definitions later. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311142319.4723-2-jgross@suse.com
2021-03-09x86/alternative: Drop unused feature parameter from ALTINSTR_REPLACEMENT()Juergen Gross
The macro ALTINSTR_REPLACEMENT() doesn't make use of the feature parameter, so drop it. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210309134813.23912-4-jgross@suse.com
2019-09-15x86: alternative.h: use asm_inline for all alternative variantsRasmus Villemoes
Most, if not all, uses of the alternative* family just provide one or two instructions in .text, but the string literal can be quite large, causing gcc to overestimate the size of the generated code. That in turn affects its decisions about inlining of the function containing the alternative() asm statement. New enough versions of gcc allow one to overrule the estimated size by using "asm inline" instead of just "asm". So replace asm by the helper asm_inline, which for older gccs just expands to asm. Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2019-04-03x86/nospec, objtool: Introduce ANNOTATE_IGNORE_ALTERNATIVEPeter Zijlstra
To facillitate other usage of ignoring alternatives; rename ANNOTATE_NOSPEC_IGNORE to ANNOTATE_IGNORE_ALTERNATIVE. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-16x86/alternatives: Add an ALTERNATIVE_3() macroBorislav Petkov
Similar to ALTERNATIVE_2(), ALTERNATIVE_3() selects between 3 possible variants. Will be used for adding RDTSCP to the rdtsc_ordered() alternatives. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: X86 ML <x86@kernel.org> Link: https://lkml.kernel.org/r/20181211222326.14581-4-bp@alien8.de
2019-01-16x86/alternatives: Add macro commentsBorislav Petkov
... so that when one stares at the .s output, one can find her way around the resulting asm magic. With it, ALTERNATIVE looks like this now: # ALT: oldnstr 661: ... 662: # ALT: padding .skip ... 663: .pushsection .altinstructions,"a" ... .popsection .pushsection .altinstr_replacement, "ax" # ALT: replacement 1 6641: ... 6651: .popsection Merge __OLDINSTR() into OLDINSTR(), while at it. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: X86 ML <x86@kernel.org> Link: https://lkml.kernel.org/r/20181211222326.14581-2-bp@alien8.de