summaryrefslogtreecommitdiff
path: root/arch/x86/boot
AgeCommit message (Collapse)Author
2022-12-07Merge tag 'v6.1-rc8' into efi/nextArd Biesheuvel
Linux 6.1-rc8
2022-12-05efi: Put Linux specific magic number in the DOS headerArd Biesheuvel
GRUB currently relies on the magic number in the image header of ARM and arm64 EFI kernel images to decide whether or not the image in question is a bootable kernel. However, the purpose of the magic number is to identify the image as one that implements the bare metal boot protocol, and so GRUB, which only does EFI boot, is limited unnecessarily to booting images that could potentially be booted in a non-EFI manner as well. This is problematic for the new zboot decompressor image format, as it can only boot in EFI mode, and must therefore not use the bare metal boot magic number in its header. For this reason, the strict magic number was dropped from GRUB, to permit essentially any kind of EFI executable to be booted via the 'linux' command, blurring the line between the linux loader and the chainloader. So let's use the same field in the DOS header that RISC-V and arm64 already use for their 'bare metal' magic numbers to store a 'generic Linux kernel' magic number, which can be used to identify bootable kernel images in PE format which don't necessarily implement a bare metal boot protocol in the same binary. Note that, in the context of EFI, the MS-DOS header is only described in terms of the fields that it shares with the hybrid PE/COFF image format, (i.e., the MS-DOS EXE magic number at offset #0 and the PE header offset at byte offset #0x3c). Since we aim for compatibility with EFI only, and not with MS-DOS or MS-Windows, we can use the remaining space in the MS-DOS header however we want. Let's set the generic magic number for x86 images as well: existing bootloaders already have their own methods to identify x86 Linux images that can be booted in a non-EFI manner, and having the magic number in place there will ease any future transitions in loader implementations to merge the x86 and non-x86 EFI boot paths. Note that 32-bit ARM already uses the same location in the header for a different purpose, but the ARM support is already widely implemented and the EFI zboot decompressor is not available on ARM anyway, so we just disregard it here. Acked-by: Leif Lindholm <quic_llindhol@quicinc.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-29x86/boot: Remove x86_32 PIC using %ebx workaroundUros Bizjak
The currently supported minimum gcc version is 5.1. Before that, the PIC register, when generating Position Independent Code, was considered "fixed" in the sense that it wasn't in the set of registers available to the compiler's register allocator. Which, on x86-32, is already a very small set. What is more, the register allocator was unable to satisfy extended asm "=b" constraints. (Yes, PIC code uses %ebx on 32-bit as the base reg.) With gcc 5.1: "Reuse of the PIC hard register, instead of using a fixed register, was implemented on x86/x86-64 targets. This improves generated PIC code performance as more hard registers can be used. Shared libraries can significantly benefit from this optimization. Currently it is switched on only for x86/x86-64 targets. As RA infrastructure is already implemented for PIC register reuse, other targets might follow this in the future." (from: https://gcc.gnu.org/gcc-5/changes.html) which basically means that the register allocator has a higher degree of freedom when handling %ebx, including reloading it with the correct value before a PIC access. Furthermore: arch/x86/Makefile: # Never want PIC in a 32-bit kernel, prevent breakage with GCC built # with nonstandard options KBUILD_CFLAGS += -fno-pic $ gcc -Wp,-MMD,arch/x86/boot/.cpuflags.o.d ... -fno-pic ... -D__KBUILD_MODNAME=kmod_cpuflags -c -o arch/x86/boot/cpuflags.o arch/x86/boot/cpuflags.c so the 32-bit workaround in cpuid_count() is fixing exactly nothing because 32-bit configs don't even allow PIC builds. As to 64-bit builds: they're done using -mcmodel=kernel which produces RIP-relative addressing for PIC builds and thus does not apply here either. So get rid of the thing and make cpuid_count() nice and simple. There should be no functional changes resulting from this. [ bp: Expand commit message. ] Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221104124546.196077-1-ubizjak@gmail.com
2022-11-24x86/efi: Make the deprecated EFI handover protocol optionalArd Biesheuvel
The EFI handover protocol permits a bootloader to invoke the kernel as a EFI PE/COFF application, while passing a bootparams struct as a third argument to the entrypoint function call. This has no basis in the UEFI specification, and there are better ways to pass additional data to a UEFI application (UEFI configuration tables, UEFI variables, UEFI protocols) than going around the StartImage() boot service and jumping to a fixed offset in the loaded image, just to call a different function that takes a third parameter. The reason for handling struct bootparams in the bootloader was that the EFI stub could only load initrd images from the EFI system partition, and so passing it via struct bootparams was needed for loaders like GRUB, which pass the initrd in memory, and may load it from anywhere, including from the network. Another motivation was EFI mixed mode, which could not use the initrd loader in the EFI stub at all due to 32/64 bit incompatibilities (which will be fixed shortly [0]), and could not invoke the ordinary PE/COFF entry point either, for the same reasons. Given that loaders such as GRUB already carried the bootparams handling in order to implement non-EFI boot, retaining that code and just passing bootparams to the EFI stub was a reasonable choice (although defining an alternate entrypoint could have been avoided.) However, the GRUB side changes never made it upstream, and are only shipped by some of the distros in their downstream versions. In the meantime, EFI support has been added to other Linux architecture ports, as well as to U-boot and systemd, including arch-agnostic methods for passing initrd images in memory [1], and for doing mixed mode boot [2], none of them requiring anything like the EFI handover protocol. So given that only out-of-tree distro GRUB relies on this, let's permit it to be omitted from the build, in preparation for retiring it completely at a later date. (Note that systemd-boot does have an implementation as well, but only uses it as a fallback for booting images that do not implement the LoadFile2 based initrd loading method, i.e., v5.8 or older) [0] https://lore.kernel.org/all/20220927085842.2860715-1-ardb@kernel.org/ [1] ec93fc371f01 ("efi/libstub: Add support for loading the initrd from a device path") [2] 97aa276579b2 ("efi/x86: Add true mixed mode entry point into .compat section") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-18-ardb@kernel.org
2022-11-24x86/boot/compressed: Only build mem_encrypt.S if AMD_MEM_ENCRYPT=yArd Biesheuvel
Avoid building the mem_encrypt.o object if memory encryption support is not enabled to begin with. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-17-ardb@kernel.org
2022-11-24x86/boot/compressed: Adhere to calling convention in get_sev_encryption_bit()Ard Biesheuvel
Make get_sev_encryption_bit() follow the ordinary i386 calling convention, and only call it if CONFIG_AMD_MEM_ENCRYPT is actually enabled. This clarifies the calling code, and makes it more maintainable. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-16-ardb@kernel.org
2022-11-24x86/boot/compressed: Move startup32_check_sev_cbit() out of head_64.SArd Biesheuvel
Now that the startup32_check_sev_cbit() routine can execute from anywhere and behaves like an ordinary function, it can be moved where it belongs. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-15-ardb@kernel.org
2022-11-24x86/boot/compressed: Move startup32_check_sev_cbit() into .textArd Biesheuvel
Move startup32_check_sev_cbit() into the .text section and turn it into an ordinary function using the ordinary 32-bit calling convention, instead of saving/restoring the registers that are known to be live at the only call site. This improves maintainability, and makes it possible to move this function out of head_64.S and into a separate compilation unit that is specific to memory encryption. Note that this requires the call site to be moved before the mixed mode check, as %eax will be live otherwise. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-14-ardb@kernel.org
2022-11-24x86/boot/compressed: Move startup32_load_idt() out of head_64.SArd Biesheuvel
Now that startup32_load_idt() has been refactored into an ordinary callable function, move it into mem-encrypt.S where it belongs. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-13-ardb@kernel.org
2022-11-24x86/boot/compressed: Move startup32_load_idt() into .text sectionArd Biesheuvel
Convert startup32_load_idt() into an ordinary function and move it into the .text section. This involves turning the rva() immediates into ones derived from a local label, and preserving/restoring the %ebp and %ebx as per the calling convention. Also move the #ifdef to the only existing call site. This makes it clear that the function call does nothing if support for memory encryption is not compiled in. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-12-ardb@kernel.org
2022-11-24x86/boot/compressed: Pull global variable reference into startup32_load_idt()Ard Biesheuvel
In preparation for moving startup32_load_idt() out of head_64.S and turning it into an ordinary function using the ordinary 32-bit calling convention, pull the global variable reference to boot32_idt up into startup32_load_idt() so that startup32_set_idt_entry() does not need to discover its own runtime physical address, which will no longer be correlated with startup_32 once this code is moved into .text. While at it, give startup32_set_idt_entry() static linkage. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-11-ardb@kernel.org
2022-11-24x86/boot/compressed: Avoid touching ECX in startup32_set_idt_entry()Ard Biesheuvel
Avoid touching register %ecx in startup32_set_idt_entry(), by folding the MOV, SHL and ORL instructions into a single ORL which no longer requires a temp register. This permits ECX to be used as a function argument in a subsequent patch. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-10-ardb@kernel.org
2022-11-24x86/boot/compressed: Simplify IDT/GDT preserve/restore in the EFI thunkArd Biesheuvel
Tweak the asm and remove some redundant instructions. While at it, fix the associated comment for style and correctness. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-9-ardb@kernel.org
2022-11-24x86/boot/compressed, efi: Merge multiple definitions of image_offset into oneArd Biesheuvel
There is no need for head_32.S and head_64.S both declaring a copy of the global 'image_offset' variable, so drop those and make the extern C declaration the definition. When image_offset is moved to the .c file, it needs to be placed particularly in the .data section because it lands by default in the .bss section which is cleared too late, in .Lrelocated, before the first access to it and thus garbage gets read, leading to SEV guests exploding in early boot. This happens only when the SEV guest kernel is loaded through grub. If supplied with qemu's -kernel command line option, that memory is always cleared upfront by qemu and all is fine there. [ bp: Expand commit message with SEV aspect. ] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-8-ardb@kernel.org
2022-11-24kbuild: fix "cat: .version: No such file or directory"Masahiro Yamada
Since commit 2df8220cc511 ("kbuild: build init/built-in.a just once"), the .version file is not touched at all when KBUILD_BUILD_VERSION is given. If KBUILD_BUILD_VERSION is specified and the .version file is missing (for example right after 'make mrproper'), "No such file or director" is shown. Even if the .version exists, it is irrelevant to the version of the current build. $ make -j$(nproc) KBUILD_BUILD_VERSION=100 mrproper defconfig all [ snip ] BUILD arch/x86/boot/bzImage cat: .version: No such file or directory Kernel: arch/x86/boot/bzImage is ready (#) Show KBUILD_BUILD_VERSION if it is given. Fixes: 2df8220cc511 ("kbuild: build init/built-in.a just once") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-11-22x86/boot/compressed: Move efi32_pe_entry() out of head_64.SArd Biesheuvel
Move the implementation of efi32_pe_entry() into efi-mixed.S, which is a more suitable location that only gets built if EFI mixed mode is actually enabled. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-7-ardb@kernel.org
2022-11-22x86/boot/compressed: Move efi32_entry out of head_64.SArd Biesheuvel
Move the efi32_entry() routine out of head_64.S and into efi-mixed.S, which reduces clutter in the complicated startup routines. It also permits linkage of some symbols used by code to be made local. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-6-ardb@kernel.org
2022-11-22x86/boot/compressed: Move efi32_pe_entry into .text sectionArd Biesheuvel
Move efi32_pe_entry() into the .text section, so that it can be moved out of head_64.S and into a separate compilation unit in a subsequent patch. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-5-ardb@kernel.org
2022-11-22x86/boot/compressed: Move bootargs parsing out of 32-bit startup codeArd Biesheuvel
Move the logic that chooses between the different EFI entrypoints out of the 32-bit boot path, and into a 64-bit helper that can perform the same task much more cleanly. While at it, document the mixed mode boot flow in a code comment. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-4-ardb@kernel.org
2022-11-22x86/boot/compressed: Move 32-bit entrypoint code into .text sectionArd Biesheuvel
Move the code that stores the arguments passed to the EFI entrypoint into the .text section, so that it can be moved into a separate compilation unit in a subsequent patch. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-3-ardb@kernel.org
2022-11-22x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.SArd Biesheuvel
In preparation for moving the mixed mode specific code out of head_64.S, rename the existing file to clarify that it contains more than just the mixed mode thunk. While at it, clean up the Makefile rules that add it to the build. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-2-ardb@kernel.org
2022-11-19x86/kaslr: Fix process_mem_region()'s return valueJiapeng Chong
Fix the following coccicheck warning: ./arch/x86/boot/compressed/kaslr.c:670:8-9: WARNING: return of 0/1 in function 'process_mem_region' with return type bool. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220421202556.129799-1-jiapeng.chong@linux.alibaba.com
2022-11-18efi: libstub: Permit mixed mode return types other than efi_status_tArd Biesheuvel
Rework the EFI stub macro wrappers around protocol method calls and other indirect calls in order to allow return types other than efi_status_t. This means the widening should be conditional on whether or not the return type is efi_status_t, and should be omitted otherwise. Also, switch to _Generic() to implement the type based compile time conditionals, which is more concise, and distinguishes between efi_status_t and u64 properly. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-02x86/boot: Repair kernel-doc for boot_kstrtoul()Lukas Bulwahn
Adjust the kernel-doc comment to have the proper function name: boot_kstrtoul(). [ bp: Massage commit message. ] Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221031094835.15923-1-lukas.bulwahn@gmail.com
2022-11-01kbuild: upgrade the orphan section warning to an error if CONFIG_WERROR is setXin Li
Andrew Cooper suggested upgrading the orphan section warning to a hard link error. However Nathan Chancellor said outright turning the warning into an error with no escape hatch might be too aggressive, as we have had these warnings triggered by new compiler generated sections, and suggested turning orphan sections into an error only if CONFIG_WERROR is set. Kees Cook echoed and emphasized that the mandate from Linus is that we should avoid breaking builds. It wrecks bisection, it causes problems across compiler versions, etc. Thus upgrade the orphan section warning to a hard link error only if CONFIG_WERROR is set. Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Xin Li <xin3.li@intel.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221025073023.16137-2-xin3.li@intel.com
2022-10-17arch: Introduce CONFIG_FUNCTION_ALIGNMENTPeter Zijlstra
Generic function-alignment infrastructure. Architectures can select FUNCTION_ALIGNMENT_xxB symbols; the FUNCTION_ALIGNMENT symbol is then set to the largest such selected size, 0 otherwise. From this the -falign-functions compiler argument and __ALIGN macro are set. This incorporates the DEBUG_FORCE_FUNCTION_ALIGN_64B knob and future alignment requirements for x86_64 (later in this series) into a single place. NOTE: also removes the 0x90 filler byte from the generic __ALIGN primitive, that value makes no sense outside of x86. NOTE: .balign 0 reverts to a no-op. Requested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111143.719248727@infradead.org
2022-10-10Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
2022-10-10Merge tag 'kbuild-v6.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove potentially incomplete targets when Kbuid is interrupted by SIGINT etc in case GNU Make may miss to do that when stderr is piped to another program. - Rewrite the single target build so it works more correctly. - Fix rpm-pkg builds with V=1. - List top-level subdirectories in ./Kbuild. - Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in kallsyms. - Avoid two different modules in lib/zstd/ having shared code, which potentially causes building the common code as build-in and modular back-and-forth. - Unify two modpost invocations to optimize the build process. - Remove head-y syntax in favor of linker scripts for placing particular sections in the head of vmlinux. - Bump the minimal GNU Make version to 3.82. - Clean up misc Makefiles and scripts. * tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits) docs: bump minimal GNU Make version to 3.82 ia64: simplify esi object addition in Makefile Revert "kbuild: Check if linker supports the -X option" kbuild: rebuild .vmlinux.export.o when its prerequisite is updated kbuild: move modules.builtin(.modinfo) rules to Makefile.vmlinux_o zstd: Fixing mixed module-builtin objects kallsyms: ignore __kstrtab_* and __kstrtabns_* symbols kallsyms: take the input file instead of reading stdin kallsyms: drop duplicated ignore patterns from kallsyms.c kbuild: reuse mksysmap output for kallsyms mksysmap: update comment about __crc_* kbuild: remove head-y syntax kbuild: use obj-y instead extra-y for objects placed at the head kbuild: hide error checker logs for V=1 builds kbuild: re-run modpost when it is updated kbuild: unify two modpost invocations kbuild: move vmlinux.o rule to the top Makefile kbuild: move .vmlinux.objs rule to Makefile.modpost kbuild: list sub-directories in ./Kbuild Makefile.compiler: replace cc-ifversion with compiler-specific macros ...
2022-10-04Merge tag 'x86_cleanups_for_v6.1_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: - The usual round of smaller fixes and cleanups all over the tree * tag 'x86_cleanups_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype x86/uaccess: Improve __try_cmpxchg64_user_asm() for x86_32 x86: Fix various duplicate-word comment typos x86/boot: Remove superfluous type casting from arch/x86/boot/bitops.h
2022-10-03x86: kmsan: disable instrumentation of unsupported codeAlexander Potapenko
Instrumenting some files with KMSAN will result in kernel being unable to link, boot or crashing at runtime for various reasons (e.g. infinite recursion caused by instrumentation hooks calling instrumented code again). Completely omit KMSAN instrumentation in the following places: - arch/x86/boot and arch/x86/realmode/rm, as KMSAN doesn't work for i386; - arch/x86/entry/vdso, which isn't linked with KMSAN runtime; - three files in arch/x86/kernel - boot problems; - arch/x86/mm/cpu_entry_area.c - recursion. Link: https://lkml.kernel.org/r/20220915150417.722975-33-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-29kbuild: build init/built-in.a just onceMasahiro Yamada
Kbuild builds init/built-in.a twice; first during the ordinary directory descending, second from scripts/link-vmlinux.sh. We do this because UTS_VERSION contains the build version and the timestamp. We cannot update it during the normal directory traversal since we do not yet know if we need to update vmlinux. UTS_VERSION is temporarily calculated, but omitted from the update check. Otherwise, vmlinux would be rebuilt every time. When Kbuild results in running link-vmlinux.sh, it increments the version number in the .version file and takes the timestamp at that time to really fix UTS_VERSION. However, updating the same file twice is a footgun. To avoid nasty timestamp issues, all build artifacts that depend on init/built-in.a are atomically generated in link-vmlinux.sh, where some of them do not need rebuilding. To fix this issue, this commit changes as follows: [1] Split UTS_VERSION out to include/generated/utsversion.h from include/generated/compile.h include/generated/utsversion.h is generated just before the vmlinux link. It is generated under include/generated/ because some decompressors (s390, x86) use UTS_VERSION. [2] Split init_uts_ns and linux_banner out to init/version-timestamp.c from init/version.c init_uts_ns and linux_banner contain UTS_VERSION. During the ordinary directory descending, they are compiled with __weak and used to determine if vmlinux needs relinking. Just before the vmlinux link, they are compiled without __weak to embed the real version and timestamp. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-08-24x86/boot: Don't propagate uninitialized boot_params->cc_blob_addressMichael Roth
In some cases, bootloaders will leave boot_params->cc_blob_address uninitialized rather than zeroing it out. This field is only meant to be set by the boot/compressed kernel in order to pass information to the uncompressed kernel when SEV-SNP support is enabled. Therefore, there are no cases where the bootloader-provided values should be treated as anything other than garbage. Otherwise, the uncompressed kernel may attempt to access this bogus address, leading to a crash during early boot. Normally, sanitize_boot_params() would be used to clear out such fields but that happens too late: sev_enable() may have already initialized it to a valid value that should not be zeroed out. Instead, have sev_enable() zero it out unconditionally beforehand. Also ensure this happens for !CONFIG_AMD_MEM_ENCRYPT as well by also including this handling in the sev_enable() stub function. [ bp: Massage commit message and comments. ] Fixes: b190a043c49a ("x86/sev: Add SEV-SNP feature detection/setup") Reported-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> Reported-by: watnuss@gmx.de Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=216387 Link: https://lore.kernel.org/r/20220823160734.89036-1-michael.roth@amd.com
2022-08-15x86/boot: Remove superfluous type casting from arch/x86/boot/bitops.hLi kunyu
'const void *' will auto-type-convert to just about any other const pointer type, no need to force it. [ mingo: Rewrote the changelog. ] Signed-off-by: Li kunyu <kunyu@nfschina.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220725042358.3377-1-kunyu@nfschina.com
2022-08-10x86: link vdso and boot with -z noexecstack --no-warn-rwx-segmentsNick Desaulniers
Users of GNU ld (BFD) from binutils 2.39+ will observe multiple instances of a new warning when linking kernels in the form: ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions Generally, we would like to avoid the stack being executable. Because there could be a need for the stack to be executable, assembler sources have to opt-in to this security feature via explicit creation of the .note.GNU-stack feature (which compilers create by default) or command line flag --noexecstack. Or we can simply tell the linker the production of such sections is irrelevant and to link the stack as --noexecstack. LLVM's LLD linker defaults to -z noexecstack, so this flag isn't strictly necessary when linking with LLD, only BFD, but it doesn't hurt to be explicit here for all linkers IMO. --no-warn-rwx-segments is currently BFD specific and only available in the current latest release, so it's wrapped in an ld-option check. While the kernel makes extensive usage of ELF sections, it doesn't use permissions from ELF segments. Link: https://lore.kernel.org/linux-block/3af4127a-f453-4cf7-f133-a181cce06f73@kernel.dk/ Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107 Link: https://github.com/llvm/llvm-project/issues/57009 Reported-and-tested-by: Jens Axboe <axboe@kernel.dk> Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-06x86/compressed/64: Add identity mappings for setup_data entriesMichael Roth
The decompressed kernel initially relies on the identity map set up by the boot/compressed kernel for accessing things like boot_params. With the recent introduction of SEV-SNP support, the decompressed kernel also needs to access the setup_data entries pointed to by boot_params->hdr.setup_data. This can lead to a crash in the kexec kernel during early boot due to these entries not currently being included in the initial identity map, see thread at Link below. Include mappings for the setup_data entries in the initial identity map. [ bp: Massage commit message and use a helper var for better readability. ] Fixes: b190a043c49a ("x86/sev: Add SEV-SNP feature detection/setup") Reported-by: Jun'ichi Nomura <junichi.nomura@nec.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/TYCPR01MB694815CD815E98945F63C99183B49@TYCPR01MB6948.jpnprd01.prod.outlook.com
2022-06-03Merge tag 'efi-next-for-v5.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull more EFI updates from Ard Biesheuvel: "Follow-up tweaks for EFI changes - they mostly address issues introduced this merge window, except for Heinrich's patch: - fix new DXE service invocations for mixed mode - use correct Kconfig symbol when setting PE header flag - clean up the drivers/firmware/efi Kconfig dependencies so that features that depend on CONFIG_EFI are hidden from the UI when the symbol is not enabled. Also included is a RISC-V bugfix from Heinrich to avoid read-write mappings of read-only firmware regions in the EFI page tables" * tag 'efi-next-for-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: clean up Kconfig dependencies on CONFIG_EFI efi/x86: libstub: Make DXE calls mixed mode safe efi: x86: Fix config name for setting the NX-compatibility flag in the PE header riscv: read-only pages should not be writable
2022-06-01efi: x86: Fix config name for setting the NX-compatibility flag in the PE headerLukas Bulwahn
Commit 21b68da7bf4a ("efi: x86: Set the NX-compatibility flag in the PE header") intends to set the compatibility flag, i.e., IMAGE_DLL_CHARACTERISTICS_NX_COMPAT, but this ifdef is actually dead as the CONFIG_DXE_MEM_ATTRIBUTES Kconfig option does not exist. The config is actually called EFI_DXE_MEM_ATTRIBUTES. Adjust the ifdef to use the intended config name. The issue was identified with ./scripts/checkkconfigsymbols.py. Fixes: 21b68da7bf4a ("efi: x86: Set the NX-compatibility flag in the PE header") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20220601115043.7678-1-lukas.bulwahn@gmail.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-05-26Merge tag 'kbuild-v5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add HOSTPKG_CONFIG env variable to allow users to override pkg-config - Support W=e as a shorthand for KCFLAGS=-Werror - Fix CONFIG_IKHEADERS build to support toybox cpio - Add scripts/dummy-tools/pahole to ease distro packagers' life - Suppress false-positive warnings from checksyscalls.sh for W=2 build - Factor out the common code of arch/*/boot/install.sh into scripts/install.sh - Support 'kernel-install' tool in scripts/prune-kernel - Refactor module-versioning to link the symbol versions at the final link of vmlinux and modules - Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in an arch-agnostic way - Refactor modpost, Makefiles * tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (56 commits) genksyms: adjust the output format to modpost kbuild: stop merging *.symversions kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS modpost: extract symbol versions from *.cmd files modpost: add sym_find_with_module() helper modpost: change the license of EXPORT_SYMBOL to bool type modpost: remove left-over cross_compile declaration kbuild: record symbol versions in *.cmd files kbuild: generate a list of objects in vmlinux modpost: move *.mod.c generation to write_mod_c_files() modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header scripts/prune-kernel: Use kernel-install if available kbuild: factor out the common installation code into scripts/install.sh modpost: split new_symbol() to symbol allocation and hash table addition modpost: make sym_add_exported() always allocate a new symbol modpost: make multiple export error modpost: dump Module.symvers in the same order of modules.order modpost: traverse the namespace_list in order modpost: use doubly linked list for dump_lists modpost: traverse unresolved symbols in order ...
2022-05-25Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm updates from Dave Airlie: "Intel have enabled DG2 on certain SKUs for laptops, AMD has started some new GPU support, msm has user allocated VA controls dma-buf: - add dma_resv_replace_fences - add dma_resv_get_singleton - make dma_excl_fence private core: - EDID parser refactorings - switch drivers to drm_mode_copy/duplicate - DRM managed mutex initialization display-helper: - put HDMI, SCDC, HDCP, DSC and DP into new module gem: - rework fence handling ttm: - rework bulk move handling - add common debugfs for resource managers - convert to kvcalloc format helpers: - support monochrome formats - RGB888, RGB565 to XRGB8888 conversions fbdev: - cfb/sys_imageblit fixes - pagelist corruption fix - create offb platform device - deferred io improvements sysfb: - Kconfig rework - support for VESA mode selection bridge: - conversions to devm_drm_of_get_bridge - conversions to panel_bridge - analogix_dp - autosuspend support - it66121 - audio support - tc358767 - DSI to DPI support - icn6211 - PLL/I2C fixes, DT property - adv7611 - enable DRM_BRIDGE_OP_HPD - anx7625 - fill ELD if no monitor - dw_hdmi - add audio support - lontium LT9211 support, i.MXMP LDB - it6505: Kconfig fix, DPCD set power fix - adv7511 - CEC support for ADV7535 panel: - ltk035c5444t, B133UAN01, NV3052C panel support - DataImage FG040346DSSWBG04 support - st7735r - DT bindings fix - ssd130x - fixes i915: - DG2 laptop PCI-IDs ("motherboard down") - Initial RPL-P PCI IDs - compute engine ABI - DG2 Tile4 support - DG2 CCS clear color compression support - DG2 render/media compression formats support - ATS-M platform info - RPL-S PCI IDs added - Bump ADL-P DMC version to v2.16 - Support static DRRS - Support multiple eDP/LVDS native mode refresh rates - DP HDR support for HSW+ - Lots of display refactoring + fixes - GuC hwconfig support and query - sysfs support for multi-tile - fdinfo per-client gpu utilisation - add geometry subslices query - fix prime mmap with LMEM - fix vm open count and remove vma refcounts - contiguous allocation fixes - steered register write support - small PCI BAR enablement - GuC error capture support - sunset igpu legacy mmap support for newer devices - GuC version 70.1.1 support amdgpu: - Initial SoC21 support - SMU 13.x enablement - SMU 13.0.4 support - ttm_eu cleanups - USB-C, GPUVM updates - TMZ fixes for RV - RAS support for VCN - PM sysfs code cleanup - DC FP rework - extend CG/PG flags to 64-bit - SI dpm lockdep fix - runtime PM fixes amdkfd: - RAS/SVM fixes - TLB flush fixes - CRIU GWS support - ignore bogus MEC signals more efficiently msm: - Fourcc modifier for tiled but not compressed layouts - Support for userspace allocated IOVA (GPU virtual address) - DPU: DSC (Display Stream Compression) support - DP: eDP support - DP: conversion to use drm_bridge and drm_bridge_connector - Merge DPU1 and MDP5 MDSS driver - DPU: writeback support nouveau: - make some structures static - make some variables static - switch to drm_gem_plane_helper_prepare_fb radeon: - misc fixes/cleanups mxsfb: - rework crtc mode setting - LCDIF CRC support etnaviv: - fencing improvements - fix address space collisions - cleanup MMU reference handling gma500: - GEM/GTT improvements - connector handling fixes komeda: - switch to plane reset helper mediatek: - MIPI DSI improvements omapdrm: - GEM improvements qxl: - aarch64 support vc4: - add a CL submission tracepoint - HDMI YUV support - HDMI/clock improvements - drop is_hdmi caching virtio: - remove restriction of non-zero blob types vmwgfx: - support for cursormob and cursorbypass 4 - fence improvements tidss: - reset DISPC on startup solomon: - SPI support - DT improvements sun4i: - allwinner D1 support - drop is_hdmi caching imx: - use swap() instead of open-coding - use devm_platform_ioremap_resource - remove redunant initializations ast: - Displayport support rockchip: - Refactor IOMMU initialisation - make some structures static - replace drm_detect_hdmi_monitor with drm_display_info.is_hdmi - support swapped YUV formats, - clock improvements - rk3568 support - VOP2 support mediatek: - MT8186 support tegra: - debugabillity improvements" * tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm: (1740 commits) drm/i915/dsi: fix VBT send packet port selection for ICL+ drm/i915/uc: Fix undefined behavior due to shift overflowing the constant drm/i915/reg: fix undefined behavior due to shift overflowing the constant drm/i915/gt: Fix use of static in macro mismatch drm/i915/audio: fix audio code enable/disable pipe logging drm/i915: Fix CFI violation with show_dynamic_id() drm/i915: Fix 'mixing different enum types' warnings in intel_display_power.c drm/i915/gt: Fix build error without CONFIG_PM drm/msm/dpu: handle pm_runtime_get_sync() errors in bind path drm/msm/dpu: add DRM_MODE_ROTATE_180 back to supported rotations drm/msm: don't free the IRQ if it was not requested drm/msm/dpu: limit writeback modes according to max_linewidth drm/amd: Don't reset dGPUs if the system is going to s2idle drm/amdgpu: Unmap legacy queue when MES is enabled drm: msm: fix possible memory leak in mdp5_crtc_cursor_set() drm/msm: Fix fb plane offset calculation drm/msm/a6xx: Fix refcount leak in a6xx_gpu_init drm/msm/dsi: don't powerup at modeset time for parade-ps8640 drm/rockchip: Change register space names in vop2 dt-bindings: display: rockchip: make reg-names mandatory for VOP2 ...
2022-05-23Merge tag 'x86_build_for_v5.19_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build updates from Borislav Petkov: - Add a "make x86_debug.config" target which enables a bunch of useful config debug options when trying to debug an issue - A gcc-12 build warnings fix * tag 'x86_build_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Wrap literal addresses in absolute_pointer() x86/configs: Add x86 debugging Kconfig fragment plus docs
2022-05-23Merge tag 'x86_tdx_for_v5.19_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Intel TDX support from Borislav Petkov: "Intel Trust Domain Extensions (TDX) support. This is the Intel version of a confidential computing solution called Trust Domain Extensions (TDX). This series adds support to run the kernel as part of a TDX guest. It provides similar guest protections to AMD's SEV-SNP like guest memory and register state encryption, memory integrity protection and a lot more. Design-wise, it differs from AMD's solution considerably: it uses a software module which runs in a special CPU mode called (Secure Arbitration Mode) SEAM. As the name suggests, this module serves as sort of an arbiter which the confidential guest calls for services it needs during its lifetime. Just like AMD's SNP set, this series reworks and streamlines certain parts of x86 arch code so that this feature can be properly accomodated" * tag 'x86_tdx_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) x86/tdx: Fix RETs in TDX asm x86/tdx: Annotate a noreturn function x86/mm: Fix spacing within memory encryption features message x86/kaslr: Fix build warning in KASLR code in boot stub Documentation/x86: Document TDX kernel architecture ACPICA: Avoid cache flush inside virtual machines x86/tdx/ioapic: Add shared bit for IOAPIC base address x86/mm: Make DMA memory shared for TD guest x86/mm/cpa: Add support for TDX shared memory x86/tdx: Make pages shared in ioremap() x86/topology: Disable CPU online/offline control for TDX guests x86/boot: Avoid #VE during boot for TDX platforms x86/boot: Set CR0.NE early and keep it set during the boot x86/acpi/x86/boot: Add multiprocessor wake-up support x86/boot: Add a trampoline for booting APs via firmware handoff x86/tdx: Wire up KVM hypercalls x86/tdx: Port I/O: Add early boot support x86/tdx: Port I/O: Add runtime hypercalls x86/boot: Port I/O: Add decompression-time support for TDX x86/boot: Port I/O: Allow to hook up alternative helpers ...
2022-05-23Merge tag 'x86_sev_for_v5.19_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull AMD SEV-SNP support from Borislav Petkov: "The third AMD confidential computing feature called Secure Nested Paging. Add to confidential guests the necessary memory integrity protection against malicious hypervisor-based attacks like data replay, memory remapping and others, thus achieving a stronger isolation from the hypervisor. At the core of the functionality is a new structure called a reverse map table (RMP) with which the guest has a say in which pages get assigned to it and gets notified when a page which it owns, gets accessed/modified under the covers so that the guest can take an appropriate action. In addition, add support for the whole machinery needed to launch a SNP guest, details of which is properly explained in each patch. And last but not least, the series refactors and improves parts of the previous SEV support so that the new code is accomodated properly and not just bolted on" * tag 'x86_sev_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) x86/entry: Fixup objtool/ibt validation x86/sev: Mark the code returning to user space as syscall gap x86/sev: Annotate stack change in the #VC handler x86/sev: Remove duplicated assignment to variable info x86/sev: Fix address space sparse warning x86/sev: Get the AP jump table address from secrets page x86/sev: Add missing __init annotations to SEV init routines virt: sevguest: Rename the sevguest dir and files to sev-guest virt: sevguest: Change driver name to reflect generic SEV support x86/boot: Put globals that are accessed early into the .data section x86/boot: Add an efi.h header for the decompressor virt: sevguest: Fix bool function returning negative value virt: sevguest: Fix return value check in alloc_shared_pages() x86/sev-es: Replace open-coded hlt-loop with sev_es_terminate() virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement virt: sevguest: Add support to get extended report virt: sevguest: Add support to derive key virt: Add SEV-SNP guest driver x86/sev: Register SEV-SNP guest request platform device x86/sev: Provide support for SNP guest request NAEs ...
2022-05-19x86/boot: Wrap literal addresses in absolute_pointer()Kees Cook
GCC 11 (incorrectly[1]) assumes that literal values cast to (void *) should be treated like a NULL pointer with an offset, and raises diagnostics when doing bounds checking under -Warray-bounds. GCC 12 got "smarter" about finding these: In function 'rdfs8', inlined from 'vga_recalc_vertical' at /srv/code/arch/x86/boot/video-mode.c:124:29, inlined from 'set_mode' at /srv/code/arch/x86/boot/video-mode.c:163:3: /srv/code/arch/x86/boot/boot.h:114:9: warning: array subscript 0 is outside array bounds of 'u8[0]' {aka 'unsigned char[]'} [-Warray-bounds] 114 | asm volatile("movb %%fs:%1,%0" : "=q" (v) : "m" (*(u8 *)addr)); | ^~~ This has been solved in other places[2] already by using the recently added absolute_pointer() macro. Do the same here. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578 [2] https://lore.kernel.org/all/20210912160149.2227137-1-linux@roeck-us.net/ Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20220227195918.705219-1-keescook@chromium.org
2022-05-11kbuild: factor out the common installation code into scripts/install.shMasahiro Yamada
Many architectures have similar install.sh scripts. The first half is really generic; it verifies that the kernel image and System.map exist, then executes ~/bin/${INSTALLKERNEL} or /sbin/${INSTALLKERNEL} if available. The second half is kind of arch-specific; it copies the kernel image and System.map to the destination, but the code is slightly different. Factor out the generic part into scripts/install.sh. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
2022-05-03efi: x86: Set the NX-compatibility flag in the PE headerPeter Jones
Following Baskov Evgeniy's "Handle UEFI NX-restricted page tables" patches, it's safe to set this compatibility flag to let loaders know they don't need to make special accommodations for kernel to load if pre-boot NX is enabled. Signed-off-by: Peter Jones <pjones@redhat.com> Link: https://lore.kernel.org/all/20220329184743.798513-1-pjones@redhat.com/ Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-04-20x86/boot: Put globals that are accessed early into the .data sectionMichael Roth
The helpers in arch/x86/boot/compressed/efi.c might be used during early boot to access the EFI system/config tables, and in some cases these EFI helpers might attempt to print debug/error messages, before console_init() has been called. __putstr() checks some variables to avoid printing anything before the console has been initialized, but this isn't enough since those variables live in .bss, which may not have been cleared yet. This can lead to a triple-fault occurring, primarily when booting in legacy/CSM mode (where EFI helpers will attempt to print some debug messages). Fix this by declaring these globals in .data section instead so there is no dependency on .bss being cleared before accessing them. Fixes: c01fce9cef849 ("x86/compressed: Add SEV-SNP feature detection/setup") Reported-by: Borislav Petkov <bp@suse.de> Suggested-by: Thomas Lendacky <Thomas.Lendacky@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220420152613.145077-1-michael.roth@amd.com
2022-04-17x86/boot: Add an efi.h header for the decompressorBorislav Petkov
Copy the needed symbols only and remove the kernel proper includes. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/YlCKWhMJEMUgJmjF@zn.tnic
2022-04-07x86/boot: Avoid #VE during boot for TDX platformsSean Christopherson
There are a few MSRs and control register bits that the kernel normally needs to modify during boot. But, TDX disallows modification of these registers to help provide consistent security guarantees. Fortunately, TDX ensures that these are all in the correct state before the kernel loads, which means the kernel does not need to modify them. The conditions to avoid are: * Any writes to the EFER MSR * Clearing CR4.MCE This theoretically makes the guest boot more fragile. If, for instance, EFER was set up incorrectly and a WRMSR was performed, it will trigger early exception panic or a triple fault, if it's before early exceptions are set up. However, this is likely to trip up the guest BIOS long before control reaches the kernel. In any case, these kinds of problems are unlikely to occur in production environments, and developers have good debug tools to fix them quickly. Change the common boot code to work on TDX and non-TDX systems. This should have no functional effect on non-TDX systems. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20220405232939.73860-24-kirill.shutemov@linux.intel.com
2022-04-07x86/boot: Set CR0.NE early and keep it set during the bootKirill A. Shutemov
TDX guest requires CR0.NE to be set. Clearing the bit triggers #GP(0). If CR0.NE is 0, the MS-DOS compatibility mode for handling floating-point exceptions is selected. In this mode, the software exception handler for floating-point exceptions is invoked externally using the processor’s FERR#, INTR, and IGNNE# pins. Using FERR# and IGNNE# to handle floating-point exception is deprecated. CR0.NE=0 also limits newer processors to operate with one logical processor active. Kernel uses CR0_STATE constant to initialize CR0. It has NE bit set. But during early boot kernel has more ad-hoc approach to setting bit in the register. During some of this ad-hoc manipulation, CR0.NE is cleared. This causes a #GP in TDX guests and makes it die in early boot. Make CR0 initialization consistent, deriving the initial value of CR0 from CR0_STATE. Since CR0_STATE always has CR0.NE=1, this ensures that CR0.NE is never 0 and avoids the #GP. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20220405232939.73860-23-kirill.shutemov@linux.intel.com
2022-04-07x86/boot: Port I/O: Add decompression-time support for TDXKirill A. Shutemov
Port I/O instructions trigger #VE in the TDX environment. In response to the exception, kernel emulates these instructions using hypercalls. But during early boot, on the decompression stage, it is cumbersome to deal with #VE. It is cleaner to go to hypercalls directly, bypassing #VE handling. Hook up TDX-specific port I/O helpers if booting in TDX environment. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20220405232939.73860-17-kirill.shutemov@linux.intel.com