summaryrefslogtreecommitdiff
path: root/drivers/firmware/efi/libstub/x86-stub.c
AgeCommit message (Collapse)Author
2025-02-21x86/efistub: Merge PE and handover entrypointsArd Biesheuvel
The difference between the PE and handover entrypoints in the EFI stub is that the former allocates a struct boot_params whereas the latter expects one from the caller. Currently, these are two completely separate entrypoints, duplicating some logic and both relying of efi_exit() to return straight back to the firmware on an error. Simplify this by making the PE entrypoint call the handover entrypoint with NULL as the argument for the struct boot_params parameter. This makes the code easier to follow, and removes the need to support two different calling conventions in the mixed mode asm code. While at it, move the assignment of boot_params_ptr into the function that actually calls into the legacy decompressor, which is where its value is required. Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Use __free() helper for pool deallocationsArd Biesheuvel
Annotate some local buffer allocations as __free(efi_pool) and simplify the associated error handling accordingly. This removes a couple of gotos and simplifies the code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Use cleanup helpers for freeing copies of the memory mapArd Biesheuvel
The EFI stub may obtain the memory map from the firmware numerous times, and this involves doing a EFI pool allocation first, which needs to be freed after use. Streamline this using a cleanup helper, which makes the code easier to follow. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Simplify PCI I/O handle buffer traversalArd Biesheuvel
Use LocateHandleBuffer() and a __free() cleanup helper to simplify the PCI I/O handle buffer traversal code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Simplify GOP handling codeArd Biesheuvel
Use the LocateHandleBuffer() API and a __free() function to simplify the logic that allocates a handle buffer to iterate over all GOP protocols in the EFI database. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Use C99-style for loop to traverse handle bufferArd Biesheuvel
Tweak the for_each_efi_handle() macro in order to avoid the need on the part of the caller to provide a loop counter variable. Also move efi_get_handle_num() to the callers, so that each occurrence can be replaced with the actual number returned by the simplified LocateHandleBuffer API. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14x86/efistub: Drop long obsolete UGA supportArd Biesheuvel
UGA is the EFI graphical output protocol that preceded GOP, and has been long obsolete. Drop support for it from the x86 implementation of the EFI stub - other architectures never bothered to implement it (save for ia64) Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-10-15efi/libstub: remove unnecessary cmd_line_len from efi_convert_cmdline()Jonathan Marek
efi_convert_cmdline() always sets cmdline_size to at least 1 on success, so the "cmdline_size > 0" does nothing and can be removed (the intent was to avoid parsing an empty string, but there is nothing wrong with parsing an empty string, it is only making boot negligibly slower). Then the cmd_line_len argument to efi_convert_cmdline can be removed because there is nothing left using it. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-07-18x86/efistub: Revert to heap allocated boot_params for PE entrypointArd Biesheuvel
This is a partial revert of commit 8117961d98f ("x86/efi: Disregard setup header of loaded image") which triggers boot issues on older Dell laptops. As it turns out, switching back to a heap allocation for the struct boot_params constructed by the EFI stub works around this, even though it is unclear why. Cc: Christian Heusel <christian@heusel.eu> Reported-by: <mavrix#kernel@simplelogin.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-07-12efi: Rename efi_early_memdesc_ptr() to efi_memdesc_ptr()Kees Cook
The "early" part of the helper's name isn't accurate[1]. Drop it in preparation for adding a new (not early) usage. Suggested-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/lkml/CAMj1kXEyDjH0uu3Z4eBesV3PEnKGi5ArXXMp7R-hn8HdRytiPg@mail.gmail.com [1] Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-07-08x86/efistub: Drop redundant clearing of BSSArd Biesheuvel
As it turns out, clearing the BSS was not the right fix for the issue that was ultimately fixed by commit decd347c2a75 ("x86/efistub: Reinstate soft limit for initrd loading"), and given that the Windows EFI loader becomes very unhappy when entered with garbage in BSS, this is one thing that x86 PC EFI implementations can be expected to get right. So drop it from the pure PE entrypoint. The handover protocol entrypoint still needs this - it is used by the flaky distro bootloaders that barely implement PE/COFF at all. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-07-08x86/efistub: Avoid returning EFI_SUCCESS on errorArd Biesheuvel
The fail label is only used in a situation where the previous EFI API call succeeded, and so status will be set to EFI_SUCCESS. Fix this, by dropping the goto entirely, and call efi_exit() with the correct error code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-07-08x86/efistub: Call Apple set_os protocol on dual GPU Intel MacsAditya Garg
0c18184de990 ("platform/x86: apple-gmux: support MMIO gmux on T2 Macs") brought support for T2 Macs in apple-gmux. But in order to use dual GPU, the integrated GPU has to be enabled. On such dual GPU EFI Macs, the EFI stub needs to report that it is booting macOS in order to prevent the firmware from disabling the iGPU. This patch is also applicable for some non T2 Intel Macs. Based on this patch for GRUB by Andreas Heider <andreas@heider.io>: https://lists.gnu.org/archive/html/grub-devel/2013-12/msg00442.html Credits also goto Kerem Karabay <kekrby@gmail.com> for helping porting the patch to the Linux kernel. Cc: Orlando Chamberlain <orlandoch.dev@gmail.com> Signed-off-by: Aditya Garg <gargaditya08@live.com> [ardb: limit scope using list of DMI matches provided by Lukas and Orlando] Reviewed-by: Lukas Wunner <lukas@wunner.de> Tested-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-07-02x86/efi: Drop support for fake EFI memory mapsArd Biesheuvel
Between kexec and confidential VM support, handling the EFI memory maps correctly on x86 is already proving to be rather difficult (as opposed to other EFI architectures which manage to never modify the EFI memory map to begin with) EFI fake memory map support is essentially a development hack (for testing new support for the 'special purpose' and 'more reliable' EFI memory attributes) that leaked into production code. The regions marked in this manner are not actually recognized as such by the firmware itself or the EFI stub (and never have), and marking memory as 'more reliable' seems rather futile if the underlying memory is just ordinary RAM. Marking memory as 'special purpose' in this way is also dubious, but may be in use in production code nonetheless. However, the same should be achievable by using the memmap= command line option with the ! operator. EFI fake memmap support is not enabled by any of the major distros (Debian, Fedora, SUSE, Ubuntu) and does not exist on other architectures, so let's drop support for it. Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-05-21Merge tag 'efi-fixes-for-v6.10-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: - Followup fix for the EFI boot sequence refactor, which may result in physical KASLR putting the kernel in a region which is being used for a special purpose via a command line argument. * tag 'efi-fixes-for-v6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efistub: Omit physical KASLR when memory reservations exist
2024-05-17x86/efistub: Omit physical KASLR when memory reservations existArd Biesheuvel
The legacy decompressor has elaborate logic to ensure that the randomized physical placement of the decompressed kernel image does not conflict with any memory reservations, including ones specified on the command line using mem=, memmap=, efi_fake_mem= or hugepages=, which are taken into account by the kernel proper at a later stage. When booting in EFI mode, it is the firmware's job to ensure that the chosen range does not conflict with any memory reservations that it knows about, and this is trivially achieved by using the firmware's memory allocation APIs. That leaves reservations specified on the command line, though, which the firmware knows nothing about, as these regions have no other special significance to the platform. Since commit a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") these reservations are not taken into account when randomizing the physical placement, which may result in conflicts where the memory cannot be reserved by the kernel proper because its own executable image resides there. To avoid having to duplicate or reuse the existing complicated logic, disable physical KASLR entirely when such overrides are specified. These are mostly diagnostic tools or niche features, and physical KASLR (as opposed to virtual KASLR, which is much more important as it affects the memory addresses observed by code executing in the kernel) is something we can live without. Closes: https://lkml.kernel.org/r/FA5F6719-8824-4B04-803E-82990E65E627%40akamai.com Reported-by: Ben Chaney <bchaney@akamai.com> Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Cc: <stable@vger.kernel.org> # v6.1+ Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-03-28x86/efistub: Reinstate soft limit for initrd loadingArd Biesheuvel
Commit 8117961d98fb2 ("x86/efi: Disregard setup header of loaded image") dropped the memcopy of the image's setup header into the boot_params struct provided to the core kernel, on the basis that EFI boot does not need it and should rely only on a single protocol to interface with the boot chain. It is also a prerequisite for being able to increase the section alignment to 4k, which is needed to enable memory protections when running in the boot services. So only the setup_header fields that matter to the core kernel are populated explicitly, and everything else is ignored. One thing was overlooked, though: the initrd_addr_max field in the setup_header is not used by the core kernel, but it is used by the EFI stub itself when it loads the initrd, where its default value of INT_MAX is used as the soft limit for memory allocation. This means that, in the old situation, the initrd was virtually always loaded in the lower 2G of memory, but now, due to initrd_addr_max being 0x0, the initrd may end up anywhere in memory. This should not be an issue principle, as most systems can deal with this fine. However, it does appear to tickle some problems in older UEFI implementations, where the memory ends up being corrupted, resulting in errors when unpacking the initramfs. So set the initrd_addr_max field to INT_MAX like it was before. Fixes: 8117961d98fb2 ("x86/efi: Disregard setup header of loaded image") Reported-by: Radek Podgorny <radek@podgorny.cz> Closes: https://lore.kernel.org/all/a99a831a-8ad5-4cb0-bff9-be637311f771@podgorny.cz Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-03-24Merge tag 'efi-fixes-for-v6.9-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Fix logic that is supposed to prevent placement of the kernel image below LOAD_PHYSICAL_ADDR - Use the firmware stack in the EFI stub when running in mixed mode - Clear BSS only once when using mixed mode - Check efi.get_variable() function pointer for NULL before trying to call it * tag 'efi-fixes-for-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: fix panic in kdump kernel x86/efistub: Don't clear BSS twice in mixed mode x86/efistub: Call mixed mode boot services on the firmware's stack efi/libstub: fix efi_random_alloc() to allocate memory at alloc_min or higher address
2024-03-24x86/efistub: Don't clear BSS twice in mixed modeArd Biesheuvel
Clearing BSS should only be done once, at the very beginning. efi_pe_entry() is the entrypoint from the firmware, which may not clear BSS and so it is done explicitly. However, efi_pe_entry() is also used as an entrypoint by the mixed mode startup code, in which case BSS will already have been cleared, and doing it again at this point will corrupt global variables holding the firmware's GDT/IDT and segment selectors. So make the memset() conditional on whether the EFI stub is running in native mode. Fixes: b3810c5a2cc4a666 ("x86/efistub: Clear decompressor BSS in native EFI entrypoint") Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-03-17Merge tag 'efi-fixes-for-v6.9-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "This fixes an oversight on my part in the recent EFI stub rework for x86, which is needed to get Linux/x86 distro builds signed again for secure boot by Microsoft. For this reason, most of this work is being backported to v6.1, which is therefore also affected by this regression. - Explicitly wipe BSS in the native EFI entrypoint, so that globals shared with the legacy decompressor are zero-initialized correctly" * tag 'efi-fixes-for-v6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efistub: Clear decompressor BSS in native EFI entrypoint
2024-03-15x86/efistub: Clear decompressor BSS in native EFI entrypointArd Biesheuvel
The EFI stub on x86 no longer invokes the decompressor as a subsequent boot stage, but calls into the decompression code directly while running in the context of the EFI boot services. This means that when using the native EFI entrypoint (as opposed to the EFI handover protocol, which clears BSS explicitly), the firmware PE image loader is being relied upon to ensure that BSS is zeroed before the EFI stub is entered from the firmware. As Radek's report proves, this is a bad idea. Not all loaders do this correctly, which means some global variables that should be statically initialized to 0x0 may have junk in them. So clear BSS explicitly when entering via efi_pe_entry(). Note that zeroing BSS from C code is not generally safe, but in this case, the following assignment and dereference of a global pointer variable ensures that the memset() cannot be deferred or reordered. Cc: <stable@kernel.org> # v6.1+ Reported-by: Radek Podgorny <radek@podgorny.cz> Closes: https://lore.kernel.org/all/a99a831a-8ad5-4cb0-bff9-be637311f771@podgorny.cz Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-03-13Merge tag 'efi-next-for-v6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Measure initrd and command line using the CC protocol if the ordinary TCG2 protocol is not implemented, typically on TDX confidential VMs - Avoid creating mappings that are both writable and executable while running in the EFI boot services. This is a prerequisite for getting the x86 shim loader signed by MicroSoft again, which allows the distros to install on x86 PCs that ship with EFI secure boot enabled. - API update for struct platform_driver::remove() * tag 'efi-next-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: virt: efi_secret: Convert to platform remove callback returning void x86/efistub: Remap kernel text read-only before dropping NX attribute efi/libstub: Add get_event_log() support for CC platforms efi/libstub: Measure into CC protocol if TCG2 protocol is absent efi/libstub: Add Confidential Computing (CC) measurement typedefs efi/tpm: Use symbolic GUID name from spec for final events table efi/libstub: Use TPM event typedefs from the TCG PC Client spec
2024-03-09x86/efistub: Remap kernel text read-only before dropping NX attributeArd Biesheuvel
Currently, the EFI stub invokes the EFI memory attributes protocol to strip any NX restrictions from the entire loaded kernel, resulting in all code and data being mapped read-write-execute. The point of the EFI memory attributes protocol is to remove the need for all memory allocations to be mapped with both write and execute permissions by default, and make it the OS loader's responsibility to transition data mappings to code mappings where appropriate. Even though the UEFI specification does not appear to leave room for denying memory attribute changes based on security policy, let's be cautious and avoid relying on the ability to create read-write-execute mappings. This is trivially achievable, given that the amount of kernel code executing via the firmware's 1:1 mapping is rather small and limited to the .head.text region. So let's drop the NX restrictions only on that subregion, but not before remapping it as read-only first. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-03-09efi/libstub: Add get_event_log() support for CC platformsKuppuswamy Sathyanarayanan
To allow event log info access after boot, EFI boot stub extracts the event log information and installs it in an EFI configuration table. Currently, EFI boot stub only supports installation of event log only for TPM 1.2 and TPM 2.0 protocols. Extend the same support for CC protocol. Since CC platform also uses TCG2 format, reuse TPM2 support code as much as possible. Link: https://uefi.org/specs/UEFI/2.10/38_Confidential_Computing.html#efi-cc-measurement-protocol [1] Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lkml.kernel.org/r/0229a87e-fb19-4dad-99fc-4afd7ed4099a%40collabora.com [ardb: Split out final events table handling to avoid version confusion] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-03-04x86/boot: Move mem_encrypt= parsing to the decompressorArd Biesheuvel
The early SME/SEV code parses the command line very early, in order to decide whether or not memory encryption should be enabled, which needs to occur even before the initial page tables are created. This is problematic for a number of reasons: - this early code runs from the 1:1 mapping provided by the decompressor or firmware, which uses a different translation than the one assumed by the linker, and so the code needs to be built in a special way; - parsing external input while the entire kernel image is still mapped writable is a bad idea in general, and really does not belong in security minded code; - the current code ignores the built-in command line entirely (although this appears to be the case for the entire decompressor) Given that the decompressor/EFI stub is an intrinsic part of the x86 bootable kernel image, move the command line parsing there and out of the core kernel. This removes the need to build lib/cmdline.o in a special way, or to use RIP-relative LEA instructions in inline asm blocks. This involves a new xloadflag in the setup header to indicate that mem_encrypt=on appeared on the kernel command line. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20240227151907.387873-17-ardb+git@google.com
2024-01-30x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDRArd Biesheuvel
The EFI stub's kernel placement logic randomizes the physical placement of the kernel by taking all available memory into account, and picking a region at random, based on a random seed. When KASLR is disabled, this seed is set to 0x0, and this results in the lowest available region of memory to be selected for loading the kernel, even if this is below LOAD_PHYSICAL_ADDR. Some of this memory is typically reserved for the GFP_DMA region, to accommodate masters that can only access the first 16 MiB of system memory. Even if such devices are rare these days, we may still end up with a warning in the kernel log, as reported by Tom: swapper/0: page allocation failure: order:10, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0 Fix this by tweaking the random allocation logic to accept a low bound on the placement, and set it to LOAD_PHYSICAL_ADDR. Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Reported-by: Tom Englund <tomenglund26@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218404 Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-01-26x86/efistub: Give up if memory attribute protocol returns an errorArd Biesheuvel
The recently introduced EFI memory attributes protocol should be used if it exists to ensure that the memory allocation created for the kernel permits execution. This is needed for compatibility with tightened requirements related to Windows logo certification for x86 PCs. Currently, we simply strip the execute protect (XP) attribute from the entire range, but this might be rejected under some firmware security policies, and so in a subsequent patch, this will be changed to only strip XP from the executable region that runs early, and make it read-only (RO) as well. In order to catch any issues early, ensure that the memory attribute protocol works as intended, and give up if it produces spurious errors. Note that the DXE services based fallback was always based on best effort, so don't propagate any errors returned by that API. Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-01-02efi/x86: Fix the missing KASLR_FLAG bit in boot_params->hdr.loadflagsYuntao Wang
When KASLR is enabled, the KASLR_FLAG bit in boot_params->hdr.loadflags should be set to 1 to propagate KASLR status from compressed kernel to kernel, just as the choose_random_location() function does. Currently, when the kernel is booted via the EFI stub, the KASLR_FLAG bit in boot_params->hdr.loadflags is not set, even though it should be. This causes some functions, such as kernel_randomize_memory(), not to execute as expected. Fix it. Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Signed-off-by: Yuntao Wang <ytcoode@gmail.com> [ardb: drop 'else' branch clearing KASLR_FLAG] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-12-11efi/x86: Avoid physical KASLR on older Dell systemsArd Biesheuvel
River reports boot hangs with v6.6 and v6.7, and the bisect points to commit a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") which moves the memory allocation and kernel decompression from the legacy decompressor (which executes *after* ExitBootServices()) to the EFI stub, using boot services for allocating the memory. The memory allocation succeeds but the subsequent call to decompress_kernel() never returns, resulting in a failed boot and a hanging system. As it turns out, this issue only occurs when physical address randomization (KASLR) is enabled, and given that this is a feature we can live without (virtual KASLR is much more important), let's disable the physical part of KASLR when booting on AMI UEFI firmware claiming to implement revision v2.0 of the specification (which was released in 2006), as this is the version these systems advertise. Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218173 Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-10-30Merge tag 'x86-boot-2023-10-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: - Rework PE header generation, primarily to generate a modern, 4k aligned kernel image view with narrower W^X permissions. - Further refine init-lifetime annotations - Misc cleanups & fixes * tag 'x86-boot-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/boot: efistub: Assign global boot_params variable x86/boot: Rename conflicting 'boot_params' pointer to 'boot_params_ptr' x86/head/64: Move the __head definition to <asm/init.h> x86/head/64: Add missing __head annotation to startup_64_load_idt() x86/head/64: Mark 'startup_gdt[]' and 'startup_gdt_descr' as __initdata x86/boot: Harmonize the style of array-type parameter for fixup_pointer() calls x86/boot: Fix incorrect startup_gdt_descr.size x86/boot: Compile boot code with -std=gnu11 too x86/boot: Increase section and file alignment to 4k/512 x86/boot: Split off PE/COFF .data section x86/boot: Drop PE/COFF .reloc section x86/boot: Construct PE/COFF .text section from assembler x86/boot: Derive file size from _edata symbol x86/boot: Define setup size in linker script x86/boot: Set EFI handover offset directly in header asm x86/boot: Grab kernel_info offset from zoffset header directly x86/boot: Drop references to startup_64 x86/boot: Drop redundant code setting the root device x86/boot: Omit compression buffer from PE/COFF image memory footprint x86/boot: Remove the 'bugger off' message ...
2023-10-18x86/boot: efistub: Assign global boot_params variableArd Biesheuvel
Now that the x86 EFI stub calls into some APIs exposed by the decompressor (e.g., kaslr_get_random_long()), it is necessary to ensure that the global boot_params variable is set correctly before doing so. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org
2023-10-17x86/boot: efistub: Assign global boot_params variableArd Biesheuvel
Now that the x86 EFI stub calls into some APIs exposed by the decompressor (e.g., kaslr_get_random_long()), it is necessary to ensure that the global boot_params variable is set correctly before doing so. Note that the decompressor and the kernel proper carry conflicting declarations for the global variable 'boot_params' so refer to it via an alias to work around this. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-10-13x86/efistub: Don't try to print after ExitBootService()Nikolay Borisov
setup_e820() is executed after UEFI's ExitBootService has been called. This causes the firmware to throw an exception because the Console IO protocol is supposed to work only during boot service environment. As per UEFI 2.9, section 12.1: "This protocol is used to handle input and output of text-based information intended for the system user during the operation of code in the boot services environment." So drop the diagnostic warning from this function. We might add back a warning that is issued later when initializing the kernel itself. Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-15x86/efi: Disregard setup header of loaded imageArd Biesheuvel
The native EFI entrypoint does not take a struct boot_params from the loader, but instead, it constructs one from scratch, using the setup header data placed at the start of the image. This setup header is placed in a way that permits legacy loaders to manipulate the contents (i.e., to pass the kernel command line or the address and size of an initial ramdisk), but EFI boot does not use it in that way - it only copies the contents that were placed there at build time, but EFI loaders will not (and should not) manipulate the setup header to configure the boot. (Commit 63bf28ceb3ebbe76 "efi: x86: Wipe setup_data on pure EFI boot" deals with some of the fallout of using setup_data in a way that breaks EFI boot.) Given that none of the non-zero values that are copied from the setup header into the EFI stub's struct boot_params are relevant to the boot now that the EFI stub no longer enters via the legacy decompressor, the copy can be omitted altogether. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230912090051.4014114-19-ardb@google.com
2023-08-28Merge tag 'efi-next-for-v6.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: "This primarily covers some cleanup work on the EFI runtime wrappers, which are shared between all EFI architectures except Itanium, and which provide some level of isolation to prevent faults occurring in the firmware code (which runs at the same privilege level as the kernel) from bringing down the system. Beyond that, there is a fix that did not make it into v6.5, and some doc fixes and dead code cleanup. - one bugfix for x86 mixed mode that did not make it into v6.5 - first pass of cleanup for the EFI runtime wrappers - some cosmetic touchups" * tag 'efi-next-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efistub: Fix PCI ROM preservation in mixed mode efi/runtime-wrappers: Clean up white space and add __init annotation acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers efi/runtime-wrappers: Don't duplicate setup/teardown code efi/runtime-wrappers: Remove duplicated macro for service returning void efi/runtime-wrapper: Move workqueue manipulation out of line efi/runtime-wrappers: Use type safe encapsulation of call arguments efi/riscv: Move EFI runtime call setup/teardown helpers out of line efi/arm64: Move EFI runtime call setup/teardown helpers out of line efi/riscv: libstub: Fix comment about absolute relocation efi: memmap: Remove kernel-doc warnings efi: Remove unused extern declaration efi_lookup_mapped_addr()
2023-08-24x86/efistub: Fix PCI ROM preservation in mixed modeMikel Rychliski
preserve_pci_rom_image() was accessing the romsize field in efi_pci_io_protocol_t directly instead of using the efi_table_attr() helper. This prevents the ROM image from being saved correctly during a mixed mode boot. Fixes: 2c3625cb9fa2 ("efi/x86: Fold __setup_efi_pci32() and __setup_efi_pci64() into one function") Signed-off-by: Mikel Rychliski <mikel@mikelr.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-08-07x86/efistub: Avoid legacy decompressor when doing EFI bootArd Biesheuvel
The bare metal decompressor code was never really intended to run in a hosted environment such as the EFI boot services, and does a few things that are becoming problematic in the context of EFI boot now that the logo requirements are getting tighter: EFI executables will no longer be allowed to consist of a single executable section that is mapped with read, write and execute permissions if they are intended for use in a context where Secure Boot is enabled (and where Microsoft's set of certificates is used, i.e., every x86 PC built to run Windows). To avoid stepping on reserved memory before having inspected the E820 tables, and to ensure the correct placement when running a kernel build that is non-relocatable, the bare metal decompressor moves its own executable image to the end of the allocation that was reserved for it, in order to perform the decompression in place. This means the region in question requires both write and execute permissions, which either need to be given upfront (which EFI will no longer permit), or need to be applied on demand using the existing page fault handling framework. However, the physical placement of the kernel is usually randomized anyway, and even if it isn't, a dedicated decompression output buffer can be allocated anywhere in memory using EFI APIs when still running in the boot services, given that EFI support already implies a relocatable kernel. This means that decompression in place is never necessary, nor is moving the compressed image from one end to the other. Since EFI already maps all of memory 1:1, it is also unnecessary to create new page tables or handle page faults when decompressing the kernel. That means there is also no need to replace the special exception handlers for SEV. Generally, there is little need to do any of the things that the decompressor does beyond - initialize SEV encryption, if needed, - perform the 4/5 level paging switch, if needed, - decompress the kernel - relocate the kernel So do all of this from the EFI stub code, and avoid the bare metal decompressor altogether. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230807162720.545787-24-ardb@kernel.org
2023-08-07x86/efistub: Perform SNP feature test while running in the firmwareArd Biesheuvel
Before refactoring the EFI stub boot flow to avoid the legacy bare metal decompressor, duplicate the SNP feature check in the EFI stub before handing over to the kernel proper. The SNP feature check can be performed while running under the EFI boot services, which means it can force the boot to fail gracefully and return an error to the bootloader if the loaded kernel does not implement support for all the features that the hypervisor enabled. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230807162720.545787-23-ardb@kernel.org
2023-08-07x86/efistub: Prefer EFI memory attributes protocol over DXE servicesArd Biesheuvel
Currently, the EFI stub relies on DXE services in some cases to clear non-execute restrictions from page allocations that need to be executable. This is dodgy, because DXE services are not specified by UEFI but by PI, and they are not intended for consumption by OS loaders. However, no alternative existed at the time. Now, there is a new UEFI protocol that should be used instead, so if it exists, prefer it over the DXE services calls. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230807162720.545787-18-ardb@kernel.org
2023-08-07x86/efistub: Perform 4/5 level paging switch from the stubArd Biesheuvel
In preparation for updating the EFI stub boot flow to avoid the bare metal decompressor code altogether, implement the support code for switching between 4 and 5 levels of paging before jumping to the kernel proper. Reuse the newly refactored trampoline that the bare metal decompressor uses, but relies on EFI APIs to allocate 32-bit addressable memory and remap it with the appropriate permissions. Given that the bare metal decompressor will no longer call into the trampoline if the number of paging levels is already set correctly, it is no longer needed to remove NX restrictions from the memory range where this trampoline may end up. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20230807162720.545787-17-ardb@kernel.org
2023-08-07x86/efistub: Clear BSS in EFI handover protocol entrypointArd Biesheuvel
The so-called EFI handover protocol is value-add from the distros that permits a loader to simply copy a PE kernel image into memory and call an alternative entrypoint that is described by an embedded boot_params structure. Most implementations of this protocol do not bother to check the PE header for minimum alignment, section placement, etc, and therefore also don't clear the image's BSS, or even allocate enough memory for it. Allocating more memory on the fly is rather difficult, but at least clear the BSS region explicitly when entering in this manner, so that the EFI stub code does not get confused by global variables that were not zero-initialized correctly. When booting in mixed mode, this BSS clearing must occur before any global state is created, so clear it in the 32-bit asm entry point. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230807162720.545787-7-ardb@kernel.org
2023-08-07x86/efistub: Simplify and clean up handover entry codeArd Biesheuvel
Now that the EFI entry code in assembler is only used by the optional and deprecated EFI handover protocol, and given that the EFI stub C code no longer returns to it, most of it can simply be dropped. While at it, clarify the symbol naming, by merging efi_main() and efi_stub_entry(), making the latter the shared entry point for all different boot modes that enter via the EFI stub. The efi32_stub_entry() and efi64_stub_entry() names are referenced explicitly by the tooling that populates the setup header, so these must be retained, but can be emitted as aliases of efi_stub_entry() where appropriate. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230807162720.545787-5-ardb@kernel.org
2023-08-07x86/efistub: Branch straight to kernel entry point from C codeArd Biesheuvel
Instead of returning to the calling code in assembler that does nothing more than perform an indirect call with the boot_params pointer in register ESI/RSI, perform the jump directly from the EFI stub C code. This will allow the asm entrypoint code to be dropped entirely in subsequent patches. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230807162720.545787-4-ardb@kernel.org
2023-06-06x86/efi: Safely enable unaccepted memory in UEFIDionna Glaze
The UEFI v2.9 specification includes a new memory type to be used in environments where the OS must accept memory that is provided from its host. Before the introduction of this memory type, all memory was accepted eagerly in the firmware. In order for the firmware to safely stop accepting memory on the OS's behalf, the OS must affirmatively indicate support to the firmware. This is only a problem for AMD SEV-SNP, since Linux has had support for it since 5.19. The other technology that can make use of unaccepted memory, Intel TDX, does not yet have Linux support, so it can strictly require unaccepted memory support as a dependency of CONFIG_TDX and not require communication with the firmware. Enabling unaccepted memory requires calling a 0-argument enablement protocol before ExitBootServices. This call is only made if the kernel is compiled with UNACCEPTED_MEMORY=y This protocol will be removed after the end of life of the first LTS that includes it, in order to give firmware implementations an expiration date for it. When the protocol is removed, firmware will strictly infer that a SEV-SNP VM is running an OS that supports the unaccepted memory type. At the earliest convenience, when unaccepted memory support is added to Linux, SEV-SNP may take strict dependence in it. After the firmware removes support for the protocol, this should be reverted. [tl: address some checkscript warnings] Signed-off-by: Dionna Glaze <dionnaglaze@google.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/0d5f3d9a20b5cf361945b7ab1263c36586a78a42.1686063086.git.thomas.lendacky@amd.com
2023-06-06efi/libstub: Implement support for unaccepted memoryKirill A. Shutemov
UEFI Specification version 2.9 introduces the concept of memory acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, requiring memory to be accepted before it can be used by the guest. Accepting happens via a protocol specific for the Virtual Machine platform. Accepting memory is costly and it makes VMM allocate memory for the accepted guest physical address range. It's better to postpone memory acceptance until memory is needed. It lowers boot time and reduces memory overhead. The kernel needs to know what memory has been accepted. Firmware communicates this information via memory map: a new memory type -- EFI_UNACCEPTED_MEMORY -- indicates such memory. Range-based tracking works fine for firmware, but it gets bulky for the kernel: e820 (or whatever the arch uses) has to be modified on every page acceptance. It leads to table fragmentation and there's a limited number of entries in the e820 table. Another option is to mark such memory as usable in e820 and track if the range has been accepted in a bitmap. One bit in the bitmap represents a naturally aligned power-2-sized region of address space -- unit. For x86, unit size is 2MiB: 4k of the bitmap is enough to track 64GiB or physical address space. In the worst-case scenario -- a huge hole in the middle of the address space -- It needs 256MiB to handle 4PiB of the address space. Any unaccepted memory that is not aligned to unit_size gets accepted upfront. The bitmap is allocated and constructed in the EFI stub and passed down to the kernel via EFI configuration table. allocate_e820() allocates the bitmap if unaccepted memory is present, according to the size of unaccepted region. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230606142637.5171-4-kirill.shutemov@linux.intel.com
2023-06-06efi/x86: Get full memory map in allocate_e820()Kirill A. Shutemov
Currently allocate_e820() is only interested in the size of map and size of memory descriptor to determine how many e820 entries the kernel needs. UEFI Specification version 2.9 introduces a new memory type -- unaccepted memory. To track unaccepted memory, the kernel needs to allocate a bitmap. The size of the bitmap is dependent on the maximum physical address present in the system. A full memory map is required to find the maximum address. Modify allocate_e820() to get a full memory map. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230606142637.5171-3-kirill.shutemov@linux.intel.com
2022-11-24x86/boot/compressed, efi: Merge multiple definitions of image_offset into oneArd Biesheuvel
There is no need for head_32.S and head_64.S both declaring a copy of the global 'image_offset' variable, so drop those and make the extern C declaration the definition. When image_offset is moved to the .c file, it needs to be placed particularly in the .data section because it lands by default in the .bss section which is cleared too late, in .Lrelocated, before the first access to it and thus garbage gets read, leading to SEV guests exploding in early boot. This happens only when the SEV guest kernel is loaded through grub. If supplied with qemu's -kernel command line option, that memory is always cleared upfront by qemu and all is fine there. [ bp: Expand commit message with SEV aspect. ] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-8-ardb@kernel.org
2022-10-21efi: libstub: Give efi_main() asmlinkage qualificationArd Biesheuvel
To stop the bots from sending sparse warnings to me and the list about efi_main() not having a prototype, decorate it with asmlinkage so that it is clear that it is called from assembly, and therefore needs to remain external, even if it is never declared in a header file. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-09Merge tag 'efi-next-for-v6.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: "A bit more going on than usual in the EFI subsystem. The main driver for this has been the introduction of the LoonArch architecture last cycle, which inspired some cleanup and refactoring of the EFI code. Another driver for EFI changes this cycle and in the future is confidential compute. The LoongArch architecture does not use either struct bootparams or DT natively [yet], and so passing information between the EFI stub and the core kernel using either of those is undesirable. And in general, overloading DT has been a source of issues on arm64, so using DT for this on new architectures is a to avoid for the time being (even if we might converge on something DT based for non-x86 architectures in the future). For this reason, in addition to the patch that enables EFI boot for LoongArch, there are a number of refactoring patches applied on top of which separate the DT bits from the generic EFI stub bits. These changes are on a separate topich branch that has been shared with the LoongArch maintainers, who will include it in their pull request as well. This is not ideal, but the best way to manage the conflicts without stalling LoongArch for another cycle. Another development inspired by LoongArch is the newly added support for EFI based decompressors. Instead of adding yet another arch-specific incarnation of this pattern for LoongArch, we are introducing an EFI app based on the existing EFI libstub infrastructure that encapulates the decompression code we use on other architectures, but in a way that is fully generic. This has been developed and tested in collaboration with distro and systemd folks, who are eager to start using this for systemd-boot and also for arm64 secure boot on Fedora. Note that the EFI zimage files this introduces can also be decompressed by non-EFI bootloaders if needed, as the image header describes the location of the payload inside the image, and the type of compression that was used. (Note that Fedora's arm64 GRUB is buggy [0] so you'll need a recent version or switch to systemd-boot in order to use this.) Finally, we are adding TPM measurement of the kernel command line provided by EFI. There is an oversight in the TCG spec which results in a blind spot for command line arguments passed to loaded images, which means that either the loader or the stub needs to take the measurement. Given the combinatorial explosion I am anticipating when it comes to firmware/bootloader stacks and firmware based attestation protocols (SEV-SNP, TDX, DICE, DRTM), it is good to set a baseline now when it comes to EFI measured boot, which is that the kernel measures the initrd and command line. Intermediate loaders can measure additional assets if needed, but with the baseline in place, we can deploy measured boot in a meaningful way even if you boot into Linux straight from the EFI firmware. Summary: - implement EFI boot support for LoongArch - implement generic EFI compressed boot support for arm64, RISC-V and LoongArch, none of which implement a decompressor today - measure the kernel command line into the TPM if measured boot is in effect - refactor the EFI stub code in order to isolate DT dependencies for architectures other than x86 - avoid calling SetVirtualAddressMap() on arm64 if the configured size of the VA space guarantees that doing so is unnecessary - move some ARM specific code out of the generic EFI source files - unmap kernel code from the x86 mixed mode 1:1 page tables" * tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits) efi/arm64: libstub: avoid SetVirtualAddressMap() when possible efi: zboot: create MemoryMapped() device path for the parent if needed efi: libstub: fix up the last remaining open coded boot service call efi/arm: libstub: move ARM specific code out of generic routines efi/libstub: measure EFI LoadOptions efi/libstub: refactor the initrd measuring functions efi/loongarch: libstub: remove dependency on flattened DT efi: libstub: install boot-time memory map as config table efi: libstub: remove DT dependency from generic stub efi: libstub: unify initrd loading between architectures efi: libstub: remove pointless goto kludge efi: libstub: simplify efi_get_memory_map() and struct efi_boot_memmap efi: libstub: avoid efi_get_memory_map() for allocating the virt map efi: libstub: drop pointless get_memory_map() call efi: libstub: fix type confusion for load_options_size arm64: efi: enable generic EFI compressed boot loongarch: efi: enable generic EFI compressed boot riscv: efi: enable generic EFI compressed boot efi/libstub: implement generic EFI zboot efi/libstub: move efi_system_table global var into separate object ...
2022-09-27efi: libstub: unify initrd loading between architecturesArd Biesheuvel
Use a EFI configuration table to pass the initrd to the core kernel, instead of per-arch methods. This cleans up the code considerably, and should make it easier for architectures to get rid of their reliance on DT for doing EFI boot in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>