linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2022-12-01	efi: stub: use random seed from EFI variable	Jason A. Donenfeld
	EFI has a rather unique benefit that it has access to some limited non-volatile storage, where the kernel can store a random seed. Read that seed in EFISTUB and concatenate it with other seeds we wind up passing onward to the kernel in the configuration table. This is complementary to the current other two sources - previous bootloaders, and the EFI RNG protocol. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> [ardb: check for non-NULL RNG protocol pointer, call GetVariable() without buffer first to obtain the size] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-24	x86/boot/compressed, efi: Merge multiple definitions of image_offset into one	Ard Biesheuvel
	There is no need for head_32.S and head_64.S both declaring a copy of the global 'image_offset' variable, so drop those and make the extern C declaration the definition. When image_offset is moved to the .c file, it needs to be placed particularly in the .data section because it lands by default in the .bss section which is cleared too late, in .Lrelocated, before the first access to it and thus garbage gets read, leading to SEV guests exploding in early boot. This happens only when the SEV guest kernel is loaded through grub. If supplied with qemu's -kernel command line option, that memory is always cleared upfront by qemu and all is fine there. [ bp: Expand commit message with SEV aspect. ] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221122161017.2426828-8-ardb@kernel.org
2022-11-18	efi: random: combine bootloader provided RNG seed with RNG protocol output	Ard Biesheuvel
	Instead of blindly creating the EFI random seed configuration table if the RNG protocol is implemented and works, check whether such a EFI configuration table was provided by an earlier boot stage and if so, concatenate the existing and the new seeds, leaving it up to the core code to mix it in and credit it the way it sees fit. This can be used for, e.g., systemd-boot, to pass an additional seed to Linux in a way that can be consumed by the kernel very early. In that case, the following definitions should be used to pass the seed to the EFI stub: struct linux_efi_random_seed { u32 size; // of the 'seed' array in bytes u8 seed[]; }; The memory for the struct must be allocated as EFI_ACPI_RECLAIM_MEMORY pool memory, and the address of the struct in memory should be installed as a EFI configuration table using the following GUID: LINUX_EFI_RANDOM_SEED_TABLE_GUID 1ce1e5bc-7ceb-42f2-81e5-8aadf180f57b Note that doing so is safe even on kernels that were built without this patch applied, but the seed will simply be overwritten with a seed derived from the EFI RNG protocol, if available. The recommended seed size is 32 bytes, and seeds larger than 512 bytes are considered corrupted and ignored entirely. In order to preserve forward secrecy, seeds from previous bootloaders are memzero'd out, and in order to preserve memory, those older seeds are also freed from memory. Freeing from memory without first memzeroing is not safe to do, as it's possible that nothing else will ever overwrite those pages used by EFI. Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> [ardb: incorporate Jason's followup changes to extend the maximum seed size on the consumer end, memzero() it and drop a needless printk] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18	efi: libstub: fix efi_load_initrd_dev_path() kernel-doc comment	Jialin Zhang
	commit f4dc7fffa987 ("efi: libstub: unify initrd loading between architectures") merge the first and the second parameters into a struct without updating the kernel-doc. Let's fix it. Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18	efi: libstub: Add mixed mode support to command line initrd loader	Ard Biesheuvel
	Now that we have support for calling protocols that need additional marshalling for mixed mode, wire up the initrd command line loader. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18	efi: libstub: Permit mixed mode return types other than efi_status_t	Ard Biesheuvel
	Rework the EFI stub macro wrappers around protocol method calls and other indirect calls in order to allow return types other than efi_status_t. This means the widening should be conditional on whether or not the return type is efi_status_t, and should be omitted otherwise. Also, switch to _Generic() to implement the type based compile time conditionals, which is more concise, and distinguishes between efi_status_t and u64 properly. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18	efi: libstub: Implement devicepath support for initrd commandline loader	Ard Biesheuvel
	Currently, the initrd= command line option to the EFI stub only supports loading files that reside on the same volume as the loaded image, which is not workable for loaders like GRUB that don't even implement the volume abstraction (EFI_SIMPLE_FILE_SYSTEM_PROTOCOL), and load the kernel from an anonymous buffer in memory. For this reason, another method was devised that relies on the LoadFile2 protocol. However, the command line loader is rather useful when using the UEFI shell or other generic loaders that have no awareness of Linux specific protocols so let's make it a bit more flexible, by permitting textual device paths to be provided to initrd= as well, provided that they refer to a file hosted on a EFI_SIMPLE_FILE_SYSTEM_PROTOCOL volume. E.g., initrd=PciRoot(0x0)/Pci(0x3,0x0)/HD(1,MBR,0xBE1AFDFA,0x3F,0xFBFC1)/rootfs.cpio.gz Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18	efi: libstub: use EFI_LOADER_CODE region when moving the kernel in memory	Ard Biesheuvel
	The EFI spec is not very clear about which permissions are being given when allocating pages of a certain type. However, it is quite obvious that EFI_LOADER_CODE is more likely to permit execution than EFI_LOADER_DATA, which becomes relevant once we permit booting the kernel proper with the firmware's 1:1 mapping still active. Ostensibly, recent systems such as the Surface Pro X grant executable permissions to EFI_LOADER_CODE regions but not EFI_LOADER_DATA regions. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-18	Merge tag 'efi-zboot-direct-for-v6.2' into efi/next	Ard Biesheuvel

2022-11-10	arm64: efi: Force the use of SetVirtualAddressMap() on Altra machines	Ard Biesheuvel
	Ampere Altra machines are reported to misbehave when the SetTime() EFI runtime service is called after ExitBootServices() but before calling SetVirtualAddressMap(). Given that the latter is horrid, pointless and explicitly documented as optional by the EFI spec, we no longer invoke it at boot if the configured size of the VA space guarantees that the EFI runtime memory regions can remain mapped 1:1 like they are at boot time. On Ampere Altra machines, this results in SetTime() calls issued by the rtc-efi driver triggering synchronous exceptions during boot. We can now recover from those without bringing down the system entirely, due to commit 23715a26c8d81291 ("arm64: efi: Recover from synchronous exceptions occurring in firmware"). However, it would be better to avoid the issue entirely, given that the firmware appears to remain in a funny state after this. So attempt to identify these machines based on the 'family' field in the type #1 SMBIOS record, and call SetVirtualAddressMap() unconditionally in that case. Tested-by: Alexandru Elisei <alexandru.elisei@gmail.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	arm64: unwind: add asynchronous unwind tables to kernel and modules	Ard Biesheuvel
	Enable asynchronous unwind table generation for both the core kernel as well as modules, and emit the resulting .eh_frame sections as init code so we can use the unwind directives for code patching at boot or module load time. This will be used by dynamic shadow call stack support, which will rely on code patching rather than compiler codegen to emit the shadow call stack push and pop instructions. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20221027155908.1940624-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-09	efi: libstub: Merge zboot decompressor with the ordinary stub	Ard Biesheuvel
	Even though our EFI zboot decompressor is pedantically spec compliant and idiomatic for EFI image loaders, calling LoadImage() and StartImage() for the nested image is a bit of a burden. Not only does it create workflow issues for the distros (as both the inner and outer PE/COFF images need to be signed for secure boot), it also copies the image around in memory numerous times: - first, the image is decompressed into a buffer; - the buffer is consumed by LoadImage(), which copies the sections into a newly allocated memory region to hold the executable image; - once the EFI stub is invoked by StartImage(), it will also move the image in memory in case of KASLR, mirrored memory or if the image must execute from a certain a priori defined address. There are only two EFI spec compliant ways to load code into memory and execute it: - use LoadImage() and StartImage(), - call ExitBootServices() and take ownership of the entire system, after which anything goes. Given that the EFI zboot decompressor always invokes the EFI stub, and given that both are built from the same set of objects, let's merge the two, so that we can avoid LoadImage()/StartImage but still load our image into memory without breaking the above rules. This also means we can decompress the image directly into its final location, which could be randomized or meet other platform specific constraints that LoadImage() does not know how to adhere to. It also means that, even if the encapsulated image still has the EFI stub incorporated as well, it does not need to be signed for secure boot when wrapping it in the EFI zboot decompressor. In the future, we might decide to retire the EFI stub attached to the decompressed image, but for the time being, they can happily coexist. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi/loongarch: libstub: Split off kernel image relocation for builtin stub	Ard Biesheuvel
	The LoongArch build of the EFI stub is part of the core kernel image, and therefore accesses section markers directly when it needs to figure out the size of the various section. The zboot decompressor does not have access to those symbols, but doesn't really need that either. So let's move handle_kernel_image() into a separate file (or rather, move everything else into a separate file) so that the zboot build does not pull in unused code that links to symbols that it does not define. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi/loongarch: Don't jump to kernel entry via the old image	Ard Biesheuvel
	Currently, the EFI entry code for LoongArch is set up to copy the executable image to the preferred offset, but instead of branching directly into that image, it branches to the local copy of kernel_entry, and relies on the logic in that function to switch to the link time address instead. This is a bit sloppy, and not something we can support once we merge the EFI decompressor with the EFI stub. So let's clean this up a bit, by adding a helper that computes the offset of kernel_entry from the start of the image, and simply adding the result to VMLINUX_LOAD_ADDRESS. And considering that we cannot execute from anywhere else anyway, let's avoid efi_relocate_kernel() and just allocate the pages instead. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi/arm64: libstub: Split off kernel image relocation for builtin stub	Ard Biesheuvel
	The arm64 build of the EFI stub is part of the core kernel image, and therefore accesses section markers directly when it needs to figure out the size of the various section. The zboot decompressor does not have access to those symbols, but doesn't really need that either. So let's move handle_kernel_image() into a separate file (or rather, move everything else into a separate file) so that the zboot build does not pull in unused code that links to symbols that it does not define. While at it, introduce a helper routine that the generic zboot loader will need to invoke after decompressing the image but before invoking it, to ensure that the I-side view of memory is consistent. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi/riscv: libstub: Split off kernel image relocation for builtin stub	Ard Biesheuvel
	The RISC-V build of the EFI stub is part of the core kernel image, and therefore accesses section markers directly when it needs to figure out the size of the various section. The zboot decompressor does not have access to those symbols, but doesn't really need that either. So let's move handle_kernel_image() into a separate file (or rather, move everything else into a separate file) so that the zboot build does not pull in unused code that links to symbols that it does not define. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Factor out min alignment and preferred kernel load address	Ard Biesheuvel
	Factor out the expressions that describe the preferred placement of the loaded image as well as the minimum alignment so we can reuse them in the decompressor. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Add image code and data size to the zimage metadata	Ard Biesheuvel
	In order to be able to switch from LoadImage() [which treats the supplied PE/COFF image as file input only, and reconstructs the memory image based on the section descriptors] to a mode where we allocate the memory directly, and invoke the image in place, we need to now how much memory to allocate beyond the end of the image. So copy this information from the payload's PE/COFF header to the end of the compressed version of the payload, so that the decompressor app can access it before performing the decompression itself. We'll also need to size of the code region once we switch arm64 to jumping to the kernel proper with MMU and caches enabled, so let's capture that information as well. Note that SizeOfCode does not account for the header, so we need SizeOfHeaders as well. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Factor out EFI stub entrypoint into separate file	Ard Biesheuvel
	In preparation for allowing the EFI zboot decompressor to reuse most of the EFI stub machinery, factor out the actual EFI PE/COFF entrypoint into a separate file. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Provide local implementations of strrchr() and memchr()	Ard Biesheuvel
	Clone the implementations of strrchr() and memchr() in lib/string.c so we can use them in the standalone zboot decompressor app. These routines are used by the FDT handling code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Move screen_info handling to common code	Ard Biesheuvel
	Currently, arm64, RISC-V and LoongArch rely on the fact that struct screen_info can be accessed directly, due to the fact that the EFI stub and the core kernel are part of the same image. This will change after a future patch, so let's ensure that the screen_info handling is able to deal with this, by adopting the arm32 approach of passing it as a configuration table. While at it, switch to ACPI reclaim memory to hold the screen_info data, which is more appropriate for this kind of allocation. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Enable efi_printk() in zboot decompressor	Ard Biesheuvel
	Split the efi_printk() routine into its own source file, and provide local implementations of strlen() and strnlen() so that the standalone zboot app can efi_err and efi_info etc. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Clone memcmp() into the stub	Ard Biesheuvel
	We will no longer be able to call into the kernel image once we merge the decompressor with the EFI stub, so we need our own implementation of memcmp(). Let's add the one from lib/string.c and simplify it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Use local strncmp() implementation unconditionally	Ard Biesheuvel
	In preparation for moving the EFI stub functionality into the zboot decompressor, switch to the stub's implementation of strncmp() unconditionally. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	arm64: efi: Move efi-entry.S into the libstub source directory	Ard Biesheuvel
	We will be sharing efi-entry.S with the zboot decompressor build, which does not link against vmlinux directly. So move it into the libstub source directory so we can include in the libstub static library. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2022-11-09	arm64: efi: Move dcache cleaning of loaded image out of efi_enter_kernel()	Ard Biesheuvel
	The efi_enter_kernel() routine will be shared between the existing EFI stub and the zboot decompressor, and the version of dcache_clean_to_poc() that the core kernel exports to the stub will not be available in the latter case. So move the handling into the .c file which will remain part of the stub build that integrates directly with the kernel proper. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2022-11-09	efi: libstub: Deduplicate ftrace command line argument filtering	Ard Biesheuvel
	No need for the same pattern to be used four times for each architecture individually if we can just apply it once later. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Drop handling of EFI properties table	Ard Biesheuvel
	The EFI properties table was a short lived experiment that never saw the light of day on non-x86 (if at all) so let's drop the handling of it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09	efi: libstub: Drop randomization of runtime memory map	Ard Biesheuvel
	Randomizing the UEFI runtime memory map requires the use of the SetVirtualAddressMap() EFI boot service, which we prefer to avoid. So let's drop randomization, which was already problematic in combination with hibernation, which means that distro kernels never enabled it in the first place. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-24	efi: random: Use 'ACPI reclaim' memory for random seed	Ard Biesheuvel
	EFI runtime services data is guaranteed to be preserved by the OS, making it a suitable candidate for the EFI random seed table, which may be passed to kexec kernels as well (after refreshing the seed), and so we need to ensure that the memory is preserved without support from the OS itself. However, runtime services data is intended for allocations that are relevant to the implementations of the runtime services themselves, and so they are unmapped from the kernel linear map, and mapped into the EFI page tables that are active while runtime service invocations are in progress. None of this is needed for the RNG seed. So let's switch to EFI 'ACPI reclaim' memory: in spite of the name, there is nothing exclusively ACPI about it, it is simply a type of allocation that carries firmware provided data which may or may not be relevant to the OS, and it is left up to the OS to decide whether to reclaim it after having consumed its contents. Given that in Linux, we never reclaim these allocations, it is a good choice for the EFI RNG seed, as the allocation is guaranteed to survive kexec reboots. One additional reason for changing this now is to align it with the upcoming recommendation for EFI bootloader provided RNG seeds, which must not use EFI runtime services code/data allocations. Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
2022-10-21	efi: runtime: Don't assume virtual mappings are missing if VA == PA == 0	Ard Biesheuvel
	The generic EFI stub can be instructed to avoid SetVirtualAddressMap(), and simply run with the firmware's 1:1 mapping. In this case, it populates the virtual address fields of the runtime regions in the memory map with the physical address of each region, so that the mapping code has to be none the wiser. Only if SetVirtualAddressMap() fails, the virtual addresses are wiped and the kernel code knows that the regions cannot be mapped. However, wiping amounts to setting it to zero, and if a runtime region happens to live at physical address 0, its valid 1:1 mapped virtual address could be mistaken for a wiped field, resulting on loss of access to the EFI services at runtime. So let's only assume that VA == 0 means 'no runtime services' if the region in question does not live at PA 0x0. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21	efi: libstub: Fix incorrect payload size in zboot header	Ard Biesheuvel
	The linker script symbol definition that captures the size of the compressed payload inside the zboot decompressor (which is exposed via the image header) refers to '.' for the end of the region, which does not give the correct result as the expression is not placed at the end of the payload. So use the symbol name explicitly. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21	efi: libstub: Give efi_main() asmlinkage qualification	Ard Biesheuvel
	To stop the bots from sending sparse warnings to me and the list about efi_main() not having a prototype, decorate it with asmlinkage so that it is clear that it is called from assembly, and therefore needs to remain external, even if it is never declared in a header file. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21	efi: libstub: Remove zboot signing from build options	Ard Biesheuvel
	The zboot decompressor series introduced a feature to sign the PE/COFF kernel image for secure boot as part of the kernel build. This was necessary because there are actually two images that need to be signed: the kernel with the EFI stub attached, and the decompressor application. This is a bit of a burden, because it means that the images must be signed on the the same system that performs the build, and this is not realistic for distros. During the next cycle, we will introduce changes to the zboot code so that the inner image no longer needs to be signed. This means that the outer PE/COFF image can be handled as usual, and be signed later in the release process. Let's remove the associated Kconfig options now so that they don't end up in a LTS release while already being deprecated. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-10	Merge tag 'mm-stable-2022-10-08' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
2022-10-09	Merge tag 'efi-next-for-v6.1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: "A bit more going on than usual in the EFI subsystem. The main driver for this has been the introduction of the LoonArch architecture last cycle, which inspired some cleanup and refactoring of the EFI code. Another driver for EFI changes this cycle and in the future is confidential compute. The LoongArch architecture does not use either struct bootparams or DT natively [yet], and so passing information between the EFI stub and the core kernel using either of those is undesirable. And in general, overloading DT has been a source of issues on arm64, so using DT for this on new architectures is a to avoid for the time being (even if we might converge on something DT based for non-x86 architectures in the future). For this reason, in addition to the patch that enables EFI boot for LoongArch, there are a number of refactoring patches applied on top of which separate the DT bits from the generic EFI stub bits. These changes are on a separate topich branch that has been shared with the LoongArch maintainers, who will include it in their pull request as well. This is not ideal, but the best way to manage the conflicts without stalling LoongArch for another cycle. Another development inspired by LoongArch is the newly added support for EFI based decompressors. Instead of adding yet another arch-specific incarnation of this pattern for LoongArch, we are introducing an EFI app based on the existing EFI libstub infrastructure that encapulates the decompression code we use on other architectures, but in a way that is fully generic. This has been developed and tested in collaboration with distro and systemd folks, who are eager to start using this for systemd-boot and also for arm64 secure boot on Fedora. Note that the EFI zimage files this introduces can also be decompressed by non-EFI bootloaders if needed, as the image header describes the location of the payload inside the image, and the type of compression that was used. (Note that Fedora's arm64 GRUB is buggy [0] so you'll need a recent version or switch to systemd-boot in order to use this.) Finally, we are adding TPM measurement of the kernel command line provided by EFI. There is an oversight in the TCG spec which results in a blind spot for command line arguments passed to loaded images, which means that either the loader or the stub needs to take the measurement. Given the combinatorial explosion I am anticipating when it comes to firmware/bootloader stacks and firmware based attestation protocols (SEV-SNP, TDX, DICE, DRTM), it is good to set a baseline now when it comes to EFI measured boot, which is that the kernel measures the initrd and command line. Intermediate loaders can measure additional assets if needed, but with the baseline in place, we can deploy measured boot in a meaningful way even if you boot into Linux straight from the EFI firmware. Summary: - implement EFI boot support for LoongArch - implement generic EFI compressed boot support for arm64, RISC-V and LoongArch, none of which implement a decompressor today - measure the kernel command line into the TPM if measured boot is in effect - refactor the EFI stub code in order to isolate DT dependencies for architectures other than x86 - avoid calling SetVirtualAddressMap() on arm64 if the configured size of the VA space guarantees that doing so is unnecessary - move some ARM specific code out of the generic EFI source files - unmap kernel code from the x86 mixed mode 1:1 page tables" * tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits) efi/arm64: libstub: avoid SetVirtualAddressMap() when possible efi: zboot: create MemoryMapped() device path for the parent if needed efi: libstub: fix up the last remaining open coded boot service call efi/arm: libstub: move ARM specific code out of generic routines efi/libstub: measure EFI LoadOptions efi/libstub: refactor the initrd measuring functions efi/loongarch: libstub: remove dependency on flattened DT efi: libstub: install boot-time memory map as config table efi: libstub: remove DT dependency from generic stub efi: libstub: unify initrd loading between architectures efi: libstub: remove pointless goto kludge efi: libstub: simplify efi_get_memory_map() and struct efi_boot_memmap efi: libstub: avoid efi_get_memory_map() for allocating the virt map efi: libstub: drop pointless get_memory_map() call efi: libstub: fix type confusion for load_options_size arm64: efi: enable generic EFI compressed boot loongarch: efi: enable generic EFI compressed boot riscv: efi: enable generic EFI compressed boot efi/libstub: implement generic EFI zboot efi/libstub: move efi_system_table global var into separate object ...
2022-10-06	Merge tag 'arm64-upstream' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits) arm64: alternatives: Use vdso/bits.h instead of linux/bits.h arm64/kprobe: Optimize the performance of patching single-step slot arm64: defconfig: Add Coresight as module kselftest/arm64: Handle EINTR while reading data from children kselftest/arm64: Flag fp-stress as exiting when we begin finishing up kselftest/arm64: Don't repeat termination handler for fp-stress ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs arm64/mm: fold check for KFENCE into can_set_direct_map() arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static arm64: fix the build with binutils 2.27 kselftest/arm64: Don't enable v8.5 for MTE selftest builds arm64: uaccess: simplify uaccess_mask_ptr() arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header kselftest/arm64: Fix typo in hwcap check arm64: mte: move register initialization to C arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate() arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() arm64/sve: Add Perf extensions documentation ...
2022-10-03	Merge tag 'kcfi-v6.1-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kcfi updates from Kees Cook: "This replaces the prior support for Clang's standard Control Flow Integrity (CFI) instrumentation, which has required a lot of special conditions (e.g. LTO) and work-arounds. The new implementation ("Kernel CFI") is specific to C, directly designed for the Linux kernel, and takes advantage of architectural features like x86's IBT. This series retains arm64 support and adds x86 support. GCC support is expected in the future[1], and additional "generic" architectural support is expected soon[2]. Summary: - treewide: Remove old CFI support details - arm64: Replace Clang CFI support with Clang KCFI support - x86: Introduce Clang KCFI support" Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048 [1] Link: https://github.com/samitolvanen/llvm-project/commits/kcfi_generic [2] * tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits) x86: Add support for CONFIG_CFI_CLANG x86/purgatory: Disable CFI x86: Add types to indirectly called assembly functions x86/tools/relocs: Ignore __kcfi_typeid_ relocations kallsyms: Drop CONFIG_CFI_CLANG workarounds objtool: Disable CFI warnings objtool: Preserve special st_shndx indexes in elf_update_symbol treewide: Drop __cficanonical treewide: Drop WARN_ON_FUNCTION_MISMATCH treewide: Drop function_nocfi init: Drop __nocfi from __init arm64: Drop unneeded __nocfi attributes arm64: Add CFI error handling arm64: Add types to indirect called assembly functions psci: Fix the function type for psci_initcall_t lkdtm: Emit an indirect call for CFI tests cfi: Add type helper macros cfi: Switch to -fsanitize=kcfi cfi: Drop __CFI_ADDRESSABLE cfi: Remove CONFIG_CFI_CLANG_SHADOW ...
2022-10-03	kmsan: disable instrumentation of unsupported common kernel code	Alexander Potapenko
	EFI stub cannot be linked with KMSAN runtime, so we disable instrumentation for it. Instrumenting kcov, stackdepot or lockdep leads to infinite recursion caused by instrumentation hooks calling instrumented code again. Link: https://lkml.kernel.org/r/20220915150417.722975-13-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-27	efi/arm64: libstub: avoid SetVirtualAddressMap() when possible	Ard Biesheuvel
	EFI's SetVirtualAddressMap() runtime service is a horrid hack that we'd like to avoid using, if possible. For 64-bit architectures such as arm64, the user and kernel mappings are entirely disjoint, and given that we use the user region for mapping the UEFI runtime regions when running under the OS, we don't rely on SetVirtualAddressMap() in the conventional way, i.e., to permit kernel mappings of the OS to coexist with kernel region mappings of the firmware regions. This means that, in principle, we should be able to avoid SetVirtualAddressMap() altogether, and simply use the 1:1 mapping that UEFI uses at boot time. (Note that omitting SetVirtualAddressMap() is explicitly permitted by the UEFI spec). However, there is a corner case on arm64, which, if configured for 3-level paging (or 2-level paging when using 64k pages), may not be able to cover the entire range of firmware mappings (which might contain both memory and MMIO peripheral mappings). So let's avoid SetVirtualAddressMap() on arm64, but only if the VA space is guaranteed to be of sufficient size. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-27	efi: zboot: create MemoryMapped() device path for the parent if needed	Ard Biesheuvel
	LoadImage() is supposed to install an instance of the protocol EFI_LOADED_IMAGE_DEVICE_PATH_PROTOCOL onto the loaded image's handle so that the program can figure out where it was loaded from. The reference implementation even does this (with a NULL protocol pointer) if the call to LoadImage() used the source buffer and size arguments, and passed NULL for the image device path. Hand rolled implementations of LoadImage may behave differently, though, and so it is better to tolerate situations where the protocol is missing. And actually, concatenating an Offset() node to a NULL device path (as we do currently) is not great either. So in cases where the protocol is absent, or when it points to NULL, construct a MemoryMapped() device node as the base node that describes the parent image's footprint in memory. Cc: Daan De Meyer <daandemeyer@fb.com> Cc: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-27	efi: libstub: fix up the last remaining open coded boot service call	Ard Biesheuvel
	We use a macro efi_bs_call() to call boot services, which is more concise, and on x86, it encapsulates the mixed mode handling. This code does not run in mixed mode, but let's switch to the macro for general tidiness. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-27	efi/libstub: measure EFI LoadOptions	Ilias Apalodimas
	The EFI TCG spec, in §10.2.6 "Measuring UEFI Variables and UEFI GPT Data", only reasons about the load options passed to a loaded image in the context of boot options booted directly from the BDS, which are measured into PCR #5 along with the rest of the Boot#### EFI variable. However, the UEFI spec mentions the following in the documentation of the LoadImage() boot service and the EFI_LOADED_IMAGE protocol: The caller may fill in the image’s "load options" data, or add additional protocol support to the handle before passing control to the newly loaded image by calling EFI_BOOT_SERVICES.StartImage(). The typical boot sequence for Linux EFI systems is to load GRUB via a boot option from the BDS, which [hopefully] calls LoadImage to load the kernel image, passing the kernel command line via the mechanism described above. This means that we cannot rely on the firmware implementing TCG measured boot to ensure that the kernel command line gets measured before the image is started, so the EFI stub will have to take care of this itself. Given that PCR #5 has an official use in the TCG measured boot spec, let's avoid it in this case. Instead, add a measurement in PCR #9 (which we already use for our initrd) and extend it with the LoadOptions measurements Co-developed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-27	efi/libstub: refactor the initrd measuring functions	Ilias Apalodimas
	Currently, from the efi-stub, we are only measuring the loaded initrd, using the TCG2 measured boot protocols. A following patch is introducing measurements of additional components, such as the kernel command line. On top of that, we will shortly have to support other types of measured boot that don't expose the TCG2 protocols. So let's prepare for that, by rejigging the efi_measure_initrd() routine into something that we should be able to reuse for measuring other assets, and which can be extended later to support other measured boot protocols. Co-developed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-27	Merge tag 'efi-loongarch-for-v6.1-2' into HEAD	Ard Biesheuvel
	Second shared stable tag between EFI and LoongArch trees This is necessary because the EFI libstub refactoring patches are mostly directed at enabling LoongArch to wire up generic EFI boot support without being forced to consume DT properties that conflict with information that EFI also provides, e.g., memory map and reservations, etc.
2022-09-27	efi/loongarch: libstub: remove dependency on flattened DT	Ard Biesheuvel
	LoongArch does not use FDT or DT natively [yet], and the only reason it currently uses it is so that it can reuse the existing EFI stub code. Overloading the DT with data passed between the EFI stub and the core kernel has been a source of problems: there is the overlap between information provided by EFI which DT can also provide (initrd base/size, command line, memory descriptions), requiring us to reason about which is which and what to prioritize. It has also resulted in ABI leaks, i.e., internal ABI being promoted to external ABI inadvertently because the bootloader can set the EFI stub's DT properties as well (e.g., "kaslr-seed"). This has become especially problematic with boot environments that want to pretend that EFI boot is being done (to access ACPI and SMBIOS tables, for instance) but have no ability to execute the EFI stub, and so the environment that the EFI stub creates is emulated [poorly, in some cases]. Another downside of treating DT like this is that the DT binary that the kernel receives is different from the one created by the firmware, which is undesirable in the context of secure and measured boot. Given that LoongArch support in Linux is brand new, we can avoid these pitfalls, and treat the DT strictly as a hardware description, and use a separate handover method between the EFI stub and the kernel. Now that initrd loading and passing the EFI memory map have been refactored into pure EFI routines that use EFI configuration tables, the only thing we need to pass directly is the kernel command line (even if we could pass this via a config table as well, it is used extremely early, so passing it directly is preferred in this case.) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-27	efi: libstub: install boot-time memory map as config table	Ard Biesheuvel
	Expose the EFI boot time memory map to the kernel via a configuration table. This is arch agnostic and enables future changes that remove the dependency on DT on architectures that don't otherwise rely on it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-27	efi: libstub: remove DT dependency from generic stub	Ard Biesheuvel
	Refactor the generic EFI stub entry code so that all the dependencies on device tree are abstracted and hidden behind a generic efi_boot_kernel() routine that can also be implemented in other ways. This allows users of the generic stub to avoid using FDT for passing information to the core kernel. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-27	efi: libstub: unify initrd loading between architectures	Ard Biesheuvel
	Use a EFI configuration table to pass the initrd to the core kernel, instead of per-arch methods. This cleans up the code considerably, and should make it easier for architectures to get rid of their reliance on DT for doing EFI boot in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-26	efi: libstub: remove pointless goto kludge	Ard Biesheuvel
	Remove some goto cruft that serves no purpose and obfuscates the code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>