summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/vmlinux.lds.S
AgeCommit message (Collapse)Author
2020-11-17arm64: omit [_text, _stext) from permanent kernel mappingArd Biesheuvel
In a previous patch, we increased the size of the EFI PE/COFF header to 64 KB, which resulted in the _stext symbol to appear at a fixed offset of 64 KB into the image. Since 64 KB is also the largest page size we support, this completely removes the need to map the first 64 KB of the kernel image, given that it only contains the arm64 Image header and the EFI header, neither of which we ever access again after booting the kernel. More importantly, we should avoid an executable mapping of non-executable and not entirely predictable data, to deal with the unlikely event that we inadvertently emitted something that looks like an opcode that could be used as a gadget for speculative execution. So let's limit the kernel mapping of .text to the [_stext, _etext) region, which matches the view of generic code (such as kallsyms) when it reasons about the boundaries of the kernel's .text section. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201117124729.12642-2-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-10-28arm64: vmlinux.lds: account for spurious empty .igot.plt sectionsArd Biesheuvel
Now that we started making the linker warn about orphan sections (input sections that are not explicitly consumed by an output section), some configurations produce the following warning: aarch64-linux-gnu-ld: warning: orphan section `.igot.plt' from `arch/arm64/kernel/head.o' being placed in section `.igot.plt' It could be any file that triggers this - head.o is simply the first input file in the link - and the resulting .igot.plt section never actually appears in vmlinux as it turns out to be empty. So let's add .igot.plt to our collection of input sections to disregard unless they are empty. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20201028133332.5571-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-10-23Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "For x86, there is a new alternative and (in the future) more scalable implementation of extended page tables that does not need a reverse map from guest physical addresses to host physical addresses. For now it is disabled by default because it is still lacking a few of the existing MMU's bells and whistles. However it is a very solid piece of work and it is already available for people to hammer on it. Other updates: ARM: - New page table code for both hypervisor and guest stage-2 - Introduction of a new EL2-private host context - Allow EL2 to have its own private per-CPU variables - Support of PMU event filtering - Complete rework of the Spectre mitigation PPC: - Fix for running nested guests with in-kernel IRQ chip - Fix race condition causing occasional host hard lockup - Minor cleanups and bugfixes x86: - allow trapping unknown MSRs to userspace - allow userspace to force #GP on specific MSRs - INVPCID support on AMD - nested AMD cleanup, on demand allocation of nested SVM state - hide PV MSRs and hypercalls for features not enabled in CPUID - new test for MSR_IA32_TSC writes from host and guest - cleanups: MMU, CPUID, shared MSRs - LAPIC latency optimizations ad bugfixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits) kvm: x86/mmu: NX largepage recovery for TDP MMU kvm: x86/mmu: Don't clear write flooding count for direct roots kvm: x86/mmu: Support MMIO in the TDP MMU kvm: x86/mmu: Support write protection for nesting in tdp MMU kvm: x86/mmu: Support disabling dirty logging for the tdp MMU kvm: x86/mmu: Support dirty logging for the TDP MMU kvm: x86/mmu: Support changed pte notifier in tdp MMU kvm: x86/mmu: Add access tracking for tdp_mmu kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU kvm: x86/mmu: Add TDP MMU PF handler kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg kvm: x86/mmu: Support zapping SPTEs in the TDP MMU KVM: Cache as_id in kvm_memory_slot kvm: x86/mmu: Add functions to handle changed TDP SPTEs kvm: x86/mmu: Allocate and free TDP MMU roots kvm: x86/mmu: Init / Uninit the TDP MMU kvm: x86/mmu: Introduce tdp_iter KVM: mmu: extract spte.h and spte.c KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp ...
2020-10-12Merge tag 'core-build-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull orphan section checking from Ingo Molnar: "Orphan link sections were a long-standing source of obscure bugs, because the heuristics that various linkers & compilers use to handle them (include these bits into the output image vs discarding them silently) are both highly idiosyncratic and also version dependent. Instead of this historically problematic mess, this tree by Kees Cook (et al) adds build time asserts and build time warnings if there's any orphan section in the kernel or if a section is not sized as expected. And because we relied on so many silent assumptions in this area, fix a metric ton of dependencies and some outright bugs related to this, before we can finally enable the checks on the x86, ARM and ARM64 platforms" * tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) x86/boot/compressed: Warn on orphan section placement x86/build: Warn on orphan section placement arm/boot: Warn on orphan section placement arm/build: Warn on orphan section placement arm64/build: Warn on orphan section placement x86/boot/compressed: Add missing debugging sections to output x86/boot/compressed: Remove, discard, or assert for unwanted sections x86/boot/compressed: Reorganize zero-size section asserts x86/build: Add asserts for unwanted sections x86/build: Enforce an empty .got.plt section x86/asm: Avoid generating unused kprobe sections arm/boot: Handle all sections explicitly arm/build: Assert for unwanted sections arm/build: Add missing sections arm/build: Explicitly keep .ARM.attributes sections arm/build: Refactor linker script headers arm64/build: Assert for unwanted sections arm64/build: Add missing DWARF sections arm64/build: Use common DISCARDS in linker script arm64/build: Remove .eh_frame* sections due to unwind tables ...
2020-09-30kvm: arm64: Set up hyp percpu data for nVHEDavid Brazdil
Add hyp percpu section to linker script and rename the corresponding ELF sections of hyp/nvhe object files. This moves all nVHE-specific percpu variables to the new hyp percpu section. Allocate sufficient amount of memory for all percpu hyp regions at global KVM init time and create corresponding hyp mappings. The base addresses of hyp percpu regions are kept in a dynamically allocated array in the kernel. Add NULL checks in PMU event-reset code as it may run before KVM memory is initialized. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-10-dbrazdil@google.com
2020-09-30kvm: arm64: Only define __kvm_ex_table for CONFIG_KVMDavid Brazdil
Minor cleanup that only creates __kvm_ex_table ELF section and related symbols if CONFIG_KVM is enabled. Also useful as more hyp-specific sections will be added. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-4-dbrazdil@google.com
2020-09-30kvm: arm64: Move nVHE hyp namespace macros to hyp_image.hDavid Brazdil
Minor cleanup to move all macros related to prefixing nVHE hyp section and symbol names into one place: hyp_image.h. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-3-dbrazdil@google.com
2020-09-07arm64: get rid of TEXT_OFFSETArd Biesheuvel
TEXT_OFFSET serves no purpose, and for this reason, it was redefined as 0x0 in the v5.8 timeframe. Since this does not appear to have caused any issues that require us to revisit that decision, let's get rid of the macro entirely, along with any references to it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20200825135440.11288-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-01arm64/build: Assert for unwanted sectionsKees Cook
In preparation for warning on orphan sections, discard unwanted non-zero-sized generated sections, and enforce other expected-to-be-zero-sized sections (since discarding them might hide problems with them suddenly gaining unexpected entries). Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-14-keescook@chromium.org
2020-09-01arm64/build: Add missing DWARF sectionsKees Cook
Explicitly include DWARF sections when they're present in the build. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-13-keescook@chromium.org
2020-09-01arm64/build: Use common DISCARDS in linker scriptKees Cook
Use the common DISCARDS rule for the linker script in an effort to regularize the linker script to prepare for warning on orphaned sections. Additionally clean up left-over no-op macros. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-12-keescook@chromium.org
2020-09-01arm64/build: Remove .eh_frame* sections due to unwind tablesKees Cook
Avoid .eh_frame* section generation by making sure both CFLAGS and AFLAGS contain -fno-asychronous-unwind-tables and -fno-unwind-tables. With all sources of .eh_frame now removed from the build, drop this DISCARD so we can be alerted in the future if it returns unexpectedly once orphan section warnings have been enabled. Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-11-keescook@chromium.org
2020-09-01vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUGKees Cook
The .comment section doesn't belong in STABS_DEBUG. Split it out into a new macro named ELF_DETAILS. This will gain other non-debug sections that need to be accounted for when linking with --orphan-handling=warn. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-5-keescook@chromium.org
2020-08-28KVM: arm64: Add kvm_extable for vaxorcism codeJames Morse
KVM has a one instruction window where it will allow an SError exception to be consumed by the hypervisor without treating it as a hypervisor bug. This is used to consume asynchronous external abort that were caused by the guest. As we are about to add another location that survives unexpected exceptions, generalise this code to make it behave like the host's extable. KVM's version has to be mapped to EL2 to be accessible on nVHE systems. The SError vaxorcism code is a one instruction window, so has two entries in the extable. Because the KVM code is copied for VHE and nVHE, we end up with four entries, half of which correspond with code that isn't mapped. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-08-03Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 and cross-arch updates from Catalin Marinas: "Here's a slightly wider-spread set of updates for 5.9. Going outside the usual arch/arm64/ area is the removal of read_barrier_depends() series from Will and the MSI/IOMMU ID translation series from Lorenzo. The notable arm64 updates include ARMv8.4 TLBI range operations and translation level hint, time namespace support, and perf. Summary: - Removal of the tremendously unpopular read_barrier_depends() barrier, which is a NOP on all architectures apart from Alpha, in favour of allowing architectures to override READ_ONCE() and do whatever dance they need to do to ensure address dependencies provide LOAD -> LOAD/STORE ordering. This work also offers a potential solution if compilers are shown to convert LOAD -> LOAD address dependencies into control dependencies (e.g. under LTO), as weakly ordered architectures will effectively be able to upgrade READ_ONCE() to smp_load_acquire(). The latter case is not used yet, but will be discussed further at LPC. - Make the MSI/IOMMU input/output ID translation PCI agnostic, augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID bus-specific parameter and apply the resulting changes to the device ID space provided by the Freescale FSL bus. - arm64 support for TLBI range operations and translation table level hints (part of the ARMv8.4 architecture version). - Time namespace support for arm64. - Export the virtual and physical address sizes in vmcoreinfo for makedumpfile and crash utilities. - CPU feature handling cleanups and checks for programmer errors (overlapping bit-fields). - ACPI updates for arm64: disallow AML accesses to EFI code regions and kernel memory. - perf updates for arm64. - Miscellaneous fixes and cleanups, most notably PLT counting optimisation for module loading, recordmcount fix to ignore relocations other than R_AARCH64_CALL26, CMA areas reserved for gigantic pages on 16K and 64K configurations. - Trivial typos, duplicate words" Link: http://lkml.kernel.org/r/20200710165203.31284-1-will@kernel.org Link: http://lkml.kernel.org/r/20200619082013.13661-1-lorenzo.pieralisi@arm.com * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits) arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack arm64/mm: save memory access in check_and_switch_context() fast switch path arm64: sigcontext.h: delete duplicated word arm64: ptrace.h: delete duplicated word arm64: pgtable-hwdef.h: delete duplicated words bus: fsl-mc: Add ACPI support for fsl-mc bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver of/irq: Make of_msi_map_rid() PCI bus agnostic of/irq: make of_msi_map_get_device_domain() bus agnostic dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus of/device: Add input id to of_dma_configure() of/iommu: Make of_map_rid() PCI agnostic ACPI/IORT: Add an input ID to acpi_dma_configure() ACPI/IORT: Remove useless PCI bus walk ACPI/IORT: Make iort_msi_map_rid() PCI agnostic ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC arm64: enable time namespace support arm64/vdso: Restrict splitting VVAR VMA arm64/vdso: Handle faults on timens page ...
2020-07-21arm64: Reduce the number of header files pulled into vmlinux.lds.SWill Deacon
Although vmlinux.lds.S smells like an assembly file and is compiled with __ASSEMBLY__ defined, it's actually just fed to the preprocessor to create our linker script. This means that any assembly macros defined by headers that it includes will result in a helpful link error: | aarch64-linux-gnu-ld:./arch/arm64/kernel/vmlinux.lds:1: syntax error In preparation for an arm64-private asm/rwonce.h implementation, which will end up pulling assembly macros into linux/compiler.h, reduce the number of headers we include directly and transitively in vmlinux.lds.S Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-02arm64/alternatives: use subsections for replacement sequencesArd Biesheuvel
When building very large kernels, the logic that emits replacement sequences for alternatives fails when relative branches are present in the code that is emitted into the .altinstr_replacement section and patched in at the original site and fixed up. The reason is that the linker will insert veneers if relative branches go out of range, and due to the relative distance of the .altinstr_replacement from the .text section where its branch targets usually live, veneers may be emitted at the end of the .altinstr_replacement section, with the relative branches in the sequence pointed at the veneers instead of the actual target. The alternatives patching logic will attempt to fix up the branch to point to its original target, which will be the veneer in this case, but given that the patch site is likely to be far away as well, it will be out of range and so patching will fail. There are other cases where these veneers are problematic, e.g., when the target of the branch is in .text while the patch site is in .init.text, in which case putting the replacement sequence inside .text may not help either. So let's use subsections to emit the replacement code as closely as possible to the patch site, to ensure that veneers are only likely to be emitted if they are required at the patch site as well, in which case they will be in range for the replacement sequence both before and after it is transported to the patch site. This will prevent alternative sequences in non-init code from being released from memory after boot, but this is tolerable given that the entire section is only 512 KB on an allyesconfig build (which weighs in at 500+ MB for the entire Image). Also, note that modules today carry the replacement sequences in non-init sections as well, and any of those that target init code will be emitted into init sections after this change. This fixes an early crash when booting an allyesconfig kernel on a system where any of the alternatives sequences containing relative branches are activated at boot (e.g., ARM64_HAS_PAN on TX2) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Dave P Martin <dave.martin@arm.com> Link: https://lore.kernel.org/r/20200630081921.13443-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-06-09mm: reorder includes after introduction of linux/pgtable.hMike Rapoport
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: introduce include/linux/pgtable.hMike Rapoport
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-28efi/libstub/arm64: align PE/COFF sections to segment alignmentArd Biesheuvel
The arm64 kernel's segment alignment is fixed at 64 KB for any page size, and relocatable kernels are able to fix up any misalignment of the kernel image with respect to the 2 MB section alignment that is mandated by the arm64 boot protocol. Let's increase the PE/COFF section alignment to the same value, so that kernels loaded by the UEFI PE/COFF loader are guaranteed to end up at an address that doesn't require any reallocation to be done if the kernel is relocatable. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20200413155521.24698-6-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-04-28arm64/kernel: Fix range on invalidating dcache for boot page tablesGavin Shan
Prior to commit 8eb7e28d4c642c31 ("arm64/mm: move runtime pgds to rodata"), idmap_pgd_dir, tramp_pg_dir, reserved_ttbr0, swapper_pg_dir, and init_pg_dir were contiguous at the end of the kernel image. The maintenance at the end of __create_page_tables assumed these were contiguous, and affected everything from the start of idmap_pg_dir to the end of init_pg_dir. That commit moved all but init_pg_dir into the .rodata section, with other data placed between idmap_pg_dir and init_pg_dir, but did not update the maintenance. Hence the maintenance is performed on much more data than necessary (but as the bootloader previously made this clean to the PoC there is no functional problem). As we only alter idmap_pg_dir, and init_pg_dir, we only need to perform maintenance for these. As the other dirs are in .rodata, the bootloader will have initialised them as expected and cleaned them to the PoC. The kernel will initialize them as necessary after enabling the MMU. This patch reworks the maintenance to only cover the idmap_pg_dir and init_pg_dir to avoid this unnecessary work. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200427235700.112220-1-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-04-28arm64/kernel: vmlinux.lds: drop redundant discard/keep macrosArd Biesheuvel
ARM_EXIT_KEEP and ARM_EXIT_DISCARD are always defined in the same way, so we don't really need them in the first place. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200416132730.25290-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-04-28arm64: rename stext to primary_entryArd Biesheuvel
For historical reasons, the primary entry routine living somewhere in the inittext section is called stext(), which is confusing, given that there is also a section marker called _stext which lives at a fixed offset in the image (either 64 or 4096 bytes, depending on whether CONFIG_EFI is enabled) Let's rename stext to primary_entry(), which is a better description and reflects the secondary_entry() routine that already exists for SMP boot. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200326171423.3080-1-ardb@kernel.org Reviwed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2019-12-06Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - ZONE_DMA32 initialisation fix when memblocks fall entirely within the first GB (used by ZONE_DMA in 5.5 for Raspberry Pi 4). - Couple of ftrace fixes following the FTRACE_WITH_REGS patchset. - access_ok() fix for the Tagged Address ABI when called from from a kernel thread (asynchronous I/O): the kthread does not have the TIF flags of the mm owner, so untag the user address unconditionally. - KVM compute_layout() called before the alternatives code patching. - Minor clean-ups. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: entry: refine comment of stack overflow check arm64: ftrace: fix ifdeffery arm64: KVM: Invoke compute_layout() before alternatives are applied arm64: Validate tagged addresses in access_ok() called from kernel threads arm64: mm: Fix column alignment for UXN in kernel_page_tables arm64: insn: consistently handle exit text arm64: mm: Fix initialisation of DMA zones on non-NUMA systems
2019-12-04arm64: insn: consistently handle exit textMark Rutland
A kernel built with KASAN && FTRACE_WITH_REGS && !MODULES, produces a boot-time splat in the bowels of ftrace: | [ 0.000000] ftrace: allocating 32281 entries in 127 pages | [ 0.000000] ------------[ cut here ]------------ | [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2019 ftrace_bug+0x27c/0x328 | [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.0-rc3-00008-g7f08ae53a7e3 #13 | [ 0.000000] Hardware name: linux,dummy-virt (DT) | [ 0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO) | [ 0.000000] pc : ftrace_bug+0x27c/0x328 | [ 0.000000] lr : ftrace_init+0x640/0x6cc | [ 0.000000] sp : ffffa000120e7e00 | [ 0.000000] x29: ffffa000120e7e00 x28: ffff00006ac01b10 | [ 0.000000] x27: ffff00006ac898c0 x26: dfffa00000000000 | [ 0.000000] x25: ffffa000120ef290 x24: ffffa0001216df40 | [ 0.000000] x23: 000000000000018d x22: ffffa0001244c700 | [ 0.000000] x21: ffffa00011bf393c x20: ffff00006ac898c0 | [ 0.000000] x19: 00000000ffffffff x18: 0000000000001584 | [ 0.000000] x17: 0000000000001540 x16: 0000000000000007 | [ 0.000000] x15: 0000000000000000 x14: ffffa00010432770 | [ 0.000000] x13: ffff940002483519 x12: 1ffff40002483518 | [ 0.000000] x11: 1ffff40002483518 x10: ffff940002483518 | [ 0.000000] x9 : dfffa00000000000 x8 : 0000000000000001 | [ 0.000000] x7 : ffff940002483519 x6 : ffffa0001241a8c0 | [ 0.000000] x5 : ffff940002483519 x4 : ffff940002483519 | [ 0.000000] x3 : ffffa00011780870 x2 : 0000000000000001 | [ 0.000000] x1 : 1fffe0000d591318 x0 : 0000000000000000 | [ 0.000000] Call trace: | [ 0.000000] ftrace_bug+0x27c/0x328 | [ 0.000000] ftrace_init+0x640/0x6cc | [ 0.000000] start_kernel+0x27c/0x654 | [ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x30/0x60 with crng_init=0 | [ 0.000000] ---[ end trace 0000000000000000 ]--- | [ 0.000000] ftrace faulted on writing | [ 0.000000] [<ffffa00011bf393c>] _GLOBAL__sub_D_65535_0___tracepoint_initcall_level+0x4/0x28 | [ 0.000000] Initializing ftrace call sites | [ 0.000000] ftrace record flags: 0 | [ 0.000000] (0) | [ 0.000000] expected tramp: ffffa000100b3344 This is due to an unfortunate combination of several factors. Building with KASAN results in the compiler generating anonymous functions to register/unregister global variables against the shadow memory. These functions are placed in .text.startup/.text.exit, and given mangled names like _GLOBAL__sub_{I,D}_65535_0_$OTHER_SYMBOL. The kernel linker script places these in .init.text and .exit.text respectively, which are both discarded at runtime as part of initmem. Building with FTRACE_WITH_REGS uses -fpatchable-function-entry=2, which also instruments KASAN's anonymous functions. When these are discarded with the rest of initmem, ftrace removes dangling references to these call sites. Building without MODULES implicitly disables STRICT_MODULE_RWX, and causes arm64's patch_map() function to treat any !core_kernel_text() symbol as something that can be modified in-place. As core_kernel_text() is only true for .text and .init.text, with the latter depending on system_state < SYSTEM_RUNNING, we'll treat .exit.text as something that can be patched in-place. However, .exit.text is mapped read-only. Hence in this configuration the ftrace init code blows up while trying to patch one of the functions generated by KASAN. We could try to filter out the call sites in .exit.text rather than initializing them, but this would be inconsistent with how we handle .init.text, and requires hooking into core bits of ftrace. The behaviour of patch_map() is also inconsistent today, so instead let's clean that up and have it consistently handle .exit.text. This patch teaches patch_map() to handle .exit.text at init time, preventing the boot-time splat above. The flow of patch_map() is reworked to make the logic clearer and minimize redundant conditionality. Fixes: 3b23e4991fb66f6d ("arm64: implement ftrace with regs") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-26Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "The main changes in this cycle were: - Cross-arch changes to move the linker sections for NOTES and EXCEPTION_TABLE into the RO_DATA area, where they belong on most architectures. (Kees Cook) - Switch the x86 linker fill byte from x90 (NOP) to 0xcc (INT3), to trap jumps into the middle of those padding areas instead of sliding execution. (Kees Cook) - A thorough cleanup of symbol definitions within x86 assembler code. The rather randomly named macros got streamlined around a (hopefully) straightforward naming scheme: SYM_START(name, linkage, align...) SYM_END(name, sym_type) SYM_FUNC_START(name) SYM_FUNC_END(name) SYM_CODE_START(name) SYM_CODE_END(name) SYM_DATA_START(name) SYM_DATA_END(name) etc - with about three times of these basic primitives with some label, local symbol or attribute variant, expressed via postfixes. No change in functionality intended. (Jiri Slaby) - Misc other changes, cleanups and smaller fixes" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits) x86/entry/64: Remove pointless jump in paranoid_exit x86/entry/32: Remove unused resume_userspace label x86/build/vdso: Remove meaningless CFLAGS_REMOVE_*.o m68k: Convert missed RODATA to RO_DATA x86/vmlinux: Use INT3 instead of NOP for linker fill bytes x86/mm: Report actual image regions in /proc/iomem x86/mm: Report which part of kernel image is freed x86/mm: Remove redundant address-of operators on addresses xtensa: Move EXCEPTION_TABLE to RO_DATA segment powerpc: Move EXCEPTION_TABLE to RO_DATA segment parisc: Move EXCEPTION_TABLE to RO_DATA segment microblaze: Move EXCEPTION_TABLE to RO_DATA segment ia64: Move EXCEPTION_TABLE to RO_DATA segment h8300: Move EXCEPTION_TABLE to RO_DATA segment c6x: Move EXCEPTION_TABLE to RO_DATA segment arm64: Move EXCEPTION_TABLE to RO_DATA segment alpha: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Actually use _etext for the end of the text segment vmlinux.lds.h: Allow EXCEPTION_TABLE to live in RO_DATA ...
2019-11-04arm64: Move EXCEPTION_TABLE to RO_DATA segmentKees Cook
Since the EXCEPTION_TABLE is read-only, collapse it into RO_DATA. Also removes the redundant ALIGN, which is already present at the end of the RO_DATA macro. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Will Deacon <will@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Peter Collingbourne <pcc@google.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-19-keescook@chromium.org
2019-11-04vmlinux.lds.h: Replace RW_DATA_SECTION with RW_DATAKees Cook
Rename RW_DATA_SECTION to RW_DATA. (Calling this a "section" is a lie, since it's multiple sections and section flags cannot be applied to the macro.) Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-14-keescook@chromium.org
2019-11-04vmlinux.lds.h: Move NOTES into RO_DATAKees Cook
The .notes section should be non-executable read-only data. As such, move it to the RO_DATA macro instead of being per-architecture defined. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-11-keescook@chromium.org
2019-10-28arm64: remove __exception annotationsJames Morse
Since commit 732674980139 ("arm64: unwind: reference pt_regs via embedded stack frame") arm64 has not used the __exception annotation to dump the pt_regs during stack tracing. in_exception_text() has no callers. This annotation is only used to blacklist kprobes, it means the same as __kprobes. Section annotations like this require the functions to be grouped together between the start/end markers, and placed according to the linker script. For kprobes we also have NOKPROBE_SYMBOL() which logs the symbol address in a section that kprobes parses and blacklists at boot. Using NOKPROBE_SYMBOL() instead lets kprobes publish the list of blacklisted symbols, and saves us from having an arm64 specific spelling of __kprobes. do_debug_exception() already has a NOKPROBE_SYMBOL() annotation. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-08-14arm64/efi: Move variable assignments after SECTIONSKees Cook
It seems that LLVM's linker does not correctly handle variable assignments involving section positions that are updated during the SECTIONS parsing. Commit aa69fb62bea1 ("arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly") ran into this too, but found a different workaround. However, this was not enough, as other variables were also miscalculated which manifested as boot failures under UEFI where __efistub__end was not taking the correct _end value (they should be the same): $ ld.lld -EL -maarch64elf --no-undefined -X -shared \ -Bsymbolic -z notext -z norelro --no-apply-dynamic-relocs \ -o vmlinux.lld -T poc.lds --whole-archive vmlinux.o && \ readelf -Ws vmlinux.lld | egrep '\b(__efistub_|)_end\b' 368272: ffff000002218000 0 NOTYPE LOCAL HIDDEN 38 __efistub__end 368322: ffff000012318000 0 NOTYPE GLOBAL DEFAULT 38 _end $ aarch64-linux-gnu-ld.bfd -EL -maarch64elf --no-undefined -X -shared \ -Bsymbolic -z notext -z norelro --no-apply-dynamic-relocs \ -o vmlinux.bfd -T poc.lds --whole-archive vmlinux.o && \ readelf -Ws vmlinux.bfd | egrep '\b(__efistub_|)_end\b' 338124: ffff000012318000 0 NOTYPE LOCAL DEFAULT ABS __efistub__end 383812: ffff000012318000 0 NOTYPE GLOBAL DEFAULT 15325 _end To work around this, all of the __efistub_-prefixed variable assignments need to be moved after the linker script's SECTIONS entry. As it turns out, this also solves the problem fixed in commit aa69fb62bea1, so those changes are reverted here. Link: https://github.com/ClangBuiltLinux/linux/issues/634 Link: https://bugs.llvm.org/show_bug.cgi?id=42990 Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05arm64: Add support for relocating the kernel with RELR relocationsPeter Collingbourne
RELR is a relocation packing format for relative relocations. The format is described in a generic-abi proposal: https://groups.google.com/d/topic/generic-abi/bX460iggiKg/discussion The LLD linker can be instructed to pack relocations in the RELR format by passing the flag --pack-dyn-relocs=relr. This patch adds a new config option, CONFIG_RELR. Enabling this option instructs the linker to pack vmlinux's relative relocations in the RELR format, and causes the kernel to apply the relocations at startup along with the RELA relocations. RELA relocations still need to be applied because the linker will emit RELA relative relocations if they are unrepresentable in the RELR format (i.e. address not a multiple of 2). Enabling CONFIG_RELR reduces the size of a defconfig kernel image with CONFIG_RANDOMIZE_BASE by 3.5MB/16% uncompressed, or 550KB/5% compressed (lz4). Signed-off-by: Peter Collingbourne <pcc@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Will Deacon <will@kernel.org>
2018-12-04arm64: relocatable: fix inconsistencies in linker script and optionsArd Biesheuvel
readelf complains about the section layout of vmlinux when building with CONFIG_RELOCATABLE=y (for KASLR): readelf: Warning: [21]: Link field (0) should index a symtab section. readelf: Warning: [21]: Info field (0) should index a relocatable section. Also, it seems that our use of '-pie -shared' is contradictory, and thus ambiguous. In general, the way KASLR is wired up at the moment is highly tailored to how ld.bfd happens to implement (and conflate) PIE executables and shared libraries, so given the current effort to support other toolchains, let's fix some of these issues as well. - Drop the -pie linker argument and just leave -shared. In ld.bfd, the differences between them are unclear (except for the ELF type of the produced image [0]) but lld chokes on seeing both at the same time. - Rename the .rela output section to .rela.dyn, as is customary for shared libraries and PIE executables, so that it is not misidentified by readelf as a static relocation section (producing the warnings above). - Pass the -z notext and -z norelro options to explicitly instruct the linker to permit text relocations, and to omit the RELRO program header (which requires a certain section layout that we don't adhere to in the kernel). These are the defaults for current versions of ld.bfd. - Discard .eh_frame and .gnu.hash sections to avoid them from being emitted between .head.text and .text, screwing up the section layout. These changes only affect the ELF image, and produce the same binary image. [0] b9dce7f1ba01 ("arm64: kernel: force ET_DYN ELF type for ...") Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Smith <peter.smith@linaro.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-10-24Merge branch 'next-general' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "In this patchset, there are a couple of minor updates, as well as some reworking of the LSM initialization code from Kees Cook (these prepare the way for ordered stackable LSMs, but are a valuable cleanup on their own)" * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: LSM: Don't ignore initialization failures LSM: Provide init debugging infrastructure LSM: Record LSM name in struct lsm_info LSM: Convert security_initcall() into DEFINE_LSM() vmlinux.lds.h: Move LSM_TABLE into INIT_DATA LSM: Convert from initcall to struct lsm_info LSM: Remove initcall tracing LSM: Rename .security_initcall section to .lsm_info vmlinux.lds.h: Avoid copy/paste of security_init section LSM: Correctly announce start of LSM initialization security: fix LSM description location keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h seccomp: remove unnecessary unlikely() security: tomoyo: Fix obsolete function security/capabilities: remove check for -EINVAL
2018-10-10vmlinux.lds.h: Move LSM_TABLE into INIT_DATAKees Cook
Since the struct lsm_info table is not an initcall, we can just move it into INIT_DATA like all the other tables. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: John Johansen <john.johansen@canonical.com> Reviewed-by: James Morris <james.morris@microsoft.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2018-09-25arm64/mm: move runtime pgds to rodataJun Yao
Now that deliberate writes to swapper_pg_dir are made via the fixmap, we can defend against errant writes by moving it into the rodata section. Since tramp_pg_dir and reserved_ttbr0 must be at a fixed offset from swapper_pg_dir, and are not modified at runtime, these are also moved into the rodata section. Likewise, idmap_pg_dir is not modified at runtime, and is moved into rodata. Signed-off-by: Jun Yao <yaojun8558363@gmail.com> Reviewed-by: James Morse <james.morse@arm.com> [Mark: simplify linker script, commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-25arm64/mm: Separate boot-time page tables from swapper_pg_dirJun Yao
Since the address of swapper_pg_dir is fixed for a given kernel image, it is an attractive target for manipulation via an arbitrary write. To mitigate this we'd like to make it read-only by moving it into the rodata section. We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir and reserved_ttbr0, so these will also need to move into rodata. However, swapper_pg_dir is allocated along with some transient page tables used for boot which we do not want to move into rodata. As a step towards this, this patch separates the boot-time page tables into a new init_pg_dir, and reduces swapper_pg_dir to the single page it needs to be. This allows us to retain the relationship between swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly separating these from the boot-time page tables. The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during boot, and all of these levels will be freed when we switch to the swapper_pg_dir, which is initialized by the existing code in paging_init(). Since we start off on the init_pg_dir, we no longer need to allocate a transient page table in paging_init() in order to ensure that swapper_pg_dir isn't live while we initialize it. There should be no functional change as a result of this patch. Signed-off-by: Jun Yao <yaojun8558363@gmail.com> Reviewed-by: James Morse <james.morse@arm.com> [Mark: place init_pg_dir after BSS, fold mm changes, commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-15arm64: remove no-op macro VMLINUX_SYMBOL()Masahiro Yamada
VMLINUX_SYMBOL() is no-op unless CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX is defined. It has ever been selected only by BLACKFIN and METAG. VMLINUX_SYMBOL() is unneeded for ARM64-specific code. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-14arm64: Extend early page table code to allow for larger kernelsSteve Capper
Currently the early assembler page table code assumes that precisely 1xpgd, 1xpud, 1xpmd are sufficient to represent the early kernel text mappings. Unfortunately this is rarely the case when running with a 16KB granule, and we also run into limits with 4KB granule when building much larger kernels. This patch re-writes the early page table logic to compute indices of mappings for each level of page table, and if multiple indices are required, the next-level page table is scaled up accordingly. Also the required size of the swapper_pg_dir is computed at link time to cover the mapping [KIMAGE_ADDR + VOFFSET, _end]. When KASLR is enabled, an extra page is set aside for each level that may require extra entries at runtime. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-14arm64: entry: Move the trampoline to be before PANSteve Capper
The trampoline page tables are positioned after the early page tables in the kernel linker script. As we are about to change the early page table logic to resolve the swapper size at link time as opposed to compile time, the SWAPPER_DIR_SIZE variable (currently used to locate the trampline) will be rendered unsuitable for low level assembler. This patch solves this issue by moving the trampoline before the PAN page tables. The offset to the trampoline from ttbr1 can then be expressed by: PAGE_SIZE + RESERVED_TTBR0_SIZE, which is available to the entry assembler. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-14arm64: Re-order reserved_ttbr0 in linker scriptSteve Capper
Currently one resolves the location of the reserved_ttbr0 for PAN by taking a positive offset from swapper_pg_dir. In a future patch we wish to extend the swapper s.t. its size is determined at link time rather than comile time, rendering SWAPPER_DIR_SIZE unsuitable for such a low level calculation. In this patch we re-arrange the order of the linker script s.t. instead one computes reserved_ttbr0 by subtracting RESERVED_TTBR0_SIZE from swapper_pg_dir. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-11arm64: kaslr: Put kernel vectors address in separate data pageWill Deacon
The literal pool entry for identifying the vectors base is the only piece of information in the trampoline page that identifies the true location of the kernel. This patch moves it into a page-aligned region of the .rodata section and maps this adjacent to the trampoline text via an additional fixmap entry, which protects against any accidental leakage of the trampoline contents. Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Laura Abbott <labbott@redhat.com> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11arm64: entry: Add exception trampoline page for exceptions from EL0Will Deacon
To allow unmapping of the kernel whilst running at EL0, we need to point the exception vectors at an entry trampoline that can map/unmap the kernel on entry/exit respectively. This patch adds the trampoline page, although it is not yet plugged into the vector table and is therefore unused. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-15arm64: add basic VMAP_STACK supportMark Rutland
This patch enables arm64 to be built with vmap'd task and IRQ stacks. As vmap'd stacks are mapped at page granularity, stacks must be a multiple of PAGE_SIZE. This means that a 64K page kernel must use stacks of at least 64K in size. To minimize the increase in Image size, IRQ stacks are dynamically allocated at boot time, rather than embedding the boot CPU's IRQ stack in the kernel image. This patch was co-authored by Ard Biesheuvel and Mark Rutland. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
2017-08-15arm64: move SEGMENT_ALIGN to <asm/memory.h>Mark Rutland
Currently we define SEGMENT_ALIGN directly in our vmlinux.lds.S. This is unfortunate, as the EFI stub currently open-codes the same number, and in future we'll want to fiddle with this. This patch moves the definition to our <asm/memory.h>, where it can be used by both vmlinux.lds.S and the EFI stub code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
2017-04-04arm64: efi: split Image code and data into separate PE/COFF sectionsArd Biesheuvel
To prevent unintended modifications to the kernel text (malicious or otherwise) while running the EFI stub, describe the kernel image as two separate sections: a .text section with read-execute permissions, covering .text, .rodata and .init.text, and a .data section with read-write permissions, covering .init.data, .data and .bss. This relies on the firmware to actually take the section permission flags into account, but this is something that is currently being implemented in EDK2, which means we will likely start seeing it in the wild between one and two years from now. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-03-23arm64: mmu: apply strict permissions to .init.text and .init.dataArd Biesheuvel
To avoid having mappings that are writable and executable at the same time, split the init region into a .init.text region that is mapped read-only, and a .init.data region that is mapped non-executable. This is possible now that the alternative patching occurs via the linear mapping, and the linear alias of the init region is always mapped writable (but never executable). Since the alternatives descriptions themselves are read-only data, move those into the .init.text region. Reviewed-by: Laura Abbott <labbott@redhat.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-21arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1Catalin Marinas
This patch adds the uaccess macros/functions to disable access to user space by setting TTBR0_EL1 to a reserved zeroed page. Since the value written to TTBR0_EL1 must be a physical address, for simplicity this patch introduces a reserved_ttbr0 page at a constant offset from swapper_pg_dir. The uaccess_disable code uses the ttbr1_el1 value adjusted by the reserved_ttbr0 offset. Enabling access to user is done by restoring TTBR0_EL1 with the value from the struct thread_info ttbr0 variable. Interrupts must be disabled during the uaccess_ttbr0_enable code to ensure the atomicity of the thread_info.ttbr0 read and TTBR0_EL1 write. This patch also moves the get_thread_info asm macro from entry.S to assembler.h for reuse in the uaccess_ttbr0_* macros. Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-10-07nmi_backtrace: generate one-line reports for idle cpusChris Metcalf
When doing an nmi backtrace of many cores, most of which are idle, the output is a little overwhelming and very uninformative. Suppress messages for cpus that are idling when they are interrupted and just emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN". We do this by grouping all the cpuidle code together into a new .cpuidle.text section, and then checking the address of the interrupted PC to see if it lies within that section. This commit suitably tags x86 and tile idle routines, and only adds in the minimal framework for other architectures. Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm] Tested-by: Petr Mladek <pmladek@suse.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>