summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/vmlinux.lds.S
AgeCommit message (Collapse)Author
2020-06-09mm: introduce include/linux/pgtable.hMike Rapoport
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-28efi/libstub/arm64: align PE/COFF sections to segment alignmentArd Biesheuvel
The arm64 kernel's segment alignment is fixed at 64 KB for any page size, and relocatable kernels are able to fix up any misalignment of the kernel image with respect to the 2 MB section alignment that is mandated by the arm64 boot protocol. Let's increase the PE/COFF section alignment to the same value, so that kernels loaded by the UEFI PE/COFF loader are guaranteed to end up at an address that doesn't require any reallocation to be done if the kernel is relocatable. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20200413155521.24698-6-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-04-28arm64/kernel: Fix range on invalidating dcache for boot page tablesGavin Shan
Prior to commit 8eb7e28d4c642c31 ("arm64/mm: move runtime pgds to rodata"), idmap_pgd_dir, tramp_pg_dir, reserved_ttbr0, swapper_pg_dir, and init_pg_dir were contiguous at the end of the kernel image. The maintenance at the end of __create_page_tables assumed these were contiguous, and affected everything from the start of idmap_pg_dir to the end of init_pg_dir. That commit moved all but init_pg_dir into the .rodata section, with other data placed between idmap_pg_dir and init_pg_dir, but did not update the maintenance. Hence the maintenance is performed on much more data than necessary (but as the bootloader previously made this clean to the PoC there is no functional problem). As we only alter idmap_pg_dir, and init_pg_dir, we only need to perform maintenance for these. As the other dirs are in .rodata, the bootloader will have initialised them as expected and cleaned them to the PoC. The kernel will initialize them as necessary after enabling the MMU. This patch reworks the maintenance to only cover the idmap_pg_dir and init_pg_dir to avoid this unnecessary work. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200427235700.112220-1-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-04-28arm64/kernel: vmlinux.lds: drop redundant discard/keep macrosArd Biesheuvel
ARM_EXIT_KEEP and ARM_EXIT_DISCARD are always defined in the same way, so we don't really need them in the first place. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200416132730.25290-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-04-28arm64: rename stext to primary_entryArd Biesheuvel
For historical reasons, the primary entry routine living somewhere in the inittext section is called stext(), which is confusing, given that there is also a section marker called _stext which lives at a fixed offset in the image (either 64 or 4096 bytes, depending on whether CONFIG_EFI is enabled) Let's rename stext to primary_entry(), which is a better description and reflects the secondary_entry() routine that already exists for SMP boot. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200326171423.3080-1-ardb@kernel.org Reviwed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2019-12-06Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - ZONE_DMA32 initialisation fix when memblocks fall entirely within the first GB (used by ZONE_DMA in 5.5 for Raspberry Pi 4). - Couple of ftrace fixes following the FTRACE_WITH_REGS patchset. - access_ok() fix for the Tagged Address ABI when called from from a kernel thread (asynchronous I/O): the kthread does not have the TIF flags of the mm owner, so untag the user address unconditionally. - KVM compute_layout() called before the alternatives code patching. - Minor clean-ups. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: entry: refine comment of stack overflow check arm64: ftrace: fix ifdeffery arm64: KVM: Invoke compute_layout() before alternatives are applied arm64: Validate tagged addresses in access_ok() called from kernel threads arm64: mm: Fix column alignment for UXN in kernel_page_tables arm64: insn: consistently handle exit text arm64: mm: Fix initialisation of DMA zones on non-NUMA systems
2019-12-04arm64: insn: consistently handle exit textMark Rutland
A kernel built with KASAN && FTRACE_WITH_REGS && !MODULES, produces a boot-time splat in the bowels of ftrace: | [ 0.000000] ftrace: allocating 32281 entries in 127 pages | [ 0.000000] ------------[ cut here ]------------ | [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2019 ftrace_bug+0x27c/0x328 | [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.0-rc3-00008-g7f08ae53a7e3 #13 | [ 0.000000] Hardware name: linux,dummy-virt (DT) | [ 0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO) | [ 0.000000] pc : ftrace_bug+0x27c/0x328 | [ 0.000000] lr : ftrace_init+0x640/0x6cc | [ 0.000000] sp : ffffa000120e7e00 | [ 0.000000] x29: ffffa000120e7e00 x28: ffff00006ac01b10 | [ 0.000000] x27: ffff00006ac898c0 x26: dfffa00000000000 | [ 0.000000] x25: ffffa000120ef290 x24: ffffa0001216df40 | [ 0.000000] x23: 000000000000018d x22: ffffa0001244c700 | [ 0.000000] x21: ffffa00011bf393c x20: ffff00006ac898c0 | [ 0.000000] x19: 00000000ffffffff x18: 0000000000001584 | [ 0.000000] x17: 0000000000001540 x16: 0000000000000007 | [ 0.000000] x15: 0000000000000000 x14: ffffa00010432770 | [ 0.000000] x13: ffff940002483519 x12: 1ffff40002483518 | [ 0.000000] x11: 1ffff40002483518 x10: ffff940002483518 | [ 0.000000] x9 : dfffa00000000000 x8 : 0000000000000001 | [ 0.000000] x7 : ffff940002483519 x6 : ffffa0001241a8c0 | [ 0.000000] x5 : ffff940002483519 x4 : ffff940002483519 | [ 0.000000] x3 : ffffa00011780870 x2 : 0000000000000001 | [ 0.000000] x1 : 1fffe0000d591318 x0 : 0000000000000000 | [ 0.000000] Call trace: | [ 0.000000] ftrace_bug+0x27c/0x328 | [ 0.000000] ftrace_init+0x640/0x6cc | [ 0.000000] start_kernel+0x27c/0x654 | [ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x30/0x60 with crng_init=0 | [ 0.000000] ---[ end trace 0000000000000000 ]--- | [ 0.000000] ftrace faulted on writing | [ 0.000000] [<ffffa00011bf393c>] _GLOBAL__sub_D_65535_0___tracepoint_initcall_level+0x4/0x28 | [ 0.000000] Initializing ftrace call sites | [ 0.000000] ftrace record flags: 0 | [ 0.000000] (0) | [ 0.000000] expected tramp: ffffa000100b3344 This is due to an unfortunate combination of several factors. Building with KASAN results in the compiler generating anonymous functions to register/unregister global variables against the shadow memory. These functions are placed in .text.startup/.text.exit, and given mangled names like _GLOBAL__sub_{I,D}_65535_0_$OTHER_SYMBOL. The kernel linker script places these in .init.text and .exit.text respectively, which are both discarded at runtime as part of initmem. Building with FTRACE_WITH_REGS uses -fpatchable-function-entry=2, which also instruments KASAN's anonymous functions. When these are discarded with the rest of initmem, ftrace removes dangling references to these call sites. Building without MODULES implicitly disables STRICT_MODULE_RWX, and causes arm64's patch_map() function to treat any !core_kernel_text() symbol as something that can be modified in-place. As core_kernel_text() is only true for .text and .init.text, with the latter depending on system_state < SYSTEM_RUNNING, we'll treat .exit.text as something that can be patched in-place. However, .exit.text is mapped read-only. Hence in this configuration the ftrace init code blows up while trying to patch one of the functions generated by KASAN. We could try to filter out the call sites in .exit.text rather than initializing them, but this would be inconsistent with how we handle .init.text, and requires hooking into core bits of ftrace. The behaviour of patch_map() is also inconsistent today, so instead let's clean that up and have it consistently handle .exit.text. This patch teaches patch_map() to handle .exit.text at init time, preventing the boot-time splat above. The flow of patch_map() is reworked to make the logic clearer and minimize redundant conditionality. Fixes: 3b23e4991fb66f6d ("arm64: implement ftrace with regs") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-26Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "The main changes in this cycle were: - Cross-arch changes to move the linker sections for NOTES and EXCEPTION_TABLE into the RO_DATA area, where they belong on most architectures. (Kees Cook) - Switch the x86 linker fill byte from x90 (NOP) to 0xcc (INT3), to trap jumps into the middle of those padding areas instead of sliding execution. (Kees Cook) - A thorough cleanup of symbol definitions within x86 assembler code. The rather randomly named macros got streamlined around a (hopefully) straightforward naming scheme: SYM_START(name, linkage, align...) SYM_END(name, sym_type) SYM_FUNC_START(name) SYM_FUNC_END(name) SYM_CODE_START(name) SYM_CODE_END(name) SYM_DATA_START(name) SYM_DATA_END(name) etc - with about three times of these basic primitives with some label, local symbol or attribute variant, expressed via postfixes. No change in functionality intended. (Jiri Slaby) - Misc other changes, cleanups and smaller fixes" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits) x86/entry/64: Remove pointless jump in paranoid_exit x86/entry/32: Remove unused resume_userspace label x86/build/vdso: Remove meaningless CFLAGS_REMOVE_*.o m68k: Convert missed RODATA to RO_DATA x86/vmlinux: Use INT3 instead of NOP for linker fill bytes x86/mm: Report actual image regions in /proc/iomem x86/mm: Report which part of kernel image is freed x86/mm: Remove redundant address-of operators on addresses xtensa: Move EXCEPTION_TABLE to RO_DATA segment powerpc: Move EXCEPTION_TABLE to RO_DATA segment parisc: Move EXCEPTION_TABLE to RO_DATA segment microblaze: Move EXCEPTION_TABLE to RO_DATA segment ia64: Move EXCEPTION_TABLE to RO_DATA segment h8300: Move EXCEPTION_TABLE to RO_DATA segment c6x: Move EXCEPTION_TABLE to RO_DATA segment arm64: Move EXCEPTION_TABLE to RO_DATA segment alpha: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Actually use _etext for the end of the text segment vmlinux.lds.h: Allow EXCEPTION_TABLE to live in RO_DATA ...
2019-11-04arm64: Move EXCEPTION_TABLE to RO_DATA segmentKees Cook
Since the EXCEPTION_TABLE is read-only, collapse it into RO_DATA. Also removes the redundant ALIGN, which is already present at the end of the RO_DATA macro. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Will Deacon <will@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Peter Collingbourne <pcc@google.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-19-keescook@chromium.org
2019-11-04vmlinux.lds.h: Replace RW_DATA_SECTION with RW_DATAKees Cook
Rename RW_DATA_SECTION to RW_DATA. (Calling this a "section" is a lie, since it's multiple sections and section flags cannot be applied to the macro.) Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-14-keescook@chromium.org
2019-11-04vmlinux.lds.h: Move NOTES into RO_DATAKees Cook
The .notes section should be non-executable read-only data. As such, move it to the RO_DATA macro instead of being per-architecture defined. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-11-keescook@chromium.org
2019-10-28arm64: remove __exception annotationsJames Morse
Since commit 732674980139 ("arm64: unwind: reference pt_regs via embedded stack frame") arm64 has not used the __exception annotation to dump the pt_regs during stack tracing. in_exception_text() has no callers. This annotation is only used to blacklist kprobes, it means the same as __kprobes. Section annotations like this require the functions to be grouped together between the start/end markers, and placed according to the linker script. For kprobes we also have NOKPROBE_SYMBOL() which logs the symbol address in a section that kprobes parses and blacklists at boot. Using NOKPROBE_SYMBOL() instead lets kprobes publish the list of blacklisted symbols, and saves us from having an arm64 specific spelling of __kprobes. do_debug_exception() already has a NOKPROBE_SYMBOL() annotation. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-08-14arm64/efi: Move variable assignments after SECTIONSKees Cook
It seems that LLVM's linker does not correctly handle variable assignments involving section positions that are updated during the SECTIONS parsing. Commit aa69fb62bea1 ("arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly") ran into this too, but found a different workaround. However, this was not enough, as other variables were also miscalculated which manifested as boot failures under UEFI where __efistub__end was not taking the correct _end value (they should be the same): $ ld.lld -EL -maarch64elf --no-undefined -X -shared \ -Bsymbolic -z notext -z norelro --no-apply-dynamic-relocs \ -o vmlinux.lld -T poc.lds --whole-archive vmlinux.o && \ readelf -Ws vmlinux.lld | egrep '\b(__efistub_|)_end\b' 368272: ffff000002218000 0 NOTYPE LOCAL HIDDEN 38 __efistub__end 368322: ffff000012318000 0 NOTYPE GLOBAL DEFAULT 38 _end $ aarch64-linux-gnu-ld.bfd -EL -maarch64elf --no-undefined -X -shared \ -Bsymbolic -z notext -z norelro --no-apply-dynamic-relocs \ -o vmlinux.bfd -T poc.lds --whole-archive vmlinux.o && \ readelf -Ws vmlinux.bfd | egrep '\b(__efistub_|)_end\b' 338124: ffff000012318000 0 NOTYPE LOCAL DEFAULT ABS __efistub__end 383812: ffff000012318000 0 NOTYPE GLOBAL DEFAULT 15325 _end To work around this, all of the __efistub_-prefixed variable assignments need to be moved after the linker script's SECTIONS entry. As it turns out, this also solves the problem fixed in commit aa69fb62bea1, so those changes are reverted here. Link: https://github.com/ClangBuiltLinux/linux/issues/634 Link: https://bugs.llvm.org/show_bug.cgi?id=42990 Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05arm64: Add support for relocating the kernel with RELR relocationsPeter Collingbourne
RELR is a relocation packing format for relative relocations. The format is described in a generic-abi proposal: https://groups.google.com/d/topic/generic-abi/bX460iggiKg/discussion The LLD linker can be instructed to pack relocations in the RELR format by passing the flag --pack-dyn-relocs=relr. This patch adds a new config option, CONFIG_RELR. Enabling this option instructs the linker to pack vmlinux's relative relocations in the RELR format, and causes the kernel to apply the relocations at startup along with the RELA relocations. RELA relocations still need to be applied because the linker will emit RELA relative relocations if they are unrepresentable in the RELR format (i.e. address not a multiple of 2). Enabling CONFIG_RELR reduces the size of a defconfig kernel image with CONFIG_RANDOMIZE_BASE by 3.5MB/16% uncompressed, or 550KB/5% compressed (lz4). Signed-off-by: Peter Collingbourne <pcc@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Will Deacon <will@kernel.org>
2018-12-04arm64: relocatable: fix inconsistencies in linker script and optionsArd Biesheuvel
readelf complains about the section layout of vmlinux when building with CONFIG_RELOCATABLE=y (for KASLR): readelf: Warning: [21]: Link field (0) should index a symtab section. readelf: Warning: [21]: Info field (0) should index a relocatable section. Also, it seems that our use of '-pie -shared' is contradictory, and thus ambiguous. In general, the way KASLR is wired up at the moment is highly tailored to how ld.bfd happens to implement (and conflate) PIE executables and shared libraries, so given the current effort to support other toolchains, let's fix some of these issues as well. - Drop the -pie linker argument and just leave -shared. In ld.bfd, the differences between them are unclear (except for the ELF type of the produced image [0]) but lld chokes on seeing both at the same time. - Rename the .rela output section to .rela.dyn, as is customary for shared libraries and PIE executables, so that it is not misidentified by readelf as a static relocation section (producing the warnings above). - Pass the -z notext and -z norelro options to explicitly instruct the linker to permit text relocations, and to omit the RELRO program header (which requires a certain section layout that we don't adhere to in the kernel). These are the defaults for current versions of ld.bfd. - Discard .eh_frame and .gnu.hash sections to avoid them from being emitted between .head.text and .text, screwing up the section layout. These changes only affect the ELF image, and produce the same binary image. [0] b9dce7f1ba01 ("arm64: kernel: force ET_DYN ELF type for ...") Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Smith <peter.smith@linaro.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-10-24Merge branch 'next-general' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "In this patchset, there are a couple of minor updates, as well as some reworking of the LSM initialization code from Kees Cook (these prepare the way for ordered stackable LSMs, but are a valuable cleanup on their own)" * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: LSM: Don't ignore initialization failures LSM: Provide init debugging infrastructure LSM: Record LSM name in struct lsm_info LSM: Convert security_initcall() into DEFINE_LSM() vmlinux.lds.h: Move LSM_TABLE into INIT_DATA LSM: Convert from initcall to struct lsm_info LSM: Remove initcall tracing LSM: Rename .security_initcall section to .lsm_info vmlinux.lds.h: Avoid copy/paste of security_init section LSM: Correctly announce start of LSM initialization security: fix LSM description location keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h seccomp: remove unnecessary unlikely() security: tomoyo: Fix obsolete function security/capabilities: remove check for -EINVAL
2018-10-10vmlinux.lds.h: Move LSM_TABLE into INIT_DATAKees Cook
Since the struct lsm_info table is not an initcall, we can just move it into INIT_DATA like all the other tables. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: John Johansen <john.johansen@canonical.com> Reviewed-by: James Morris <james.morris@microsoft.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2018-09-25arm64/mm: move runtime pgds to rodataJun Yao
Now that deliberate writes to swapper_pg_dir are made via the fixmap, we can defend against errant writes by moving it into the rodata section. Since tramp_pg_dir and reserved_ttbr0 must be at a fixed offset from swapper_pg_dir, and are not modified at runtime, these are also moved into the rodata section. Likewise, idmap_pg_dir is not modified at runtime, and is moved into rodata. Signed-off-by: Jun Yao <yaojun8558363@gmail.com> Reviewed-by: James Morse <james.morse@arm.com> [Mark: simplify linker script, commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-25arm64/mm: Separate boot-time page tables from swapper_pg_dirJun Yao
Since the address of swapper_pg_dir is fixed for a given kernel image, it is an attractive target for manipulation via an arbitrary write. To mitigate this we'd like to make it read-only by moving it into the rodata section. We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir and reserved_ttbr0, so these will also need to move into rodata. However, swapper_pg_dir is allocated along with some transient page tables used for boot which we do not want to move into rodata. As a step towards this, this patch separates the boot-time page tables into a new init_pg_dir, and reduces swapper_pg_dir to the single page it needs to be. This allows us to retain the relationship between swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly separating these from the boot-time page tables. The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during boot, and all of these levels will be freed when we switch to the swapper_pg_dir, which is initialized by the existing code in paging_init(). Since we start off on the init_pg_dir, we no longer need to allocate a transient page table in paging_init() in order to ensure that swapper_pg_dir isn't live while we initialize it. There should be no functional change as a result of this patch. Signed-off-by: Jun Yao <yaojun8558363@gmail.com> Reviewed-by: James Morse <james.morse@arm.com> [Mark: place init_pg_dir after BSS, fold mm changes, commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-15arm64: remove no-op macro VMLINUX_SYMBOL()Masahiro Yamada
VMLINUX_SYMBOL() is no-op unless CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX is defined. It has ever been selected only by BLACKFIN and METAG. VMLINUX_SYMBOL() is unneeded for ARM64-specific code. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-14arm64: Extend early page table code to allow for larger kernelsSteve Capper
Currently the early assembler page table code assumes that precisely 1xpgd, 1xpud, 1xpmd are sufficient to represent the early kernel text mappings. Unfortunately this is rarely the case when running with a 16KB granule, and we also run into limits with 4KB granule when building much larger kernels. This patch re-writes the early page table logic to compute indices of mappings for each level of page table, and if multiple indices are required, the next-level page table is scaled up accordingly. Also the required size of the swapper_pg_dir is computed at link time to cover the mapping [KIMAGE_ADDR + VOFFSET, _end]. When KASLR is enabled, an extra page is set aside for each level that may require extra entries at runtime. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-14arm64: entry: Move the trampoline to be before PANSteve Capper
The trampoline page tables are positioned after the early page tables in the kernel linker script. As we are about to change the early page table logic to resolve the swapper size at link time as opposed to compile time, the SWAPPER_DIR_SIZE variable (currently used to locate the trampline) will be rendered unsuitable for low level assembler. This patch solves this issue by moving the trampoline before the PAN page tables. The offset to the trampoline from ttbr1 can then be expressed by: PAGE_SIZE + RESERVED_TTBR0_SIZE, which is available to the entry assembler. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-14arm64: Re-order reserved_ttbr0 in linker scriptSteve Capper
Currently one resolves the location of the reserved_ttbr0 for PAN by taking a positive offset from swapper_pg_dir. In a future patch we wish to extend the swapper s.t. its size is determined at link time rather than comile time, rendering SWAPPER_DIR_SIZE unsuitable for such a low level calculation. In this patch we re-arrange the order of the linker script s.t. instead one computes reserved_ttbr0 by subtracting RESERVED_TTBR0_SIZE from swapper_pg_dir. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-11arm64: kaslr: Put kernel vectors address in separate data pageWill Deacon
The literal pool entry for identifying the vectors base is the only piece of information in the trampoline page that identifies the true location of the kernel. This patch moves it into a page-aligned region of the .rodata section and maps this adjacent to the trampoline text via an additional fixmap entry, which protects against any accidental leakage of the trampoline contents. Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Laura Abbott <labbott@redhat.com> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11arm64: entry: Add exception trampoline page for exceptions from EL0Will Deacon
To allow unmapping of the kernel whilst running at EL0, we need to point the exception vectors at an entry trampoline that can map/unmap the kernel on entry/exit respectively. This patch adds the trampoline page, although it is not yet plugged into the vector table and is therefore unused. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-15arm64: add basic VMAP_STACK supportMark Rutland
This patch enables arm64 to be built with vmap'd task and IRQ stacks. As vmap'd stacks are mapped at page granularity, stacks must be a multiple of PAGE_SIZE. This means that a 64K page kernel must use stacks of at least 64K in size. To minimize the increase in Image size, IRQ stacks are dynamically allocated at boot time, rather than embedding the boot CPU's IRQ stack in the kernel image. This patch was co-authored by Ard Biesheuvel and Mark Rutland. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
2017-08-15arm64: move SEGMENT_ALIGN to <asm/memory.h>Mark Rutland
Currently we define SEGMENT_ALIGN directly in our vmlinux.lds.S. This is unfortunate, as the EFI stub currently open-codes the same number, and in future we'll want to fiddle with this. This patch moves the definition to our <asm/memory.h>, where it can be used by both vmlinux.lds.S and the EFI stub code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
2017-04-04arm64: efi: split Image code and data into separate PE/COFF sectionsArd Biesheuvel
To prevent unintended modifications to the kernel text (malicious or otherwise) while running the EFI stub, describe the kernel image as two separate sections: a .text section with read-execute permissions, covering .text, .rodata and .init.text, and a .data section with read-write permissions, covering .init.data, .data and .bss. This relies on the firmware to actually take the section permission flags into account, but this is something that is currently being implemented in EDK2, which means we will likely start seeing it in the wild between one and two years from now. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-03-23arm64: mmu: apply strict permissions to .init.text and .init.dataArd Biesheuvel
To avoid having mappings that are writable and executable at the same time, split the init region into a .init.text region that is mapped read-only, and a .init.data region that is mapped non-executable. This is possible now that the alternative patching occurs via the linear mapping, and the linear alias of the init region is always mapped writable (but never executable). Since the alternatives descriptions themselves are read-only data, move those into the .init.text region. Reviewed-by: Laura Abbott <labbott@redhat.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-21arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1Catalin Marinas
This patch adds the uaccess macros/functions to disable access to user space by setting TTBR0_EL1 to a reserved zeroed page. Since the value written to TTBR0_EL1 must be a physical address, for simplicity this patch introduces a reserved_ttbr0 page at a constant offset from swapper_pg_dir. The uaccess_disable code uses the ttbr1_el1 value adjusted by the reserved_ttbr0 offset. Enabling access to user is done by restoring TTBR0_EL1 with the value from the struct thread_info ttbr0 variable. Interrupts must be disabled during the uaccess_ttbr0_enable code to ensure the atomicity of the thread_info.ttbr0 read and TTBR0_EL1 write. This patch also moves the get_thread_info asm macro from entry.S to assembler.h for reuse in the uaccess_ttbr0_* macros. Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-10-07nmi_backtrace: generate one-line reports for idle cpusChris Metcalf
When doing an nmi backtrace of many cores, most of which are idle, the output is a little overwhelming and very uninformative. Suppress messages for cpus that are idling when they are interrupted and just emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN". We do this by grouping all the cpuidle code together into a new .cpuidle.text section, and then checking the address of the interrupted PC to see if it lies within that section. This commit suitably tags x86 and tile idle routines, and only adds in the minimal framework for other architectures. Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm] Tested-by: Petr Mladek <pmladek@suse.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-25arm64: vmlinux.ld: Add mmuoff data sections and move mmuoff text into idmapJames Morse
Resume from hibernate needs to clean any text executed by the kernel with the MMU off to the PoC. Collect these functions together into the .idmap.text section as all this code is tightly coupled and also needs the same cleaning after resume. Data is more complicated, secondary_holding_pen_release is written with the MMU on, clean and invalidated, then read with the MMU off. In contrast __boot_cpu_mode is written with the MMU off, the corresponding cache line is invalidated, so when we read it with the MMU on we don't get stale data. These cache maintenance operations conflict with each other if the values are within a Cache Writeback Granule (CWG) of each other. Collect the data into two sections .mmuoff.data.read and .mmuoff.data.write, the linker script ensures mmuoff.data.write section is aligned to the architectural maximum CWG of 2KB. Signed-off-by: James Morse <james.morse@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-29arm64: relocatable: suppress R_AARCH64_ABS64 relocations in vmlinuxArd Biesheuvel
The linker routines that we rely on to produce a relocatable PIE binary treat it as a shared ELF object in some ways, i.e., it emits symbol based R_AARCH64_ABS64 relocations into the final binary since doing so would be appropriate when linking a shared library that is subject to symbol preemption. (This means that an executable can override certain symbols that are exported by a shared library it is linked with, and that the shared library *must* update all its internal references as well, and point them to the version provided by the executable.) Symbol preemption does not occur for OS hosted PIE executables, let alone for vmlinux, and so we would prefer to get rid of these symbol based relocations. This would allow us to simplify the relocation routines, and to strip the .dynsym, .dynstr and .hash sections from the binary. (Note that these are tiny, and are placed in the .init segment, but they clutter up the vmlinux binary.) Note that these R_AARCH64_ABS64 relocations are only emitted for absolute references to symbols defined in the linker script, all other relocatable quantities are covered by anonymous R_AARCH64_RELATIVE relocations that simply list the offsets to all 64-bit values in the binary that need to be fixed up based on the offset between the link time and run time addresses. Fortunately, GNU ld has a -Bsymbolic option, which is intended for shared libraries to allow them to ignore symbol preemption, and unconditionally bind all internal symbol references to its own definitions. So set it for our PIE binary as well, and get rid of the asoociated sections and the relocation code that processes them. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: fixed conflict with __dynsym_offset linker script entry] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-29arm64: vmlinux.lds: make __rela_offset and __dynsym_offset ABSOLUTEArd Biesheuvel
Due to the untyped KIMAGE_VADDR constant, the linker may not notice that the __rela_offset and __dynsym_offset expressions are absolute values (i.e., are not subject to relocation). This does not matter for KASLR, but it does confuse kallsyms in relative mode, since it uses the lowest non-absolute symbol address as the anchor point, and expects all other symbol addresses to be within 4 GB of it. Fix this by qualifying these expressions as ABSOLUTE() explicitly. Fixes: 0cd3defe0af4 ("arm64: kernel: perform relocation processing from ID map") Cc: <stable@vger.kernel.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-21Merge branch 'for-next/kprobes' into for-next/coreCatalin Marinas
* kprobes: arm64: kprobes: Add KASAN instrumentation around stack accesses arm64: kprobes: Cleanup jprobe_return arm64: kprobes: Fix overflow when saving stack arm64: kprobes: WARN if attempting to step with PSTATE.D=1 kprobes: Add arm64 case in kprobe example module arm64: Add kernel return probes support (kretprobes) arm64: Add trampoline code for kretprobes arm64: kprobes instruction simulation support arm64: Treat all entry code as non-kprobe-able arm64: Blacklist non-kprobe-able symbol arm64: Kprobes with single stepping support arm64: add conditional instruction simulation support arm64: Add more test functions to insn.c arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature
2016-07-19arm64: Treat all entry code as non-kprobe-ablePratyush Anand
Entry symbols are not kprobe safe. So blacklist them for kprobing. Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: David A. Long <dave.long@linaro.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> [catalin.marinas@arm.com: Do not include syscall wrappers in .entry.text] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-19arm64: Kprobes with single stepping supportSandeepa Prabhu
Add support for basic kernel probes(kprobes) and jump probes (jprobes) for ARM64. Kprobes utilizes software breakpoint and single step debug exceptions supported on ARM v8. A software breakpoint is placed at the probe address to trap the kernel execution into the kprobe handler. ARM v8 supports enabling single stepping before the break exception return (ERET), with next PC in exception return address (ELR_EL1). The kprobe handler prepares an executable memory slot for out-of-line execution with a copy of the original instruction being probed, and enables single stepping. The PC is set to the out-of-line slot address before the ERET. With this scheme, the instruction is executed with the exact same register context except for the PC (and DAIF) registers. Debug mask (PSTATE.D) is enabled only when single stepping a recursive kprobe, e.g.: during kprobes reenter so that probed instruction can be single stepped within the kprobe handler -exception- context. The recursion depth of kprobe is always 2, i.e. upon probe re-entry, any further re-entry is prevented by not calling handlers and the case counted as a missed kprobe). Single stepping from the x-o-l slot has a drawback for PC-relative accesses like branching and symbolic literals access as the offset from the new PC (slot address) may not be ensured to fit in the immediate value of the opcode. Such instructions need simulation, so reject probing them. Instructions generating exceptions or cpu mode change are rejected for probing. Exclusive load/store instructions are rejected too. Additionally, the code is checked to see if it is inside an exclusive load/store sequence (code from Pratyush). System instructions are mostly enabled for stepping, except MSR/MRS accesses to "DAIF" flags in PSTATE, which are not safe for probing. This also changes arch/arm64/include/asm/ptrace.h to use include/asm-generic/ptrace.h. Thanks to Steve Capper and Pratyush Anand for several suggested Changes. Signed-off-by: Sandeepa Prabhu <sandeepa.s.prabhu@gmail.com> Signed-off-by: David A. Long <dave.long@linaro.org> Signed-off-by: Pratyush Anand <panand@redhat.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-06-27arm64: mm: fix location of _etextArd Biesheuvel
As Kees Cook notes in the ARM counterpart of this patch [0]: The _etext position is defined to be the end of the kernel text code, and should not include any part of the data segments. This interferes with things that might check memory ranges and expect executable code up to _etext. In particular, Kees is referring to the HARDENED_USERCOPY patch set [1], which rejects attempts to call copy_to_user() on kernel ranges containing executable code, but does allow access to the .rodata segment. Regardless of whether one may or may not agree with the distinction, it makes sense for _etext to have the same meaning across architectures. So let's put _etext where it belongs, between .text and .rodata, and fix up existing references to use __init_begin instead, which unlike _end_rodata includes the exception and notes sections as well. The _etext references in kaslr.c are left untouched, since its references to [_stext, _etext) are meant to capture potential jump instruction targets, and so disregarding .rodata is actually an improvement here. [0] http://article.gmane.org/gmane.linux.kernel/2245084 [1] http://thread.gmane.org/gmane.linux.kernel.hardened.devel/2502 Reported-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-04-28arm64: kernel: Add support for hibernate/suspend-to-diskJames Morse
Add support for hibernate/suspend-to-disk. Suspend borrows code from cpu_suspend() to write cpu state onto the stack, before calling swsusp_save() to save the memory image. Restore creates a set of temporary page tables, covering only the linear map, copies the restore code to a 'safe' page, then uses the copy to restore the memory image. The copied code executes in the lower half of the address space, and once complete, restores the original kernel's page tables. It then calls into cpu_resume(), and follows the normal cpu_suspend() path back into the suspend code. To restore a kernel using KASLR, the address of the page tables, and cpu_resume() are stored in the hibernate arch-header and the el2 vectors are pivotted via the 'safe' page in low memory. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Kevin Hilman <khilman@baylibre.com> # Tested on Juno R2 Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-26arm64: kernel: perform relocation processing from ID mapArd Biesheuvel
Refactor the relocation processing so that the code executes from the ID map while accessing the relocation tables via the virtual mapping. This way, we can use literals containing virtual addresses as before, instead of having to use convoluted absolute expressions. For symmetry with the secondary code path, the relocation code and the subsequent jump to the virtual entry point are implemented in a function called __primary_switch(), and __mmap_switched() is renamed to __primary_switched(). Also, the call sequence in stext() is aligned with the one in secondary_startup(), by replacing the awkward 'adr_l lr' and 'b cpu_setup' sequence with a simple branch and link. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-14arm64: simplify kernel segment mapping granularityArd Biesheuvel
The mapping of the kernel consist of four segments, each of which is mapped with different permission attributes and/or lifetimes. To optimize the TLB and translation table footprint, we define various opaque constants in the linker script that resolve to different aligment values depending on the page size and whether CONFIG_DEBUG_ALIGN_RODATA is set. Considering that - a 4 KB granule kernel benefits from a 64 KB segment alignment (due to the fact that it allows the use of the contiguous bit), - the minimum alignment of the .data segment is THREAD_SIZE already, not PAGE_SIZE (i.e., we already have padding between _data and the start of the .data payload in many cases), - 2 MB is a suitable alignment value on all granule sizes, either for mapping directly (level 2 on 4 KB), or via the contiguous bit (level 3 on 16 KB and 64 KB), - anything beyond 2 MB exceeds the minimum alignment mandated by the boot protocol, and can only be mapped efficiently if the physical alignment happens to be the same, we can simplify this by standardizing on 64 KB (or 2 MB) explicitly, i.e., regardless of granule size, all segments are aligned either to 64 KB, or to 2 MB if CONFIG_DEBUG_ALIGN_RODATA=y. This also means we can drop the Kconfig dependency of CONFIG_DEBUG_ALIGN_RODATA on CONFIG_ARM64_4K_PAGES. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-14arm64: cover the .head.text section in the .text segment mappingArd Biesheuvel
Keeping .head.text out of the .text mapping buys us very little: its actual payload is only 4 KB, most of which is padding, but the page alignment may add up to 2 MB (in case of CONFIG_DEBUG_ALIGN_RODATA=y) of additional padding to the uncompressed kernel Image. Also, on 4 KB granule kernels, the 4 KB misalignment of .text forces us to map the adjacent 56 KB of code without the PTE_CONT attribute, and since this region contains things like the vector table and the GIC interrupt handling entry point, this region is likely to benefit from the reduced TLB pressure that results from PTE_CONT mappings. So remove the alignment between the .head.text and .text sections, and use the [_text, _etext) rather than the [_stext, _etext) interval for mapping the .text segment. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-25arch, ftrace: for KASAN put hard/soft IRQ entries into separate sectionsAlexander Potapenko
KASAN needs to know whether the allocation happens in an IRQ handler. This lets us strip everything below the IRQ entry point to reduce the number of unique stack traces needed to be stored. Move the definition of __irq_entry to <linux/interrupt.h> so that the users don't need to pull in <linux/ftrace.h>. Also introduce the __softirq_entry macro which is similar to __irq_entry, but puts the corresponding functions to the .softirqentry.text section. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-20Merge branch 'efi-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main changes are: - Use separate EFI page tables when executing EFI firmware code. This isolates the EFI context from the rest of the kernel, which has security and general robustness advantages. (Matt Fleming) - Run regular UEFI firmware with interrupts enabled. This is already the status quo under other OSs. (Ard Biesheuvel) - Various x86 EFI enhancements, such as the use of non-executable attributes for EFI memory mappings. (Sai Praneeth Prakhya) - Various arm64 UEFI enhancements. (Ard Biesheuvel) - ... various fixes and cleanups. The separate EFI page tables feature got delayed twice already, because it's an intrusive change and we didn't feel confident about it - third time's the charm we hope!" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) x86/mm/pat: Fix boot crash when 1GB pages are not supported by the CPU x86/efi: Only map kernel text for EFI mixed mode x86/efi: Map EFI_MEMORY_{XP,RO} memory region bits to EFI page tables x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd() efi/arm*: Perform hardware compatibility check efi/arm64: Check for h/w support before booting a >4 KB granular kernel efi/arm: Check for LPAE support before booting a LPAE kernel efi/arm-init: Use read-only early mappings efi/efistub: Prevent __init annotations from being used arm64/vmlinux.lds.S: Handle .init.rodata.xxx and .init.bss sections efi/arm64: Drop __init annotation from handle_kernel_image() x86/mm/pat: Use _PAGE_GLOBAL bit for EFI page table mappings efi/runtime-wrappers: Run UEFI Runtime Services with interrupts enabled efi: Reformat GUID tables to follow the format in UEFI spec efi: Add Persistent Memory type name efi: Add NV memory attribute x86/efi: Show actual ending addresses in efi_print_memmap x86/efi/bgrt: Don't ignore the BGRT if the 'valid' bit is 0 efivars: Use to_efivar_entry efi: Runtime-wrapper: Get rid of the rtc_lock spinlock ...
2016-02-26arm64: mm: Mark .rodata as ROJeremy Linton
Currently the .rodata section is actually still executable when DEBUG_RODATA is enabled. This changes that so the .rodata is actually read only, no execute. It also adds the .rodata section to the mem_init banner. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Mark Rutland <mark.rutland@arm.com> [catalin.marinas@arm.com: added vm_struct vmlinux_rodata in map_kernel()] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-24arm64: add support for building vmlinux as a relocatable PIE binaryArd Biesheuvel
This implements CONFIG_RELOCATABLE, which links the final vmlinux image with a dynamic relocation section, allowing the early boot code to perform a relocation to a different virtual address at runtime. This is a prerequisite for KASLR (CONFIG_RANDOMIZE_BASE). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-22arm64/vmlinux.lds.S: Handle .init.rodata.xxx and .init.bss sectionsArd Biesheuvel
The EFI stub is typically built into the decompressor (x86, ARM) so none of its symbols are annotated as __init. However, on arm64, the stub is linked into the kernel proper, and the code is __init annotated at the section level by prepending all names of SHF_ALLOC sections with '.init'. This results in section names like .init.rodata.str1.8 (for string literals) and .init.bss (which is tiny), both of which can be moved into the .init.data output section. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1455712566-16727-6-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18arm64: introduce KIMAGE_VADDR as the virtual base of the kernel regionArd Biesheuvel
This introduces the preprocessor symbol KIMAGE_VADDR which will serve as the symbolic virtual base of the kernel region, i.e., the kernel's virtual offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being equal to PAGE_OFFSET, but in the future, it will be moved below it once we move the kernel virtual mapping out of the linear mapping. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16arm64: ensure _stext and _etext are page-alignedMark Rutland
Currently we have separate ALIGN_DEBUG_RO{,_MIN} directives to align _etext and __init_begin. While we ensure that __init_begin is page-aligned, we do not provide the same guarantee for _etext. This is not problematic currently as the alignment of __init_begin is sufficient to prevent issues when we modify permissions. Subsequent patches will assume page alignment of segments of the kernel we wish to map with different permissions. To ensure this, move _etext after the ALIGN_DEBUG_RO_MIN for the init section. This renders the prior ALIGN_DEBUG_RO irrelevant, and hence it is removed. Likewise, upgrade to ALIGN_DEBUG_RO_MIN(PAGE_SIZE) for _stext. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>