linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-11-08	arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions	Arnd Bergmann
	gcc warns about undefined behavior the vmalloc code when building with CONFIG_ARM64_PA_BITS_52, when the 'idx++' in the argument to __phys_to_pte_val() is evaluated twice: mm/vmalloc.c: In function 'vmap_pfn_apply': mm/vmalloc.c:2800:58: error: operation on 'data->idx' may be undefined [-Werror=sequence-point] 2800 \| pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot)); \| ~~~~~~~~~^~ arch/arm64/include/asm/pgtable-types.h:25:37: note: in definition of macro '__pte' 25 \| #define __pte(x) ((pte_t) { (x) } ) \| ^ arch/arm64/include/asm/pgtable.h:80:15: note: in expansion of macro '__phys_to_pte_val' 80 \| __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) \| pgprot_val(prot)) \| ^~~~~~~~~~~~~~~~~ mm/vmalloc.c:2800:30: note: in expansion of macro 'pfn_pte' 2800 \| pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot)); \| ^~~~~~~ I have no idea why this never showed up earlier, but the safest workaround appears to be changing those macros into inline functions so the arguments get evaluated only once. Cc: Matthew Wilcox <willy@infradead.org> Fixes: 75387b92635e ("arm64: handle 52-bit physical addresses in page table entries") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211105075414.2553155-1-arnd@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-11-08	arm64: Track no early_pgtable_alloc() for kmemleak	Qian Cai
	After switched page size from 64KB to 4KB on several arm64 servers here, kmemleak starts to run out of early memory pool due to a huge number of those early_pgtable_alloc() calls: kmemleak_alloc_phys() memblock_alloc_range_nid() memblock_phys_alloc_range() early_pgtable_alloc() init_pmd() alloc_init_pud() __create_pgd_mapping() __map_memblock() paging_init() setup_arch() start_kernel() Increased the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE by 4 times won't be enough for a server with 200GB+ memory. There isn't much interesting to check memory leaks for those early page tables and those early memory mappings should not reference to other memory. Hence, no kmemleak false positives, and we can safely skip tracking those early allocations from kmemleak like we did in the commit fed84c785270 ("mm/memblock.c: skip kmemleak for kasan_init()") without needing to introduce complications to automatically scale the value depends on the runtime memory size etc. After the patch, the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE becomes sufficient again. Signed-off-by: Qian Cai <quic_qiancai@quicinc.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/r/20211105150509.7826-1-quic_qiancai@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2021-11-08	arm64: mte: change PR_MTE_TCF_NONE back into an unsigned long	Peter Collingbourne
	This constant was previously an unsigned long, but was changed into an int in commit 433c38f40f6a ("arm64: mte: change ASYNC and SYNC TCF settings into bitfields"). This ended up causing spurious unsigned-signed comparison warnings in expressions such as: (x & PR_MTE_TCF_MASK) != PR_MTE_TCF_NONE Therefore, change it back into an unsigned long to silence these warnings. Link: https://linux-review.googlesource.com/id/I07a72310db30227a5b7d789d0b817d78b657c639 Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://lore.kernel.org/r/20211105230829.2254790-1-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-11-08	arm64: vdso: remove -nostdlib compiler flag	Masahiro Yamada
	The -nostdlib option requests the compiler to not use the standard system startup files or libraries when linking. It is effective only when $(CC) is used as a linker driver. Since commit 691efbedc60d ("arm64: vdso: use $(LD) instead of $(CC) to link VDSO"), $(LD) is directly used, hence -nostdlib is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20211107161802.323125-1-masahiroy@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-11-08	arm64: arm64_ftr_reg->name may not be a human-readable string	Reiji Watanabe
	The id argument of ARM64_FTR_REG_OVERRIDE() is used for two purposes: one as the system register encoding (used for the sys_id field of __ftr_reg_entry), and the other as the register name (stringified and used for the name field of arm64_ftr_reg), which is debug information. The id argument is supposed to be a macro that indicates an encoding of the register (eg. SYS_ID_AA64PFR0_EL1, etc). ARM64_FTR_REG(), which also has the same id argument, uses ARM64_FTR_REG_OVERRIDE() and passes the id to the macro. Since the id argument is completely macro-expanded before it is substituted into a macro body of ARM64_FTR_REG_OVERRIDE(), the stringified id in the body of ARM64_FTR_REG_OVERRIDE is not a human-readable register name, but a string of numeric bitwise operations. Fix this so that human-readable register names are available as debug information. Fixes: 8f266a5d878a ("arm64: cpufeature: Add global feature override facility") Signed-off-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211101045421.2215822-1-reijiw@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-29	Merge branch 'for-next/fixes' into for-next/core	Will Deacon
	Merge for-next/fixes to resolve conflicts in arm64_hugetlb_cma_reserve(). * for-next/fixes: acpi/arm64: fix next_platform_timer() section mismatch error arm64/hugetlb: fix CMA gigantic page order for non-4K PAGE_SIZE
2021-10-29	Merge branch 'for-next/vdso' into for-next/core	Will Deacon
	* for-next/vdso: arm64: vdso32: require CROSS_COMPILE_COMPAT for gcc+bfd arm64: vdso32: suppress error message for 'make mrproper' arm64: vdso32: drop test for -march=armv8-a arm64: vdso32: drop the test for dmb ishld
2021-10-29	Merge branch 'for-next/trbe-errata' into for-next/core	Will Deacon
	* for-next/trbe-errata: arm64: errata: Add detection for TRBE write to out-of-range arm64: errata: Add workaround for TSB flush failures arm64: errata: Add detection for TRBE overwrite in FILL mode arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
2021-10-29	Merge branch 'for-next/sve' into for-next/core	Will Deacon
	* for-next/sve: arm64/sve: Fix warnings when SVE is disabled arm64/sve: Add stub for sve_max_virtualisable_vl() arm64/sve: Track vector lengths for tasks in an array arm64/sve: Explicitly load vector length when restoring SVE state arm64/sve: Put system wide vector length information into structs arm64/sve: Use accessor functions for vector lengths in thread_struct arm64/sve: Rename find_supported_vector_length() arm64/sve: Make access to FFR optional arm64/sve: Make sve_state_size() static arm64/sve: Remove sve_load_from_fpsimd_state() arm64/fp: Reindent fpsimd_save()
2021-10-29	Merge branch 'for-next/scs' into for-next/core	Will Deacon
	* for-next/scs: scs: Release kasan vmalloc poison in scs_free process
2021-10-29	Merge branch 'for-next/pfn-valid' into for-next/core	Will Deacon
	* for-next/pfn-valid: arm64/mm: drop HAVE_ARCH_PFN_VALID dma-mapping: remove bogus test for pfn_valid from dma_map_resource
2021-10-29	Merge branch 'for-next/perf' into for-next/core	Will Deacon
	* for-next/perf: drivers/perf: Improve build test coverage drivers/perf: thunderx2_pmu: Change data in size tx2_uncore_event_update() drivers/perf: hisi: Fix PA PMU counter offset
2021-10-29	Merge branch 'for-next/mte' into for-next/core	Will Deacon
	* for-next/mte: kasan: Extend KASAN mode kernel parameter arm64: mte: Add asymmetric mode support arm64: mte: CPU feature detection for Asymm MTE arm64: mte: Bitfield definitions for Asymm MTE kasan: Remove duplicate of kasan_flag_async arm64: kasan: mte: move GCR_EL1 switch to task switch when KASAN disabled
2021-10-29	Merge branch 'for-next/mm' into for-next/core	Will Deacon
	* for-next/mm: arm64: mm: update max_pfn after memory hotplug arm64/mm: Add pud_sect_supported() arm64: mm: Drop pointless call to set_max_mapnr()
2021-10-29	Merge branch 'for-next/misc' into for-next/core	Will Deacon
	* for-next/misc: arm64: Select POSIX_CPU_TIMERS_TASK_WORK arm64: Document boot requirements for FEAT_SME_FA64 arm64: ftrace: use function_nocfi for _mcount as well arm64: asm: setup.h: export common variables arm64/traps: Avoid unnecessary kernel/user pointer conversion
2021-10-29	Merge branch 'for-next/kselftest' into for-next/core	Will Deacon
	* for-next/kselftest: selftests: arm64: Factor out utility functions for assembly FP tests selftests: arm64: Add coverage of ptrace flags for SVE VL inheritance selftests: arm64: Verify that all possible vector lengths are handled selftests: arm64: Fix and enable test for setting current VL in vec-syscfg selftests: arm64: Remove bogus error check on writing to files selftests: arm64: Fix printf() format mismatch in vec-syscfg selftests: arm64: Move FPSIMD in SVE ptrace test into a function selftests: arm64: More comprehensively test the SVE ptrace interface selftests: arm64: Verify interoperation of SVE and FPSIMD register sets selftests: arm64: Clarify output when verifying SVE register set selftests: arm64: Document what the SVE ptrace test is doing selftests: arm64: Remove extraneous register setting code selftests: arm64: Don't log child creation as a test in SVE ptrace test selftests: arm64: Use a define for the number of SVE ptrace tests to be run
2021-10-29	Merge branch 'for-next/kexec' into for-next/core	Will Deacon
	* for-next/kexec: arm64: trans_pgd: remove trans_pgd_map_page() arm64: kexec: remove cpu-reset.h arm64: kexec: remove the pre-kexec PoC maintenance arm64: kexec: keep MMU enabled during kexec relocation arm64: kexec: install a copy of the linear-map arm64: kexec: use ld script for relocation function arm64: kexec: relocate in EL1 mode arm64: kexec: configure EL2 vectors for kexec arm64: kexec: pass kimage as the only argument to relocation function arm64: kexec: Use dcache ops macros instead of open-coding arm64: kexec: skip relocation code for inplace kexec arm64: kexec: flush image and lists during kexec load time arm64: hibernate: abstract ttrb0 setup function arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors arm64: kernel: add helper for booted at EL2 and not VHE
2021-10-29	Merge branch 'for-next/extable' into for-next/core	Will Deacon
	* for-next/extable: arm64: vmlinux.lds.S: remove `.fixup` section arm64: extable: add load_unaligned_zeropad() handler arm64: extable: add a dedicated uaccess handler arm64: extable: add `type` and `data` fields arm64: extable: use `ex` for `exception_table_entry` arm64: extable: make fixup_exception() return bool arm64: extable: consolidate definitions arm64: gpr-num: support W registers arm64: factor out GPR numbering helpers arm64: kvm: use kvm_exception_table_entry arm64: lib: __arch_copy_to_user(): fold fixups into body arm64: lib: __arch_copy_from_user(): fold fixups into body arm64: lib: __arch_clear_user(): fold fixups into body
2021-10-29	Merge branch 'for-next/8.6-timers' into for-next/core	Will Deacon
	* for-next/8.6-timers: arm64: Add HWCAP for self-synchronising virtual counter arm64: Add handling of CNTVCTSS traps arm64: Add CNT{P,V}CTSS_EL0 alternatives to cnt{p,v}ct_el0 arm64: Add a capability for FEAT_ECV clocksource/drivers/arch_arm_timer: Move workaround synchronisation around clocksource/drivers/arm_arch_timer: Fix masking for high freq counters clocksource/drivers/arm_arch_timer: Drop unnecessary ISB on CVAL programming clocksource/drivers/arm_arch_timer: Remove any trace of the TVAL programming interface clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations clocksource/drivers/arm_arch_timer: Advertise 56bit timer to the core code clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL clocksource/drivers/arm_arch_timer: Fix MMIO base address vs callback ordering issue clocksource/drivers/arm_arch_timer: Move drop _tval from erratum function names clocksource/drivers/arm_arch_timer: Move system register timer programming over to CVAL clocksource/drivers/arm_arch_timer: Extend write side of timer register accessors to u64 clocksource/drivers/arm_arch_timer: Drop CNT*_TVAL read accessors clocksource/arm_arch_timer: Add build-time guards for unhandled register accesses
2021-10-28	arm64: Select POSIX_CPU_TIMERS_TASK_WORK	Nicolas Saenz Julienne
	With 6caa5812e2d1 ("KVM: arm64: Use generic KVM xfer to guest work function") all arm64 exit paths are properly equipped to handle the POSIX timers' task work. Deferring timer callbacks to thread context, not only limits the amount of time spent in hard interrupt context, but is a safer implementation[1], and will allow PREEMPT_RT setups to use KVM[2]. So let's enable POSIX_CPU_TIMERS_TASK_WORK on arm64. [1] https://lore.kernel.org/all/20200716201923.228696399@linutronix.de/ [2] https://lore.kernel.org/linux-rt-users/87v92bdnlx.ffs@tglx/ Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211018144713.873464-1-nsaenzju@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-28	arm64: Document boot requirements for FEAT_SME_FA64	Mark Brown
	The EAC1 release of the SME specification adds the FA64 feature which requires enablement at higher ELs before lower ELs can use it. Document what we require from higher ELs in our boot requirements. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211026111802.12853-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-10-26	arm64/sve: Fix warnings when SVE is disabled	Mark Brown
	In configurations where SVE is disabled we define but never reference the functions for retrieving the default vector length, causing warnings. Fix this by move the ifdef up, marking get_default_vl() inline since it is referenced from code guarded by an IS_ENABLED() check, and do the same for the other accessors for consistency. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211022141635.2360415-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-10-26	arm64/sve: Add stub for sve_max_virtualisable_vl()	Mark Brown
	Fixes build problems for configurations with KVM enabled but SVE disabled. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211022141635.2360415-2-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: errata: Add detection for TRBE write to out-of-range	Suzuki K Poulose
	Arm Neoverse-N2 and Cortex-A710 cores are affected by an erratum where the trbe, under some circumstances, might write upto 64bytes to an address after the Limit as programmed by the TRBLIMITR_EL1.LIMIT. This might - - Corrupt a page in the ring buffer, which may corrupt trace from a previous session, consumed by userspace. - Hit the guard page at the end of the vmalloc area and raise a fault. To keep the handling simpler, we always leave the last page from the range, which TRBE is allowed to write. This can be achieved by ensuring that we always have more than a PAGE worth space in the range, while calculating the LIMIT for TRBE. And then the LIMIT pointer can be adjusted to leave the PAGE (TRBLIMITR.LIMIT -= PAGE_SIZE), out of the TRBE range while enabling it. This makes sure that the TRBE will only write to an area within its allowed limit (i.e, [head-head+size]) and we do not have to handle address faults within the driver. Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-5-suzuki.poulose@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: errata: Add workaround for TSB flush failures	Suzuki K Poulose
	Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers from errata, where a TSB (trace synchronization barrier) fails to flush the trace data completely, when executed from a trace prohibited region. In Linux we always execute it after we have moved the PE to trace prohibited region. So, we can apply the workaround every time a TSB is executed. The work around is to issue two TSB consecutively. NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying that a late CPU could be blocked from booting if it is the first CPU that requires the workaround. This is because we do not allow setting a cpu_hwcaps after the SMP boot. The other alternative is to use "this_cpu_has_cap()" instead of the faster system wide check, which may be a bit of an overhead, given we may have to do this in nvhe KVM host before a guest entry. Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-4-suzuki.poulose@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: errata: Add detection for TRBE overwrite in FILL mode	Suzuki K Poulose
	Arm Neoverse-N2 and the Cortex-A710 cores are affected by a CPU erratum where the TRBE will overwrite the trace buffer in FILL mode. The TRBE doesn't stop (as expected in FILL mode) when it reaches the limit and wraps to the base to continue writing upto 3 cache lines. This will overwrite any trace that was written previously. Add the Neoverse-N2 erratum(#2139208) and Cortex-A710 erratum (#2119858) to the detection logic. This will be used by the TRBE driver in later patches to work around the issue. The detection has been kept with the core arm64 errata framework list to make sure : - We don't duplicate the framework in TRBE driver - The errata detection is advertised like the rest of the CPU errata. Note that the Kconfig entries are not fully active until the TRBE driver implements the work around. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> cc: Leo Yan <leo.yan@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-3-suzuki.poulose@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: Add Neoverse-N2, Cortex-A710 CPU part definition	Suzuki K Poulose
	Add the CPU Partnumbers for the new Arm designs. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-2-suzuki.poulose@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	selftests: arm64: Factor out utility functions for assembly FP tests	Mark Brown
	The various floating point test programs written in assembly have a bunch of helper functions and macros which are cut'n'pasted between them. Factor them out into a separate source file which is linked into all of them. We don't include memcmp() since it isn't as generic as it should be and directly branches to report an error in the programs. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211019181851.3341232-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: vmlinux.lds.S: remove `.fixup` section	Mark Rutland
	We no longer place anything into a `.fixup` section, so we no longer need to place those sections into the `.text` section in the main kernel Image. Remove the use of `.fixup`. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-14-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: extable: add load_unaligned_zeropad() handler	Mark Rutland
	For inline assembly, we place exception fixups out-of-line in the `.fixup` section such that these are out of the way of the fast path. This has a few drawbacks: * Since the fixup code is anonymous, backtraces will symbolize fixups as offsets from the nearest prior symbol, currently `__entry_tramp_text_end`. This is confusing, and painful to debug without access to the relevant vmlinux. * Since the exception handler adjusts the PC to execute the fixup, and the fixup uses a direct branch back into the function it fixes, backtraces of fixups miss the original function. This is confusing, and violates requirements for RELIABLE_STACKTRACE (and therefore LIVEPATCH). * Inline assembly and associated fixups are generated from templates, and we have many copies of logically identical fixups which only differ in which specific registers are written to and which address is branched to at the end of the fixup. This is potentially wasteful of I-cache resources, and makes it hard to add additional logic to fixups without significant bloat. * In the case of load_unaligned_zeropad(), the logic in the fixup requires a temporary register that we must allocate even in the fast-path where it will not be used. This patch address all four concerns for load_unaligned_zeropad() fixups by adding a dedicated exception handler which performs the fixup logic in exception context and subsequent returns back after the faulting instruction. For the moment, the fixup logic is identical to the old assembly fixup logic, but in future we could enhance this by taking the ESR and FAR into account to constrain the faults we try to fix up, or to specialize fixups for MTE tag check faults. Other than backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-13-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: extable: add a dedicated uaccess handler	Mark Rutland
	For inline assembly, we place exception fixups out-of-line in the `.fixup` section such that these are out of the way of the fast path. This has a few drawbacks: * Since the fixup code is anonymous, backtraces will symbolize fixups as offsets from the nearest prior symbol, currently `__entry_tramp_text_end`. This is confusing, and painful to debug without access to the relevant vmlinux. * Since the exception handler adjusts the PC to execute the fixup, and the fixup uses a direct branch back into the function it fixes, backtraces of fixups miss the original function. This is confusing, and violates requirements for RELIABLE_STACKTRACE (and therefore LIVEPATCH). * Inline assembly and associated fixups are generated from templates, and we have many copies of logically identical fixups which only differ in which specific registers are written to and which address is branched to at the end of the fixup. This is potentially wasteful of I-cache resources, and makes it hard to add additional logic to fixups without significant bloat. This patch address all three concerns for inline uaccess fixups by adding a dedicated exception handler which updates registers in exception context and subsequent returns back into the function which faulted, removing the need for fixups specialized to each faulting instruction. Other than backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-12-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: extable: add `type` and `data` fields	Mark Rutland
	Subsequent patches will add specialized handlers for fixups, in addition to the simple PC fixup and BPF handlers we have today. In preparation, this patch adds a new `type` field to struct exception_table_entry, and uses this to distinguish the fixup and BPF cases. A `data` field is also added so that subsequent patches can associate data specific to each exception site (e.g. register numbers). Handlers are named ex_handler_() for consistency, following the exmaple of x86. At the same time, get_ex_fixup() is split out into a helper so that it can be used by other ex_handler_() functions ins subsequent patches. This patch will increase the size of the exception tables, which will be remedied by subsequent patches removing redundant fixup code. There should be no functional change as a result of this patch. Since each entry is now 12 bytes in size, we must reduce the alignment of each entry from `.align 3` (i.e. 8 bytes) to `.align 2` (i.e. 4 bytes), which is the natrual alignment of the `insn` and `fixup` fields. The current 8-byte alignment is a holdover from when the `insn` and `fixup` fields was 8 bytes, and while not harmful has not been necessary since commit: 6c94f27ac847ff8e ("arm64: switch to relative exception tables") Similarly, RO_EXCEPTION_TABLE_ALIGN is dropped to 4 bytes. Concurrently with this patch, x86's exception table entry format is being updated (similarly to a 12-byte format, with 32-bytes of absolute data). Once both have been merged it should be possible to unify the sorttable logic for the two. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: James Morse <james.morse@arm.com> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-11-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: extable: use `ex` for `exception_table_entry`	Mark Rutland
	Subsequent patches will extend `struct exception_table_entry` with more fields, and the distinction between the entry and its `fixup` field will become more important. For clarity, let's consistently use `ex` to refer to refer to an entire entry. In subsequent patches we'll use `fixup` to refer to the fixup field specifically. This matches the naming convention used today in arch/arm64/net/bpf_jit_comp.c. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-10-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: extable: make fixup_exception() return bool	Mark Rutland
	The return values of fixup_exception() and arm64_bpf_fixup_exception() represent a boolean condition rather than an error code, so for clarity it would be better to return `bool` rather than `int`. This patch adjusts the code accordingly. While we're modifying the prototype, we also remove the unnecessary `extern` keyword, so that this won't look out of place when we make subsequent additions to the header. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: James Morse <james.morse@arm.com> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-9-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: extable: consolidate definitions	Mark Rutland
	In subsequent patches we'll alter the structure and usage of struct exception_table_entry. For inline assembly, we create these using the `_ASM_EXTABLE()` CPP macro defined in <asm/uaccess.h>, and for plain assembly code we use the `_asm_extable()` GAS macro defined in <asm/assembler.h>, which are largely identical save for different escaping and stringification requirements. This patch moves the common definitions to a new <asm/asm-extable.h> header, so that it's easier to keep the two in-sync, and to remove the implication that these are only used for uaccess helpers (as e.g. load_unaligned_zeropad() is only used on kernel memory, and depends upon `_ASM_EXTABLE()`. At the same time, a few minor modifications are made for clarity and in preparation for subsequent patches: * The structure creation is factored out into an `__ASM_EXTABLE_RAW()` macro. This will make it easier to support different fixup variants in subsequent patches without needing to update all users of `_ASM_EXTABLE()`, and makes it easier to see tha the CPP and GAS variants of the macros are structurally identical. For the CPP macro, the stringification of fields is left to the wrapper macro, `_ASM_EXTABLE()`, as in subsequent patches it will be necessary to stringify fields in wrapper macros to safely concatenate strings which cannot be token-pasted together in CPP. * The fields of the structure are created separately on their own lines. This will make it easier to add/remove/modify individual fields clearly. * Additional parentheses are added around the use of macro arguments in field definitions to avoid any potential problems with evaluation due to operator precedence, and to make errors upon misuse clearer. * USER() is moved into <asm/asm-uaccess.h>, as it is not required by all assembly code, and is already refered to by comments in that file. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: gpr-num: support W registers	Mark Rutland
	In subsequent patches we'll want to map W registers to their register numbers. Update gpr-num.h so that we can do this. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-7-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: factor out GPR numbering helpers	Mark Rutland
	In <asm/sysreg.h> we have macros to convert the names of general purpose registers (GPRs) into integer constants, which we use to manually build the encoding for `MRS` and `MSR` instructions where we can't rely on the assembler to do so for us. In subsequent patches we'll need to map the same GPR names to integer constants so that we can use this to build metadata for exception fixups. So that the we can use the mappings elsewhere, factor out the definitions into a new <asm/gpr-num.h> header, renaming the definitions to align with this "GPR num" naming for clarity. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-6-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: kvm: use kvm_exception_table_entry	Mark Rutland
	In subsequent patches we'll alter `struct exception_table_entry`, adding fields that are not needed for KVM exception fixups. In preparation for this, migrate KVM to its own `struct kvm_exception_table_entry`, which is identical to the current format of `struct exception_table_entry`. Comments are updated accordingly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: lib: __arch_copy_to_user(): fold fixups into body	Mark Rutland
	Like other functions, __arch_copy_to_user() places its exception fixups in the `.fixup` section without any clear association with __arch_copy_to_user() itself. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol, which happens to be `__entry_tramp_text_end`. Further, since the PC adjustment for the fixup is akin to a direct branch rather than a function call, __arch_copy_to_user() itself will be missing from the backtrace. This is confusing and hinders debugging. In general this pattern will also be problematic for CONFIG_LIVEPATCH, since fixups often return to their associated function, but this isn't accurately captured in the stacktrace. To solve these issues for assembly functions, we must move fixups into the body of the functions themselves, after the usual fast-path returns. This patch does so for __arch_copy_to_user(). Inline assembly will be dealt with in subsequent patches. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: lib: __arch_copy_from_user(): fold fixups into body	Mark Rutland
	Like other functions, __arch_copy_from_user() places its exception fixups in the `.fixup` section without any clear association with __arch_copy_from_user() itself. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol, which happens to be `__entry_tramp_text_end`. Further, since the PC adjustment for the fixup is akin to a direct branch rather than a function call, __arch_copy_from_user() itself will be missing from the backtrace. This is confusing and hinders debugging. In general this pattern will also be problematic for CONFIG_LIVEPATCH, since fixups often return to their associated function, but this isn't accurately captured in the stacktrace. To solve these issues for assembly functions, we must move fixups into the body of the functions themselves, after the usual fast-path returns. This patch does so for __arch_copy_from_user(). Inline assembly will be dealt with in subsequent patches. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: lib: __arch_clear_user(): fold fixups into body	Mark Rutland
	Like other functions, __arch_clear_user() places its exception fixups in the `.fixup` section without any clear association with __arch_clear_user() itself. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol, which happens to be `__entry_tramp_text_end`. Further, since the PC adjustment for the fixup is akin to a direct branch rather than a function call, __arch_clear_user() itself will be missing from the backtrace. This is confusing and hinders debugging. In general this pattern will also be problematic for CONFIG_LIVEPATCH, since fixups often return to their associated function, but this isn't accurately captured in the stacktrace. To solve these issues for assembly functions, we must move fixups into the body of the functions themselves, after the usual fast-path returns. This patch does so for __arch_clear_user(). Inline assembly will be dealt with in subsequent patches. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: vdso32: require CROSS_COMPILE_COMPAT for gcc+bfd	Nick Desaulniers
	Similar to commit 231ad7f409f1 ("Makefile: infer --target from ARCH for CC=clang") There really is no point in setting --target based on $CROSS_COMPILE_COMPAT for clang when the integrated assembler is being used, since commit ef94340583ee ("arm64: vdso32: drop -no-integrated-as flag"). Allows COMPAT_VDSO to be selected without setting $CROSS_COMPILE_COMPAT when using clang and lld together. Before: $ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 defconfig $ grep CONFIG_COMPAT_VDSO .config CONFIG_COMPAT_VDSO=y $ ARCH=arm64 make -j72 LLVM=1 defconfig $ grep CONFIG_COMPAT_VDSO .config $ After: $ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 defconfig $ grep CONFIG_COMPAT_VDSO .config CONFIG_COMPAT_VDSO=y $ ARCH=arm64 make -j72 LLVM=1 defconfig $ grep CONFIG_COMPAT_VDSO .config CONFIG_COMPAT_VDSO=y Reviewed-by: Nathan Chancellor <nathan@kernel.org> Suggested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20211019223646.1146945-5-ndesaulniers@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: vdso32: suppress error message for 'make mrproper'	Nick Desaulniers
	When running the following command without arm-linux-gnueabi-gcc in one's $PATH, the following warning is observed: $ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 mrproper make[1]: arm-linux-gnueabi-gcc: No such file or directory This is because KCONFIG is not run for mrproper, so CONFIG_CC_IS_CLANG is not set, and we end up eagerly evaluating various variables that try to invoke CC_COMPAT. This is a similar problem to what was observed in commit dc960bfeedb0 ("h8300: suppress error messages for 'make clean'") Reported-by: Lucas Henneman <henneman@google.com> Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20211019223646.1146945-4-ndesaulniers@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: vdso32: drop test for -march=armv8-a	Nick Desaulniers
	As Arnd points out: gcc-4.8 already supported -march=armv8, and we require gcc-5.1 now, so both this #if/#else construct and the corresponding "cc32-option,-march=armv8-a" check should be obsolete now. Link: https://lore.kernel.org/lkml/CAK8P3a3UBEJ0Py2ycz=rHfgog8g3mCOeQOwO0Gmp-iz6Uxkapg@mail.gmail.com/ Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20211019223646.1146945-3-ndesaulniers@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64: vdso32: drop the test for dmb ishld	Nick Desaulniers
	Binutils added support for this instruction in commit e797f7e0b2bedc9328d4a9a0ebc63ca7a2dbbebc which shipped in 2.24 (just missing the 2.23 release) but was cherry-picked into 2.23 in commit 27a50d6755bae906bc73b4ec1a8b448467f0bea1. Thanks to Christian and Simon for helping me with the patch archaeology. According to Documentation/process/changes.rst, the minimum supported version of binutils is 2.23. Since all supported versions of GAS support this instruction, drop the assembler invocation, preprocessor flags/guards, and the cross assembler macro that's now unused. This also avoids a recursive self reference in a follow up cleanup patch. Cc: Christian Biesinger <cbiesinger@google.com> Cc: Simon Marchi <simon.marchi@polymtl.ca> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20211019223646.1146945-2-ndesaulniers@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64/sve: Track vector lengths for tasks in an array	Mark Brown
	As for SVE we will track a per task SME vector length for tasks. Convert the existing storage for the vector length into an array and update fpsimd_flush_task() to initialise this in a function. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211019172247.3045838-10-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64/sve: Explicitly load vector length when restoring SVE state	Mark Brown
	Currently when restoring the SVE state we supply the SVE vector length as an argument to sve_load_state() and the underlying macros. This becomes inconvenient with the addition of SME since we may need to restore any combination of SVE and SME vector lengths, and we already separately restore the vector length in the KVM code. We don't need to know the vector length during the actual register load since the SME load instructions can index into the data array for us. Refactor the interface so we explicitly set the vector length separately to restoring the SVE registers in preparation for adding SME support, no functional change should be involved. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211019172247.3045838-9-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64/sve: Put system wide vector length information into structs	Mark Brown
	With the introduction of SME we will have a second vector length in the system, enumerated and configured in a very similar fashion to the existing SVE vector length. While there are a few differences in how things are handled this is a relatively small portion of the overall code so in order to avoid code duplication we factor out We create two structs, one vl_info for the static hardware properties and one vl_config for the runtime configuration, with an array instantiated for each and update all the users to reference these. Some accessor functions are provided where helpful for readability, and the write to set the vector length is put into a function since the system register being updated needs to be chosen at compile time. This is a mostly mechanical replacement, further work will be required to actually make things generic, ensuring that we handle those places where there are differences properly. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211019172247.3045838-8-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64/sve: Use accessor functions for vector lengths in thread_struct	Mark Brown
	In a system with SME there are parallel vector length controls for SVE and SME vectors which function in much the same way so it is desirable to share the code for handling them as much as possible. In order to prepare for doing this add a layer of accessor functions for the various VL related operations on tasks. Since almost all current interactions are actually via task->thread rather than directly with the thread_info the accessors use that. Accessors are provided for both generic and SVE specific usage, the generic accessors should be used for cases where register state is being manipulated since the registers are shared between streaming and regular SVE so we know that when SME support is implemented we will always have to be in the appropriate mode already and hence can generalise now. Since we are using task_struct and we don't want to cause widespread inclusion of sched.h the acessors are all out of line, it is hoped that none of the uses are in a sufficiently critical path for this to be an issue. Those that are most likely to present an issue are in the same translation unit so hopefully the compiler may be able to inline anyway. This is purely adding the layer of abstraction, additional work will be needed to support tasks using SME. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211019172247.3045838-7-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21	arm64/sve: Rename find_supported_vector_length()	Mark Brown
	The function has SVE specific checks in it and it will be more trouble to add conditional code for SME than it is to simply rename it to be SVE specific. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211019172247.3045838-6-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>