Age | Commit message (Collapse) | Author |
|
gcc warns about undefined behavior the vmalloc code when building
with CONFIG_ARM64_PA_BITS_52, when the 'idx++' in the argument to
__phys_to_pte_val() is evaluated twice:
mm/vmalloc.c: In function 'vmap_pfn_apply':
mm/vmalloc.c:2800:58: error: operation on 'data->idx' may be undefined [-Werror=sequence-point]
2800 | *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
| ~~~~~~~~~^~
arch/arm64/include/asm/pgtable-types.h:25:37: note: in definition of macro '__pte'
25 | #define __pte(x) ((pte_t) { (x) } )
| ^
arch/arm64/include/asm/pgtable.h:80:15: note: in expansion of macro '__phys_to_pte_val'
80 | __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
| ^~~~~~~~~~~~~~~~~
mm/vmalloc.c:2800:30: note: in expansion of macro 'pfn_pte'
2800 | *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
| ^~~~~~~
I have no idea why this never showed up earlier, but the safest
workaround appears to be changing those macros into inline functions
so the arguments get evaluated only once.
Cc: Matthew Wilcox <willy@infradead.org>
Fixes: 75387b92635e ("arm64: handle 52-bit physical addresses in page table entries")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211105075414.2553155-1-arnd@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
After switched page size from 64KB to 4KB on several arm64 servers here,
kmemleak starts to run out of early memory pool due to a huge number of
those early_pgtable_alloc() calls:
kmemleak_alloc_phys()
memblock_alloc_range_nid()
memblock_phys_alloc_range()
early_pgtable_alloc()
init_pmd()
alloc_init_pud()
__create_pgd_mapping()
__map_memblock()
paging_init()
setup_arch()
start_kernel()
Increased the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE by 4 times
won't be enough for a server with 200GB+ memory. There isn't much
interesting to check memory leaks for those early page tables and those
early memory mappings should not reference to other memory. Hence, no
kmemleak false positives, and we can safely skip tracking those early
allocations from kmemleak like we did in the commit fed84c785270
("mm/memblock.c: skip kmemleak for kasan_init()") without needing to
introduce complications to automatically scale the value depends on the
runtime memory size etc. After the patch, the default value of
DEBUG_KMEMLEAK_MEM_POOL_SIZE becomes sufficient again.
Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20211105150509.7826-1-quic_qiancai@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This constant was previously an unsigned long, but was changed
into an int in commit 433c38f40f6a ("arm64: mte: change ASYNC and
SYNC TCF settings into bitfields"). This ended up causing spurious
unsigned-signed comparison warnings in expressions such as:
(x & PR_MTE_TCF_MASK) != PR_MTE_TCF_NONE
Therefore, change it back into an unsigned long to silence these
warnings.
Link: https://linux-review.googlesource.com/id/I07a72310db30227a5b7d789d0b817d78b657c639
Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://lore.kernel.org/r/20211105230829.2254790-1-pcc@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The -nostdlib option requests the compiler to not use the standard
system startup files or libraries when linking. It is effective only
when $(CC) is used as a linker driver.
Since commit 691efbedc60d ("arm64: vdso: use $(LD) instead of $(CC)
to link VDSO"), $(LD) is directly used, hence -nostdlib is unneeded.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20211107161802.323125-1-masahiroy@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The id argument of ARM64_FTR_REG_OVERRIDE() is used for two purposes:
one as the system register encoding (used for the sys_id field of
__ftr_reg_entry), and the other as the register name (stringified
and used for the name field of arm64_ftr_reg), which is debug
information. The id argument is supposed to be a macro that
indicates an encoding of the register (eg. SYS_ID_AA64PFR0_EL1, etc).
ARM64_FTR_REG(), which also has the same id argument,
uses ARM64_FTR_REG_OVERRIDE() and passes the id to the macro.
Since the id argument is completely macro-expanded before it is
substituted into a macro body of ARM64_FTR_REG_OVERRIDE(),
the stringified id in the body of ARM64_FTR_REG_OVERRIDE is not
a human-readable register name, but a string of numeric bitwise
operations.
Fix this so that human-readable register names are available as
debug information.
Fixes: 8f266a5d878a ("arm64: cpufeature: Add global feature override facility")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211101045421.2215822-1-reijiw@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Merge for-next/fixes to resolve conflicts in arm64_hugetlb_cma_reserve().
* for-next/fixes:
acpi/arm64: fix next_platform_timer() section mismatch error
arm64/hugetlb: fix CMA gigantic page order for non-4K PAGE_SIZE
|
|
* for-next/vdso:
arm64: vdso32: require CROSS_COMPILE_COMPAT for gcc+bfd
arm64: vdso32: suppress error message for 'make mrproper'
arm64: vdso32: drop test for -march=armv8-a
arm64: vdso32: drop the test for dmb ishld
|
|
* for-next/trbe-errata:
arm64: errata: Add detection for TRBE write to out-of-range
arm64: errata: Add workaround for TSB flush failures
arm64: errata: Add detection for TRBE overwrite in FILL mode
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
|
|
* for-next/sve:
arm64/sve: Fix warnings when SVE is disabled
arm64/sve: Add stub for sve_max_virtualisable_vl()
arm64/sve: Track vector lengths for tasks in an array
arm64/sve: Explicitly load vector length when restoring SVE state
arm64/sve: Put system wide vector length information into structs
arm64/sve: Use accessor functions for vector lengths in thread_struct
arm64/sve: Rename find_supported_vector_length()
arm64/sve: Make access to FFR optional
arm64/sve: Make sve_state_size() static
arm64/sve: Remove sve_load_from_fpsimd_state()
arm64/fp: Reindent fpsimd_save()
|
|
* for-next/scs:
scs: Release kasan vmalloc poison in scs_free process
|
|
* for-next/pfn-valid:
arm64/mm: drop HAVE_ARCH_PFN_VALID
dma-mapping: remove bogus test for pfn_valid from dma_map_resource
|
|
* for-next/perf:
drivers/perf: Improve build test coverage
drivers/perf: thunderx2_pmu: Change data in size tx2_uncore_event_update()
drivers/perf: hisi: Fix PA PMU counter offset
|
|
* for-next/mte:
kasan: Extend KASAN mode kernel parameter
arm64: mte: Add asymmetric mode support
arm64: mte: CPU feature detection for Asymm MTE
arm64: mte: Bitfield definitions for Asymm MTE
kasan: Remove duplicate of kasan_flag_async
arm64: kasan: mte: move GCR_EL1 switch to task switch when KASAN disabled
|
|
* for-next/mm:
arm64: mm: update max_pfn after memory hotplug
arm64/mm: Add pud_sect_supported()
arm64: mm: Drop pointless call to set_max_mapnr()
|
|
* for-next/misc:
arm64: Select POSIX_CPU_TIMERS_TASK_WORK
arm64: Document boot requirements for FEAT_SME_FA64
arm64: ftrace: use function_nocfi for _mcount as well
arm64: asm: setup.h: export common variables
arm64/traps: Avoid unnecessary kernel/user pointer conversion
|
|
* for-next/kselftest:
selftests: arm64: Factor out utility functions for assembly FP tests
selftests: arm64: Add coverage of ptrace flags for SVE VL inheritance
selftests: arm64: Verify that all possible vector lengths are handled
selftests: arm64: Fix and enable test for setting current VL in vec-syscfg
selftests: arm64: Remove bogus error check on writing to files
selftests: arm64: Fix printf() format mismatch in vec-syscfg
selftests: arm64: Move FPSIMD in SVE ptrace test into a function
selftests: arm64: More comprehensively test the SVE ptrace interface
selftests: arm64: Verify interoperation of SVE and FPSIMD register sets
selftests: arm64: Clarify output when verifying SVE register set
selftests: arm64: Document what the SVE ptrace test is doing
selftests: arm64: Remove extraneous register setting code
selftests: arm64: Don't log child creation as a test in SVE ptrace test
selftests: arm64: Use a define for the number of SVE ptrace tests to be run
|
|
* for-next/kexec:
arm64: trans_pgd: remove trans_pgd_map_page()
arm64: kexec: remove cpu-reset.h
arm64: kexec: remove the pre-kexec PoC maintenance
arm64: kexec: keep MMU enabled during kexec relocation
arm64: kexec: install a copy of the linear-map
arm64: kexec: use ld script for relocation function
arm64: kexec: relocate in EL1 mode
arm64: kexec: configure EL2 vectors for kexec
arm64: kexec: pass kimage as the only argument to relocation function
arm64: kexec: Use dcache ops macros instead of open-coding
arm64: kexec: skip relocation code for inplace kexec
arm64: kexec: flush image and lists during kexec load time
arm64: hibernate: abstract ttrb0 setup function
arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors
arm64: kernel: add helper for booted at EL2 and not VHE
|
|
* for-next/extable:
arm64: vmlinux.lds.S: remove `.fixup` section
arm64: extable: add load_unaligned_zeropad() handler
arm64: extable: add a dedicated uaccess handler
arm64: extable: add `type` and `data` fields
arm64: extable: use `ex` for `exception_table_entry`
arm64: extable: make fixup_exception() return bool
arm64: extable: consolidate definitions
arm64: gpr-num: support W registers
arm64: factor out GPR numbering helpers
arm64: kvm: use kvm_exception_table_entry
arm64: lib: __arch_copy_to_user(): fold fixups into body
arm64: lib: __arch_copy_from_user(): fold fixups into body
arm64: lib: __arch_clear_user(): fold fixups into body
|
|
* for-next/8.6-timers:
arm64: Add HWCAP for self-synchronising virtual counter
arm64: Add handling of CNTVCTSS traps
arm64: Add CNT{P,V}CTSS_EL0 alternatives to cnt{p,v}ct_el0
arm64: Add a capability for FEAT_ECV
clocksource/drivers/arch_arm_timer: Move workaround synchronisation around
clocksource/drivers/arm_arch_timer: Fix masking for high freq counters
clocksource/drivers/arm_arch_timer: Drop unnecessary ISB on CVAL programming
clocksource/drivers/arm_arch_timer: Remove any trace of the TVAL programming interface
clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations
clocksource/drivers/arm_arch_timer: Advertise 56bit timer to the core code
clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL
clocksource/drivers/arm_arch_timer: Fix MMIO base address vs callback ordering issue
clocksource/drivers/arm_arch_timer: Move drop _tval from erratum function names
clocksource/drivers/arm_arch_timer: Move system register timer programming over to CVAL
clocksource/drivers/arm_arch_timer: Extend write side of timer register accessors to u64
clocksource/drivers/arm_arch_timer: Drop CNT*_TVAL read accessors
clocksource/arm_arch_timer: Add build-time guards for unhandled register accesses
|
|
With 6caa5812e2d1 ("KVM: arm64: Use generic KVM xfer to guest work
function") all arm64 exit paths are properly equipped to handle the
POSIX timers' task work.
Deferring timer callbacks to thread context, not only limits the amount
of time spent in hard interrupt context, but is a safer
implementation[1], and will allow PREEMPT_RT setups to use KVM[2].
So let's enable POSIX_CPU_TIMERS_TASK_WORK on arm64.
[1] https://lore.kernel.org/all/20200716201923.228696399@linutronix.de/
[2] https://lore.kernel.org/linux-rt-users/87v92bdnlx.ffs@tglx/
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211018144713.873464-1-nsaenzju@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The EAC1 release of the SME specification adds the FA64 feature which
requires enablement at higher ELs before lower ELs can use it. Document
what we require from higher ELs in our boot requirements.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211026111802.12853-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In configurations where SVE is disabled we define but never reference the
functions for retrieving the default vector length, causing warnings. Fix
this by move the ifdef up, marking get_default_vl() inline since it is
referenced from code guarded by an IS_ENABLED() check, and do the same for
the other accessors for consistency.
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211022141635.2360415-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Fixes build problems for configurations with KVM enabled but SVE disabled.
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211022141635.2360415-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Arm Neoverse-N2 and Cortex-A710 cores are affected by an erratum where
the trbe, under some circumstances, might write upto 64bytes to an
address after the Limit as programmed by the TRBLIMITR_EL1.LIMIT.
This might -
- Corrupt a page in the ring buffer, which may corrupt trace from a
previous session, consumed by userspace.
- Hit the guard page at the end of the vmalloc area and raise a fault.
To keep the handling simpler, we always leave the last page from the
range, which TRBE is allowed to write. This can be achieved by ensuring
that we always have more than a PAGE worth space in the range, while
calculating the LIMIT for TRBE. And then the LIMIT pointer can be
adjusted to leave the PAGE (TRBLIMITR.LIMIT -= PAGE_SIZE), out of the
TRBE range while enabling it. This makes sure that the TRBE will only
write to an area within its allowed limit (i.e, [head-head+size]) and
we do not have to handle address faults within the driver.
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-5-suzuki.poulose@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers
from errata, where a TSB (trace synchronization barrier)
fails to flush the trace data completely, when executed from
a trace prohibited region. In Linux we always execute it
after we have moved the PE to trace prohibited region. So,
we can apply the workaround every time a TSB is executed.
The work around is to issue two TSB consecutively.
NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying
that a late CPU could be blocked from booting if it is the
first CPU that requires the workaround. This is because we
do not allow setting a cpu_hwcaps after the SMP boot. The
other alternative is to use "this_cpu_has_cap()" instead
of the faster system wide check, which may be a bit of an
overhead, given we may have to do this in nvhe KVM host
before a guest entry.
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-4-suzuki.poulose@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Arm Neoverse-N2 and the Cortex-A710 cores are affected
by a CPU erratum where the TRBE will overwrite the trace buffer
in FILL mode. The TRBE doesn't stop (as expected in FILL mode)
when it reaches the limit and wraps to the base to continue
writing upto 3 cache lines. This will overwrite any trace that
was written previously.
Add the Neoverse-N2 erratum(#2139208) and Cortex-A710 erratum
(#2119858) to the detection logic.
This will be used by the TRBE driver in later patches to work
around the issue. The detection has been kept with the core
arm64 errata framework list to make sure :
- We don't duplicate the framework in TRBE driver
- The errata detection is advertised like the rest
of the CPU errata.
Note that the Kconfig entries are not fully active until the
TRBE driver implements the work around.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
cc: Leo Yan <leo.yan@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-3-suzuki.poulose@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the CPU Partnumbers for the new Arm designs.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211019163153.3692640-2-suzuki.poulose@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The various floating point test programs written in assembly have a bunch
of helper functions and macros which are cut'n'pasted between them. Factor
them out into a separate source file which is linked into all of them.
We don't include memcmp() since it isn't as generic as it should be and
directly branches to report an error in the programs.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019181851.3341232-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
We no longer place anything into a `.fixup` section, so we no longer
need to place those sections into the `.text` section in the main kernel
Image.
Remove the use of `.fixup`.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-14-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
For inline assembly, we place exception fixups out-of-line in the
`.fixup` section such that these are out of the way of the fast path.
This has a few drawbacks:
* Since the fixup code is anonymous, backtraces will symbolize fixups as
offsets from the nearest prior symbol, currently
`__entry_tramp_text_end`. This is confusing, and painful to debug
without access to the relevant vmlinux.
* Since the exception handler adjusts the PC to execute the fixup, and
the fixup uses a direct branch back into the function it fixes,
backtraces of fixups miss the original function. This is confusing,
and violates requirements for RELIABLE_STACKTRACE (and therefore
LIVEPATCH).
* Inline assembly and associated fixups are generated from templates,
and we have many copies of logically identical fixups which only
differ in which specific registers are written to and which address is
branched to at the end of the fixup. This is potentially wasteful of
I-cache resources, and makes it hard to add additional logic to fixups
without significant bloat.
* In the case of load_unaligned_zeropad(), the logic in the fixup
requires a temporary register that we must allocate even in the
fast-path where it will not be used.
This patch address all four concerns for load_unaligned_zeropad() fixups
by adding a dedicated exception handler which performs the fixup logic
in exception context and subsequent returns back after the faulting
instruction. For the moment, the fixup logic is identical to the old
assembly fixup logic, but in future we could enhance this by taking the
ESR and FAR into account to constrain the faults we try to fix up, or to
specialize fixups for MTE tag check faults.
Other than backtracing, there should be no functional change as a result
of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-13-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
For inline assembly, we place exception fixups out-of-line in the
`.fixup` section such that these are out of the way of the fast path.
This has a few drawbacks:
* Since the fixup code is anonymous, backtraces will symbolize fixups as
offsets from the nearest prior symbol, currently
`__entry_tramp_text_end`. This is confusing, and painful to debug
without access to the relevant vmlinux.
* Since the exception handler adjusts the PC to execute the fixup, and
the fixup uses a direct branch back into the function it fixes,
backtraces of fixups miss the original function. This is confusing,
and violates requirements for RELIABLE_STACKTRACE (and therefore
LIVEPATCH).
* Inline assembly and associated fixups are generated from templates,
and we have many copies of logically identical fixups which only
differ in which specific registers are written to and which address is
branched to at the end of the fixup. This is potentially wasteful of
I-cache resources, and makes it hard to add additional logic to fixups
without significant bloat.
This patch address all three concerns for inline uaccess fixups by
adding a dedicated exception handler which updates registers in
exception context and subsequent returns back into the function which
faulted, removing the need for fixups specialized to each faulting
instruction.
Other than backtracing, there should be no functional change as a result
of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-12-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Subsequent patches will add specialized handlers for fixups, in addition
to the simple PC fixup and BPF handlers we have today. In preparation,
this patch adds a new `type` field to struct exception_table_entry, and
uses this to distinguish the fixup and BPF cases. A `data` field is also
added so that subsequent patches can associate data specific to each
exception site (e.g. register numbers).
Handlers are named ex_handler_*() for consistency, following the exmaple
of x86. At the same time, get_ex_fixup() is split out into a helper so
that it can be used by other ex_handler_*() functions ins subsequent
patches.
This patch will increase the size of the exception tables, which will be
remedied by subsequent patches removing redundant fixup code. There
should be no functional change as a result of this patch.
Since each entry is now 12 bytes in size, we must reduce the alignment
of each entry from `.align 3` (i.e. 8 bytes) to `.align 2` (i.e. 4
bytes), which is the natrual alignment of the `insn` and `fixup` fields.
The current 8-byte alignment is a holdover from when the `insn` and
`fixup` fields was 8 bytes, and while not harmful has not been necessary
since commit:
6c94f27ac847ff8e ("arm64: switch to relative exception tables")
Similarly, RO_EXCEPTION_TABLE_ALIGN is dropped to 4 bytes.
Concurrently with this patch, x86's exception table entry format is
being updated (similarly to a 12-byte format, with 32-bytes of absolute
data). Once both have been merged it should be possible to unify the
sorttable logic for the two.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morse <james.morse@arm.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-11-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Subsequent patches will extend `struct exception_table_entry` with more
fields, and the distinction between the entry and its `fixup` field will
become more important.
For clarity, let's consistently use `ex` to refer to refer to an entire
entry. In subsequent patches we'll use `fixup` to refer to the fixup
field specifically. This matches the naming convention used today in
arch/arm64/net/bpf_jit_comp.c.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-10-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The return values of fixup_exception() and arm64_bpf_fixup_exception()
represent a boolean condition rather than an error code, so for clarity
it would be better to return `bool` rather than `int`.
This patch adjusts the code accordingly. While we're modifying the
prototype, we also remove the unnecessary `extern` keyword, so that this
won't look out of place when we make subsequent additions to the header.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morse <james.morse@arm.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-9-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In subsequent patches we'll alter the structure and usage of struct
exception_table_entry. For inline assembly, we create these using the
`_ASM_EXTABLE()` CPP macro defined in <asm/uaccess.h>, and for plain
assembly code we use the `_asm_extable()` GAS macro defined in
<asm/assembler.h>, which are largely identical save for different
escaping and stringification requirements.
This patch moves the common definitions to a new <asm/asm-extable.h>
header, so that it's easier to keep the two in-sync, and to remove the
implication that these are only used for uaccess helpers (as e.g.
load_unaligned_zeropad() is only used on kernel memory, and depends upon
`_ASM_EXTABLE()`.
At the same time, a few minor modifications are made for clarity and in
preparation for subsequent patches:
* The structure creation is factored out into an `__ASM_EXTABLE_RAW()`
macro. This will make it easier to support different fixup variants in
subsequent patches without needing to update all users of
`_ASM_EXTABLE()`, and makes it easier to see tha the CPP and GAS
variants of the macros are structurally identical.
For the CPP macro, the stringification of fields is left to the
wrapper macro, `_ASM_EXTABLE()`, as in subsequent patches it will be
necessary to stringify fields in wrapper macros to safely concatenate
strings which cannot be token-pasted together in CPP.
* The fields of the structure are created separately on their own lines.
This will make it easier to add/remove/modify individual fields
clearly.
* Additional parentheses are added around the use of macro arguments in
field definitions to avoid any potential problems with evaluation due
to operator precedence, and to make errors upon misuse clearer.
* USER() is moved into <asm/asm-uaccess.h>, as it is not required by all
assembly code, and is already refered to by comments in that file.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-8-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In subsequent patches we'll want to map W registers to their register
numbers. Update gpr-num.h so that we can do this.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-7-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In <asm/sysreg.h> we have macros to convert the names of general purpose
registers (GPRs) into integer constants, which we use to manually build
the encoding for `MRS` and `MSR` instructions where we can't rely on the
assembler to do so for us.
In subsequent patches we'll need to map the same GPR names to integer
constants so that we can use this to build metadata for exception
fixups.
So that the we can use the mappings elsewhere, factor out the
definitions into a new <asm/gpr-num.h> header, renaming the definitions
to align with this "GPR num" naming for clarity.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-6-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In subsequent patches we'll alter `struct exception_table_entry`, adding
fields that are not needed for KVM exception fixups.
In preparation for this, migrate KVM to its own `struct
kvm_exception_table_entry`, which is identical to the current format of
`struct exception_table_entry`. Comments are updated accordingly.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-5-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Like other functions, __arch_copy_to_user() places its exception fixups
in the `.fixup` section without any clear association with
__arch_copy_to_user() itself. If we backtrace the fixup code, it will be
symbolized as an offset from the nearest prior symbol, which happens to
be `__entry_tramp_text_end`. Further, since the PC adjustment for the
fixup is akin to a direct branch rather than a function call,
__arch_copy_to_user() itself will be missing from the backtrace.
This is confusing and hinders debugging. In general this pattern will
also be problematic for CONFIG_LIVEPATCH, since fixups often return to
their associated function, but this isn't accurately captured in the
stacktrace.
To solve these issues for assembly functions, we must move fixups into
the body of the functions themselves, after the usual fast-path returns.
This patch does so for __arch_copy_to_user().
Inline assembly will be dealt with in subsequent patches.
Other than the improved backtracing, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Like other functions, __arch_copy_from_user() places its exception
fixups in the `.fixup` section without any clear association with
__arch_copy_from_user() itself. If we backtrace the fixup code, it will
be symbolized as an offset from the nearest prior symbol, which happens
to be `__entry_tramp_text_end`. Further, since the PC adjustment for the
fixup is akin to a direct branch rather than a function call,
__arch_copy_from_user() itself will be missing from the backtrace.
This is confusing and hinders debugging. In general this pattern will
also be problematic for CONFIG_LIVEPATCH, since fixups often return to
their associated function, but this isn't accurately captured in the
stacktrace.
To solve these issues for assembly functions, we must move fixups into
the body of the functions themselves, after the usual fast-path returns.
This patch does so for __arch_copy_from_user().
Inline assembly will be dealt with in subsequent patches.
Other than the improved backtracing, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Like other functions, __arch_clear_user() places its exception fixups in
the `.fixup` section without any clear association with
__arch_clear_user() itself. If we backtrace the fixup code, it will be
symbolized as an offset from the nearest prior symbol, which happens to
be `__entry_tramp_text_end`. Further, since the PC adjustment for the
fixup is akin to a direct branch rather than a function call,
__arch_clear_user() itself will be missing from the backtrace.
This is confusing and hinders debugging. In general this pattern will
also be problematic for CONFIG_LIVEPATCH, since fixups often return to
their associated function, but this isn't accurately captured in the
stacktrace.
To solve these issues for assembly functions, we must move fixups into
the body of the functions themselves, after the usual fast-path returns.
This patch does so for __arch_clear_user().
Inline assembly will be dealt with in subsequent patches.
Other than the improved backtracing, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Similar to
commit 231ad7f409f1 ("Makefile: infer --target from ARCH for CC=clang")
There really is no point in setting --target based on
$CROSS_COMPILE_COMPAT for clang when the integrated assembler is being
used, since
commit ef94340583ee ("arm64: vdso32: drop -no-integrated-as flag").
Allows COMPAT_VDSO to be selected without setting $CROSS_COMPILE_COMPAT
when using clang and lld together.
Before:
$ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 defconfig
$ grep CONFIG_COMPAT_VDSO .config
CONFIG_COMPAT_VDSO=y
$ ARCH=arm64 make -j72 LLVM=1 defconfig
$ grep CONFIG_COMPAT_VDSO .config
$
After:
$ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 defconfig
$ grep CONFIG_COMPAT_VDSO .config
CONFIG_COMPAT_VDSO=y
$ ARCH=arm64 make -j72 LLVM=1 defconfig
$ grep CONFIG_COMPAT_VDSO .config
CONFIG_COMPAT_VDSO=y
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20211019223646.1146945-5-ndesaulniers@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
When running the following command without arm-linux-gnueabi-gcc in
one's $PATH, the following warning is observed:
$ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 mrproper
make[1]: arm-linux-gnueabi-gcc: No such file or directory
This is because KCONFIG is not run for mrproper, so CONFIG_CC_IS_CLANG
is not set, and we end up eagerly evaluating various variables that try
to invoke CC_COMPAT.
This is a similar problem to what was observed in
commit dc960bfeedb0 ("h8300: suppress error messages for 'make clean'")
Reported-by: Lucas Henneman <henneman@google.com>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20211019223646.1146945-4-ndesaulniers@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
As Arnd points out:
gcc-4.8 already supported -march=armv8, and we require gcc-5.1 now, so
both this #if/#else construct and the corresponding
"cc32-option,-march=armv8-a" check should be obsolete now.
Link: https://lore.kernel.org/lkml/CAK8P3a3UBEJ0Py2ycz=rHfgog8g3mCOeQOwO0Gmp-iz6Uxkapg@mail.gmail.com/
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20211019223646.1146945-3-ndesaulniers@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Binutils added support for this instruction in commit
e797f7e0b2bedc9328d4a9a0ebc63ca7a2dbbebc which shipped in 2.24 (just
missing the 2.23 release) but was cherry-picked into 2.23 in commit
27a50d6755bae906bc73b4ec1a8b448467f0bea1. Thanks to Christian and Simon
for helping me with the patch archaeology.
According to Documentation/process/changes.rst, the minimum supported
version of binutils is 2.23. Since all supported versions of GAS support
this instruction, drop the assembler invocation, preprocessor
flags/guards, and the cross assembler macro that's now unused.
This also avoids a recursive self reference in a follow up cleanup
patch.
Cc: Christian Biesinger <cbiesinger@google.com>
Cc: Simon Marchi <simon.marchi@polymtl.ca>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20211019223646.1146945-2-ndesaulniers@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
As for SVE we will track a per task SME vector length for tasks. Convert
the existing storage for the vector length into an array and update
fpsimd_flush_task() to initialise this in a function.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-10-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently when restoring the SVE state we supply the SVE vector length
as an argument to sve_load_state() and the underlying macros. This becomes
inconvenient with the addition of SME since we may need to restore any
combination of SVE and SME vector lengths, and we already separately
restore the vector length in the KVM code. We don't need to know the vector
length during the actual register load since the SME load instructions can
index into the data array for us.
Refactor the interface so we explicitly set the vector length separately
to restoring the SVE registers in preparation for adding SME support, no
functional change should be involved.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-9-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
With the introduction of SME we will have a second vector length in the
system, enumerated and configured in a very similar fashion to the
existing SVE vector length. While there are a few differences in how
things are handled this is a relatively small portion of the overall
code so in order to avoid code duplication we factor out
We create two structs, one vl_info for the static hardware properties
and one vl_config for the runtime configuration, with an array
instantiated for each and update all the users to reference these. Some
accessor functions are provided where helpful for readability, and the
write to set the vector length is put into a function since the system
register being updated needs to be chosen at compile time.
This is a mostly mechanical replacement, further work will be required
to actually make things generic, ensuring that we handle those places
where there are differences properly.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-8-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In a system with SME there are parallel vector length controls for SVE and
SME vectors which function in much the same way so it is desirable to
share the code for handling them as much as possible. In order to prepare
for doing this add a layer of accessor functions for the various VL related
operations on tasks.
Since almost all current interactions are actually via task->thread rather
than directly with the thread_info the accessors use that. Accessors are
provided for both generic and SVE specific usage, the generic accessors
should be used for cases where register state is being manipulated since
the registers are shared between streaming and regular SVE so we know that
when SME support is implemented we will always have to be in the appropriate
mode already and hence can generalise now.
Since we are using task_struct and we don't want to cause widespread
inclusion of sched.h the acessors are all out of line, it is hoped that
none of the uses are in a sufficiently critical path for this to be an
issue. Those that are most likely to present an issue are in the same
translation unit so hopefully the compiler may be able to inline anyway.
This is purely adding the layer of abstraction, additional work will be
needed to support tasks using SME.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-7-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The function has SVE specific checks in it and it will be more trouble
to add conditional code for SME than it is to simply rename it to be SVE
specific.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|