summaryrefslogtreecommitdiff
path: root/Documentation/arm64
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2023-02-21 15:27:48 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2023-02-21 15:27:48 -0800
commit8bf1a529cd664c8e5268381f1e24fe67aa611dd3 (patch)
treec5cec84941923778e4e2ec5a5d65e2fc8ea71b58 /Documentation/arm64
parentb327dfe05258e09c8db6e1e091c2e6d84dd426a6 (diff)
parentd54170812ef1c80e0fa3ed3e554a0bbfc2920d9d (diff)
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas: - Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit architectural register (ZT0, for the look-up table feature) that Linux needs to save/restore - Include TPIDR2 in the signal context and add the corresponding kselftests - Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG (ARM CMN) at probe time - Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64 - Permit EFI boot with MMU and caches on. Instead of cleaning the entire loaded kernel image to the PoC and disabling the MMU and caches before branching to the kernel bare metal entry point, leave the MMU and caches enabled and rely on EFI's cacheable 1:1 mapping of all of system RAM to populate the initial page tables - Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64 kernel (the arm32 kernel only defines the values) - Harden the arm64 shadow call stack pointer handling: stash the shadow stack pointer in the task struct on interrupt, load it directly from this structure - Signal handling cleanups to remove redundant validation of size information and avoid reading the same data from userspace twice - Refactor the hwcap macros to make use of the automatically generated ID registers. It should make new hwcaps writing less error prone - Further arm64 sysreg conversion and some fixes - arm64 kselftest fixes and improvements - Pointer authentication cleanups: don't sign leaf functions, unify asm-arch manipulation - Pseudo-NMI code generation optimisations - Minor fixes for SME and TPIDR2 handling - Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable, replace strtobool() to kstrtobool() in the cpufeature.c code, apply dynamic shadow call stack in two passes, intercept pfn changes in set_pte_at() without the required break-before-make sequence, attempt to dump all instructions on unhandled kernel faults * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (130 commits) arm64: fix .idmap.text assertion for large kernels kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests kselftest/arm64: Copy whole EXTRA context arm64: kprobes: Drop ID map text from kprobes blacklist perf: arm_spe: Print the version of SPE detected perf: arm_spe: Add support for SPEv1.2 inverted event filtering perf: Add perf_event_attr::config3 arm64/sme: Fix __finalise_el2 SMEver check drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable arm64/signal: Only read new data when parsing the ZT context arm64/signal: Only read new data when parsing the ZA context arm64/signal: Only read new data when parsing the SVE context arm64/signal: Avoid rereading context frame sizes arm64/signal: Make interface for restore_fpsimd_context() consistent arm64/signal: Remove redundant size validation from parse_user_sigframe() arm64/signal: Don't redundantly verify FPSIMD magic arm64/cpufeature: Use helper macros to specify hwcaps arm64/cpufeature: Always use symbolic name for feature value in hwcaps arm64/sysreg: Initial unsigned annotations for ID registers arm64/sysreg: Initial annotation of signed ID registers ...
Diffstat (limited to 'Documentation/arm64')
-rw-r--r--Documentation/arm64/booting.rst12
-rw-r--r--Documentation/arm64/elf_hwcaps.rst20
-rw-r--r--Documentation/arm64/sme.rst55
-rw-r--r--Documentation/arm64/sve.rst4
4 files changed, 78 insertions, 13 deletions
diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
index 96fe10ec6c24..ffeccdd6bdac 100644
--- a/Documentation/arm64/booting.rst
+++ b/Documentation/arm64/booting.rst
@@ -223,7 +223,7 @@ Before jumping into the kernel, the following conditions must be met:
For systems with a GICv3 interrupt controller to be used in v3 mode:
- If EL3 is present:
- - ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
+ - ICC_SRE_EL3.Enable (bit 3) must be initialised to 0b1.
- ICC_SRE_EL3.SRE (bit 0) must be initialised to 0b1.
- ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across
all CPUs the kernel is executing on, and must stay constant
@@ -369,6 +369,16 @@ Before jumping into the kernel, the following conditions must be met:
- HCR_EL2.ATA (bit 56) must be initialised to 0b1.
+ For CPUs with the Scalable Matrix Extension version 2 (FEAT_SME2):
+
+ - If EL3 is present:
+
+ - SMCR_EL3.EZT0 (bit 30) must be initialised to 0b1.
+
+ - If the kernel is entered at EL1 and EL2 is present:
+
+ - SMCR_EL2.EZT0 (bit 30) must be initialised to 0b1.
+
The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs. All CPUs must
enter the kernel in the same exception level. Where the values documented
diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index 6fed84f935df..83e57e4d38e2 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -14,7 +14,7 @@ Some hardware or software features are only available on some CPU
implementations, and/or with certain kernel configurations, but have no
architected discovery mechanism available to userspace code at EL0. The
kernel exposes the presence of these features to userspace through a set
-of flags called hwcaps, exposed in the auxilliary vector.
+of flags called hwcaps, exposed in the auxiliary vector.
Userspace software can test for features by acquiring the AT_HWCAP or
AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant
@@ -284,6 +284,24 @@ HWCAP2_RPRFM
HWCAP2_SVE2P1
Functionality implied by ID_AA64ZFR0_EL1.SVEver == 0b0010.
+HWCAP2_SME2
+ Functionality implied by ID_AA64SMFR0_EL1.SMEver == 0b0001.
+
+HWCAP2_SME2P1
+ Functionality implied by ID_AA64SMFR0_EL1.SMEver == 0b0010.
+
+HWCAP2_SMEI16I32
+ Functionality implied by ID_AA64SMFR0_EL1.I16I32 == 0b0101
+
+HWCAP2_SMEBI32I32
+ Functionality implied by ID_AA64SMFR0_EL1.BI32I32 == 0b1
+
+HWCAP2_SMEB16B16
+ Functionality implied by ID_AA64SMFR0_EL1.B16B16 == 0b1
+
+HWCAP2_SMEF16F16
+ Functionality implied by ID_AA64SMFR0_EL1.F16F16 == 0b1
+
4. Unused AT_HWCAP bits
-----------------------
diff --git a/Documentation/arm64/sme.rst b/Documentation/arm64/sme.rst
index 16d2db4c2e2e..1c43ea12eb4f 100644
--- a/Documentation/arm64/sme.rst
+++ b/Documentation/arm64/sme.rst
@@ -18,14 +18,19 @@ model features for SME is included in Appendix A.
1. General
-----------
-* PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA
- register state and TPIDR2_EL0 are tracked per thread.
+* PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA and (when
+ present) ZTn register state and TPIDR2_EL0 are tracked per thread.
* The presence of SME is reported to userspace via HWCAP2_SME in the aux vector
AT_HWCAP2 entry. Presence of this flag implies the presence of the SME
instructions and registers, and the Linux-specific system interfaces
described in this document. SME is reported in /proc/cpuinfo as "sme".
+* The presence of SME2 is reported to userspace via HWCAP2_SME2 in the
+ aux vector AT_HWCAP2 entry. Presence of this flag implies the presence of
+ the SME2 instructions and ZT0, and the Linux-specific system interfaces
+ described in this document. SME2 is reported in /proc/cpuinfo as "sme2".
+
* Support for the execution of SME instructions in userspace can also be
detected by reading the CPU ID register ID_AA64PFR1_EL1 using an MRS
instruction, and checking that the value of the SME field is nonzero. [3]
@@ -44,6 +49,7 @@ model features for SME is included in Appendix A.
HWCAP2_SME_B16F32
HWCAP2_SME_F32F32
HWCAP2_SME_FA64
+ HWCAP2_SME2
This list may be extended over time as the SME architecture evolves.
@@ -52,8 +58,8 @@ model features for SME is included in Appendix A.
cpu-feature-registers.txt for details.
* Debuggers should restrict themselves to interacting with the target via the
- NT_ARM_SVE, NT_ARM_SSVE and NT_ARM_ZA regsets. The recommended way
- of detecting support for these regsets is to connect to a target process
+ NT_ARM_SVE, NT_ARM_SSVE, NT_ARM_ZA and NT_ARM_ZT regsets. The recommended
+ way of detecting support for these regsets is to connect to a target process
first and then attempt a
ptrace(PTRACE_GETREGSET, pid, NT_ARM_<regset>, &iov).
@@ -89,13 +95,13 @@ be zeroed.
-------------------------
* On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the
- ZA matrix are preserved.
+ ZA matrix and ZTn (if present) are preserved.
* On syscall PSTATE.SM will be cleared and the SVE registers will be handled
as per the standard SVE ABI.
-* Neither the SVE registers nor ZA are used to pass arguments to or receive
- results from any syscall.
+* None of the SVE registers, ZA or ZTn are used to pass arguments to
+ or receive results from any syscall.
* On process creation (eg, clone()) the newly created process will have
PSTATE.SM cleared.
@@ -111,6 +117,9 @@ be zeroed.
* Signal handlers are invoked with streaming mode and ZA disabled.
+* A new signal frame record TPIDR2_MAGIC is added formatted as a struct
+ tpidr2_context to allow access to TPIDR2_EL0 from signal handlers.
+
* A new signal frame record za_context encodes the ZA register contents on
signal delivery. [1]
@@ -134,6 +143,14 @@ be zeroed.
__reserved[] referencing this space. za_context is then written in the
extra space. Refer to [1] for further details about this mechanism.
+* If ZTn is supported and PSTATE.ZA==1 then a signal frame record for ZTn will
+ be generated.
+
+* The signal record for ZTn has magic ZT_MAGIC (0x5a544e01) and consists of a
+ standard signal frame header followed by a struct zt_context specifying
+ the number of ZTn registers supported by the system, then zt_context.nregs
+ blocks of 64 bytes of data per register.
+
5. Signal return
-----------------
@@ -151,6 +168,9 @@ When returning from a signal handler:
the signal frame does not match the current vector length, the signal return
attempt is treated as illegal, resulting in a forced SIGSEGV.
+* If ZTn is not supported or PSTATE.ZA==0 then it is illegal to have a
+ signal frame record for ZTn, resulting in a forced SIGSEGV.
+
6. prctl extensions
--------------------
@@ -214,8 +234,8 @@ prctl(PR_SME_SET_VL, unsigned long arg)
vector length that will be applied at the next execve() by the calling
thread.
- * Changing the vector length causes all of ZA, P0..P15, FFR and all bits of
- Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
+ * Changing the vector length causes all of ZA, ZTn, P0..P15, FFR and all
+ bits of Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
unspecified, including both streaming and non-streaming SVE state.
Calling PR_SME_SET_VL with vl equal to the thread's current vector
length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag,
@@ -317,6 +337,15 @@ The regset data starts with struct user_za_header, containing:
* The effect of writing a partial, incomplete payload is unspecified.
+* A new regset NT_ARM_ZT is defined for access to ZTn state via
+ PTRACE_GETREGSET and PTRACE_SETREGSET.
+
+* The NT_ARM_ZT regset consists of a single 512 bit register.
+
+* When PSTATE.ZA==0 reads of NT_ARM_ZT will report all bits of ZTn as 0.
+
+* Writes to NT_ARM_ZT will set PSTATE.ZA to 1.
+
8. ELF coredump extensions
---------------------------
@@ -331,6 +360,11 @@ The regset data starts with struct user_za_header, containing:
been read if a PTRACE_GETREGSET of NT_ARM_ZA were executed for each thread
when the coredump was generated.
+* A NT_ARM_ZT note will be added to each coredump for each thread of the
+ dumped process. The contents will be equivalent to the data that would have
+ been read if a PTRACE_GETREGSET of NT_ARM_ZT were executed for each thread
+ when the coredump was generated.
+
* The NT_ARM_TLS note will be extended to two registers, the second register
will contain TPIDR2_EL0 on systems that support SME and will be read as
zero with writes ignored otherwise.
@@ -406,6 +440,9 @@ In A64 state, SME adds the following:
For best system performance it is strongly encouraged for software to enable
ZA only when it is actively being used.
+* A new ZT0 register is introduced when SME2 is present. This is a 512 bit
+ register which is accessible when PSTATE.ZA is set, as ZA itself is.
+
* Two new 1 bit fields in PSTATE which may be controlled via the SMSTART and
SMSTOP instructions or by access to the SVCR system register:
diff --git a/Documentation/arm64/sve.rst b/Documentation/arm64/sve.rst
index c7a356bf4e8f..1b90a30382ac 100644
--- a/Documentation/arm64/sve.rst
+++ b/Documentation/arm64/sve.rst
@@ -175,7 +175,7 @@ the SVE instruction set architecture.
When returning from a signal handler:
* If there is no sve_context record in the signal frame, or if the record is
- present but contains no register data as desribed in the previous section,
+ present but contains no register data as described in the previous section,
then the SVE registers/bits become non-live and take unspecified values.
* If sve_context is present in the signal frame and contains full register
@@ -223,7 +223,7 @@ prctl(PR_SVE_SET_VL, unsigned long arg)
Defer the requested vector length change until the next execve()
performed by this thread.
- The effect is equivalent to implicit exceution of the following
+ The effect is equivalent to implicit execution of the following
call immediately after the next execve() (if any) by the thread:
prctl(PR_SVE_SET_VL, arg & ~PR_SVE_SET_VL_ONEXEC)