diff options
Diffstat (limited to 'Documentation/core-api/protection-keys.rst')
| -rw-r--r-- | Documentation/core-api/protection-keys.rst | 81 |
1 files changed, 51 insertions, 30 deletions
diff --git a/Documentation/core-api/protection-keys.rst b/Documentation/core-api/protection-keys.rst index 49d9833af871..7eb7c6023e09 100644 --- a/Documentation/core-api/protection-keys.rst +++ b/Documentation/core-api/protection-keys.rst @@ -4,30 +4,48 @@ Memory Protection Keys ====================== -Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature -which is found on Intel's Skylake "Scalable Processor" Server CPUs. -It will be avalable in future non-server parts. - -For anyone wishing to test or use this feature, it is available in -Amazon's EC2 C5 instances and is known to work there using an Ubuntu -17.04 image. - -Memory Protection Keys provides a mechanism for enforcing page-based -protections, but without requiring modification of the page tables -when an application changes protection domains. It works by -dedicating 4 previously ignored bits in each page table entry to a -"protection key", giving 16 possible keys. - -There is also a new user-accessible register (PKRU) with two separate -bits (Access Disable and Write Disable) for each key. Being a CPU -register, PKRU is inherently thread-local, potentially giving each +Memory Protection Keys provide a mechanism for enforcing page-based +protections, but without requiring modification of the page tables when an +application changes protection domains. + +Pkeys Userspace (PKU) is a feature which can be found on: + * Intel server CPUs, Skylake and later + * Intel client CPUs, Tiger Lake (11th Gen Core) and later + * Future AMD CPUs + * arm64 CPUs implementing the Permission Overlay Extension (FEAT_S1POE) + +x86_64 +====== +Pkeys work by dedicating 4 previously Reserved bits in each page table entry to +a "protection key", giving 16 possible keys. + +Protections for each key are defined with a per-CPU user-accessible register +(PKRU). Each of these is a 32-bit register storing two bits (Access Disable +and Write Disable) for each of 16 keys. + +Being a CPU register, PKRU is inherently thread-local, potentially giving each thread a different set of protections from every other thread. -There are two new instructions (RDPKRU/WRPKRU) for reading and writing -to the new register. The feature is only available in 64-bit mode, -even though there is theoretically space in the PAE PTEs. These -permissions are enforced on data access only and have no effect on -instruction fetches. +There are two instructions (RDPKRU/WRPKRU) for reading and writing to the +register. The feature is only available in 64-bit mode, even though there is +theoretically space in the PAE PTEs. These permissions are enforced on data +access only and have no effect on instruction fetches. + +arm64 +===== + +Pkeys use 3 bits in each page table entry, to encode a "protection key index", +giving 8 possible keys. + +Protections for each key are defined with a per-CPU user-writable system +register (POR_EL0). This is a 64-bit register encoding read, write and execute +overlay permissions for each protection key index. + +Being a CPU register, POR_EL0 is inherently thread-local, potentially giving +each thread a different set of protections from every other thread. + +Unlike x86_64, the protection key permissions also apply to instruction +fetches. Syscalls ======== @@ -39,11 +57,10 @@ There are 3 system calls which directly interact with pkeys:: int pkey_mprotect(unsigned long start, size_t len, unsigned long prot, int pkey); -Before a pkey can be used, it must first be allocated with -pkey_alloc(). An application calls the WRPKRU instruction -directly in order to change access permissions to memory covered -with a key. In this example WRPKRU is wrapped by a C function -called pkey_set(). +Before a pkey can be used, it must first be allocated with pkey_alloc(). An +application writes to the architecture specific CPU register directly in order +to change access permissions to memory covered with a key. In this example +this is wrapped by a C function called pkey_set(). :: int real_prot = PROT_READ|PROT_WRITE; @@ -65,9 +82,9 @@ is no longer in use:: munmap(ptr, PAGE_SIZE); pkey_free(pkey); -.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions. - An example implementation can be found in - tools/testing/selftests/x86/protection_keys.c. +.. note:: pkey_set() is a wrapper around writing to the CPU register. + Example implementations can be found in + tools/testing/selftests/mm/pkey-{arm64,powerpc,x86}.h Behavior ======== @@ -97,3 +114,7 @@ with a read():: The kernel will send a SIGSEGV in both cases, but si_code will be set to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when the plain mprotect() permissions are violated. + +Note that kernel accesses from a kthread (such as io_uring) will use a default +value for the protection key register and so will not be consistent with +userspace's value of the register or mprotect(). |
