diff options
Diffstat (limited to 'Documentation/virt/kvm/x86')
-rw-r--r-- | Documentation/virt/kvm/x86/amd-memory-encryption.rst | 169 | ||||
-rw-r--r-- | Documentation/virt/kvm/x86/errata.rst | 30 | ||||
-rw-r--r-- | Documentation/virt/kvm/x86/index.rst | 1 | ||||
-rw-r--r-- | Documentation/virt/kvm/x86/intel-tdx.rst | 255 |
4 files changed, 451 insertions, 4 deletions
diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst index 84335d119ff1..1ddb6a86ce7f 100644 --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst @@ -76,15 +76,56 @@ are defined in ``<linux/psp-dev.h>``. KVM implements the following commands to support common lifecycle events of SEV guests, such as launching, running, snapshotting, migrating and decommissioning. -1. KVM_SEV_INIT ---------------- +1. KVM_SEV_INIT2 +---------------- -The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform +The KVM_SEV_INIT2 command is used by the hypervisor to initialize the SEV platform context. In a typical workflow, this command should be the first command issued. +For this command to be accepted, either KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM +must have been passed to the KVM_CREATE_VM ioctl. A virtual machine created +with those machine types in turn cannot be run until KVM_SEV_INIT2 is invoked. + +Parameters: struct kvm_sev_init (in) Returns: 0 on success, -negative on error +:: + + struct kvm_sev_init { + __u64 vmsa_features; /* initial value of features field in VMSA */ + __u32 flags; /* must be 0 */ + __u16 ghcb_version; /* maximum guest GHCB version allowed */ + __u16 pad1; + __u32 pad2[8]; + }; + +It is an error if the hypervisor does not support any of the bits that +are set in ``flags`` or ``vmsa_features``. ``vmsa_features`` must be +0 for SEV virtual machines, as they do not have a VMSA. + +``ghcb_version`` must be 0 for SEV virtual machines, as they do not issue GHCB +requests. If ``ghcb_version`` is 0 for any other guest type, then the maximum +allowed guest GHCB protocol will default to version 2. + +This command replaces the deprecated KVM_SEV_INIT and KVM_SEV_ES_INIT commands. +The commands did not have any parameters (the ```data``` field was unused) and +only work for the KVM_X86_DEFAULT_VM machine type (0). + +They behave as if: + +* the VM type is KVM_X86_SEV_VM for KVM_SEV_INIT, or KVM_X86_SEV_ES_VM for + KVM_SEV_ES_INIT + +* the ``flags`` and ``vmsa_features`` fields of ``struct kvm_sev_init`` are + set to zero, and ``ghcb_version`` is set to 0 for KVM_SEV_INIT and 1 for + KVM_SEV_ES_INIT. + +If the ``KVM_X86_SEV_VMSA_FEATURES`` attribute does not exist, the hypervisor only +supports KVM_SEV_INIT and KVM_SEV_ES_INIT. In that case, note that KVM_SEV_ES_INIT +might set the debug swap VMSA feature (bit 5) depending on the value of the +``debug_swap`` parameter of ``kvm-amd.ko``. + 2. KVM_SEV_LAUNCH_START ----------------------- @@ -425,6 +466,124 @@ issued by the hypervisor to make the guest ready for execution. Returns: 0 on success, -negative on error +18. KVM_SEV_SNP_LAUNCH_START +---------------------------- + +The KVM_SNP_LAUNCH_START command is used for creating the memory encryption +context for the SEV-SNP guest. It must be called prior to issuing +KVM_SEV_SNP_LAUNCH_UPDATE or KVM_SEV_SNP_LAUNCH_FINISH; + +Parameters (in): struct kvm_sev_snp_launch_start + +Returns: 0 on success, -negative on error + +:: + + struct kvm_sev_snp_launch_start { + __u64 policy; /* Guest policy to use. */ + __u8 gosvw[16]; /* Guest OS visible workarounds. */ + __u16 flags; /* Must be zero. */ + __u8 pad0[6]; + __u64 pad1[4]; + }; + +See SNP_LAUNCH_START in the SEV-SNP specification [snp-fw-abi]_ for further +details on the input parameters in ``struct kvm_sev_snp_launch_start``. + +19. KVM_SEV_SNP_LAUNCH_UPDATE +----------------------------- + +The KVM_SEV_SNP_LAUNCH_UPDATE command is used for loading userspace-provided +data into a guest GPA range, measuring the contents into the SNP guest context +created by KVM_SEV_SNP_LAUNCH_START, and then encrypting/validating that GPA +range so that it will be immediately readable using the encryption key +associated with the guest context once it is booted, after which point it can +attest the measurement associated with its context before unlocking any +secrets. + +It is required that the GPA ranges initialized by this command have had the +KVM_MEMORY_ATTRIBUTE_PRIVATE attribute set in advance. See the documentation +for KVM_SET_MEMORY_ATTRIBUTES for more details on this aspect. + +Upon success, this command is not guaranteed to have processed the entire +range requested. Instead, the ``gfn_start``, ``uaddr``, and ``len`` fields of +``struct kvm_sev_snp_launch_update`` will be updated to correspond to the +remaining range that has yet to be processed. The caller should continue +calling this command until those fields indicate the entire range has been +processed, e.g. ``len`` is 0, ``gfn_start`` is equal to the last GFN in the +range plus 1, and ``uaddr`` is the last byte of the userspace-provided source +buffer address plus 1. In the case where ``type`` is KVM_SEV_SNP_PAGE_TYPE_ZERO, +``uaddr`` will be ignored completely. + +Parameters (in): struct kvm_sev_snp_launch_update + +Returns: 0 on success, < 0 on error, -EAGAIN if caller should retry + +:: + + struct kvm_sev_snp_launch_update { + __u64 gfn_start; /* Guest page number to load/encrypt data into. */ + __u64 uaddr; /* Userspace address of data to be loaded/encrypted. */ + __u64 len; /* 4k-aligned length in bytes to copy into guest memory.*/ + __u8 type; /* The type of the guest pages being initialized. */ + __u8 pad0; + __u16 flags; /* Must be zero. */ + __u32 pad1; + __u64 pad2[4]; + + }; + +where the allowed values for page_type are #define'd as:: + + KVM_SEV_SNP_PAGE_TYPE_NORMAL + KVM_SEV_SNP_PAGE_TYPE_ZERO + KVM_SEV_SNP_PAGE_TYPE_UNMEASURED + KVM_SEV_SNP_PAGE_TYPE_SECRETS + KVM_SEV_SNP_PAGE_TYPE_CPUID + +See the SEV-SNP spec [snp-fw-abi]_ for further details on how each page type is +used/measured. + +20. KVM_SEV_SNP_LAUNCH_FINISH +----------------------------- + +After completion of the SNP guest launch flow, the KVM_SEV_SNP_LAUNCH_FINISH +command can be issued to make the guest ready for execution. + +Parameters (in): struct kvm_sev_snp_launch_finish + +Returns: 0 on success, -negative on error + +:: + + struct kvm_sev_snp_launch_finish { + __u64 id_block_uaddr; + __u64 id_auth_uaddr; + __u8 id_block_en; + __u8 auth_key_en; + __u8 vcek_disabled; + __u8 host_data[32]; + __u8 pad0[3]; + __u16 flags; /* Must be zero */ + __u64 pad1[4]; + }; + + +See SNP_LAUNCH_FINISH in the SEV-SNP specification [snp-fw-abi]_ for further +details on the input parameters in ``struct kvm_sev_snp_launch_finish``. + +Device attribute API +==================== + +Attributes of the SEV implementation can be retrieved through the +``KVM_HAS_DEVICE_ATTR`` and ``KVM_GET_DEVICE_ATTR`` ioctls on the ``/dev/kvm`` +device node, using group ``KVM_X86_GRP_SEV``. + +Currently only one attribute is implemented: + +* ``KVM_X86_SEV_VMSA_FEATURES``: return the set of all bits that + are accepted in the ``vmsa_features`` of ``KVM_SEV_INIT2``. + Firmware Management =================== @@ -444,9 +603,11 @@ References ========== -See [white-paper]_, [api-spec]_, [amd-apm]_ and [kvm-forum]_ for more info. +See [white-paper]_, [api-spec]_, [amd-apm]_, [kvm-forum]_, and [snp-fw-abi]_ +for more info. .. [white-paper] https://developer.amd.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf .. [api-spec] https://support.amd.com/TechDocs/55766_SEV-KM_API_Specification.pdf .. [amd-apm] https://support.amd.com/TechDocs/24593.pdf (section 15.34) .. [kvm-forum] https://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf +.. [snp-fw-abi] https://www.amd.com/system/files/TechDocs/56860.pdf diff --git a/Documentation/virt/kvm/x86/errata.rst b/Documentation/virt/kvm/x86/errata.rst index 49a05f24747b..37c79362a48f 100644 --- a/Documentation/virt/kvm/x86/errata.rst +++ b/Documentation/virt/kvm/x86/errata.rst @@ -33,6 +33,18 @@ Note however that any software (e.g ``WIN87EM.DLL``) expecting these features to be present likely predates these CPUID feature bits, and therefore doesn't know to check for them anyway. +``KVM_SET_VCPU_EVENTS`` issue +----------------------------- + +Invalid KVM_SET_VCPU_EVENTS input with respect to error codes *may* result in +failed VM-Entry on Intel CPUs. Pre-CET Intel CPUs require that exception +injection through the VMCS correctly set the "error code valid" flag, e.g. +require the flag be set when injecting a #GP, clear when injecting a #UD, +clear when injecting a soft exception, etc. Intel CPUs that enumerate +IA32_VMX_BASIC[56] as '1' relax VMX's consistency checks, and AMD CPUs have no +restrictions whatsoever. KVM_SET_VCPU_EVENTS doesn't sanity check the vector +versus "has_error_code", i.e. KVM's ABI follows AMD behavior. + Nested virtualization features ------------------------------ @@ -48,3 +60,21 @@ have the same physical APIC ID, KVM will deliver events targeting that APIC ID only to the vCPU with the lowest vCPU ID. If KVM_X2APIC_API_USE_32BIT_IDS is not enabled, KVM follows x86 architecture when processing interrupts (all vCPUs matching the target APIC ID receive the interrupt). + +MTRRs +----- +KVM does not virtualize guest MTRR memory types. KVM emulates accesses to MTRR +MSRs, i.e. {RD,WR}MSR in the guest will behave as expected, but KVM does not +honor guest MTRRs when determining the effective memory type, and instead +treats all of guest memory as having Writeback (WB) MTRRs. + +CR0.CD +------ +KVM does not virtualize CR0.CD on Intel CPUs. Similar to MTRR MSRs, KVM +emulates CR0.CD accesses so that loads and stores from/to CR0 behave as +expected, but setting CR0.CD=1 has no impact on the cachaeability of guest +memory. + +Note, this erratum does not affect AMD CPUs, which fully virtualize CR0.CD in +hardware, i.e. put the CPU caches into "no fill" mode when CR0.CD=1, even when +running in the guest.
\ No newline at end of file diff --git a/Documentation/virt/kvm/x86/index.rst b/Documentation/virt/kvm/x86/index.rst index 9ece6b8dc817..851e99174762 100644 --- a/Documentation/virt/kvm/x86/index.rst +++ b/Documentation/virt/kvm/x86/index.rst @@ -11,6 +11,7 @@ KVM for x86 systems cpuid errata hypercalls + intel-tdx mmu msr nested-vmx diff --git a/Documentation/virt/kvm/x86/intel-tdx.rst b/Documentation/virt/kvm/x86/intel-tdx.rst new file mode 100644 index 000000000000..76bdd95334d6 --- /dev/null +++ b/Documentation/virt/kvm/x86/intel-tdx.rst @@ -0,0 +1,255 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=================================== +Intel Trust Domain Extensions (TDX) +=================================== + +Overview +======== +Intel's Trust Domain Extensions (TDX) protect confidential guest VMs from the +host and physical attacks. A CPU-attested software module called 'the TDX +module' runs inside a new CPU isolated range to provide the functionalities to +manage and run protected VMs, a.k.a, TDX guests or TDs. + +Please refer to [1] for the whitepaper, specifications and other resources. + +This documentation describes TDX-specific KVM ABIs. The TDX module needs to be +initialized before it can be used by KVM to run any TDX guests. The host +core-kernel provides the support of initializing the TDX module, which is +described in the Documentation/arch/x86/tdx.rst. + +API description +=============== + +KVM_MEMORY_ENCRYPT_OP +--------------------- +:Type: vm ioctl, vcpu ioctl + +For TDX operations, KVM_MEMORY_ENCRYPT_OP is re-purposed to be generic +ioctl with TDX specific sub-ioctl() commands. + +:: + + /* Trust Domain Extensions sub-ioctl() commands. */ + enum kvm_tdx_cmd_id { + KVM_TDX_CAPABILITIES = 0, + KVM_TDX_INIT_VM, + KVM_TDX_INIT_VCPU, + KVM_TDX_INIT_MEM_REGION, + KVM_TDX_FINALIZE_VM, + KVM_TDX_GET_CPUID, + + KVM_TDX_CMD_NR_MAX, + }; + + struct kvm_tdx_cmd { + /* enum kvm_tdx_cmd_id */ + __u32 id; + /* flags for sub-command. If sub-command doesn't use this, set zero. */ + __u32 flags; + /* + * data for each sub-command. An immediate or a pointer to the actual + * data in process virtual address. If sub-command doesn't use it, + * set zero. + */ + __u64 data; + /* + * Auxiliary error code. The sub-command may return TDX SEAMCALL + * status code in addition to -Exxx. + */ + __u64 hw_error; + }; + +KVM_TDX_CAPABILITIES +-------------------- +:Type: vm ioctl +:Returns: 0 on success, <0 on error + +Return the TDX capabilities that current KVM supports with the specific TDX +module loaded in the system. It reports what features/capabilities are allowed +to be configured to the TDX guest. + +- id: KVM_TDX_CAPABILITIES +- flags: must be 0 +- data: pointer to struct kvm_tdx_capabilities +- hw_error: must be 0 + +:: + + struct kvm_tdx_capabilities { + __u64 supported_attrs; + __u64 supported_xfam; + __u64 reserved[254]; + + /* Configurable CPUID bits for userspace */ + struct kvm_cpuid2 cpuid; + }; + + +KVM_TDX_INIT_VM +--------------- +:Type: vm ioctl +:Returns: 0 on success, <0 on error + +Perform TDX specific VM initialization. This needs to be called after +KVM_CREATE_VM and before creating any VCPUs. + +- id: KVM_TDX_INIT_VM +- flags: must be 0 +- data: pointer to struct kvm_tdx_init_vm +- hw_error: must be 0 + +:: + + struct kvm_tdx_init_vm { + __u64 attributes; + __u64 xfam; + __u64 mrconfigid[6]; /* sha384 digest */ + __u64 mrowner[6]; /* sha384 digest */ + __u64 mrownerconfig[6]; /* sha384 digest */ + + /* The total space for TD_PARAMS before the CPUIDs is 256 bytes */ + __u64 reserved[12]; + + /* + * Call KVM_TDX_INIT_VM before vcpu creation, thus before + * KVM_SET_CPUID2. + * This configuration supersedes KVM_SET_CPUID2s for VCPUs because the + * TDX module directly virtualizes those CPUIDs without VMM. The user + * space VMM, e.g. qemu, should make KVM_SET_CPUID2 consistent with + * those values. If it doesn't, KVM may have wrong idea of vCPUIDs of + * the guest, and KVM may wrongly emulate CPUIDs or MSRs that the TDX + * module doesn't virtualize. + */ + struct kvm_cpuid2 cpuid; + }; + + +KVM_TDX_INIT_VCPU +----------------- +:Type: vcpu ioctl +:Returns: 0 on success, <0 on error + +Perform TDX specific VCPU initialization. + +- id: KVM_TDX_INIT_VCPU +- flags: must be 0 +- data: initial value of the guest TD VCPU RCX +- hw_error: must be 0 + +KVM_TDX_INIT_MEM_REGION +----------------------- +:Type: vcpu ioctl +:Returns: 0 on success, <0 on error + +Initialize @nr_pages TDX guest private memory starting from @gpa with userspace +provided data from @source_addr. + +Note, before calling this sub command, memory attribute of the range +[gpa, gpa + nr_pages] needs to be private. Userspace can use +KVM_SET_MEMORY_ATTRIBUTES to set the attribute. + +If KVM_TDX_MEASURE_MEMORY_REGION flag is specified, it also extends measurement. + +- id: KVM_TDX_INIT_MEM_REGION +- flags: currently only KVM_TDX_MEASURE_MEMORY_REGION is defined +- data: pointer to struct kvm_tdx_init_mem_region +- hw_error: must be 0 + +:: + + #define KVM_TDX_MEASURE_MEMORY_REGION (1UL << 0) + + struct kvm_tdx_init_mem_region { + __u64 source_addr; + __u64 gpa; + __u64 nr_pages; + }; + + +KVM_TDX_FINALIZE_VM +------------------- +:Type: vm ioctl +:Returns: 0 on success, <0 on error + +Complete measurement of the initial TD contents and mark it ready to run. + +- id: KVM_TDX_FINALIZE_VM +- flags: must be 0 +- data: must be 0 +- hw_error: must be 0 + + +KVM_TDX_GET_CPUID +----------------- +:Type: vcpu ioctl +:Returns: 0 on success, <0 on error + +Get the CPUID values that the TDX module virtualizes for the TD guest. +When it returns -E2BIG, the user space should allocate a larger buffer and +retry. The minimum buffer size is updated in the nent field of the +struct kvm_cpuid2. + +- id: KVM_TDX_GET_CPUID +- flags: must be 0 +- data: pointer to struct kvm_cpuid2 (in/out) +- hw_error: must be 0 (out) + +:: + + struct kvm_cpuid2 { + __u32 nent; + __u32 padding; + struct kvm_cpuid_entry2 entries[0]; + }; + + struct kvm_cpuid_entry2 { + __u32 function; + __u32 index; + __u32 flags; + __u32 eax; + __u32 ebx; + __u32 ecx; + __u32 edx; + __u32 padding[3]; + }; + +KVM TDX creation flow +===================== +In addition to the standard KVM flow, new TDX ioctls need to be called. The +control flow is as follows: + +#. Check system wide capability + + * KVM_CAP_VM_TYPES: Check if VM type is supported and if KVM_X86_TDX_VM + is supported. + +#. Create VM + + * KVM_CREATE_VM + * KVM_TDX_CAPABILITIES: Query TDX capabilities for creating TDX guests. + * KVM_CHECK_EXTENSION(KVM_CAP_MAX_VCPUS): Query maximum VCPUs the TD can + support at VM level (TDX has its own limitation on this). + * KVM_SET_TSC_KHZ: Configure TD's TSC frequency if a different TSC frequency + than host is desired. This is Optional. + * KVM_TDX_INIT_VM: Pass TDX specific VM parameters. + +#. Create VCPU + + * KVM_CREATE_VCPU + * KVM_TDX_INIT_VCPU: Pass TDX specific VCPU parameters. + * KVM_SET_CPUID2: Configure TD's CPUIDs. + * KVM_SET_MSRS: Configure TD's MSRs. + +#. Initialize initial guest memory + + * Prepare content of initial guest memory. + * KVM_TDX_INIT_MEM_REGION: Add initial guest memory. + * KVM_TDX_FINALIZE_VM: Finalize the measurement of the TDX guest. + +#. Run VCPU + +References +========== + +https://www.intel.com/content/www/us/en/developer/tools/trust-domain-extensions/documentation.html |