diff options
Diffstat (limited to 'Documentation/virt/kvm/api.rst')
-rw-r--r-- | Documentation/virt/kvm/api.rst | 113 |
1 files changed, 109 insertions, 4 deletions
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 1bd2d42e6424..6aa40ee05a4a 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -2006,7 +2006,14 @@ frequency is KHz. If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also be used as a vm ioctl to set the initial tsc frequency of subsequently -created vCPUs. +created vCPUs. Note, the vm ioctl is only allowed prior to creating vCPUs. + +For TSC protected Confidential Computing (CoCo) VMs where TSC frequency +is configured once at VM scope and remains unchanged during VM's +lifetime, the vm ioctl should be used to configure the TSC frequency +and the vcpu ioctl is not supported. + +Example of such CoCo VMs: TDX guests. 4.56 KVM_GET_TSC_KHZ -------------------- @@ -6645,7 +6652,8 @@ to the byte array. .. note:: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR, KVM_EXIT_XEN, - KVM_EXIT_EPR, KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding + KVM_EXIT_EPR, KVM_EXIT_HYPERCALL, KVM_EXIT_TDX, + KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding operations are complete (and guest state is consistent) only after userspace has re-entered the kernel with KVM_RUN. The kernel side will first finish incomplete operations and then check for pending signals. @@ -7176,6 +7184,69 @@ The valid value for 'flags' is: :: + /* KVM_EXIT_TDX */ + struct { + __u64 flags; + __u64 nr; + union { + struct { + u64 ret; + u64 data[5]; + } unknown; + struct { + u64 ret; + u64 gpa; + u64 size; + } get_quote; + struct { + u64 ret; + u64 leaf; + u64 r11, r12, r13, r14; + } get_tdvmcall_info; + struct { + u64 ret; + u64 vector; + } setup_event_notify; + }; + } tdx; + +Process a TDVMCALL from the guest. KVM forwards select TDVMCALL based +on the Guest-Hypervisor Communication Interface (GHCI) specification; +KVM bridges these requests to the userspace VMM with minimal changes, +placing the inputs in the union and copying them back to the guest +on re-entry. + +Flags are currently always zero, whereas ``nr`` contains the TDVMCALL +number from register R11. The remaining field of the union provide the +inputs and outputs of the TDVMCALL. Currently the following values of +``nr`` are defined: + + * ``TDVMCALL_GET_QUOTE``: the guest has requested to generate a TD-Quote + signed by a service hosting TD-Quoting Enclave operating on the host. + Parameters and return value are in the ``get_quote`` field of the union. + The ``gpa`` field and ``size`` specify the guest physical address + (without the shared bit set) and the size of a shared-memory buffer, in + which the TDX guest passes a TD Report. The ``ret`` field represents + the return value of the GetQuote request. When the request has been + queued successfully, the TDX guest can poll the status field in the + shared-memory area to check whether the Quote generation is completed or + not. When completed, the generated Quote is returned via the same buffer. + + * ``TDVMCALL_GET_TD_VM_CALL_INFO``: the guest has requested the support + status of TDVMCALLs. The output values for the given leaf should be + placed in fields from ``r11`` to ``r14`` of the ``get_tdvmcall_info`` + field of the union. + + * ``TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT``: the guest has requested to + set up a notification interrupt for vector ``vector``. + +KVM may add support for more values in the future that may cause a userspace +exit, even without calls to ``KVM_ENABLE_CAP`` or similar. In this case, +it will enter with output fields already valid; in the common case, the +``unknown.ret`` field of the union will be ``TDVMCALL_STATUS_SUBFUNC_UNSUPPORTED``. +Userspace need not do anything if it does not wish to support a TDVMCALL. +:: + /* Fix the size of the union. */ char padding[256]; }; @@ -7780,6 +7851,7 @@ Valid bits in args[0] are:: #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) #define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3) + #define KVM_X86_DISABLE_EXITS_APERFMPERF (1 << 4) Enabling this capability on a VM provides userspace with a way to no longer intercept some instructions for improved latency in some @@ -7790,6 +7862,28 @@ all such vmexits. Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits. +Virtualizing the ``IA32_APERF`` and ``IA32_MPERF`` MSRs requires more +than just disabling APERF/MPERF exits. While both Intel and AMD +document strict usage conditions for these MSRs--emphasizing that only +the ratio of their deltas over a time interval (T0 to T1) is +architecturally defined--simply passing through the MSRs can still +produce an incorrect ratio. + +This erroneous ratio can occur if, between T0 and T1: + +1. The vCPU thread migrates between logical processors. +2. Live migration or suspend/resume operations take place. +3. Another task shares the vCPU's logical processor. +4. C-states lower than C0 are emulated (e.g., via HLT interception). +5. The guest TSC frequency doesn't match the host TSC frequency. + +Due to these complexities, KVM does not automatically associate this +passthrough capability with the guest CPUID bit, +``CPUID.6:ECX.APERFMPERF[bit 0]``. Userspace VMMs that deem this +mechanism adequate for virtualizing the ``IA32_APERF`` and +``IA32_MPERF`` MSRs must set the guest CPUID bit explicitly. + + 7.14 KVM_CAP_S390_HPAGE_1M -------------------------- @@ -8316,7 +8410,7 @@ core crystal clock frequency, if a non-zero CPUID 0x15 is exposed to the guest. 7.36 KVM_CAP_DIRTY_LOG_RING/KVM_CAP_DIRTY_LOG_RING_ACQ_REL ---------------------------------------------------------- -:Architectures: x86, arm64 +:Architectures: x86, arm64, riscv :Type: vm :Parameters: args[0] - size of the dirty log ring @@ -8528,7 +8622,7 @@ ENOSYS for the others. When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request. -7.37 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS +7.42 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS ------------------------------------- :Architectures: arm64 @@ -8557,6 +8651,17 @@ given VM. When this capability is enabled, KVM resets the VCPU when setting MP_STATE_INIT_RECEIVED through IOCTL. The original MP_STATE is preserved. +7.43 KVM_CAP_ARM_CACHEABLE_PFNMAP_SUPPORTED +------------------------------------------- + +:Architectures: arm64 +:Target: VM +:Parameters: None + +This capability indicate to the userspace whether a PFNMAP memory region +can be safely mapped as cacheable. This relies on the presence of +force write back (FWB) feature support on the hardware. + 8. Other capabilities. ====================== |