summaryrefslogtreecommitdiff
path: root/Documentation/virt/kvm/api.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/virt/kvm/api.rst')
-rw-r--r--Documentation/virt/kvm/api.rst84
1 files changed, 66 insertions, 18 deletions
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 9abf93ee5f65..6aa40ee05a4a 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -2006,7 +2006,14 @@ frequency is KHz.
If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also
be used as a vm ioctl to set the initial tsc frequency of subsequently
-created vCPUs.
+created vCPUs. Note, the vm ioctl is only allowed prior to creating vCPUs.
+
+For TSC protected Confidential Computing (CoCo) VMs where TSC frequency
+is configured once at VM scope and remains unchanged during VM's
+lifetime, the vm ioctl should be used to configure the TSC frequency
+and the vcpu ioctl is not supported.
+
+Example of such CoCo VMs: TDX guests.
4.56 KVM_GET_TSC_KHZ
--------------------
@@ -7196,6 +7203,10 @@ The valid value for 'flags' is:
u64 leaf;
u64 r11, r12, r13, r14;
} get_tdvmcall_info;
+ struct {
+ u64 ret;
+ u64 vector;
+ } setup_event_notify;
};
} tdx;
@@ -7210,21 +7221,24 @@ number from register R11. The remaining field of the union provide the
inputs and outputs of the TDVMCALL. Currently the following values of
``nr`` are defined:
-* ``TDVMCALL_GET_QUOTE``: the guest has requested to generate a TD-Quote
-signed by a service hosting TD-Quoting Enclave operating on the host.
-Parameters and return value are in the ``get_quote`` field of the union.
-The ``gpa`` field and ``size`` specify the guest physical address
-(without the shared bit set) and the size of a shared-memory buffer, in
-which the TDX guest passes a TD Report. The ``ret`` field represents
-the return value of the GetQuote request. When the request has been
-queued successfully, the TDX guest can poll the status field in the
-shared-memory area to check whether the Quote generation is completed or
-not. When completed, the generated Quote is returned via the same buffer.
-
-* ``TDVMCALL_GET_TD_VM_CALL_INFO``: the guest has requested the support
-status of TDVMCALLs. The output values for the given leaf should be
-placed in fields from ``r11`` to ``r14`` of the ``get_tdvmcall_info``
-field of the union.
+ * ``TDVMCALL_GET_QUOTE``: the guest has requested to generate a TD-Quote
+ signed by a service hosting TD-Quoting Enclave operating on the host.
+ Parameters and return value are in the ``get_quote`` field of the union.
+ The ``gpa`` field and ``size`` specify the guest physical address
+ (without the shared bit set) and the size of a shared-memory buffer, in
+ which the TDX guest passes a TD Report. The ``ret`` field represents
+ the return value of the GetQuote request. When the request has been
+ queued successfully, the TDX guest can poll the status field in the
+ shared-memory area to check whether the Quote generation is completed or
+ not. When completed, the generated Quote is returned via the same buffer.
+
+ * ``TDVMCALL_GET_TD_VM_CALL_INFO``: the guest has requested the support
+ status of TDVMCALLs. The output values for the given leaf should be
+ placed in fields from ``r11`` to ``r14`` of the ``get_tdvmcall_info``
+ field of the union.
+
+ * ``TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT``: the guest has requested to
+ set up a notification interrupt for vector ``vector``.
KVM may add support for more values in the future that may cause a userspace
exit, even without calls to ``KVM_ENABLE_CAP`` or similar. In this case,
@@ -7837,6 +7851,7 @@ Valid bits in args[0] are::
#define KVM_X86_DISABLE_EXITS_HLT (1 << 1)
#define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2)
#define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3)
+ #define KVM_X86_DISABLE_EXITS_APERFMPERF (1 << 4)
Enabling this capability on a VM provides userspace with a way to no
longer intercept some instructions for improved latency in some
@@ -7847,6 +7862,28 @@ all such vmexits.
Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits.
+Virtualizing the ``IA32_APERF`` and ``IA32_MPERF`` MSRs requires more
+than just disabling APERF/MPERF exits. While both Intel and AMD
+document strict usage conditions for these MSRs--emphasizing that only
+the ratio of their deltas over a time interval (T0 to T1) is
+architecturally defined--simply passing through the MSRs can still
+produce an incorrect ratio.
+
+This erroneous ratio can occur if, between T0 and T1:
+
+1. The vCPU thread migrates between logical processors.
+2. Live migration or suspend/resume operations take place.
+3. Another task shares the vCPU's logical processor.
+4. C-states lower than C0 are emulated (e.g., via HLT interception).
+5. The guest TSC frequency doesn't match the host TSC frequency.
+
+Due to these complexities, KVM does not automatically associate this
+passthrough capability with the guest CPUID bit,
+``CPUID.6:ECX.APERFMPERF[bit 0]``. Userspace VMMs that deem this
+mechanism adequate for virtualizing the ``IA32_APERF`` and
+``IA32_MPERF`` MSRs must set the guest CPUID bit explicitly.
+
+
7.14 KVM_CAP_S390_HPAGE_1M
--------------------------
@@ -8373,7 +8410,7 @@ core crystal clock frequency, if a non-zero CPUID 0x15 is exposed to the guest.
7.36 KVM_CAP_DIRTY_LOG_RING/KVM_CAP_DIRTY_LOG_RING_ACQ_REL
----------------------------------------------------------
-:Architectures: x86, arm64
+:Architectures: x86, arm64, riscv
:Type: vm
:Parameters: args[0] - size of the dirty log ring
@@ -8585,7 +8622,7 @@ ENOSYS for the others.
When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
-7.37 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS
+7.42 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS
-------------------------------------
:Architectures: arm64
@@ -8614,6 +8651,17 @@ given VM.
When this capability is enabled, KVM resets the VCPU when setting
MP_STATE_INIT_RECEIVED through IOCTL. The original MP_STATE is preserved.
+7.43 KVM_CAP_ARM_CACHEABLE_PFNMAP_SUPPORTED
+-------------------------------------------
+
+:Architectures: arm64
+:Target: VM
+:Parameters: None
+
+This capability indicate to the userspace whether a PFNMAP memory region
+can be safely mapped as cacheable. This relies on the presence of
+force write back (FWB) feature support on the hardware.
+
8. Other capabilities.
======================