summaryrefslogtreecommitdiff
path: root/Documentation/virt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/virt')
-rw-r--r--Documentation/virt/kvm/api.rst51
-rw-r--r--Documentation/virt/kvm/devices/arm-vgic-v3.rst77
-rw-r--r--Documentation/virt/kvm/review-checklist.rst95
3 files changed, 206 insertions, 17 deletions
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 43ed57e048a8..6aa40ee05a4a 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -2006,7 +2006,14 @@ frequency is KHz.
If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also
be used as a vm ioctl to set the initial tsc frequency of subsequently
-created vCPUs.
+created vCPUs. Note, the vm ioctl is only allowed prior to creating vCPUs.
+
+For TSC protected Confidential Computing (CoCo) VMs where TSC frequency
+is configured once at VM scope and remains unchanged during VM's
+lifetime, the vm ioctl should be used to configure the TSC frequency
+and the vcpu ioctl is not supported.
+
+Example of such CoCo VMs: TDX guests.
4.56 KVM_GET_TSC_KHZ
--------------------
@@ -7230,8 +7237,8 @@ inputs and outputs of the TDVMCALL. Currently the following values of
placed in fields from ``r11`` to ``r14`` of the ``get_tdvmcall_info``
field of the union.
-* ``TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT``: the guest has requested to
-set up a notification interrupt for vector ``vector``.
+ * ``TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT``: the guest has requested to
+ set up a notification interrupt for vector ``vector``.
KVM may add support for more values in the future that may cause a userspace
exit, even without calls to ``KVM_ENABLE_CAP`` or similar. In this case,
@@ -7844,6 +7851,7 @@ Valid bits in args[0] are::
#define KVM_X86_DISABLE_EXITS_HLT (1 << 1)
#define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2)
#define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3)
+ #define KVM_X86_DISABLE_EXITS_APERFMPERF (1 << 4)
Enabling this capability on a VM provides userspace with a way to no
longer intercept some instructions for improved latency in some
@@ -7854,6 +7862,28 @@ all such vmexits.
Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits.
+Virtualizing the ``IA32_APERF`` and ``IA32_MPERF`` MSRs requires more
+than just disabling APERF/MPERF exits. While both Intel and AMD
+document strict usage conditions for these MSRs--emphasizing that only
+the ratio of their deltas over a time interval (T0 to T1) is
+architecturally defined--simply passing through the MSRs can still
+produce an incorrect ratio.
+
+This erroneous ratio can occur if, between T0 and T1:
+
+1. The vCPU thread migrates between logical processors.
+2. Live migration or suspend/resume operations take place.
+3. Another task shares the vCPU's logical processor.
+4. C-states lower than C0 are emulated (e.g., via HLT interception).
+5. The guest TSC frequency doesn't match the host TSC frequency.
+
+Due to these complexities, KVM does not automatically associate this
+passthrough capability with the guest CPUID bit,
+``CPUID.6:ECX.APERFMPERF[bit 0]``. Userspace VMMs that deem this
+mechanism adequate for virtualizing the ``IA32_APERF`` and
+``IA32_MPERF`` MSRs must set the guest CPUID bit explicitly.
+
+
7.14 KVM_CAP_S390_HPAGE_1M
--------------------------
@@ -8380,7 +8410,7 @@ core crystal clock frequency, if a non-zero CPUID 0x15 is exposed to the guest.
7.36 KVM_CAP_DIRTY_LOG_RING/KVM_CAP_DIRTY_LOG_RING_ACQ_REL
----------------------------------------------------------
-:Architectures: x86, arm64
+:Architectures: x86, arm64, riscv
:Type: vm
:Parameters: args[0] - size of the dirty log ring
@@ -8592,7 +8622,7 @@ ENOSYS for the others.
When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
-7.37 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS
+7.42 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS
-------------------------------------
:Architectures: arm64
@@ -8621,6 +8651,17 @@ given VM.
When this capability is enabled, KVM resets the VCPU when setting
MP_STATE_INIT_RECEIVED through IOCTL. The original MP_STATE is preserved.
+7.43 KVM_CAP_ARM_CACHEABLE_PFNMAP_SUPPORTED
+-------------------------------------------
+
+:Architectures: arm64
+:Target: VM
+:Parameters: None
+
+This capability indicate to the userspace whether a PFNMAP memory region
+can be safely mapped as cacheable. This relies on the presence of
+force write back (FWB) feature support on the hardware.
+
8. Other capabilities.
======================
diff --git a/Documentation/virt/kvm/devices/arm-vgic-v3.rst b/Documentation/virt/kvm/devices/arm-vgic-v3.rst
index e860498b1e35..ff02102f7141 100644
--- a/Documentation/virt/kvm/devices/arm-vgic-v3.rst
+++ b/Documentation/virt/kvm/devices/arm-vgic-v3.rst
@@ -78,6 +78,8 @@ Groups:
-ENXIO The group or attribute is unknown/unsupported for this device
or hardware support is missing.
-EFAULT Invalid user pointer for attr->addr.
+ -EBUSY Attempt to write a register that is read-only after
+ initialization
======= =============================================================
@@ -120,6 +122,12 @@ Groups:
Note that distributor fields are not banked, but return the same value
regardless of the mpidr used to access the register.
+ Userspace is allowed to write the following register fields prior to
+ initialization of the VGIC:
+
+ * GICD_IIDR.Revision
+ * GICD_TYPER2.nASSGIcap
+
GICD_IIDR.Revision is updated when the KVM implementation is changed in a
way directly observable by the guest or userspace. Userspace should read
GICD_IIDR from KVM and write back the read value to confirm its expected
@@ -128,6 +136,12 @@ Groups:
behavior.
+ GICD_TYPER2.nASSGIcap allows userspace to control the support of SGIs
+ without an active state. At VGIC creation the field resets to the
+ maximum capability of the system. Userspace is expected to read the field
+ to determine the supported value(s) before writing to the field.
+
+
The GICD_STATUSR and GICR_STATUSR registers are architecturally defined such
that a write of a clear bit has no effect, whereas a write with a set bit
clears that value. To allow userspace to freely set the values of these two
@@ -202,16 +216,69 @@ Groups:
KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS accesses the CPU interface registers for the
CPU specified by the mpidr field.
- CPU interface registers access is not implemented for AArch32 mode.
- Error -ENXIO is returned when accessed in AArch32 mode.
+ The available registers are:
+
+ =============== ====================================================
+ ICC_PMR_EL1
+ ICC_BPR0_EL1
+ ICC_AP0R0_EL1
+ ICC_AP0R1_EL1 when the host implements at least 6 bits of priority
+ ICC_AP0R2_EL1 when the host implements 7 bits of priority
+ ICC_AP0R3_EL1 when the host implements 7 bits of priority
+ ICC_AP1R0_EL1
+ ICC_AP1R1_EL1 when the host implements at least 6 bits of priority
+ ICC_AP1R2_EL1 when the host implements 7 bits of priority
+ ICC_AP1R3_EL1 when the host implements 7 bits of priority
+ ICC_BPR1_EL1
+ ICC_CTLR_EL1
+ ICC_SRE_EL1
+ ICC_IGRPEN0_EL1
+ ICC_IGRPEN1_EL1
+ =============== ====================================================
+
+ When EL2 is available for the guest, these registers are also available:
+
+ ============= ====================================================
+ ICH_AP0R0_EL2
+ ICH_AP0R1_EL2 when the host implements at least 6 bits of priority
+ ICH_AP0R2_EL2 when the host implements 7 bits of priority
+ ICH_AP0R3_EL2 when the host implements 7 bits of priority
+ ICH_AP1R0_EL2
+ ICH_AP1R1_EL2 when the host implements at least 6 bits of priority
+ ICH_AP1R2_EL2 when the host implements 7 bits of priority
+ ICH_AP1R3_EL2 when the host implements 7 bits of priority
+ ICH_HCR_EL2
+ ICC_SRE_EL2
+ ICH_VTR_EL2
+ ICH_VMCR_EL2
+ ICH_LR0_EL2
+ ICH_LR1_EL2
+ ICH_LR2_EL2
+ ICH_LR3_EL2
+ ICH_LR4_EL2
+ ICH_LR5_EL2
+ ICH_LR6_EL2
+ ICH_LR7_EL2
+ ICH_LR8_EL2
+ ICH_LR9_EL2
+ ICH_LR10_EL2
+ ICH_LR11_EL2
+ ICH_LR12_EL2
+ ICH_LR13_EL2
+ ICH_LR14_EL2
+ ICH_LR15_EL2
+ ============= ====================================================
+
+ CPU interface registers are only described using the AArch64
+ encoding.
Errors:
- ======= =====================================================
- -ENXIO Getting or setting this register is not yet supported
+ ======= =================================================
+ -ENXIO Getting or setting this register is not supported
-EBUSY VCPU is running
-EINVAL Invalid mpidr or register value supplied
- ======= =====================================================
+ ======= =================================================
KVM_DEV_ARM_VGIC_GRP_NR_IRQS
diff --git a/Documentation/virt/kvm/review-checklist.rst b/Documentation/virt/kvm/review-checklist.rst
index dc01aea4057b..debac54e14e7 100644
--- a/Documentation/virt/kvm/review-checklist.rst
+++ b/Documentation/virt/kvm/review-checklist.rst
@@ -7,7 +7,7 @@ Review checklist for kvm patches
1. The patch must follow Documentation/process/coding-style.rst and
Documentation/process/submitting-patches.rst.
-2. Patches should be against kvm.git master branch.
+2. Patches should be against kvm.git master or next branches.
3. If the patch introduces or modifies a new userspace API:
- the API must be documented in Documentation/virt/kvm/api.rst
@@ -18,10 +18,10 @@ Review checklist for kvm patches
5. New features must default to off (userspace should explicitly request them).
Performance improvements can and should default to on.
-6. New cpu features should be exposed via KVM_GET_SUPPORTED_CPUID2
+6. New cpu features should be exposed via KVM_GET_SUPPORTED_CPUID2,
+ or its equivalent for non-x86 architectures
-7. Emulator changes should be accompanied by unit tests for qemu-kvm.git
- kvm/test directory.
+7. The feature should be testable (see below).
8. Changes should be vendor neutral when possible. Changes to common code
are better than duplicating changes to vendor code.
@@ -36,6 +36,87 @@ Review checklist for kvm patches
11. New guest visible features must either be documented in a hardware manual
or be accompanied by documentation.
-12. Features must be robust against reset and kexec - for example, shared
- host/guest memory must be unshared to prevent the host from writing to
- guest memory that the guest has not reserved for this purpose.
+Testing of KVM code
+-------------------
+
+All features contributed to KVM, and in many cases bugfixes too, should be
+accompanied by some kind of tests and/or enablement in open source guests
+and VMMs. KVM is covered by multiple test suites:
+
+*Selftests*
+ These are low level tests that allow granular testing of kernel APIs.
+ This includes API failure scenarios, invoking APIs after specific
+ guest instructions, and testing multiple calls to ``KVM_CREATE_VM``
+ within a single test. They are included in the kernel tree at
+ ``tools/testing/selftests/kvm``.
+
+``kvm-unit-tests``
+ A collection of small guests that test CPU and emulated device features
+ from a guest's perspective. They run under QEMU or ``kvmtool``, and
+ are generally not KVM-specific: they can be run with any accelerator
+ that QEMU support or even on bare metal, making it possible to compare
+ behavior across hypervisors and processor families.
+
+Functional test suites
+ Various sets of functional tests exist, such as QEMU's ``tests/functional``
+ suite and `avocado-vt <https://avocado-vt.readthedocs.io/en/latest/>`__.
+ These typically involve running a full operating system in a virtual
+ machine.
+
+The best testing approach depends on the feature's complexity and
+operation. Here are some examples and guidelines:
+
+New instructions (no new registers or APIs)
+ The corresponding CPU features (if applicable) should be made available
+ in QEMU. If the instructions require emulation support or other code in
+ KVM, it is worth adding coverage to ``kvm-unit-tests`` or selftests;
+ the latter can be a better choice if the instructions relate to an API
+ that already has good selftest coverage.
+
+New hardware features (new registers, no new APIs)
+ These should be tested via ``kvm-unit-tests``; this more or less implies
+ supporting them in QEMU and/or ``kvmtool``. In some cases selftests
+ can be used instead, similar to the previous case, or specifically to
+ test corner cases in guest state save/restore.
+
+Bug fixes and performance improvements
+ These usually do not introduce new APIs, but it's worth sharing
+ any benchmarks and tests that will validate your contribution,
+ ideally in the form of regression tests. Tests and benchmarks
+ can be included in either ``kvm-unit-tests`` or selftests, depending
+ on the specifics of your change. Selftests are especially useful for
+ regression tests because they are included directly in Linux's tree.
+
+Large scale internal changes
+ While it's difficult to provide a single policy, you should ensure that
+ the changed code is covered by either ``kvm-unit-tests`` or selftests.
+ In some cases the affected code is run for any guests and functional
+ tests suffice. Explain your testing process in the cover letter,
+ as that can help identify gaps in existing test suites.
+
+New APIs
+ It is important to demonstrate your use case. This can be as simple as
+ explaining that the feature is already in use on bare metal, or it can be
+ a proof-of-concept implementation in userspace. The latter need not be
+ open source, though that is of course preferrable for easier testing.
+ Selftests should test corner cases of the APIs, and should also cover
+ basic host and guest operation if no open source VMM uses the feature.
+
+Bigger features, usually spanning host and guest
+ These should be supported by Linux guests, with limited exceptions for
+ Hyper-V features that are testable on Windows guests. It is strongly
+ suggested that the feature be usable with an open source host VMM, such
+ as at least one of QEMU or crosvm, and guest firmware. Selftests should
+ test at least API error cases. Guest operation can be covered by
+ either selftests of ``kvm-unit-tests`` (this is especially important for
+ paravirtualized and Windows-only features). Strong selftest coverage
+ can also be a replacement for implementation in an open source VMM,
+ but this is generally not recommended.
+
+Following the above suggestions for testing in selftests and
+``kvm-unit-tests`` will make it easier for the maintainers to review
+and accept your code. In fact, even before you contribute your changes
+upstream it will make it easier for you to develop for KVM.
+
+Of course, the KVM maintainers reserve the right to require more tests,
+though they may also waive the requirement from time to time.