From 5ec3289b31ab9bb209be59cee360aac4b03f320a Mon Sep 17 00:00:00 2001 From: David Woodhouse Date: Fri, 18 Nov 2022 14:32:38 +0000 Subject: KVM: x86/xen: Compatibility fixes for shared runstate area MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The guest runstate area can be arbitrarily byte-aligned. In fact, even when a sane 32-bit guest aligns the overall structure nicely, the 64-bit fields in the structure end up being unaligned due to the fact that the 32-bit ABI only aligns them to 32 bits. So setting the ->state_entry_time field to something|XEN_RUNSTATE_UPDATE is buggy, because if it's unaligned then we can't update the whole field atomically; the low bytes might be observable before the _UPDATE bit is. Xen actually updates the *byte* containing that top bit, on its own. KVM should do the same. In addition, we cannot assume that the runstate area fits within a single page. One option might be to make the gfn_to_pfn cache cope with regions that cross a page — but getting a contiguous virtual kernel mapping of a discontiguous set of IOMEM pages is a distinctly non-trivial exercise, and it seems this is the *only* current use case for the GPC which would benefit from it. An earlier version of the runstate code did use a gfn_to_hva cache for this purpose, but it still had the single-page restriction because it used the uhva directly — because it needs to be able to do so atomically when the vCPU is being scheduled out, so it used pagefault_disable() around the accesses and didn't just use kvm_write_guest_cached() which has a fallback path. So... use a pair of GPCs for the first and potential second page covering the runstate area. We can get away with locking both at once because nothing else takes more than one GPC lock at a time so we can invent a trivial ordering rule. The common case where it's all in the same page is kept as a fast path, but in both cases, the actual guest structure (compat or not) is built up from the fields in @vx, following preset pointers to the state and times fields. The only difference is whether those pointers point to the kernel stack (in the split case) or to guest memory directly via the GPC. The fast path is also fixed to use a byte access for the XEN_RUNSTATE_UPDATE bit, then the only real difference is the dual memcpy. Finally, Xen also does write the runstate area immediately when it's configured. Flip the kvm_xen_update_runstate() and …_guest() functions and call the latter directly when the runstate area is set. This means that other ioctls which modify the runstate also write it immediately to the guest when they do so, which is also intended. Update the xen_shinfo_test to exercise the pathological case where the XEN_RUNSTATE_UPDATE flag in the top byte of the state_entry_time is actually in a different page to the rest of the 64-bit word. Signed-off-by: David Woodhouse Signed-off-by: Paolo Bonzini --- tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c index 2a5727188c8d..7f39815f1772 100644 --- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c +++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c @@ -26,17 +26,17 @@ #define SHINFO_REGION_GPA 0xc0000000ULL #define SHINFO_REGION_SLOT 10 -#define DUMMY_REGION_GPA (SHINFO_REGION_GPA + (2 * PAGE_SIZE)) +#define DUMMY_REGION_GPA (SHINFO_REGION_GPA + (3 * PAGE_SIZE)) #define DUMMY_REGION_SLOT 11 #define SHINFO_ADDR (SHINFO_REGION_GPA) -#define PVTIME_ADDR (SHINFO_REGION_GPA + PAGE_SIZE) -#define RUNSTATE_ADDR (SHINFO_REGION_GPA + PAGE_SIZE + 0x20) #define VCPU_INFO_ADDR (SHINFO_REGION_GPA + 0x40) +#define PVTIME_ADDR (SHINFO_REGION_GPA + PAGE_SIZE) +#define RUNSTATE_ADDR (SHINFO_REGION_GPA + PAGE_SIZE + PAGE_SIZE - 15) #define SHINFO_VADDR (SHINFO_REGION_GVA) -#define RUNSTATE_VADDR (SHINFO_REGION_GVA + PAGE_SIZE + 0x20) #define VCPU_INFO_VADDR (SHINFO_REGION_GVA + 0x40) +#define RUNSTATE_VADDR (SHINFO_REGION_GVA + PAGE_SIZE + PAGE_SIZE - 15) #define EVTCHN_VECTOR 0x10 @@ -449,8 +449,8 @@ int main(int argc, char *argv[]) /* Map a region for the shared_info page */ vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, - SHINFO_REGION_GPA, SHINFO_REGION_SLOT, 2, 0); - virt_map(vm, SHINFO_REGION_GVA, SHINFO_REGION_GPA, 2); + SHINFO_REGION_GPA, SHINFO_REGION_SLOT, 3, 0); + virt_map(vm, SHINFO_REGION_GVA, SHINFO_REGION_GPA, 3); struct shared_info *shinfo = addr_gpa2hva(vm, SHINFO_VADDR); -- cgit From d8ba8ba4c801b794f47852a6f1821ea48f83b5d1 Mon Sep 17 00:00:00 2001 From: David Woodhouse Date: Sun, 27 Nov 2022 12:22:10 +0000 Subject: KVM: x86/xen: Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured Closer inspection of the Xen code shows that we aren't supposed to be using the XEN_RUNSTATE_UPDATE flag unconditionally. It should be explicitly enabled by guests through the HYPERVISOR_vm_assist hypercall. If we randomly set the top bit of ->state_entry_time for a guest that hasn't asked for it and doesn't expect it, that could make the runtimes fail to add up and confuse the guest. Without the flag it's perfectly safe for a vCPU to read its own vcpu_runstate_info; just not for one vCPU to read *another's*. I briefly pondered adding a word for the whole set of VMASST_TYPE_* flags but the only one we care about for HVM guests is this, so it seemed a bit pointless. Signed-off-by: David Woodhouse Message-Id: <20221127122210.248427-3-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini --- tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c index 7f39815f1772..c9b0110d73f3 100644 --- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c +++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c @@ -440,6 +440,7 @@ int main(int argc, char *argv[]) TEST_REQUIRE(xen_caps & KVM_XEN_HVM_CONFIG_SHARED_INFO); bool do_runstate_tests = !!(xen_caps & KVM_XEN_HVM_CONFIG_RUNSTATE); + bool do_runstate_flag = !!(xen_caps & KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG); bool do_eventfd_tests = !!(xen_caps & KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL); bool do_evtchn_tests = do_eventfd_tests && !!(xen_caps & KVM_XEN_HVM_CONFIG_EVTCHN_SEND); @@ -475,6 +476,19 @@ int main(int argc, char *argv[]) }; vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &lm); + if (do_runstate_flag) { + struct kvm_xen_hvm_attr ruf = { + .type = KVM_XEN_ATTR_TYPE_RUNSTATE_UPDATE_FLAG, + .u.runstate_update_flag = 1, + }; + vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &ruf); + + ruf.u.runstate_update_flag = 0; + vm_ioctl(vm, KVM_XEN_HVM_GET_ATTR, &ruf); + TEST_ASSERT(ruf.u.runstate_update_flag == 1, + "Failed to read back RUNSTATE_UPDATE_FLAG attr"); + } + struct kvm_xen_hvm_attr ha = { .type = KVM_XEN_ATTR_TYPE_SHARED_INFO, .u.shared_info.gfn = SHINFO_REGION_GPA / PAGE_SIZE, -- cgit From 8acc35186ed63436bfaf60051c8bb53f344dcbfc Mon Sep 17 00:00:00 2001 From: David Woodhouse Date: Sat, 19 Nov 2022 09:27:46 +0000 Subject: KVM: x86/xen: Add runstate tests for 32-bit mode and crossing page boundary Torture test the cases where the runstate crosses a page boundary, and and especially the case where it's configured in 32-bit mode and doesn't, but then switching to 64-bit mode makes it go onto the second page. To simplify this, make the KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST ioctl also update the guest runstate area. It already did so if the actual runstate changed, as a side-effect of kvm_xen_update_runstate(). So doing it in the plain adjustment case is making it more consistent, as well as giving us a nice way to trigger the update without actually running the vCPU again and changing the values. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant Signed-off-by: Paolo Bonzini --- .../testing/selftests/kvm/x86_64/xen_shinfo_test.c | 115 +++++++++++++++++---- 1 file changed, 95 insertions(+), 20 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c index c9b0110d73f3..721f6a693799 100644 --- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c +++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c @@ -88,14 +88,20 @@ struct pvclock_wall_clock { } __attribute__((__packed__)); struct vcpu_runstate_info { - uint32_t state; - uint64_t state_entry_time; - uint64_t time[4]; + uint32_t state; + uint64_t state_entry_time; + uint64_t time[5]; /* Extra field for overrun check */ }; +struct compat_vcpu_runstate_info { + uint32_t state; + uint64_t state_entry_time; + uint64_t time[5]; +} __attribute__((__packed__));; + struct arch_vcpu_info { - unsigned long cr2; - unsigned long pad; /* sizeof(vcpu_info_t) == 64 */ + unsigned long cr2; + unsigned long pad; /* sizeof(vcpu_info_t) == 64 */ }; struct vcpu_info { @@ -1013,22 +1019,91 @@ int main(int argc, char *argv[]) runstate_names[i], rs->time[i]); } } - TEST_ASSERT(rs->state == rst.u.runstate.state, "Runstate mismatch"); - TEST_ASSERT(rs->state_entry_time == rst.u.runstate.state_entry_time, - "State entry time mismatch"); - TEST_ASSERT(rs->time[RUNSTATE_running] == rst.u.runstate.time_running, - "Running time mismatch"); - TEST_ASSERT(rs->time[RUNSTATE_runnable] == rst.u.runstate.time_runnable, - "Runnable time mismatch"); - TEST_ASSERT(rs->time[RUNSTATE_blocked] == rst.u.runstate.time_blocked, - "Blocked time mismatch"); - TEST_ASSERT(rs->time[RUNSTATE_offline] == rst.u.runstate.time_offline, - "Offline time mismatch"); - - TEST_ASSERT(rs->state_entry_time == rs->time[0] + - rs->time[1] + rs->time[2] + rs->time[3], - "runstate times don't add up"); + + /* + * Exercise runstate info at all points across the page boundary, in + * 32-bit and 64-bit mode. In particular, test the case where it is + * configured in 32-bit mode and then switched to 64-bit mode while + * active, which takes it onto the second page. + */ + unsigned long runstate_addr; + struct compat_vcpu_runstate_info *crs; + for (runstate_addr = SHINFO_REGION_GPA + PAGE_SIZE + PAGE_SIZE - sizeof(*rs) - 4; + runstate_addr < SHINFO_REGION_GPA + PAGE_SIZE + PAGE_SIZE + 4; runstate_addr++) { + + rs = addr_gpa2hva(vm, runstate_addr); + crs = (void *)rs; + + memset(rs, 0xa5, sizeof(*rs)); + + /* Set to compatibility mode */ + lm.u.long_mode = 0; + vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &lm); + + /* Set runstate to new address (kernel will write it) */ + struct kvm_xen_vcpu_attr st = { + .type = KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR, + .u.gpa = runstate_addr, + }; + vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &st); + + if (verbose) + printf("Compatibility runstate at %08lx\n", runstate_addr); + + TEST_ASSERT(crs->state == rst.u.runstate.state, "Runstate mismatch"); + TEST_ASSERT(crs->state_entry_time == rst.u.runstate.state_entry_time, + "State entry time mismatch"); + TEST_ASSERT(crs->time[RUNSTATE_running] == rst.u.runstate.time_running, + "Running time mismatch"); + TEST_ASSERT(crs->time[RUNSTATE_runnable] == rst.u.runstate.time_runnable, + "Runnable time mismatch"); + TEST_ASSERT(crs->time[RUNSTATE_blocked] == rst.u.runstate.time_blocked, + "Blocked time mismatch"); + TEST_ASSERT(crs->time[RUNSTATE_offline] == rst.u.runstate.time_offline, + "Offline time mismatch"); + TEST_ASSERT(crs->time[RUNSTATE_offline + 1] == 0xa5a5a5a5a5a5a5a5ULL, + "Structure overrun"); + TEST_ASSERT(crs->state_entry_time == crs->time[0] + + crs->time[1] + crs->time[2] + crs->time[3], + "runstate times don't add up"); + + + /* Now switch to 64-bit mode */ + lm.u.long_mode = 1; + vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &lm); + + memset(rs, 0xa5, sizeof(*rs)); + + /* Don't change the address, just trigger a write */ + struct kvm_xen_vcpu_attr adj = { + .type = KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST, + .u.runstate.state = (uint64_t)-1 + }; + vcpu_ioctl(vcpu, KVM_XEN_VCPU_SET_ATTR, &adj); + + if (verbose) + printf("64-bit runstate at %08lx\n", runstate_addr); + + TEST_ASSERT(rs->state == rst.u.runstate.state, "Runstate mismatch"); + TEST_ASSERT(rs->state_entry_time == rst.u.runstate.state_entry_time, + "State entry time mismatch"); + TEST_ASSERT(rs->time[RUNSTATE_running] == rst.u.runstate.time_running, + "Running time mismatch"); + TEST_ASSERT(rs->time[RUNSTATE_runnable] == rst.u.runstate.time_runnable, + "Runnable time mismatch"); + TEST_ASSERT(rs->time[RUNSTATE_blocked] == rst.u.runstate.time_blocked, + "Blocked time mismatch"); + TEST_ASSERT(rs->time[RUNSTATE_offline] == rst.u.runstate.time_offline, + "Offline time mismatch"); + TEST_ASSERT(rs->time[RUNSTATE_offline + 1] == 0xa5a5a5a5a5a5a5a5ULL, + "Structure overrun"); + + TEST_ASSERT(rs->state_entry_time == rs->time[0] + + rs->time[1] + rs->time[2] + rs->time[3], + "runstate times don't add up"); + } } + kvm_vm_free(vm); return 0; } -- cgit From b80732fdc9b235046687a2999ed198fa55fde901 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Tue, 7 Jun 2022 23:23:53 +0000 Subject: KVM: selftests: Verify userspace can stuff IA32_FEATURE_CONTROL at will Verify the KVM allows userspace to set all supported bits in the IA32_FEATURE_CONTROL MSR irrespective of the current guest CPUID, and that all unsupported bits are rejected. Throw the testcase into vmx_msrs_test even though it's not technically a VMX MSR; it's close enough, and the most frequently feature controlled by the MSR is VMX. Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20220607232353.3375324-4-seanjc@google.com --- .../selftests/kvm/include/x86_64/processor.h | 2 + tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c | 47 ++++++++++++++++++++++ 2 files changed, 49 insertions(+) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h index 5d310abe6c3f..ac4590abb44d 100644 --- a/tools/testing/selftests/kvm/include/x86_64/processor.h +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h @@ -102,6 +102,7 @@ struct kvm_x86_cpu_feature { #define X86_FEATURE_XMM2 KVM_X86_CPU_FEATURE(0x1, 0, EDX, 26) #define X86_FEATURE_FSGSBASE KVM_X86_CPU_FEATURE(0x7, 0, EBX, 0) #define X86_FEATURE_TSC_ADJUST KVM_X86_CPU_FEATURE(0x7, 0, EBX, 1) +#define X86_FEATURE_SGX KVM_X86_CPU_FEATURE(0x7, 0, EBX, 2) #define X86_FEATURE_HLE KVM_X86_CPU_FEATURE(0x7, 0, EBX, 4) #define X86_FEATURE_SMEP KVM_X86_CPU_FEATURE(0x7, 0, EBX, 7) #define X86_FEATURE_INVPCID KVM_X86_CPU_FEATURE(0x7, 0, EBX, 10) @@ -115,6 +116,7 @@ struct kvm_x86_cpu_feature { #define X86_FEATURE_PKU KVM_X86_CPU_FEATURE(0x7, 0, ECX, 3) #define X86_FEATURE_LA57 KVM_X86_CPU_FEATURE(0x7, 0, ECX, 16) #define X86_FEATURE_RDPID KVM_X86_CPU_FEATURE(0x7, 0, ECX, 22) +#define X86_FEATURE_SGX_LC KVM_X86_CPU_FEATURE(0x7, 0, ECX, 30) #define X86_FEATURE_SHSTK KVM_X86_CPU_FEATURE(0x7, 0, ECX, 7) #define X86_FEATURE_IBT KVM_X86_CPU_FEATURE(0x7, 0, EDX, 20) #define X86_FEATURE_AMX_TILE KVM_X86_CPU_FEATURE(0x7, 0, EDX, 24) diff --git a/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c b/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c index 322d561b4260..90720b6205f4 100644 --- a/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c +++ b/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c @@ -67,6 +67,52 @@ static void vmx_save_restore_msrs_test(struct kvm_vcpu *vcpu) vmx_fixed1_msr_test(vcpu, MSR_IA32_VMX_VMFUNC, -1ull); } +static void __ia32_feature_control_msr_test(struct kvm_vcpu *vcpu, + uint64_t msr_bit, + struct kvm_x86_cpu_feature feature) +{ + uint64_t val; + + vcpu_clear_cpuid_feature(vcpu, feature); + + val = vcpu_get_msr(vcpu, MSR_IA32_FEAT_CTL); + vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, val | msr_bit | FEAT_CTL_LOCKED); + vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, (val & ~msr_bit) | FEAT_CTL_LOCKED); + vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, val | msr_bit | FEAT_CTL_LOCKED); + vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, (val & ~msr_bit) | FEAT_CTL_LOCKED); + vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, val); + + if (!kvm_cpu_has(feature)) + return; + + vcpu_set_cpuid_feature(vcpu, feature); +} + +static void ia32_feature_control_msr_test(struct kvm_vcpu *vcpu) +{ + uint64_t supported_bits = FEAT_CTL_LOCKED | + FEAT_CTL_VMX_ENABLED_INSIDE_SMX | + FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX | + FEAT_CTL_SGX_LC_ENABLED | + FEAT_CTL_SGX_ENABLED | + FEAT_CTL_LMCE_ENABLED; + int bit, r; + + __ia32_feature_control_msr_test(vcpu, FEAT_CTL_VMX_ENABLED_INSIDE_SMX, X86_FEATURE_SMX); + __ia32_feature_control_msr_test(vcpu, FEAT_CTL_VMX_ENABLED_INSIDE_SMX, X86_FEATURE_VMX); + __ia32_feature_control_msr_test(vcpu, FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX, X86_FEATURE_VMX); + __ia32_feature_control_msr_test(vcpu, FEAT_CTL_SGX_LC_ENABLED, X86_FEATURE_SGX_LC); + __ia32_feature_control_msr_test(vcpu, FEAT_CTL_SGX_LC_ENABLED, X86_FEATURE_SGX); + __ia32_feature_control_msr_test(vcpu, FEAT_CTL_SGX_ENABLED, X86_FEATURE_SGX); + __ia32_feature_control_msr_test(vcpu, FEAT_CTL_LMCE_ENABLED, X86_FEATURE_MCE); + + for_each_clear_bit(bit, &supported_bits, 64) { + r = _vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, BIT(bit)); + TEST_ASSERT(r == 0, + "Setting reserved bit %d in IA32_FEATURE_CONTROL should fail", bit); + } +} + int main(void) { struct kvm_vcpu *vcpu; @@ -79,6 +125,7 @@ int main(void) vm = vm_create_with_one_vcpu(&vcpu, NULL); vmx_save_restore_msrs_test(vcpu); + ia32_feature_control_msr_test(vcpu); kvm_vm_free(vm); } -- cgit From a33004e844e4c60da86ecf8c249aab7179817fce Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Tue, 29 Nov 2022 17:52:59 +0000 Subject: KVM: selftests: Fix inverted "warning" in access tracking perf test Warn if the number of idle pages is greater than or equal to 10% of the total number of pages, not if the percentage of idle pages is less than 10%. The original code asserted that less than 10% of pages were still idle, but the check got inverted when the assert was converted to a warning. Opportunistically clean up the warning; selftests are 64-bit only, there is no need to use "%PRIu64" instead of "%lu". Fixes: 6336a810db5c ("KVM: selftests: replace assertion with warning in access_tracking_perf_test") Reviewed-by: Emanuele Giuseppe Esposito Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20221129175300.4052283-2-seanjc@google.com --- tools/testing/selftests/kvm/access_tracking_perf_test.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c index 02d3587cab0a..d45ef319a68f 100644 --- a/tools/testing/selftests/kvm/access_tracking_perf_test.c +++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c @@ -185,10 +185,9 @@ static void mark_vcpu_memory_idle(struct kvm_vm *vm, * happens, much more pages are cached there and guest won't see the * "idle" bit cleared. */ - if (still_idle < pages / 10) - printf("WARNING: vCPU%d: Too many pages still idle (%" PRIu64 - "out of %" PRIu64 "), this will affect performance results" - ".\n", + if (still_idle >= pages / 10) + printf("WARNING: vCPU%d: Too many pages still idle (%lu out of %lu), " + "this will affect performance results.\n", vcpu_idx, still_idle, pages); close(page_idle_fd); -- cgit From 8fcee0421386344b58fdc1cd5940219617037968 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Tue, 29 Nov 2022 17:53:00 +0000 Subject: KVM: selftests: Restore assert for non-nested VMs in access tracking test Restore the assert (on x86-64) that <10% of pages are still idle when NOT running as a nested VM in the access tracking test. The original assert was converted to a "warning" to avoid false failures when running the test in a VM, but the non-nested case does not suffer from the same "infinite TLB size" issue. Using the HYPERVISOR flag isn't infallible as VMMs aren't strictly required to enumerate the "feature" in CPUID, but practically speaking anyone that is running KVM selftests in VMs is going to be using a VMM and hypervisor that sets the HYPERVISOR flag. Cc: David Matlack Reviewed-by: Emanuele Giuseppe Esposito Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20221129175300.4052283-3-seanjc@google.com --- tools/testing/selftests/kvm/access_tracking_perf_test.c | 17 ++++++++++++----- tools/testing/selftests/kvm/include/x86_64/processor.h | 1 + 2 files changed, 13 insertions(+), 5 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c index d45ef319a68f..9f9503e40ca5 100644 --- a/tools/testing/selftests/kvm/access_tracking_perf_test.c +++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c @@ -46,6 +46,7 @@ #include "test_util.h" #include "memstress.h" #include "guest_modes.h" +#include "processor.h" /* Global variable used to synchronize all of the vCPU threads. */ static int iteration; @@ -180,15 +181,21 @@ static void mark_vcpu_memory_idle(struct kvm_vm *vm, * access tracking but low enough as to not make the test too brittle * over time and across architectures. * - * Note that when run in nested virtualization, this check will trigger - * much more frequently because TLB size is unlimited and since no flush - * happens, much more pages are cached there and guest won't see the - * "idle" bit cleared. + * When running the guest as a nested VM, "warn" instead of asserting + * as the TLB size is effectively unlimited and the KVM doesn't + * explicitly flush the TLB when aging SPTEs. As a result, more pages + * are cached and the guest won't see the "idle" bit cleared. */ - if (still_idle >= pages / 10) + if (still_idle >= pages / 10) { +#ifdef __x86_64__ + TEST_ASSERT(this_cpu_has(X86_FEATURE_HYPERVISOR), + "vCPU%d: Too many pages still idle (%lu out of %lu)", + vcpu_idx, still_idle, pages); +#endif printf("WARNING: vCPU%d: Too many pages still idle (%lu out of %lu), " "this will affect performance results.\n", vcpu_idx, still_idle, pages); + } close(page_idle_fd); close(pagemap_fd); diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h index 5d310abe6c3f..22852bd32d7b 100644 --- a/tools/testing/selftests/kvm/include/x86_64/processor.h +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h @@ -94,6 +94,7 @@ struct kvm_x86_cpu_feature { #define X86_FEATURE_XSAVE KVM_X86_CPU_FEATURE(0x1, 0, ECX, 26) #define X86_FEATURE_OSXSAVE KVM_X86_CPU_FEATURE(0x1, 0, ECX, 27) #define X86_FEATURE_RDRAND KVM_X86_CPU_FEATURE(0x1, 0, ECX, 30) +#define X86_FEATURE_HYPERVISOR KVM_X86_CPU_FEATURE(0x1, 0, ECX, 31) #define X86_FEATURE_PAE KVM_X86_CPU_FEATURE(0x1, 0, EDX, 6) #define X86_FEATURE_MCE KVM_X86_CPU_FEATURE(0x1, 0, EDX, 7) #define X86_FEATURE_APIC KVM_X86_CPU_FEATURE(0x1, 0, EDX, 9) -- cgit From 18eee7bfd18d6c9586dd224cf5b74258700fe815 Mon Sep 17 00:00:00 2001 From: Lei Wang Date: Mon, 28 Nov 2022 22:57:32 +0000 Subject: KVM: selftests: Move XFD CPUID checking out of __vm_xsave_require_permission() Move the kvm_cpu_has() check on X86_FEATURE_XFD out of the helper to enable off-by-default XSAVE-managed features and into the one test that currenty requires XFD (XFeature Disable) support. kvm_cpu_has() uses kvm_get_supported_cpuid() and thus caches KVM_GET_SUPPORTED_CPUID, and so using kvm_cpu_has() before ARCH_REQ_XCOMP_GUEST_PERM effectively results in the test caching stale values, e.g. subsequent checks on AMX_TILE will get false negatives. Although off-by-default features are nonsensical without XFD, checking for XFD virtualization prior to enabling such features isn't strictly required. Signed-off-by: Lei Wang Fixes: 7fbb653e01fd ("KVM: selftests: Check KVM's supported CPUID, not host CPUID, for XFD") Link: https://lore.kernel.org/r/20221125023839.315207-1-lei4.wang@intel.com [sean: add Fixes, reword changelog] Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20221128225735.3291648-2-seanjc@google.com --- tools/testing/selftests/kvm/lib/x86_64/processor.c | 2 -- tools/testing/selftests/kvm/x86_64/amx_test.c | 1 + 2 files changed, 1 insertion(+), 2 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index d532c20c74fd..aac7b32a794b 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -563,8 +563,6 @@ void __vm_xsave_require_permission(int bit, const char *name) .addr = (unsigned long) &bitmask }; - TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_XFD)); - kvm_fd = open_kvm_dev_path_or_exit(); rc = __kvm_ioctl(kvm_fd, KVM_GET_DEVICE_ATTR, &attr); close(kvm_fd); diff --git a/tools/testing/selftests/kvm/x86_64/amx_test.c b/tools/testing/selftests/kvm/x86_64/amx_test.c index 21de6ae42086..1256c7faadd3 100644 --- a/tools/testing/selftests/kvm/x86_64/amx_test.c +++ b/tools/testing/selftests/kvm/x86_64/amx_test.c @@ -254,6 +254,7 @@ int main(int argc, char *argv[]) /* Create VM */ vm = vm_create_with_one_vcpu(&vcpu, guest_code); + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_XFD)); TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_XSAVE)); TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_AMX_TILE)); TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_XTILECFG)); -- cgit From 2ceade1d363c934633a1788d0f98fc2332062b92 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 28 Nov 2022 22:57:33 +0000 Subject: KVM: selftests: Move __vm_xsave_require_permission() below CPUID helpers Move __vm_xsave_require_permission() below the CPUID helpers so that a future change can reference the cached result of KVM_GET_SUPPORTED_CPUID while keeping the definition of the variable close to its intended user, kvm_get_supported_cpuid(). No functional change intended. Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20221128225735.3291648-3-seanjc@google.com --- tools/testing/selftests/kvm/lib/x86_64/processor.c | 64 +++++++++++----------- 1 file changed, 32 insertions(+), 32 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index aac7b32a794b..23067465c035 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -552,38 +552,6 @@ static void vcpu_setup(struct kvm_vm *vm, struct kvm_vcpu *vcpu) vcpu_sregs_set(vcpu, &sregs); } -void __vm_xsave_require_permission(int bit, const char *name) -{ - int kvm_fd; - u64 bitmask; - long rc; - struct kvm_device_attr attr = { - .group = 0, - .attr = KVM_X86_XCOMP_GUEST_SUPP, - .addr = (unsigned long) &bitmask - }; - - kvm_fd = open_kvm_dev_path_or_exit(); - rc = __kvm_ioctl(kvm_fd, KVM_GET_DEVICE_ATTR, &attr); - close(kvm_fd); - - if (rc == -1 && (errno == ENXIO || errno == EINVAL)) - __TEST_REQUIRE(0, "KVM_X86_XCOMP_GUEST_SUPP not supported"); - - TEST_ASSERT(rc == 0, "KVM_GET_DEVICE_ATTR(0, KVM_X86_XCOMP_GUEST_SUPP) error: %ld", rc); - - __TEST_REQUIRE(bitmask & (1ULL << bit), - "Required XSAVE feature '%s' not supported", name); - - TEST_REQUIRE(!syscall(SYS_arch_prctl, ARCH_REQ_XCOMP_GUEST_PERM, bit)); - - rc = syscall(SYS_arch_prctl, ARCH_GET_XCOMP_GUEST_PERM, &bitmask); - TEST_ASSERT(rc == 0, "prctl(ARCH_GET_XCOMP_GUEST_PERM) error: %ld", rc); - TEST_ASSERT(bitmask & (1ULL << bit), - "prctl(ARCH_REQ_XCOMP_GUEST_PERM) failure bitmask=0x%lx", - bitmask); -} - void kvm_arch_vm_post_create(struct kvm_vm *vm) { vm_create_irqchip(vm); @@ -705,6 +673,38 @@ uint64_t kvm_get_feature_msr(uint64_t msr_index) return buffer.entry.data; } +void __vm_xsave_require_permission(int bit, const char *name) +{ + int kvm_fd; + u64 bitmask; + long rc; + struct kvm_device_attr attr = { + .group = 0, + .attr = KVM_X86_XCOMP_GUEST_SUPP, + .addr = (unsigned long) &bitmask + }; + + kvm_fd = open_kvm_dev_path_or_exit(); + rc = __kvm_ioctl(kvm_fd, KVM_GET_DEVICE_ATTR, &attr); + close(kvm_fd); + + if (rc == -1 && (errno == ENXIO || errno == EINVAL)) + __TEST_REQUIRE(0, "KVM_X86_XCOMP_GUEST_SUPP not supported"); + + TEST_ASSERT(rc == 0, "KVM_GET_DEVICE_ATTR(0, KVM_X86_XCOMP_GUEST_SUPP) error: %ld", rc); + + __TEST_REQUIRE(bitmask & (1ULL << bit), + "Required XSAVE feature '%s' not supported", name); + + TEST_REQUIRE(!syscall(SYS_arch_prctl, ARCH_REQ_XCOMP_GUEST_PERM, bit)); + + rc = syscall(SYS_arch_prctl, ARCH_GET_XCOMP_GUEST_PERM, &bitmask); + TEST_ASSERT(rc == 0, "prctl(ARCH_GET_XCOMP_GUEST_PERM) error: %ld", rc); + TEST_ASSERT(bitmask & (1ULL << bit), + "prctl(ARCH_REQ_XCOMP_GUEST_PERM) failure bitmask=0x%lx", + bitmask); +} + void vcpu_init_cpuid(struct kvm_vcpu *vcpu, const struct kvm_cpuid2 *cpuid) { TEST_ASSERT(cpuid != vcpu->cpuid, "@cpuid can't be the vCPU's CPUID"); -- cgit From cd5f3d210095347e9d40a9a1d464f5ee0bb5d7f2 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 28 Nov 2022 22:57:34 +0000 Subject: KVM: selftests: Disallow "get supported CPUID" before REQ_XCOMP_GUEST_PERM Disallow using kvm_get_supported_cpuid() and thus caching KVM's supported CPUID info before enabling XSAVE-managed features that are off-by-default and must be enabled by ARCH_REQ_XCOMP_GUEST_PERM. Caching the supported CPUID before all XSAVE features are enabled can result in false negatives due to testing features that were cached before they were enabled. Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20221128225735.3291648-4-seanjc@google.com --- tools/testing/selftests/kvm/lib/x86_64/processor.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index 23067465c035..1d3829e652e6 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -601,21 +601,24 @@ void vcpu_arch_free(struct kvm_vcpu *vcpu) free(vcpu->cpuid); } +/* Do not use kvm_supported_cpuid directly except for validity checks. */ +static void *kvm_supported_cpuid; + const struct kvm_cpuid2 *kvm_get_supported_cpuid(void) { - static struct kvm_cpuid2 *cpuid; int kvm_fd; - if (cpuid) - return cpuid; + if (kvm_supported_cpuid) + return kvm_supported_cpuid; - cpuid = allocate_kvm_cpuid2(MAX_NR_CPUID_ENTRIES); + kvm_supported_cpuid = allocate_kvm_cpuid2(MAX_NR_CPUID_ENTRIES); kvm_fd = open_kvm_dev_path_or_exit(); - kvm_ioctl(kvm_fd, KVM_GET_SUPPORTED_CPUID, cpuid); + kvm_ioctl(kvm_fd, KVM_GET_SUPPORTED_CPUID, + (struct kvm_cpuid2 *)kvm_supported_cpuid); close(kvm_fd); - return cpuid; + return kvm_supported_cpuid; } static uint32_t __kvm_cpu_has(const struct kvm_cpuid2 *cpuid, @@ -684,6 +687,9 @@ void __vm_xsave_require_permission(int bit, const char *name) .addr = (unsigned long) &bitmask }; + TEST_ASSERT(!kvm_supported_cpuid, + "kvm_get_supported_cpuid() cannot be used before ARCH_REQ_XCOMP_GUEST_PERM"); + kvm_fd = open_kvm_dev_path_or_exit(); rc = __kvm_ioctl(kvm_fd, KVM_GET_DEVICE_ATTR, &attr); close(kvm_fd); -- cgit From 553d1652b8615b5ae3080bb1a561207aee87fa85 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 28 Nov 2022 22:57:35 +0000 Subject: KVM: selftests: Do kvm_cpu_has() checks before creating VM+vCPU Move the AMX test's kvm_cpu_has() checks before creating the VM+vCPU, there are no dependencies between the two operations. Opportunistically add a comment to call out that enabling off-by-default XSAVE-managed features must be done before KVM_GET_SUPPORTED_CPUID is cached. Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20221128225735.3291648-5-seanjc@google.com --- tools/testing/selftests/kvm/x86_64/amx_test.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/x86_64/amx_test.c b/tools/testing/selftests/kvm/x86_64/amx_test.c index 1256c7faadd3..bd72c6eb3b67 100644 --- a/tools/testing/selftests/kvm/x86_64/amx_test.c +++ b/tools/testing/selftests/kvm/x86_64/amx_test.c @@ -249,17 +249,21 @@ int main(int argc, char *argv[]) u32 amx_offset; int stage, ret; + /* + * Note, all off-by-default features must be enabled before anything + * caches KVM_GET_SUPPORTED_CPUID, e.g. before using kvm_cpu_has(). + */ vm_xsave_require_permission(XSTATE_XTILE_DATA_BIT); - /* Create VM */ - vm = vm_create_with_one_vcpu(&vcpu, guest_code); - TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_XFD)); TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_XSAVE)); TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_AMX_TILE)); TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_XTILECFG)); TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_XTILEDATA)); + /* Create VM */ + vm = vm_create_with_one_vcpu(&vcpu, guest_code); + TEST_ASSERT(kvm_cpu_has_p(X86_PROPERTY_XSTATE_MAX_SIZE), "KVM should enumerate max XSAVE size when XSAVE is supported"); xsave_restore_size = kvm_cpu_property(X86_PROPERTY_XSTATE_MAX_SIZE); -- cgit From 0c3265235fc17e78773025ed0ddc7c0324b6ed89 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Tue, 22 Nov 2022 01:33:09 +0000 Subject: KVM: selftests: Define and use a custom static assert in lib headers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Define and use kvm_static_assert() in the common KVM selftests headers to provide deterministic behavior, and to allow creating static asserts without dummy messages. The kernel's static_assert() makes the message param optional, and on the surface, tools/include/linux/build_bug.h appears to follow suit. However, glibc may override static_assert() and redefine it as a direct alias of _Static_assert(), which makes the message parameter mandatory. This leads to non-deterministic behavior as KVM selftests code that utilizes static_assert() without a custom message may or not compile depending on the order of includes. E.g. recently added asserts in x86_64/processor.h fail on some systems with errors like In file included from lib/memstress.c:11:0: include/x86_64/processor.h: In function ‘this_cpu_has_p’: include/x86_64/processor.h:193:34: error: expected ‘,’ before ‘)’ token static_assert(low_bit < high_bit); \ ^ due to _Static_assert() expecting a comma before a message. The "message optional" version of static_assert() uses macro magic to strip away the comma when presented with empty an __VA_ARGS__ #ifndef static_assert #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr) #define __static_assert(expr, msg, ...) _Static_assert(expr, msg) #endif // static_assert and effectively generates "_Static_assert(expr, #expr)". The incompatible version of static_assert() gets defined by this snippet in /usr/include/assert.h: #if defined __USE_ISOC11 && !defined __cplusplus # undef static_assert # define static_assert _Static_assert #endif which yields "_Static_assert(expr)" and thus fails as above. KVM selftests don't actually care about using C11, but __USE_ISOC11 gets defined because of _GNU_SOURCE, which many tests do #define. _GNU_SOURCE triggers a massive pile of defines in /usr/include/features.h, including _ISOC11_SOURCE: /* If _GNU_SOURCE was defined by the user, turn on all the other features. */ #ifdef _GNU_SOURCE # undef _ISOC95_SOURCE # define _ISOC95_SOURCE 1 # undef _ISOC99_SOURCE # define _ISOC99_SOURCE 1 # undef _ISOC11_SOURCE # define _ISOC11_SOURCE 1 # undef _POSIX_SOURCE # define _POSIX_SOURCE 1 # undef _POSIX_C_SOURCE # define _POSIX_C_SOURCE 200809L # undef _XOPEN_SOURCE # define _XOPEN_SOURCE 700 # undef _XOPEN_SOURCE_EXTENDED # define _XOPEN_SOURCE_EXTENDED 1 # undef _LARGEFILE64_SOURCE # define _LARGEFILE64_SOURCE 1 # undef _DEFAULT_SOURCE # define _DEFAULT_SOURCE 1 # undef _ATFILE_SOURCE # define _ATFILE_SOURCE 1 #endif which further down in /usr/include/features.h leads to: /* This is to enable the ISO C11 extension. */ #if (defined _ISOC11_SOURCE \ || (defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L)) # define __USE_ISOC11 1 #endif To make matters worse, /usr/include/assert.h doesn't guard against multiple inclusion by turning itself into a nop, but instead #undefs a few macros and continues on. As a result, it's all but impossible to ensure the "message optional" version of static_assert() will actually be used, e.g. explicitly including assert.h and #undef'ing static_assert() doesn't work as a later inclusion of assert.h will again redefine its version. #ifdef _ASSERT_H # undef _ASSERT_H # undef assert # undef __ASSERT_VOID_CAST # ifdef __USE_GNU # undef assert_perror # endif #endif /* assert.h */ #define _ASSERT_H 1 #include Fixes: fcba483e8246 ("KVM: selftests: Sanity check input to ioctls() at build time") Fixes: ee3795536664 ("KVM: selftests: Refactor X86_FEATURE_* framework to prep for X86_PROPERTY_*") Fixes: 53a7dc0f215e ("KVM: selftests: Add X86_PROPERTY_* framework to retrieve CPUID values") Signed-off-by: Sean Christopherson Link: https://lore.kernel.org/r/20221122013309.1872347-1-seanjc@google.com --- .../testing/selftests/kvm/include/kvm_util_base.h | 14 +++++++++++++- .../selftests/kvm/include/x86_64/processor.h | 22 +++++++++++----------- 2 files changed, 24 insertions(+), 12 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index c7685c7038ff..9fa0d340f291 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -22,6 +22,18 @@ #include "sparsebit.h" +/* + * Provide a version of static_assert() that is guaranteed to have an optional + * message param. If _ISOC11_SOURCE is defined, glibc (/usr/include/assert.h) + * #undefs and #defines static_assert() as a direct alias to _Static_assert(), + * i.e. effectively makes the message mandatory. Many KVM selftests #define + * _GNU_SOURCE for various reasons, and _GNU_SOURCE implies _ISOC11_SOURCE. As + * a result, static_assert() behavior is non-deterministic and may or may not + * require a message depending on #include order. + */ +#define __kvm_static_assert(expr, msg, ...) _Static_assert(expr, msg) +#define kvm_static_assert(expr, ...) __kvm_static_assert(expr, ##__VA_ARGS__, #expr) + #define KVM_DEV_PATH "/dev/kvm" #define KVM_MAX_VCPUS 512 @@ -196,7 +208,7 @@ static inline bool kvm_has_cap(long cap) #define kvm_do_ioctl(fd, cmd, arg) \ ({ \ - static_assert(!_IOC_SIZE(cmd) || sizeof(*arg) == _IOC_SIZE(cmd), ""); \ + kvm_static_assert(!_IOC_SIZE(cmd) || sizeof(*arg) == _IOC_SIZE(cmd)); \ ioctl(fd, cmd, arg); \ }) diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h index 22852bd32d7b..411549ef4947 100644 --- a/tools/testing/selftests/kvm/include/x86_64/processor.h +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h @@ -72,11 +72,11 @@ struct kvm_x86_cpu_feature { .bit = __bit, \ }; \ \ - static_assert((fn & 0xc0000000) == 0 || \ - (fn & 0xc0000000) == 0x40000000 || \ - (fn & 0xc0000000) == 0x80000000 || \ - (fn & 0xc0000000) == 0xc0000000); \ - static_assert(idx < BIT(sizeof(feature.index) * BITS_PER_BYTE)); \ + kvm_static_assert((fn & 0xc0000000) == 0 || \ + (fn & 0xc0000000) == 0x40000000 || \ + (fn & 0xc0000000) == 0x80000000 || \ + (fn & 0xc0000000) == 0xc0000000); \ + kvm_static_assert(idx < BIT(sizeof(feature.index) * BITS_PER_BYTE)); \ feature; \ }) @@ -191,12 +191,12 @@ struct kvm_x86_cpu_property { .hi_bit = high_bit, \ }; \ \ - static_assert(low_bit < high_bit); \ - static_assert((fn & 0xc0000000) == 0 || \ - (fn & 0xc0000000) == 0x40000000 || \ - (fn & 0xc0000000) == 0x80000000 || \ - (fn & 0xc0000000) == 0xc0000000); \ - static_assert(idx < BIT(sizeof(property.index) * BITS_PER_BYTE)); \ + kvm_static_assert(low_bit < high_bit); \ + kvm_static_assert((fn & 0xc0000000) == 0 || \ + (fn & 0xc0000000) == 0x40000000 || \ + (fn & 0xc0000000) == 0x80000000 || \ + (fn & 0xc0000000) == 0xc0000000); \ + kvm_static_assert(idx < BIT(sizeof(property.index) * BITS_PER_BYTE)); \ property; \ }) -- cgit From ef16b2dff4d1c71eb32b306d400d4c0f3a383ba7 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Sat, 19 Nov 2022 01:34:44 +0000 Subject: KVM: arm64: selftests: Enable single-step without a "full" ucall() Add a new ucall hook, GUEST_UCALL_NONE(), to allow tests to make ucalls without allocating a ucall struct, and use it to enable single-step in ARM's debug-exceptions test. Like the disable single-step path, the enabling path also needs to ensure that no exclusive access sequences are attempted after enabling single-step, as the exclusive monitor is cleared on ERET from the debug exception taken to EL2. The test currently "works" because clear_bit() isn't actually an atomic operation... yet. Signed-off-by: Sean Christopherson Message-Id: <20221119013450.2643007-4-seanjc@google.com> Signed-off-by: Paolo Bonzini --- .../selftests/kvm/aarch64/debug-exceptions.c | 21 +++++++++++---------- tools/testing/selftests/kvm/include/ucall_common.h | 8 ++++++++ 2 files changed, 19 insertions(+), 10 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/aarch64/debug-exceptions.c b/tools/testing/selftests/kvm/aarch64/debug-exceptions.c index d86c4e4d1c82..c62ec4d7f6a3 100644 --- a/tools/testing/selftests/kvm/aarch64/debug-exceptions.c +++ b/tools/testing/selftests/kvm/aarch64/debug-exceptions.c @@ -239,10 +239,6 @@ static void guest_svc_handler(struct ex_regs *regs) svc_addr = regs->pc; } -enum single_step_op { - SINGLE_STEP_ENABLE = 0, -}; - static void guest_code_ss(int test_cnt) { uint64_t i; @@ -253,8 +249,16 @@ static void guest_code_ss(int test_cnt) w_bvr = i << 2; w_wvr = i << 2; - /* Enable Single Step execution */ - GUEST_SYNC(SINGLE_STEP_ENABLE); + /* + * Enable Single Step execution. Note! This _must_ be a bare + * ucall as the ucall() path uses atomic operations to manage + * the ucall structures, and the built-in "atomics" are usually + * implemented via exclusive access instructions. The exlusive + * monitor is cleared on ERET, and so taking debug exceptions + * during a LDREX=>STREX sequence will prevent forward progress + * and hang the guest/test. + */ + GUEST_UCALL_NONE(); /* * The userspace will verify that the pc is as expected during @@ -356,12 +360,9 @@ void test_single_step_from_userspace(int test_cnt) break; } - TEST_ASSERT(cmd == UCALL_SYNC, + TEST_ASSERT(cmd == UCALL_NONE, "Unexpected ucall cmd 0x%lx", cmd); - TEST_ASSERT(uc.args[1] == SINGLE_STEP_ENABLE, - "Unexpected ucall action 0x%lx", uc.args[1]); - debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP; ss_enable = true; diff --git a/tools/testing/selftests/kvm/include/ucall_common.h b/tools/testing/selftests/kvm/include/ucall_common.h index bdd373189a77..1a6aaef5ccae 100644 --- a/tools/testing/selftests/kvm/include/ucall_common.h +++ b/tools/testing/selftests/kvm/include/ucall_common.h @@ -35,6 +35,14 @@ void ucall(uint64_t cmd, int nargs, ...); uint64_t get_ucall(struct kvm_vcpu *vcpu, struct ucall *uc); void ucall_init(struct kvm_vm *vm, vm_paddr_t mmio_gpa); +/* + * Perform userspace call without any associated data. This bare call avoids + * allocating a ucall struct, which can be useful if the atomic operations in + * the full ucall() are problematic and/or unwanted. Note, this will come out + * as UCALL_NONE on the backend. + */ +#define GUEST_UCALL_NONE() ucall_arch_do_ucall((vm_vaddr_t)NULL) + #define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4) \ ucall(UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4) #define GUEST_SYNC(stage) ucall(UCALL_SYNC, 2, "hello", stage) -- cgit From 03a0c819e71755398d59993b9adee203544617d5 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Sat, 19 Nov 2022 01:34:47 +0000 Subject: KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests Use the dedicated non-atomic helpers for {clear,set}_bit() and their test variants, i.e. the double-underscore versions. Depsite being defined in atomic.h, and despite the kernel versions being atomic in the kernel, tools' {clear,set}_bit() helpers aren't actually atomic. Move to the double-underscore versions so that the versions that are expected to be atomic (for kernel developers) can be made atomic without affecting users that don't want atomic operations. Leave the usage in ucall_free() as-is, it's the one place in tools/ that actually wants/needs atomic behavior. Signed-off-by: Sean Christopherson Message-Id: <20221119013450.2643007-7-seanjc@google.com> Signed-off-by: Paolo Bonzini --- tools/testing/selftests/kvm/aarch64/arch_timer.c | 2 +- tools/testing/selftests/kvm/dirty_log_test.c | 34 +++++++++++----------- tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c | 4 +-- .../testing/selftests/kvm/x86_64/hyperv_svm_test.c | 4 +-- 4 files changed, 22 insertions(+), 22 deletions(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/aarch64/arch_timer.c b/tools/testing/selftests/kvm/aarch64/arch_timer.c index f2a96779716a..26556a266021 100644 --- a/tools/testing/selftests/kvm/aarch64/arch_timer.c +++ b/tools/testing/selftests/kvm/aarch64/arch_timer.c @@ -222,7 +222,7 @@ static void *test_vcpu_run(void *arg) /* Currently, any exit from guest is an indication of completion */ pthread_mutex_lock(&vcpu_done_map_lock); - set_bit(vcpu_idx, vcpu_done_map); + __set_bit(vcpu_idx, vcpu_done_map); pthread_mutex_unlock(&vcpu_done_map_lock); switch (get_ucall(vcpu, &uc)) { diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index a38c4369fb8e..a75548865f6b 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -44,20 +44,20 @@ # define BITOP_LE_SWIZZLE ((BITS_PER_LONG-1) & ~0x7) # define test_bit_le(nr, addr) \ test_bit((nr) ^ BITOP_LE_SWIZZLE, addr) -# define set_bit_le(nr, addr) \ - set_bit((nr) ^ BITOP_LE_SWIZZLE, addr) -# define clear_bit_le(nr, addr) \ - clear_bit((nr) ^ BITOP_LE_SWIZZLE, addr) -# define test_and_set_bit_le(nr, addr) \ - test_and_set_bit((nr) ^ BITOP_LE_SWIZZLE, addr) -# define test_and_clear_bit_le(nr, addr) \ - test_and_clear_bit((nr) ^ BITOP_LE_SWIZZLE, addr) +# define __set_bit_le(nr, addr) \ + __set_bit((nr) ^ BITOP_LE_SWIZZLE, addr) +# define __clear_bit_le(nr, addr) \ + __clear_bit((nr) ^ BITOP_LE_SWIZZLE, addr) +# define __test_and_set_bit_le(nr, addr) \ + __test_and_set_bit((nr) ^ BITOP_LE_SWIZZLE, addr) +# define __test_and_clear_bit_le(nr, addr) \ + __test_and_clear_bit((nr) ^ BITOP_LE_SWIZZLE, addr) #else -# define test_bit_le test_bit -# define set_bit_le set_bit -# define clear_bit_le clear_bit -# define test_and_set_bit_le test_and_set_bit -# define test_and_clear_bit_le test_and_clear_bit +# define test_bit_le test_bit +# define __set_bit_le __set_bit +# define __clear_bit_le __clear_bit +# define __test_and_set_bit_le __test_and_set_bit +# define __test_and_clear_bit_le __test_and_clear_bit #endif #define TEST_DIRTY_RING_COUNT 65536 @@ -305,7 +305,7 @@ static uint32_t dirty_ring_collect_one(struct kvm_dirty_gfn *dirty_gfns, TEST_ASSERT(cur->offset < num_pages, "Offset overflow: " "0x%llx >= 0x%x", cur->offset, num_pages); //pr_info("fetch 0x%x page %llu\n", *fetch_index, cur->offset); - set_bit_le(cur->offset, bitmap); + __set_bit_le(cur->offset, bitmap); dirty_ring_last_page = cur->offset; dirty_gfn_set_collected(cur); (*fetch_index)++; @@ -560,7 +560,7 @@ static void vm_dirty_log_verify(enum vm_guest_mode mode, unsigned long *bmap) value_ptr = host_test_mem + page * host_page_size; /* If this is a special page that we were tracking... */ - if (test_and_clear_bit_le(page, host_bmap_track)) { + if (__test_and_clear_bit_le(page, host_bmap_track)) { host_track_next_count++; TEST_ASSERT(test_bit_le(page, bmap), "Page %"PRIu64" should have its dirty bit " @@ -568,7 +568,7 @@ static void vm_dirty_log_verify(enum vm_guest_mode mode, unsigned long *bmap) page); } - if (test_and_clear_bit_le(page, bmap)) { + if (__test_and_clear_bit_le(page, bmap)) { bool matched; host_dirty_count++; @@ -661,7 +661,7 @@ static void vm_dirty_log_verify(enum vm_guest_mode mode, unsigned long *bmap) * should report its dirtyness in the * next run */ - set_bit_le(page, host_bmap_track); + __set_bit_le(page, host_bmap_track); } } } diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c b/tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c index ba09d300c953..af29e5776d40 100644 --- a/tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c +++ b/tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c @@ -142,7 +142,7 @@ void guest_code(struct vmx_pages *vmx_pages, struct hyperv_test_pages *hv_pages, /* Intercept RDMSR 0xc0000100 */ vmwrite(CPU_BASED_VM_EXEC_CONTROL, vmreadz(CPU_BASED_VM_EXEC_CONTROL) | CPU_BASED_USE_MSR_BITMAPS); - set_bit(MSR_FS_BASE & 0x1fff, vmx_pages->msr + 0x400); + __set_bit(MSR_FS_BASE & 0x1fff, vmx_pages->msr + 0x400); GUEST_ASSERT(!vmresume()); GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_MSR_READ); current_evmcs->guest_rip += 2; /* rdmsr */ @@ -154,7 +154,7 @@ void guest_code(struct vmx_pages *vmx_pages, struct hyperv_test_pages *hv_pages, current_evmcs->guest_rip += 2; /* rdmsr */ /* Intercept RDMSR 0xc0000101 without telling KVM about it */ - set_bit(MSR_GS_BASE & 0x1fff, vmx_pages->msr + 0x400); + __set_bit(MSR_GS_BASE & 0x1fff, vmx_pages->msr + 0x400); /* Make sure HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP is set */ current_evmcs->hv_clean_fields |= HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP; GUEST_ASSERT(!vmresume()); diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c b/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c index 3b3cc94ba8e4..68a7d354ea07 100644 --- a/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c +++ b/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c @@ -103,7 +103,7 @@ static void __attribute__((__flatten__)) guest_code(struct svm_test_data *svm, /* Intercept RDMSR 0xc0000100 */ vmcb->control.intercept |= 1ULL << INTERCEPT_MSR_PROT; - set_bit(2 * (MSR_FS_BASE & 0x1fff), svm->msr + 0x800); + __set_bit(2 * (MSR_FS_BASE & 0x1fff), svm->msr + 0x800); run_guest(vmcb, svm->vmcb_gpa); GUEST_ASSERT(vmcb->control.exit_code == SVM_EXIT_MSR); vmcb->save.rip += 2; /* rdmsr */ @@ -115,7 +115,7 @@ static void __attribute__((__flatten__)) guest_code(struct svm_test_data *svm, vmcb->save.rip += 2; /* rdmsr */ /* Intercept RDMSR 0xc0000101 without telling KVM about it */ - set_bit(2 * (MSR_GS_BASE & 0x1fff), svm->msr + 0x800); + __set_bit(2 * (MSR_GS_BASE & 0x1fff), svm->msr + 0x800); /* Make sure HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP is set */ vmcb->control.clean |= HV_VMCB_NESTED_ENLIGHTENMENTS; run_guest(vmcb, svm->vmcb_gpa); -- cgit From 36293352ff433061d45d52784983e44950c09ae3 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Sat, 19 Nov 2022 01:34:49 +0000 Subject: tools: Drop "atomic_" prefix from atomic test_and_set_bit() Drop the "atomic_" prefix from tools' atomic_test_and_set_bit() to match the kernel nomenclature where test_and_set_bit() is atomic, and __test_and_set_bit() provides the non-atomic variant. Signed-off-by: Sean Christopherson Message-Id: <20221119013450.2643007-9-seanjc@google.com> Signed-off-by: Paolo Bonzini --- tools/testing/selftests/kvm/lib/ucall_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/lib/ucall_common.c b/tools/testing/selftests/kvm/lib/ucall_common.c index fcae96461e46..820ce6c82829 100644 --- a/tools/testing/selftests/kvm/lib/ucall_common.c +++ b/tools/testing/selftests/kvm/lib/ucall_common.c @@ -44,7 +44,7 @@ static struct ucall *ucall_alloc(void) GUEST_ASSERT(ucall_pool); for (i = 0; i < KVM_MAX_VCPUS; ++i) { - if (!atomic_test_and_set_bit(i, ucall_pool->in_use)) { + if (!test_and_set_bit(i, ucall_pool->in_use)) { uc = &ucall_pool->ucalls[i]; memset(uc->args, 0, sizeof(uc->args)); return uc; -- cgit From 4bf46e35826d8dc4fc0a103dd0ccd94c072a4c6a Mon Sep 17 00:00:00 2001 From: Colin Ian King Date: Thu, 1 Dec 2022 09:13:54 +0000 Subject: KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic" There is a spelling mistake in some help text. Fix it. Signed-off-by: Colin Ian King Message-Id: <20221201091354.1613652-1-colin.i.king@gmail.com> Signed-off-by: Paolo Bonzini --- tools/testing/selftests/kvm/dirty_log_perf_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'tools/testing/selftests/kvm') diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index c33e89012ae6..e9d6d1aecf89 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -398,7 +398,7 @@ static void help(char *name) printf(" -x: Split the memory region into this number of memslots.\n" " (default: 1)\n"); printf(" -w: specify the percentage of pages which should be written to\n" - " as an integer from 0-100 inclusive. This is probabalistic,\n" + " as an integer from 0-100 inclusive. This is probabilistic,\n" " so -w X means each page has an X%% chance of writing\n" " and a (100-X)%% chance of reading.\n" " (default: 100 i.e. all pages are written to.)\n"); -- cgit