summaryrefslogtreecommitdiff
path: root/arch/arm64/kvm/hyp/switch.c
AgeCommit message (Collapse)Author
2018-03-19KVM: arm64: Introduce separate VHE/non-VHE sysreg save/restore functionsChristoffer Dall
As we are about to handle system registers quite differently between VHE and non-VHE systems. In preparation for that, we need to split some of the handling functions between VHE and non-VHE functionality. For now, we simply copy the non-VHE functions, but we do change the use of static keys for VHE and non-VHE functionality now that we have separate functions. Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Remove noop calls to timer save/restore from VHE switchChristoffer Dall
The VHE switch function calls __timer_enable_traps and __timer_disable_traps which don't do anything on VHE systems. Therefore, simply remove these calls from the VHE switch function and make the functions non-conditional as they are now only called from the non-VHE switch path. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Don't deactivate VM on VHE systemsChristoffer Dall
There is no need to reset the VTTBR to zero when exiting the guest on VHE systems. VHE systems don't use stage 2 translations for the EL2&0 translation regime used by the host. Reviewed-by: Andrew Jones <drjones@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Remove kern_hyp_va() use in VHE switch functionChristoffer Dall
VHE kernels run completely in EL2 and therefore don't have a notion of kernel and hyp addresses, they are all just kernel addresses. Therefore don't call kern_hyp_va() in the VHE switch function. Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Introduce VHE-specific kvm_vcpu_runChristoffer Dall
So far this is mostly (see below) a copy of the legacy non-VHE switch function, but we will start reworking these functions in separate directions to work on VHE and non-VHE in the most optimal way in later patches. The only difference after this patch between the VHE and non-VHE run functions is that we omit the branch-predictor variant-2 hardening for QC Falkor CPUs, because this workaround is specific to a series of non-VHE ARMv8.0 CPUs. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Factor out fault info population and gic workaroundsChristoffer Dall
The current world-switch function has functionality to detect a number of cases where we need to fixup some part of the exit condition and possibly run the guest again, before having restored the host state. This includes populating missing fault info, emulating GICv2 CPU interface accesses when mapped at unaligned addresses, and emulating the GICv3 CPU interface on systems that need it. As we are about to have an alternative switch function for VHE systems, but VHE systems still need the same early fixup logic, factor out this logic into a separate function that can be shared by both switch functions. No functional change. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Improve debug register save/restore flowChristoffer Dall
Instead of having multiple calls from the world switch path to the debug logic, each figuring out if the dirty bit is set and if we should save/restore the debug registers, let's just provide two hooks to the debug save/restore functionality, one for switching to the guest context, and one for switching to the host context, and we get the benefit of only having to evaluate the dirty flag once on each path, plus we give the compiler some more room to inline some of this functionality. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm/arm64: Introduce vcpu_el1_is_32bitChristoffer Dall
We have numerous checks around that checks if the HCR_EL2 has the RW bit set to figure out if we're running an AArch64 or AArch32 VM. In some cases, directly checking the RW bit (given its unintuitive name), is a bit confusing, and that's not going to improve as we move logic around for the following patches that optimize KVM on AArch64 hosts with VHE. Therefore, introduce a helper, vcpu_el1_is_32bit, and replace existing direct checks of HCR_EL2.RW with the helper. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm/arm64: Get rid of vcpu->arch.irq_linesChristoffer Dall
We currently have a separate read-modify-write of the HCR_EL2 on entry to the guest for the sole purpose of setting the VF and VI bits, if set. Since this is most rarely the case (only when using userspace IRQ chip and interrupts are in flight), let's get rid of this operation and instead modify the bits in the vcpu->arch.hcr[_el2] directly when needed. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Move HCR_INT_OVERRIDE to default HCR_EL2 guest flagShih-Wei Li
We always set the IMO and FMO bits in the HCR_EL2 when running the guest, regardless if we use the vgic or not. By moving these flags to HCR_GUEST_FLAGS we can avoid one of the extra save/restore operations of HCR_EL2 in the world switch code, and we can also soon get rid of the other one. This is safe, because even though the IMO and FMO bits control both taking the interrupts to EL2 and remapping ICC_*_EL1 to ICV_*_EL1 when executed at EL1, as long as we ensure that these bits are clear when running the EL1 host, we're OK, because we reset the HCR_EL2 to only have the HCR_RW bit set when returning to EL1 on non-VHE systems. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Shih-Wei Li <shihwei@cs.columbia.edu> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Rework hyp_panic for VHE and non-VHEChristoffer Dall
VHE actually doesn't rely on clearing the VTTBR when returning to the host kernel, and that is the current key mechanism of hyp_panic to figure out how to attempt to return to a state good enough to print a panic statement. Therefore, we split the hyp_panic function into two functions, a VHE and a non-VHE, keeping the non-VHE version intact, but changing the VHE behavior. The vttbr_el2 check on VHE doesn't really make that much sense, because the only situation where we can get here on VHE is when the hypervisor assembly code actually called into hyp_panic, which only happens when VBAR_EL2 has been set to the KVM exception vectors. On VHE, we can always safely disable the traps and restore the host registers at this point, so we simply do that unconditionally and call into the panic function directly. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-19KVM: arm64: Avoid storing the vcpu pointer on the stackChristoffer Dall
We already have the percpu area for the host cpu state, which points to the VCPU, so there's no need to store the VCPU pointer on the stack on every context switch. We can be a little more clever and just use tpidr_el2 for the percpu offset and load the VCPU pointer from the host context. This has the benefit of being able to retrieve the host context even when our stack is corrupted, and it has a potential performance benefit because we trade a store plus a load for an mrs and a load on a round trip to the guest. This does require us to calculate the percpu offset without including the offset from the kernel mapping of the percpu array to the linear mapping of the array (which is what we store in tpidr_el1), because a PC-relative generated address in EL2 is already giving us the hyp alias of the linear mapping of a kernel address. We do this in __cpu_init_hyp_mode() by using kvm_ksym_ref(). The code that accesses ESR_EL2 was previously using an alternative to use the _EL1 accessor on VHE systems, but this was actually unnecessary as the _EL1 accessor aliases the ESR_EL2 register on VHE, and the _EL2 accessor does the same thing on both systems. Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-02-26arm64: KVM: Move CPU ID reg trap setup off the world switch pathDave Martin
The HCR_EL2.TID3 flag needs to be set when trapping guest access to the CPU ID registers is required. However, the decision about whether to set this bit does not need to be repeated at every switch to the guest. Instead, it's sufficient to make this decision once and record the outcome. This patch moves the decision to vcpu_reset_hcr() and records the choice made in vcpu->arch.hcr_el2. The world switch code can then load this directly when switching to the guest without the need for conditional logic on the critical path. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Suggested-by: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-02-12arm64: Add missing Falkor part number for branch predictor hardeningShanker Donthineni
References to CPU part number MIDR_QCOM_FALKOR were dropped from the mailing list patch due to mainline/arm64 branch dependency. So this patch adds the missing part number. Fixes: ec82b567a74f ("arm64: Implement branch predictor hardening for Falkor") Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-10Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Radim Krčmář: "ARM: - icache invalidation optimizations, improving VM startup time - support for forwarded level-triggered interrupts, improving performance for timers and passthrough platform devices - a small fix for power-management notifiers, and some cosmetic changes PPC: - add MMIO emulation for vector loads and stores - allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without requiring the complex thread synchronization of older CPU versions - improve the handling of escalation interrupts with the XIVE interrupt controller - support decrement register migration - various cleanups and bugfixes. s390: - Cornelia Huck passed maintainership to Janosch Frank - exitless interrupts for emulated devices - cleanup of cpuflag handling - kvm_stat counter improvements - VSIE improvements - mm cleanup x86: - hypervisor part of SEV - UMIP, RDPID, and MSR_SMI_COUNT emulation - paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit - allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more AVX512 features - show vcpu id in its anonymous inode name - many fixes and cleanups - per-VCPU MSR bitmaps (already merged through x86/pti branch) - stable KVM clock when nesting on Hyper-V (merged through x86/hyperv)" * tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits) KVM: PPC: Book3S: Add MMIO emulation for VMX instructions KVM: PPC: Book3S HV: Branch inside feature section KVM: PPC: Book3S HV: Make HPT resizing work on POWER9 KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code KVM: PPC: Book3S PR: Fix broken select due to misspelling KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs() KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled KVM: PPC: Book3S HV: Drop locks before reading guest memory kvm: x86: remove efer_reload entry in kvm_vcpu_stat KVM: x86: AMD Processor Topology Information x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested kvm: embed vcpu id to dentry of vcpu anon inode kvm: Map PFN-type memory regions as writable (if possible) x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n KVM: arm/arm64: Fixup userspace irqchip static key optimization KVM: arm/arm64: Fix userspace_irqchip_in_use counting KVM: arm/arm64: Fix incorrect timer_is_pending logic MAINTAINERS: update KVM/s390 maintainers MAINTAINERS: add Halil as additional vfio-ccw maintainer MAINTAINERS: add David as a reviewer for KVM/s390 ...
2018-02-06arm64: Kill PSCI_GET_VERSION as a variant-2 workaroundMarc Zyngier
Now that we've standardised on SMCCC v1.1 to perform the branch prediction invalidation, let's drop the previous band-aid. If vendors haven't updated their firmware to do SMCCC 1.1, they haven't updated PSCI either, so we don't loose anything. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-06arm/arm64: KVM: Turn kvm_psci_version into a static inlineMarc Zyngier
We're about to need kvm_psci_version in HYP too. So let's turn it into a static inline, and pass the kvm structure as a second parameter (so that HYP can do a kern_hyp_va on it). Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-16KVM: arm64: Save ESR_EL2 on guest SErrorJames Morse
When we exit a guest due to an SError the vcpu fault info isn't updated with the ESR. Today this is only done for traps. The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's fault_info with the ESR on SError so that handle_exit() can determine if this was a RAS SError and decode its severity. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-16KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.James Morse
Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature generated an SError with an implementation defined ESR_EL1.ISS, because we had no mechanism to specify the ESR value. On Juno this generates an all-zero ESR, the most significant bit 'ISV' is clear indicating the remainder of the ISS field is invalid. With the RAS Extensions we have a mechanism to specify this value, and the most significant bit has a new meaning: 'IDS - Implementation Defined Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized' instead of 'no valid ISS'. Add KVM support for the VSESR_EL2 register to specify an ESR value when HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to specify an implementation-defined value. We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM save/restores this bit during __{,de}activate_traps() and hardware clears the bit once the guest has consumed the virtual-SError. Future patches may add an API (or KVM CAP) to pend a virtual SError with a specified ESR. Cc: Dongjiu Geng <gengdongjiu@huawei.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-13KVM: arm64: Change hyp_panic()s dependency on tpidr_el2James Morse
Make tpidr_el2 a cpu-offset for per-cpu variables in the same way the host uses tpidr_el1. This lets tpidr_el{1,2} have the same value, and on VHE they can be the same register. KVM calls hyp_panic() when anything unexpected happens. This may occur while a guest owns the EL1 registers. KVM stashes the vcpu pointer in tpidr_el2, which it uses to find the host context in order to restore the host EL1 registers before parachuting into the host's panic(). The host context is a struct kvm_cpu_context allocated in the per-cpu area, and mapped to hyp. Given the per-cpu offset for this CPU, this is easy to find. Change hyp_panic() to take a pointer to the struct kvm_cpu_context. Wrap these calls with an asm function that retrieves the struct kvm_cpu_context from the host's per-cpu area. Copy the per-cpu offset from the hosts tpidr_el1 into tpidr_el2 during kvm init. (Later patches will make this unnecessary for VHE hosts) We print out the vcpu pointer as part of the panic message. Add a back reference to the 'running vcpu' in the host cpu context to preserve this. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08arm64: Implement branch predictor hardening for FalkorShanker Donthineni
Falkor is susceptible to branch predictor aliasing and can theoretically be attacked by malicious code. This patch implements a mitigation for these attacks, preventing any malicious entries from affecting other victim contexts. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> [will: fix label name when !CONFIG_KVM and remove references to MIDR_FALKOR] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08arm64: KVM: Make PSCI_VERSION a fast pathMarc Zyngier
For those CPUs that require PSCI to perform a BP invalidation, going all the way to the PSCI code for not much is a waste of precious cycles. Let's terminate that call as early as possible. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08arm64: KVM: Use per-CPU vector when BP hardening is enabledMarc Zyngier
Now that we have per-CPU vectors, let's plug then in the KVM/arm64 code. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08KVM: arm/arm64: Detangle kvm_mmu.h from kvm_hyp.hMarc Zyngier
kvm_hyp.h has an odd dependency on kvm_mmu.h, which makes the opposite inclusion impossible. Let's start with breaking that useless dependency. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-11-29kvm: arm64: handle single-step of hyp emulated mmio instructionsAlex Bennée
There is a fast-path of MMIO emulation inside hyp mode. The handling of single-step is broadly the same as kvm_arm_handle_step_debug() except we just setup ESR/HSR so handle_exit() does the correct thing as we exit. For the case of an emulated illegal access causing an SError we will exit via the ARM_EXCEPTION_EL1_SERROR path in handle_exit(). We behave as we would during a real SError and clear the DBG_SPSR_SS bit for the emulated instruction. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-11-16Merge tag 'kvm-4.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Radim Krčmář: "First batch of KVM changes for 4.15 Common: - Python 3 support in kvm_stat - Accounting of slabs to kmemcg ARM: - Optimized arch timer handling for KVM/ARM - Improvements to the VGIC ITS code and introduction of an ITS reset ioctl - Unification of the 32-bit fault injection logic - More exact external abort matching logic PPC: - Support for running hashed page table (HPT) MMU mode on a host that is using the radix MMU mode; single threaded mode on POWER 9 is added as a pre-requisite - Resolution of merge conflicts with the last second 4.14 HPT fixes - Fixes and cleanups s390: - Some initial preparation patches for exitless interrupts and crypto - New capability for AIS migration - Fixes x86: - Improved emulation of LAPIC timer mode changes, MCi_STATUS MSRs, and after-reset state - Refined dependencies for VMX features - Fixes for nested SMI injection - A lot of cleanups" * tag 'kvm-4.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (89 commits) KVM: s390: provide a capability for AIS state migration KVM: s390: clear_io_irq() requests are not expected for adapter interrupts KVM: s390: abstract conversion between isc and enum irq_types KVM: s390: vsie: use common code functions for pinning KVM: s390: SIE considerations for AP Queue virtualization KVM: s390: document memory ordering for kvm_s390_vcpu_wakeup KVM: PPC: Book3S HV: Cosmetic post-merge cleanups KVM: arm/arm64: fix the incompatible matching for external abort KVM: arm/arm64: Unify 32bit fault injection KVM: arm/arm64: vgic-its: Implement KVM_DEV_ARM_ITS_CTRL_RESET KVM: arm/arm64: Document KVM_DEV_ARM_ITS_CTRL_RESET KVM: arm/arm64: vgic-its: Free caches when GITS_BASER Valid bit is cleared KVM: arm/arm64: vgic-its: New helper functions to free the caches KVM: arm/arm64: vgic-its: Remove kvm_its_unmap_device arm/arm64: KVM: Load the timer state when enabling the timer KVM: arm/arm64: Rework kvm_timer_should_fire KVM: arm/arm64: Get rid of kvm_timer_flush_hwstate KVM: arm/arm64: Avoid phys timer emulation in vcpu entry/exit KVM: arm/arm64: Move phys_timer_emulate function KVM: arm/arm64: Use kvm_arm_timer_set/get_reg for guest register traps ...
2017-11-06KVM: arm/arm64: Move timer save/restore out of the hyp codeChristoffer Dall
As we are about to be lazy with saving and restoring the timer registers, we prepare by moving all possible timer configuration logic out of the hyp code. All virtual timer registers can be programmed from EL1 and since the arch timer is always a level triggered interrupt we can safely do this with interrupts disabled in the host kernel on the way to the guest without taking vtimer interrupts in the host kernel (yet). The downside is that the cntvoff register can only be programmed from hyp mode, so we jump into hyp mode and back to program it. This is also safe, because the host kernel doesn't use the virtual timer in the KVM code. It may add a little performance performance penalty, but only until following commits where we move this operation to vcpu load/put. Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-03arm64/sve: KVM: Prevent guests from using SVEDave Martin
Until KVM has full SVE support, guests must not be allowed to execute SVE instructions. This patch enables the necessary traps, and also ensures that the traps are disabled again on exit from the guest so that the host can still use SVE if it wants to. On guest exit, high bits of the SVE Zn registers may have been clobbered as a side-effect the execution of FPSIMD instructions in the guest. The existing KVM host FPSIMD restore code is not sufficient to restore these bits, so this patch explicitly marks the CPU as not containing cached vector state for any task, thus forcing a reload on the next return to userspace. This is an interim measure, in advance of adding full SVE awareness to KVM. This marking of cached vector state in the CPU as invalid is done using __this_cpu_write(fpsimd_last_state, NULL) in fpsimd.c. Due to the repeated use of this rather obscure operation, it makes sense to factor it out as a separate helper with a clearer name. This patch factors it out as fpsimd_flush_cpu_state(), and ports all callers to use it. As a side effect of this refactoring, a this_cpu_write() in fpsimd_cpu_pm_notifier() is changed to __this_cpu_write(). This should be fine, since cpu_pm_enter() is supposed to be called only with interrupts disabled. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03arm64: KVM: Hide unsupported AArch64 CPU features from guestsDave Martin
Currently, a guest kernel sees the true CPU feature registers (ID_*_EL1) when it reads them using MRS instructions. This means that the guest may observe features that are present in the hardware but the host doesn't understand or doesn't provide support for. A guest may legimitately try to use such a feature as per the architecture, but use of the feature may trap instead of working normally, triggering undef injection into the guest. This is not a problem for the host, but the guest may go wrong when running on newer hardware than the host knows about. This patch hides from guest VMs any AArch64-specific CPU features that the host doesn't support, by exposing to the guest the sanitised versions of the registers computed by the cpufeatures framework, instead of the true hardware registers. To achieve this, HCR_EL2.TID3 is now set for AArch64 guests, and emulation code is added to KVM to report the sanitised versions of the affected registers in response to MRS and register reads from userspace. The affected registers are removed from invariant_sys_regs[] (since the invariant_sys_regs handling is no longer quite correct for them) and added to sys_reg_desgs[], with appropriate access(), get_user() and set_user() methods. No runtime vcpu storage is allocated for the registers: instead, they are read on demand from the cpufeatures framework. This may need modification in the future if there is a need for userspace to customise the features visible to the guest. Attempts by userspace to write the registers are handled similarly to the current invariant_sys_regs handling: writes are permitted, but only if they don't attempt to change the value. This is sufficient to support VM snapshot/restore from userspace. Because of the additional registers, restoring a VM on an older kernel may not work unless userspace knows how to handle the extra VM registers exposed to the KVM user ABI by this patch. Under the principle of least damage, this patch makes no attempt to handle any of the other registers currently in invariant_sys_regs[], or to emulate registers for AArch32: however, these could be handled in a similar way in future, as necessary. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15KVM: arm64: vgic-v3: Add hook to handle guest GICv3 sysreg accesses at EL2Marc Zyngier
In order to start handling guest access to GICv3 system registers, let's add a hook that will get called when we trap a system register access. This is gated by a new static key (vgic_v3_cpuif_trap). Tested-by: Alexander Graf <agraf@suse.de> Acked-by: David Daney <david.daney@cavium.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-05-16KVM: arm64: Restore host physical timer access on hyp_panic()James Morse
When KVM panics, it hurridly restores the host context and parachutes into the host's panic() code. At some point panic() touches the physical timer/counter. Unless we are an arm64 system with VHE, this traps back to EL2. If we're lucky, we panic again. Add a __timer_save_state() call to KVMs hyp_panic() path, this saves the guest registers and disables the traps for the host. Fixes: 53fd5b6487e4 ("arm64: KVM: Add panic handling") Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-02-02arm64: KVM: Save/restore the host SPE state when entering/leaving a VMWill Deacon
The SPE buffer is virtually addressed, using the page tables of the CPU MMU. Unusually, this means that the EL0/1 page table may be live whilst we're executing at EL2 on non-VHE configurations. When VHE is in use, we can use the same property to profile the guest behind its back. This patch adds the relevant disabling and flushing code to KVM so that the host can make use of SPE without corrupting guest memory, and any attempts by a guest to use SPE will result in a trap. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Alex Bennée <alex.bennee@linaro.org> Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-12-13Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - struct thread_info moved off-stack (also touching include/linux/thread_info.h and include/linux/restart_block.h) - cpus_have_cap() reworked to avoid __builtin_constant_p() for static key use (also touching drivers/irqchip/irq-gic-v3.c) - uprobes support (currently only for native 64-bit tasks) - Emulation of kernel Privileged Access Never (PAN) using TTBR0_EL1 switching to a reserved page table - CPU capacity information passing via DT or sysfs (used by the scheduler) - support for systems without FP/SIMD (IOW, kernel avoids touching these registers; there is no soft-float ABI, nor kernel emulation for AArch64 FP/SIMD) - handling of hardware watchpoint with unaligned addresses, varied lengths and offsets from base - use of the page table contiguous hint for kernel mappings - hugetlb fixes for sizes involving the contiguous hint - remove unnecessary I-cache invalidation in flush_cache_range() - CNTHCTL_EL2 access fix for CPUs with VHE support (ARMv8.1) - boot-time checks for writable+executable kernel mappings - simplify asm/opcodes.h and avoid including the 32-bit ARM counterpart and make the arm64 kernel headers self-consistent (Xen headers patch merged separately) - Workaround for broken .inst support in certain binutils versions * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (60 commits) arm64: Disable PAN on uaccess_enable() arm64: Work around broken .inst when defective gas is detected arm64: Add detection code for broken .inst support in binutils arm64: Remove reference to asm/opcodes.h arm64: Get rid of asm/opcodes.h arm64: smp: Prevent raw_smp_processor_id() recursion arm64: head.S: Fix CNTHCTL_EL2 access on VHE system arm64: Remove I-cache invalidation from flush_cache_range() arm64: Enable HIBERNATION in defconfig arm64: Enable CONFIG_ARM64_SW_TTBR0_PAN arm64: xen: Enable user access before a privcmd hvc call arm64: Handle faults caused by inadvertent user access with PAN enabled arm64: Disable TTBR0_EL1 during normal kernel execution arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1 arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro arm64: Factor out PAN enabling/disabling into separate uaccess_* macros arm64: Update the synchronous external abort fault description selftests: arm64: add test for unaligned/inexact watchpoint handling arm64: Allow hw watchpoint of length 3,5,6 and 7 arm64: hw_breakpoint: Handle inexact watchpoint addresses ...
2016-12-09arm64: KVM: pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guestMarc Zyngier
The ARMv8 architecture allows the cycle counter to be configured by setting PMSELR_EL0.SEL==0x1f and then accessing PMXEVTYPER_EL0, hence accessing PMCCFILTR_EL0. But it disallows the use of PMSELR_EL0.SEL==0x1f to access the cycle counter itself through PMXEVCNTR_EL0. Linux itself doesn't violate this rule, but we may end up with PMSELR_EL0.SEL being set to 0x1f when we enter a guest. If that guest accesses PMXEVCNTR_EL0, the access may UNDEF at EL1, despite the guest not having done anything wrong. In order to avoid this unfortunate course of events (haha!), let's sanitize PMSELR_EL0 on guest entry. This ensures that the guest won't explode unexpectedly. Cc: stable@vger.kernel.org #4.6+ Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-11-16arm64: Support systems without FP/ASIMDSuzuki K Poulose
The arm64 kernel assumes that FP/ASIMD units are always present and accesses the FP/ASIMD specific registers unconditionally. This could cause problems when they are absent. This patch adds the support for kernel handling systems without FP/ASIMD by skipping the register access within the kernel. For kvm, we trap the accesses to FP/ASIMD and inject an undefined instruction exception to the VM. The callers of the exported kernel_neon_begin_partial() should make sure that the FP/ASIMD is supported. Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> [catalin.marinas@arm.com: add comment on the ARM64_HAS_NO_FPSIMD conflict and the new location] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-22arm64: KVM: Use static keys for selecting the GIC backendVladimir Murzin
Currently GIC backend is selected via alternative framework and this is fine. We are going to introduce vgic-v3 to 32-bit world and there we don't have patching framework in hand, so we can either check support for GICv3 every time we need to choose which backend to use or try to optimise it by using static keys. The later looks quite promising because we can share logic involved in selecting GIC backend between architectures if both uses static keys. This patch moves arm64 from alternative to static keys framework for selecting GIC backend. For that we embed static key into vgic_global and enable the key during vgic initialisation based on what has already been exposed by the host GIC driver. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08arm64: KVM: Inject a vSerror if detecting a bad GICV access at EL2Marc Zyngier
If, when proxying a GICV access at EL2, we detect that the guest is doing something silly, report an EL1 SError instead ofgnoring the access. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08arm64: KVM: Handle async aborts delivered while at EL2Marc Zyngier
If EL1 generates an asynchronous abort and then traps into EL2 before the abort has been delivered, we may end-up with the abort firing at the worse possible place: on the host. In order to avoid this, it is necessary to take the abort at EL2, by clearing the PSTATE.A bit. In order to survive this abort, we do it at a point where we're in a known state with respect to the world switch, and handle the resulting exception, overloading the exit code in the process. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08arm64: KVM: Preserve pending vSError in world switchMarc Zyngier
The HCR_EL2.VSE bit is used to signal an SError to a guest, and has the peculiar feature of getting cleared when the guest has taken the abort (this is the only bit that behaves as such in this register). This means that if we signal such an abort, we must leave it in the guest context until it disappears from HCR_EL2, and at which point it must be cleared from the context. This is achieved by reading back from HCR_EL2 until the guest takes the fault. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08arm64: KVM: vgic-v2: Add the GICV emulation infrastructureMarc Zyngier
In order to efficiently perform the GICV access on behalf of the guest, we need to be able to avoid going back all the way to the host kernel. For this, we introduce a new hook in the world switch code, conveniently placed just after populating the fault info. At that point, we only have saved/restored the GP registers, and we can quickly perform all the required checks (data abort, translation fault, valid faulting syndrome, not an external abort, not a PTW). Coming back from the emulation code, we need to skip the emulated instruction. This involves an additional bit of save/restore in order to be able to access the guest's PC (and possibly CPSR if this is a 32bit guest). At this stage, no emulation code is provided. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08KVM: arm/arm64: Get rid of exported aliases to static functionsChristoffer Dall
When rewriting the assembly code to C code, it was useful to have exported aliases or static functions so that we could keep the existing common C code unmodified and at the same time rewrite arm64 from assembly to C code, and later do the arm part. Now when both are done, we really don't need this level of indirection anymore, and it's time to save a few lines and brain cells. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-17arm64: Document workaround for Cortex-A72 erratum #853709Marc Zyngier
We already have a workaround for Cortex-A57 erratum #852523, but Cortex-A72 r0p0 to r0p2 do suffer from the same issue (known as erratum #853709). Let's document the fact that we already handle this. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: - ARM: GICv3 ITS emulation and various fixes. Removal of the old VGIC implementation. - s390: support for trapping software breakpoints, nested virtualization (vSIE), the STHYI opcode, initial extensions for CPU model support. - MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups, preliminary to this and the upcoming support for hardware virtualization extensions. - x86: support for execute-only mappings in nested EPT; reduced vmexit latency for TSC deadline timer (by about 30%) on Intel hosts; support for more than 255 vCPUs. - PPC: bugfixes. * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits) KVM: PPC: Introduce KVM_CAP_PPC_HTM MIPS: Select HAVE_KVM for MIPS64_R{2,6} MIPS: KVM: Reset CP0_PageMask during host TLB flush MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX() MIPS: KVM: Sign extend MFC0/RDHWR results MIPS: KVM: Fix 64-bit big endian dynamic translation MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase MIPS: KVM: Use 64-bit CP0_EBase when appropriate MIPS: KVM: Set CP0_Status.KX on MIPS64 MIPS: KVM: Make entry code MIPS64 friendly MIPS: KVM: Use kmap instead of CKSEG0ADDR() MIPS: KVM: Use virt_to_phys() to get commpage PFN MIPS: Fix definition of KSEGX() for 64-bit KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD kvm: x86: nVMX: maintain internal copy of current VMCS KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures KVM: arm64: vgic-its: Simplify MAPI error handling KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers KVM: arm64: vgic-its: Turn device_id validation into generic ID validation ...
2016-07-03arm64: KVM: Always reference __hyp_panic_string via its kernel VAMarc Zyngier
__hyp_panic_string is passed via the HYP panic code to the panic function, and is being "upgraded" to a kernel address, as it is referenced by the HYP code (in a PC-relative way). This is a bit silly, and we'd be better off obtaining the kernel address and not mess with it at all. This patch implements this with a tiny bit of asm glue, by forcing the string pointer to be read from the literal pool. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-06-21arm64/kvm: use ESR_ELx_EC to extract ECMark Rutland
Now that we have a helper to extract the EC from an ESR_ELx value, make use of this in the arm64 KVM code for simplicity and consistency. There should be no functional changes as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Dave P Martin <dave.martin@arm.com> Cc: Huang Shijie <shijie.huang@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: kvmarm@lists.cs.columbia.edu Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-29arm64: KVM: Add access handler for PMUSERENR registerShannon Zhao
This register resets as unknown in 64bit mode while it resets as zero in 32bit mode. Here we choose to reset it as zero for consistency. PMUSERENR_EL0 holds some bits which decide whether PMU registers can be accessed from EL0. Add some check helpers to handle the access from EL0. When these bits are zero, only reading PMUSERENR will trap to EL2 and writing PMUSERENR or reading/writing other PMU registers will trap to EL1 other than EL2 when HCR.TGE==0. To current KVM configuration (HCR.TGE==0) there is no way to get these traps. Here we write 0xf to physical PMUSERENR register on VM entry, so that it will trap PMU access from EL0 to EL2. Within the register access handler we check the real value of guest PMUSERENR register to decide whether this access is allowed. If not allowed, return false to inject UND to guest. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Move kvm/hyp/hyp.h to include/asm/kvm_hyp.hMarc Zyngier
In order to be able to move code outside of kvm/hyp, we need to make the global hyp.h file accessible from a standard location. include/asm/kvm_hyp.h seems good enough. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: Move most of the fault decoding to CMarc Zyngier
The fault decoding process (including computing the IPA in the case of a permission fault) would be much better done in C code, as we have a reasonable infrastructure to deal with the VHE/non-VHE differences. Let's move the whole thing to C, including the workaround for erratum 834220, and just patch the odd ESR_EL2 access remaining in hyp-entry.S. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: VHE: Add alternative panic handlingMarc Zyngier
As the kernel fully runs in HYP when VHE is enabled, we can directly branch to the kernel's panic() implementation, and not perform an exception return. Add the alternative code to deal with this. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-29arm64: KVM: VHE: Implement VHE activate/deactivate_trapsMarc Zyngier
Running the kernel in HYP mode requires the HCR_E2H bit to be set at all times, and the HCR_TGE bit to be set when running as a host (and cleared when running as a guest). At the same time, the vector must be set to the current role of the kernel (either host or hypervisor), and a couple of system registers differ between VHE and non-VHE. We implement these by using another set of alternate functions that get dynamically patched. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>