summaryrefslogtreecommitdiff
path: root/arch/arm64/kvm/hyp
AgeCommit message (Collapse)Author
2019-11-08Merge branches 'for-next/elf-hwcap-docs', 'for-next/smccc-conduit-cleanup', ↵Catalin Marinas
'for-next/zone-dma', 'for-next/relax-icc_pmr_el1-sync', 'for-next/double-page-fault', 'for-next/misc', 'for-next/kselftest-arm64-signal' and 'for-next/kaslr-diagnostics' into for-next/core * for-next/elf-hwcap-docs: : Update the arm64 ELF HWCAP documentation docs/arm64: cpu-feature-registers: Rewrite bitfields that don't follow [e, s] docs/arm64: cpu-feature-registers: Documents missing visible fields docs/arm64: elf_hwcaps: Document HWCAP_SB docs/arm64: elf_hwcaps: sort the HWCAP{, 2} documentation by ascending value * for-next/smccc-conduit-cleanup: : SMC calling convention conduit clean-up firmware: arm_sdei: use common SMCCC_CONDUIT_* firmware/psci: use common SMCCC_CONDUIT_* arm: spectre-v2: use arm_smccc_1_1_get_conduit() arm64: errata: use arm_smccc_1_1_get_conduit() arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit() * for-next/zone-dma: : Reintroduction of ZONE_DMA for Raspberry Pi 4 support arm64: mm: reserve CMA and crashkernel in ZONE_DMA32 dma/direct: turn ARCH_ZONE_DMA_BITS into a variable arm64: Make arm64_dma32_phys_limit static arm64: mm: Fix unused variable warning in zone_sizes_init mm: refresh ZONE_DMA and ZONE_DMA32 comments in 'enum zone_type' arm64: use both ZONE_DMA and ZONE_DMA32 arm64: rename variables used to calculate ZONE_DMA32's size arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys() * for-next/relax-icc_pmr_el1-sync: : Relax ICC_PMR_EL1 (GICv3) accesses when ICC_CTLR_EL1.PMHE is clear arm64: Document ICC_CTLR_EL3.PMHE setting requirements arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear * for-next/double-page-fault: : Avoid a double page fault in __copy_from_user_inatomic() if hw does not support auto Access Flag mm: fix double page fault on arm64 if PTE_AF is cleared x86/mm: implement arch_faults_on_old_pte() stub on x86 arm64: mm: implement arch_faults_on_old_pte() on arm64 arm64: cpufeature: introduce helper cpu_has_hw_af() * for-next/misc: : Various fixes and clean-ups arm64: kpti: Add NVIDIA's Carmel core to the KPTI whitelist arm64: mm: Remove MAX_USER_VA_BITS definition arm64: mm: simplify the page end calculation in __create_pgd_mapping() arm64: print additional fault message when executing non-exec memory arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill() arm64: pgtable: Correct typo in comment arm64: docs: cpu-feature-registers: Document ID_AA64PFR1_EL1 arm64: cpufeature: Fix typos in comment arm64/mm: Poison initmem while freeing with free_reserved_area() arm64: use generic free_initrd_mem() arm64: simplify syscall wrapper ifdeffery * for-next/kselftest-arm64-signal: : arm64-specific kselftest support with signal-related test-cases kselftest: arm64: fake_sigreturn_misaligned_sp kselftest: arm64: fake_sigreturn_bad_size kselftest: arm64: fake_sigreturn_duplicated_fpsimd kselftest: arm64: fake_sigreturn_missing_fpsimd kselftest: arm64: fake_sigreturn_bad_size_for_magic0 kselftest: arm64: fake_sigreturn_bad_magic kselftest: arm64: add helper get_current_context kselftest: arm64: extend test_init functionalities kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht] kselftest: arm64: mangle_pstate_invalid_daif_bits kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils kselftest: arm64: extend toplevel skeleton Makefile * for-next/kaslr-diagnostics: : Provide diagnostics on boot for KASLR arm64: kaslr: Check command line before looking for a seed arm64: kaslr: Announce KASLR status on boot
2019-10-28Merge branch 'kvm-arm64/erratum-1319367' of ↵Catalin Marinas
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into for-next/core Similarly to erratum 1165522 that affects Cortex-A76, A57 and A72 respectively suffer from errata 1319537 and 1319367, potentially resulting in TLB corruption if the CPU speculates an AT instruction while switching guests. The fix is slightly more involved since we don't have VHE to help us here, but the idea is the same: when switching a guest in, we must prevent any speculated AT from being able to parse the page tables until S2 is up and running. Only at this stage can we allow AT to take place. For this, we always restore the guest sysregs first, except for its SCTLR and TCR registers, which must be set with SCTLR.M=1 and TCR.EPD{0,1} = {1, 1}, effectively disabling the PTW and TLB allocation. Once S2 is setup, we restore the guest's SCTLR and TCR. Similar things must be done on TLB invalidation... * 'kvm-arm64/erratum-1319367' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms: arm64: Enable and document ARM errata 1319367 and 1319537 arm64: KVM: Prevent speculative S1 PTW when restoring vcpu context arm64: KVM: Disable EL1 PTW when invalidating S2 TLBs arm64: KVM: Reorder system register restoration and stage-2 activation arm64: Add ARM64_WORKAROUND_1319367 for all A57 and A72 versions
2019-10-26arm64: KVM: Prevent speculative S1 PTW when restoring vcpu contextMarc Zyngier
When handling erratum 1319367, we must ensure that the page table walker cannot parse the S1 page tables while the guest is in an inconsistent state. This is done as follows: On guest entry: - TCR_EL1.EPD{0,1} are set, ensuring that no PTW can occur - all system registers are restored, except for TCR_EL1 and SCTLR_EL1 - stage-2 is restored - SCTLR_EL1 and TCR_EL1 are restored On guest exit: - SCTLR_EL1.M and TCR_EL1.EPD{0,1} are set, ensuring that no PTW can occur - stage-2 is disabled - All host system registers are restored Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-10-26arm64: KVM: Disable EL1 PTW when invalidating S2 TLBsMarc Zyngier
When erratum 1319367 is being worked around, special care must be taken not to allow the page table walker to populate TLBs while we have the stage-2 translation enabled (which would otherwise result in a bizare mix of the host S1 and the guest S2). We enforce this by setting TCR_EL1.EPD{0,1} before restoring the S2 configuration, and clear the same bits after having disabled S2. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-10-26arm64: KVM: Reorder system register restoration and stage-2 activationMarc Zyngier
In order to prepare for handling erratum 1319367, we need to make sure that all system registers (and most importantly the registers configuring the virtual memory) are set before we enable stage-2 translation. This results in a minor reorganisation of the load sequence, without any functional change. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-10-15arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clearMarc Zyngier
The GICv3 architecture specification is incredibly misleading when it comes to PMR and the requirement for a DSB. It turns out that this DSB is only required if the CPU interface sends an Upstream Control message to the redistributor in order to update the RD's view of PMR. This message is only sent when ICC_CTLR_EL1.PMHE is set, which isn't the case in Linux. It can still be set from EL3, so some special care is required. But the upshot is that in the (hopefuly large) majority of the cases, we can drop the DSB altogether. This relies on a new static key being set if the boot CPU has PMHE set. The drawback is that this static key has to be exported to modules. Cc: Will Deacon <will@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-08arm64: KVM: Trap VM ops when ARM64_WORKAROUND_CAVIUM_TX2_219_TVM is setMarc Zyngier
In order to workaround the TX2-219 erratum, it is necessary to trap TTBRx_EL1 accesses to EL2. This is done by setting HCR_EL2.TVM on guest entry, which has the side effect of trapping all the other VM-related sysregs as well. To minimize the overhead, a fast path is used so that we don't have to go all the way back to the main sysreg handling code, unless the rest of the hypervisor expects to see these accesses. Cc: <stable@vger.kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-10-03Merge tag 'kvmarm-fixes-5.4-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm fixes for 5.4, take #1 - Remove the now obsolete hyp_alternate_select construct - Fix the TRACE_INCLUDE_PATH macro in the vgic code
2019-09-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "s390: - ioctl hardening - selftests ARM: - ITS translation cache - support for 512 vCPUs - various cleanups and bugfixes PPC: - various minor fixes and preparation x86: - bugfixes all over the place (posted interrupts, SVM, emulation corner cases, blocked INIT) - some IPI optimizations" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (75 commits) KVM: X86: Use IPI shorthands in kvm guest when support KVM: x86: Fix INIT signal handling in various CPU states KVM: VMX: Introduce exit reason for receiving INIT signal on guest-mode KVM: VMX: Stop the preemption timer during vCPU reset KVM: LAPIC: Micro optimize IPI latency kvm: Nested KVM MMUs need PAE root too KVM: x86: set ctxt->have_exception in x86_decode_insn() KVM: x86: always stop emulation on page fault KVM: nVMX: trace nested VM-Enter failures detected by H/W KVM: nVMX: add tracepoint for failed nested VM-Enter x86: KVM: svm: Fix a check in nested_svm_vmrun() KVM: x86: Return to userspace with internal error on unexpected exit reason KVM: x86: Add kvm_emulate_{rd,wr}msr() to consolidate VXM/SVM code KVM: x86: Refactor up kvm_{g,s}et_msr() to simplify callers doc: kvm: Fix return description of KVM_SET_MSRS KVM: X86: Tune PLE Window tracepoint KVM: VMX: Change ple_window type to unsigned int KVM: X86: Remove tailing newline for tracepoints KVM: X86: Trace vcpu_id for vmexit KVM: x86: Manually calculate reserved bits when loading PDPTRS ...
2019-09-16Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "Although there isn't tonnes of code in terms of line count, there are a fair few headline features which I've noted both in the tag and also in the merge commits when I pulled everything together. The part I'm most pleased with is that we had 35 contributors this time around, which feels like a big jump from the usual small group of core arm64 arch developers. Hopefully they all enjoyed it so much that they'll continue to contribute, but we'll see. It's probably worth highlighting that we've pulled in a branch from the risc-v folks which moves our CPU topology code out to where it can be shared with others. Summary: - 52-bit virtual addressing in the kernel - New ABI to allow tagged user pointers to be dereferenced by syscalls - Early RNG seeding by the bootloader - Improve robustness of SMP boot - Fix TLB invalidation in light of recent architectural clarifications - Support for i.MX8 DDR PMU - Remove direct LSE instruction patching in favour of static keys - Function error injection using kprobes - Support for the PPTT "thread" flag introduced by ACPI 6.3 - Move PSCI idle code into proper cpuidle driver - Relaxation of implicit I/O memory barriers - Build with RELR relocations when toolchain supports them - Numerous cleanups and non-critical fixes" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (114 commits) arm64: remove __iounmap arm64: atomics: Use K constraint when toolchain appears to support it arm64: atomics: Undefine internal macros after use arm64: lse: Make ARM64_LSE_ATOMICS depend on JUMP_LABEL arm64: asm: Kill 'asm/atomic_arch.h' arm64: lse: Remove unused 'alt_lse' assembly macro arm64: atomics: Remove atomic_ll_sc compilation unit arm64: avoid using hard-coded registers for LSE atomics arm64: atomics: avoid out-of-line ll/sc atomics arm64: Use correct ll/sc atomic constraints jump_label: Don't warn on __exit jump entries docs/perf: Add documentation for the i.MX8 DDR PMU perf/imx_ddr: Add support for AXI ID filtering arm64: kpti: ensure patched kernel text is fetched from PoU arm64: fix fixmap copy for 16K pages and 48-bit VA perf/smmuv3: Validate groups for global filtering perf/smmuv3: Validate group size arm64: Relax Documentation/arm64/tagged-pointers.rst arm64: kvm: Replace hardcoded '1' with SYS_PAR_EL1_F arm64: mm: Ignore spurious translation faults taken from the kernel ...
2019-09-09arm64: KVM: Replace hyp_alternate_select with has_vhe()Marc Zyngier
Given that the TLB invalidation path is pretty rarely used, there was never any advantage to using hyp_alternate_select() here. has_vhe(), being a glorified static key, is the right tool for the job. Off you go. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com>
2019-09-09arm64: KVM: Drop hyp_alternate_select for checking for ARM64_WORKAROUND_834220Marc Zyngier
There is no reason for using hyp_alternate_select when checking for ARM64_WORKAROUND_834220, as each of the capabilities is also backed by a static key. Just replace the KVM-specific construct with cpus_have_const_cap(ARM64_WORKAROUND_834220). Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com>
2019-08-27arm64: kvm: Replace hardcoded '1' with SYS_PAR_EL1_FWill Deacon
Now that we have a definition for the 'F' field of PAR_EL1, use that instead of coding the immediate directly. Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-18arm64/kvm: Remove VMID rollover I-cache maintenanceMark Rutland
For VPIPT I-caches, we need I-cache maintenance on VMID rollover to avoid an ABA problem. Consider a single vCPU VM, with a pinned stage-2, running with an idmap VA->IPA and idmap IPA->PA. If we don't do maintenance on rollover: // VMID A Writes insn X to PA 0xF Invalidates PA 0xF (for VMID A) I$ contains [{A,F}->X] [VMID ROLLOVER] // VMID B Writes insn Y to PA 0xF Invalidates PA 0xF (for VMID B) I$ contains [{A,F}->X, {B,F}->Y] [VMID ROLLOVER] // VMID A I$ contains [{A,F}->X, {B,F}->Y] Unexpectedly hits stale I$ line {A,F}->X. However, for PIPT and VIPT I-caches, the VMID doesn't affect lookup or constrain maintenance. Given the VMID doesn't affect PIPT and VIPT I-caches, and given VMID rollover is independent of changes to stage-2 mappings, I-cache maintenance cannot be necessary on VMID rollover for PIPT or VIPT I-caches. This patch removes the maintenance on rollover for VIPT and PIPT I-caches. At the same time, the unnecessary colons are removed from the asm statement to make it more legible. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-07-29arm64: KVM: hyp: debug-sr: Mark expected switch fall-throughAnders Roxell
When fall-through warnings was enabled by default the following warnings was starting to show up: ../arch/arm64/kvm/hyp/debug-sr.c: In function ‘__debug_save_state’: ../arch/arm64/kvm/hyp/debug-sr.c:20:19: warning: this statement may fall through [-Wimplicit-fallthrough=] case 15: ptr[15] = read_debug(reg, 15); \ ../arch/arm64/kvm/hyp/debug-sr.c:113:2: note: in expansion of macro ‘save_debug’ save_debug(dbg->dbg_bcr, dbgbcr, brps); ^~~~~~~~~~ ../arch/arm64/kvm/hyp/debug-sr.c:21:2: note: here case 14: ptr[14] = read_debug(reg, 14); \ ^~~~ ../arch/arm64/kvm/hyp/debug-sr.c:113:2: note: in expansion of macro ‘save_debug’ save_debug(dbg->dbg_bcr, dbgbcr, brps); ^~~~~~~~~~ ../arch/arm64/kvm/hyp/debug-sr.c:21:19: warning: this statement may fall through [-Wimplicit-fallthrough=] case 14: ptr[14] = read_debug(reg, 14); \ ../arch/arm64/kvm/hyp/debug-sr.c:113:2: note: in expansion of macro ‘save_debug’ save_debug(dbg->dbg_bcr, dbgbcr, brps); ^~~~~~~~~~ ../arch/arm64/kvm/hyp/debug-sr.c:22:2: note: here case 13: ptr[13] = read_debug(reg, 13); \ ^~~~ ../arch/arm64/kvm/hyp/debug-sr.c:113:2: note: in expansion of macro ‘save_debug’ save_debug(dbg->dbg_bcr, dbgbcr, brps); ^~~~~~~~~~ Rework to add a 'Fall through' comment where the compiler warned about fall-through, hence silencing the warning. Fixes: d93512ef0f0e ("Makefile: Globally enable fall-through warning") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> [maz: fixed commit message] Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-07-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - support for chained PMU counters in guests - improved SError handling - handle Neoverse N1 erratum #1349291 - allow side-channel mitigation status to be migrated - standardise most AArch64 system register accesses to msr_s/mrs_s - fix host MPIDR corruption on 32bit - selftests ckleanups x86: - PMU event {white,black}listing - ability for the guest to disable host-side interrupt polling - fixes for enlightened VMCS (Hyper-V pv nested virtualization), - new hypercall to yield to IPI target - support for passing cstate MSRs through to the guest - lots of cleanups and optimizations Generic: - Some txt->rST conversions for the documentation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (128 commits) Documentation: virtual: Add toctree hooks Documentation: kvm: Convert cpuid.txt to .rst Documentation: virtual: Convert paravirt_ops.txt to .rst KVM: x86: Unconditionally enable irqs in guest context KVM: x86: PMU Event Filter kvm: x86: Fix -Wmissing-prototypes warnings KVM: Properly check if "page" is valid in kvm_vcpu_unmap KVM: arm/arm64: Initialise host's MPIDRs by reading the actual register KVM: LAPIC: Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane kvm: LAPIC: write down valid APIC registers KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s KVM: doc: Add API documentation on the KVM_REG_ARM_WORKAROUNDS register KVM: arm/arm64: Add save/restore support for firmware workaround state arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests KVM: arm/arm64: Support chained PMU counters KVM: arm/arm64: Remove pmc->bitmask KVM: arm/arm64: Re-create event when setting counter value KVM: arm/arm64: Extract duplicated code to own function KVM: arm/arm64: Rename kvm_pmu_{enable/disable}_counter functions KVM: LAPIC: ARBPRI is a reserved register for x2APIC ...
2019-07-08Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP} - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to manage the permissions of executable vmalloc regions more strictly - Slight performance improvement by keeping softirqs enabled while touching the FPSIMD/SVE state (kernel_neon_begin/end) - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG and AXFLAG instructions for floating point comparison flags manipulation) and FRINT (rounding floating point numbers to integers) - Re-instate ARM64_PSEUDO_NMI support which was previously marked as BROKEN due to some bugs (now fixed) - Improve parking of stopped CPUs and implement an arm64-specific panic_smp_self_stop() to avoid warning on not being able to stop secondary CPUs during panic - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI platforms - perf: DDR performance monitor support for iMX8QXP - cache_line_size() can now be set from DT or ACPI/PPTT if provided to cope with a system cache info not exposed via the CPUID registers - Avoid warning on hardware cache line size greater than ARCH_DMA_MINALIGN if the system is fully coherent - arm64 do_page_fault() and hugetlb cleanups - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep) - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags' introduced in 5.1) - CONFIG_RANDOMIZE_BASE now enabled in defconfig - Allow the selection of ARM64_MODULE_PLTS, currently only done via RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill over into the vmalloc area - Make ZONE_DMA32 configurable * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits) perf: arm_spe: Enable ACPI/Platform automatic module loading arm_pmu: acpi: spe: Add initial MADT/SPE probing ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens ACPI/PPTT: Modify node flag detection to find last IDENTICAL x86/entry: Simplify _TIF_SYSCALL_EMU handling arm64: rename dump_instr as dump_kernel_instr arm64/mm: Drop [PTE|PMD]_TYPE_FAULT arm64: Implement panic_smp_self_stop() arm64: Improve parking of stopped CPUs arm64: Expose FRINT capabilities to userspace arm64: Expose ARMv8.5 CondM capability to userspace arm64: defconfig: enable CONFIG_RANDOMIZE_BASE arm64: ARM64_MODULES_PLTS must depend on MODULES arm64: bpf: do not allocate executable memory arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP arm64: module: create module allocations without exec permissions arm64: Allow user selection of ARM64_MODULE_PLTS acpi/arm64: ignore 5.1 FADTs that are reported as 5.0 arm64: Allow selecting Pseudo-NMI again ...
2019-07-05KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_sDave Martin
Currently, the {read,write}_sysreg_el*() accessors for accessing particular ELs' sysregs in the presence of VHE rely on some local hacks and define their system register encodings in a way that is inconsistent with the core definitions in <asm/sysreg.h>. As a result, it is necessary to add duplicate definitions for any system register that already needs a definition in sysreg.h for other reasons. This is a bit of a maintenance headache, and the reasons for the _el*() accessors working the way they do is a bit historical. This patch gets rid of the shadow sysreg definitions in <asm/kvm_hyp.h>, converts the _el*() accessors to use the core __msr_s/__mrs_s interface, and converts all call sites to use the standard sysreg #define names (i.e., upper case, with SYS_ prefix). This patch will conflict heavily anyway, so the opportunity to clean up some bad whitespace in the context of the changes is taken. The change exposes a few system registers that have no sysreg.h definition, due to msr_s/mrs_s being used in place of msr/mrs: additions are made in order to fill in the gaps. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Link: https://www.spinics.net/lists/kvm-arm/msg31717.html [Rebased to v4.21-rc1] Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> [Rebased to v5.2-rc5, changelog updates] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-07-05KVM: arm64: Skip more of the SError vaxorcismJames Morse
During __guest_exit() we need to consume any SError left pending by the guest so it doesn't contaminate the host. With v8.2 we use the ESB-instruction. For systems without v8.2, we use dsb+isb and unmask SError. We do this on every guest exit. Use the same dsb+isr_el1 trick, this lets us know if an SError is pending after the dsb, allowing us to skip the isb and self-synchronising PSTATE write if its not. This means SError remains masked during KVM's world-switch, so any SError that occurs during this time is reported by the host, instead of causing a hyp-panic. As we're benchmarking this code lets polish the layout. If you give gcc likely()/unlikely() hints in an if() condition, it shuffles the generated assembly so that the likely case is immediately after the branch. Lets do the same here. Signed-off-by: James Morse <james.morse@arm.com> Changes since v2: * Added isb after the dsb to prevent an early read Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-07-05KVM: arm64: Re-mask SError after the one instruction windowJames Morse
KVM consumes any SError that were pending during guest exit with a dsb/isb and unmasking SError. It currently leaves SError unmasked for the rest of world-switch. This means any SError that occurs during this part of world-switch will cause a hyp-panic. We'd much prefer it to remain pending until we return to the host. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-07-05KVM: arm64: Defer guest entry when an asynchronous exception is pendingJames Morse
SError that occur during world-switch's entry to the guest will be accounted to the guest, as the exception is masked until we enter the guest... but we want to attribute the SError as precisely as possible. Reading DISR_EL1 before guest entry requires free registers, and using ESB+DISR_EL1 to consume and read back the ESR would leave KVM holding a host SError... We would rather leave the SError pending and let the host take it once we exit world-switch. To do this, we need to defer guest-entry if an SError is pending. Read the ISR to see if SError (or an IRQ) is pending. If so fake an exit. Place this check between __guest_enter()'s save of the host registers, and restore of the guest's. SError that occur between here and the eret into the guest must have affected the guest's registers, which we can naturally attribute to the guest. The dsb is needed to ensure any previous writes have been done before we read ISR_EL1. On systems without the v8.2 RAS extensions this doesn't give us anything as we can't contain errors, and the ESR bits to describe the severity are all implementation-defined. Replace this with a nop for these systems. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-07-05KVM: arm64: Consume pending SError as early as possibleJames Morse
On systems with v8.2 we switch the 'vaxorcism' of guest SError with an alternative sequence that uses the ESB-instruction, then reads DISR_EL1. This saves the unmasking and remasking of asynchronous exceptions. We do this after we've saved the guest registers and restored the host's. Any SError that becomes pending due to this will be accounted to the guest, when it actually occurred during host-execution. Move the ESB-instruction as early as possible. Any guest SError will become pending due to this ESB-instruction and then consumed to DISR_EL1 before the host touches anything. This lets us account for host/guest SError precisely on the guest exit exception boundary. Because the ESB-instruction now lands in the preamble section of the vectors, we need to add it to the unpatched indirect vectors too, and to any sequence that may be patched in over the top. The ESB-instruction always lives in the head of the vectors, to be before any memory write. Whereas the register-store always lives in the tail. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-07-05KVM: arm64: Make indirect vectors preamble behaviour symmetricJames Morse
The KVM indirect vectors support is a little complicated. Different CPUs may use different exception vectors for KVM that are generated at boot. Adding new instructions involves checking all the possible combinations do the right thing. To make changes here easier to review lets state what we expect of the preamble: 1. The first vector run, must always run the preamble. 2. Patching the head or tail of the vector shouldn't remove preamble instructions. Today, this is easy as we only have one instruction in the preamble. Change the unpatched tail of the indirect vector so that it always runs this, regardless of patching. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-07-05KVM: arm64: Abstract the size of the HYP vectors pre-ambleJames Morse
The EL2 vector hardening feature causes KVM to generate vectors for each type of CPU present in the system. The generated sequences already do some of the early guest-exit work (i.e. saving registers). To avoid duplication the generated vectors branch to the original vector just after the preamble. This size is hard coded. Adding new instructions to the HYP vector causes strange side effects, which are difficult to debug as the affected code is patched in at runtime. Add KVM_VECTOR_PREAMBLE to tell kvm_patch_vector_branch() how big the preamble is. The valid_vect macro can then validate this at build time. Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-06-21arm64: Fix incorrect irqflag restore for priority maskingJulien Thierry
When using IRQ priority masking to disable interrupts, in order to deal with the PSR.I state, local_irq_save() would convert the I bit into a PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore() potentially modifying the value of PMR in undesired location due to the state of PSR.I upon flag saving [1]. In an attempt to solve this issue in a less hackish manner, introduce a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent whether PSR.I is being used to disable interrupts, in which case it takes precedence of the status of interrupt masking via PMR. GIC_PRIO_PSR_I_SET is chosen such that (<pmr_value> | GIC_PRIO_PSR_I_SET) does not mask more interrupts than <pmr_value> as some sections (e.g. arch_cpu_idle(), interrupt acknowledge path) requires PMR not to mask interrupts that could be signaled to the CPU when using only PSR.I. [1] https://www.spinics.net/lists/arm-kernel/msg716956.html Fixes: 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: <stable@vger.kernel.org> # 5.1.x- Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Pouloze <suzuki.poulose@arm.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24KVM: arm/arm64: Move cc/it checks under hyp's Makefile to avoid instrumentationJames Morse
KVM has helpers to handle the condition codes of trapped aarch32 instructions. These are marked __hyp_text and used from HYP, but they aren't built by the 'hyp' Makefile, which has all the runes to avoid ASAN and KCOV instrumentation. Move this code to a new hyp/aarch32.c to avoid a hyp-panic when starting an aarch32 guest on a host built with the ASAN/KCOV debug options. Fixes: 021234ef3752f ("KVM: arm64: Make kvm_condition_valid32() accessible from EL2") Fixes: 8cebe750c4d9a ("arm64: KVM: Make kvm_skip_instr32 available to HYP") Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-24KVM: arm64: Move pmu hyp code under hyp's Makefile to avoid instrumentationJames Morse
KVM's pmu.c contains the __hyp_text needed to switch the pmu registers between host and guest. Because this isn't covered by the 'hyp' Makefile, it can be built with kasan and friends when these are enabled in Kconfig. When starting a guest, this results in: | Kernel panic - not syncing: HYP panic: | PS:a00003c9 PC:000083000028ada0 ESR:86000007 | FAR:000083000028ada0 HPFAR:0000000029df5300 PAR:0000000000000000 | VCPU:000000004e10b7d6 | CPU: 0 PID: 3088 Comm: qemu-system-aar Not tainted 5.2.0-rc1 #11026 | Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Plat | Call trace: | dump_backtrace+0x0/0x200 | show_stack+0x20/0x30 | dump_stack+0xec/0x158 | panic+0x1ec/0x420 | panic+0x0/0x420 | SMP: stopping secondary CPUs | Kernel Offset: disabled | CPU features: 0x002,25006082 | Memory Limit: none | ---[ end Kernel panic - not syncing: HYP panic: This is caused by functions in pmu.c calling the instrumented code, which isn't mapped to hyp. From objdump -r: | RELOCATION RECORDS FOR [.hyp.text]: | OFFSET TYPE VALUE | 0000000000000010 R_AARCH64_CALL26 __sanitizer_cov_trace_pc | 0000000000000018 R_AARCH64_CALL26 __asan_load4_noabort | 0000000000000024 R_AARCH64_CALL26 __asan_load4_noabort Move the affected code to a new file under 'hyp's Makefile. Fixes: 3d91befbb3a0 ("arm64: KVM: Enable !VHE support for :G/:H perf event modifiers") Cc: Andrew Murray <Andrew.Murray@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-24arm64: KVM: Enable !VHE support for :G/:H perf event modifiersAndrew Murray
Enable/disable event counters as appropriate when entering and exiting the guest to enable support for guest or host only event counting. For both VHE and non-VHE we switch the counters between host/guest at EL2. The PMU may be on when we change which counters are enabled however we avoid adding an isb as we instead rely on existing context synchronisation events: the eret to enter the guest (__guest_enter) and eret in kvm_call_hyp for __kvm_vcpu_run_nvhe on returning. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-24KVM: arm/arm64: Context-switch ptrauth registersMark Rutland
When pointer authentication is supported, a guest may wish to use it. This patch adds the necessary KVM infrastructure for this to work, with a semi-lazy context switch of the pointer auth state. Pointer authentication feature is only enabled when VHE is built in the kernel and present in the CPU implementation so only VHE code paths are modified. When we schedule a vcpu, we disable guest usage of pointer authentication instructions and accesses to the keys. While these are disabled, we avoid context-switching the keys. When we trap the guest trying to use pointer authentication functionality, we change to eagerly context-switching the keys, and enable the feature. The next time the vcpu is scheduled out/in, we start again. However the host key save is optimized and implemented inside ptrauth instruction/register access trap. Pointer authentication consists of address authentication and generic authentication, and CPUs in a system might have varied support for either. Where support for either feature is not uniform, it is hidden from guests via ID register emulation, as a result of the cpufeature framework in the host. Unfortunately, address authentication and generic authentication cannot be trapped separately, as the architecture provides a single EL2 trap covering both. If we wish to expose one without the other, we cannot prevent a (badly-written) guest from intermittently using a feature which is not uniformly supported (when scheduled on a physical CPU which supports the relevant feature). Hence, this patch expects both type of authentication to be present in a cpu. This switch of key is done from guest enter/exit assembly as preparation for the upcoming in-kernel pointer authentication support. Hence, these key switching routines are not implemented in C code as they may cause pointer authentication key signing error in some situations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> [Only VHE, key switch in full assembly, vcpu_has_ptrauth checks , save host key in ptrauth exception trap] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: kvmarm@lists.cs.columbia.edu [maz: various fixups] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-29KVM: arm64/sve: Context switch the SVE registersDave Martin
In order to give each vcpu its own view of the SVE registers, this patch adds context storage via a new sve_state pointer in struct vcpu_arch. An additional member sve_max_vl is also added for each vcpu, to determine the maximum vector length visible to the guest and thus the value to be configured in ZCR_EL2.LEN while the vcpu is active. This also determines the layout and size of the storage in sve_state, which is read and written by the same backend functions that are used for context-switching the SVE state for host tasks. On SVE-enabled vcpus, SVE access traps are now handled by switching in the vcpu's SVE context and disabling the trap before returning to the guest. On other vcpus, the trap is not handled and an exit back to the host occurs, where the handle_sve() fallback path reflects an undefined instruction exception back to the guest, consistently with the behaviour of non-SVE-capable hardware (as was done unconditionally prior to this patch). No SVE handling is added on non-VHE-only paths, since VHE is an architectural and Kconfig prerequisite of SVE. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Tested-by: zhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-29KVM: arm64/sve: System register context switch and access supportDave Martin
This patch adds the necessary support for context switching ZCR_EL1 for each vcpu. ZCR_EL1 is trapped alongside the FPSIMD/SVE registers, so it makes sense for it to be handled as part of the guest FPSIMD/SVE context for context switch purposes instead of handling it as a general system register. This means that it can be switched in lazily at the appropriate time. No effort is made to track host context for this register, since SVE requires VHE: thus the hosts's value for this register lives permanently in ZCR_EL2 and does not alias the guest's value at any time. The Hyp switch and fpsimd context handling code is extended appropriately. Accessors are added in sys_regs.c to expose the SVE system registers and ID register fields. Because these need to be conditionally visible based on the guest configuration, they are implemented separately for now rather than by use of the generic system register helpers. This may be abstracted better later on when/if there are more features requiring this model. ID_AA64ZFR0_EL1 is RO-RAZ for MRS/MSR when SVE is disabled for the guest, but for compatibility with non-SVE aware KVM implementations the register should not be enumerated at all for KVM_GET_REG_LIST in this case. For consistency we also reject ioctl access to the register. This ensures that a non-SVE-enabled guest looks the same to userspace, irrespective of whether the kernel KVM implementation supports SVE. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Tested-by: zhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - some cleanups - direct physical timer assignment - cache sanitization for 32-bit guests s390: - interrupt cleanup - introduction of the Guest Information Block - preparation for processor subfunctions in cpu models PPC: - bug fixes and improvements, especially related to machine checks and protection keys x86: - many, many cleanups, including removing a bunch of MMU code for unnecessary optimizations - AVIC fixes Generic: - memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits) kvm: vmx: fix formatting of a comment KVM: doc: Document the life cycle of a VM and its resources MAINTAINERS: Add KVM selftests to existing KVM entry Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()" KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char() KVM: PPC: Fix compilation when KVM is not enabled KVM: Minor cleanups for kvm_main.c KVM: s390: add debug logging for cpu model subfunctions KVM: s390: implement subfunction processor calls arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2 KVM: arm/arm64: Remove unused timer variable KVM: PPC: Book3S: Improve KVM reference counting KVM: PPC: Book3S HV: Fix build failure without IOMMU support Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()" x86: kvmguest: use TSC clocksource if invariant TSC is exposed KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes() KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children ...
2019-03-10Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Pseudo NMI support for arm64 using GICv3 interrupt priorities - uaccess macros clean-up (unsafe user accessors also merged but reverted, waiting for objtool support on arm64) - ptrace regsets for Pointer Authentication (ARMv8.3) key management - inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by the riscv maintainers) - arm64/perf updates: PMU bindings converted to json-schema, unused variable and misleading comment removed - arm64/debug fixes to ensure checking of the triggering exception level and to avoid the propagation of the UNKNOWN FAR value into the si_code for debug signals - Workaround for Fujitsu A64FX erratum 010001 - lib/raid6 ARM NEON optimisations - NR_CPUS now defaults to 256 on arm64 - Minor clean-ups (documentation/comments, Kconfig warning, unused asm-offsets, clang warnings) - MAINTAINERS update for list information to the ARM64 ACPI entry * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits) arm64: mmu: drop paging_init comments arm64: debug: Ensure debug handlers check triggering exception level arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals Revert "arm64: uaccess: Implement unsafe accessors" arm64: avoid clang warning about self-assignment arm64: Kconfig.platforms: fix warning unmet direct dependencies lib/raid6: arm: optimize away a mask operation in NEON recovery routine lib/raid6: use vdupq_n_u8 to avoid endianness warnings arm64: io: Hook up __io_par() for inX() ordering riscv: io: Update __io_[p]ar() macros to take an argument asm-generic/io: Pass result of I/O accessor to __io_[p]ar() arm64: Add workaround for Fujitsu A64FX erratum 010001 arm64: Rename get_thread_info() arm64: Remove documentation about TIF_USEDFPU arm64: irqflags: Fix clang build warnings arm64: Enable the support of pseudo-NMIs arm64: Skip irqflags tracing for NMI in IRQs disabled context arm64: Skip preemption when exiting an NMI arm64: Handle serror in NMI context irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI ...
2019-02-19arm/arm64: KVM: Statically configure the host's view of MPIDRMarc Zyngier
We currently eagerly save/restore MPIDR. It turns out to be slightly pointless: - On the host, this value is known as soon as we're scheduled on a physical CPU - In the guest, this value cannot change, as it is set by KVM (and this is a read-only register) The result of the above is that we can perfectly avoid the eager saving of MPIDR_EL1, and only keep the restore. We just have to setup the host contexts appropriately at boot time. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
2019-02-19arm64: KVM: Drop VHE-specific HYP call stubMarc Zyngier
We now call VHE code directly, without going through any central dispatching function. Let's drop that code. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
2019-02-07KVM: arm64: Forbid kprobing of the VHE world-switch codeJames Morse
On systems with VHE the kernel and KVM's world-switch code run at the same exception level. Code that is only used on a VHE system does not need to be annotated as __hyp_text as it can reside anywhere in the kernel text. __hyp_text was also used to prevent kprobes from patching breakpoint instructions into this region, as this code runs at a different exception level. While this is no longer true with VHE, KVM still switches VBAR_EL1, meaning a kprobe's breakpoint executed in the world-switch code will cause a hyp-panic. echo "p:weasel sysreg_save_guest_state_vhe" > /sys/kernel/debug/tracing/kprobe_events echo 1 > /sys/kernel/debug/tracing/events/kprobes/weasel/enable lkvm run -k /boot/Image --console serial -p "console=ttyS0 earlycon=uart,mmio,0x3f8" # lkvm run -k /boot/Image -m 384 -c 3 --name guest-1474 Info: Placing fdt at 0x8fe00000 - 0x8fffffff Info: virtio-mmio.devices=0x200@0x10000:36 Info: virtio-mmio.devices=0x200@0x10200:37 Info: virtio-mmio.devices=0x200@0x10400:38 [ 614.178186] Kernel panic - not syncing: HYP panic: [ 614.178186] PS:404003c9 PC:ffff0000100d70e0 ESR:f2000004 [ 614.178186] FAR:0000000080080000 HPFAR:0000000000800800 PAR:1d00007edbadc0de [ 614.178186] VCPU:00000000f8de32f1 [ 614.178383] CPU: 2 PID: 1482 Comm: kvm-vcpu-0 Not tainted 5.0.0-rc2 #10799 [ 614.178446] Call trace: [ 614.178480] dump_backtrace+0x0/0x148 [ 614.178567] show_stack+0x24/0x30 [ 614.178658] dump_stack+0x90/0xb4 [ 614.178710] panic+0x13c/0x2d8 [ 614.178793] hyp_panic+0xac/0xd8 [ 614.178880] kvm_vcpu_run_vhe+0x9c/0xe0 [ 614.178958] kvm_arch_vcpu_ioctl_run+0x454/0x798 [ 614.179038] kvm_vcpu_ioctl+0x360/0x898 [ 614.179087] do_vfs_ioctl+0xc4/0x858 [ 614.179174] ksys_ioctl+0x84/0xb8 [ 614.179261] __arm64_sys_ioctl+0x28/0x38 [ 614.179348] el0_svc_common+0x94/0x108 [ 614.179401] el0_svc_handler+0x38/0x78 [ 614.179487] el0_svc+0x8/0xc [ 614.179558] SMP: stopping secondary CPUs [ 614.179661] Kernel Offset: disabled [ 614.179695] CPU features: 0x003,2a80aa38 [ 614.179758] Memory Limit: none [ 614.179858] ---[ end Kernel panic - not syncing: HYP panic: [ 614.179858] PS:404003c9 PC:ffff0000100d70e0 ESR:f2000004 [ 614.179858] FAR:0000000080080000 HPFAR:0000000000800800 PAR:1d00007edbadc0de [ 614.179858] VCPU:00000000f8de32f1 ]--- Annotate the VHE world-switch functions that aren't marked __hyp_text using NOKPROBE_SYMBOL(). Signed-off-by: James Morse <james.morse@arm.com> Fixes: 3f5c90b890ac ("KVM: arm64: Introduce VHE-specific kvm_vcpu_run") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-06arm64: kvm: Unmask PMR before entering guestJulien Thierry
Interrupts masked by ICC_PMR_EL1 will not be signaled to the CPU. This means that hypervisor will not receive masked interrupts while running a guest. We need to make sure that all maskable interrupts are masked from the time we call local_irq_disable() in the main run loop, and remain so until we call local_irq_enable() after returning from the guest, and we need to ensure that we see no interrupts at all (including pseudo-NMIs) in the middle of the VM world-switch, while at the same time we need to ensure we exit the guest when there are interrupts for the host. We can accomplish this with pseudo-NMIs enabled by: (1) local_irq_disable: set the priority mask (2) enter guest: set PSTATE.I (3) clear the priority mask (4) eret to guest (5) exit guest: set the priotiy mask clear PSTATE.I (and restore other host PSTATE bits) (6) local_irq_enable: clear the priority mask. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-12-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - selftests improvements - large PUD support for HugeTLB - single-stepping fixes - improved tracing - various timer and vGIC fixes x86: - Processor Tracing virtualization - STIBP support - some correctness fixes - refactorings and splitting of vmx.c - use the Hyper-V range TLB flush hypercall - reduce order of vcpu struct - WBNOINVD support - do not use -ftrace for __noclone functions - nested guest support for PAUSE filtering on AMD - more Hyper-V enlightenments (direct mode for synthetic timers) PPC: - nested VFIO s390: - bugfixes only this time" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: x86: Add CPUID support for new instruction WBNOINVD kvm: selftests: ucall: fix exit mmio address guessing Revert "compiler-gcc: disable -ftracer for __noclone functions" KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup MAINTAINERS: Add arch/x86/kvm sub-directories to existing KVM/x86 entry KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streams KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range() KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp() KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte() KVM: Make kvm_set_spte_hva() return int KVM: Replace old tlb flush function with new one to flush a specified range. KVM/MMU: Add tlb flush with range helper function KVM/VMX: Add hv tlb range flush support x86/hyper-v: Add HvFlushGuestAddressList hypercall support KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops KVM: x86: Disable Intel PT when VMXON in L1 guest KVM: x86: Set intercept for Intel PT MSRs read/write KVM: x86: Implement Intel PT MSRs read/write emulation ...
2018-12-18arm64: KVM: Consistently advance singlestep when emulating instructionsMark Rutland
When we emulate a guest instruction, we don't advance the hardware singlestep state machine, and thus the guest will receive a software step exception after a next instruction which is not emulated by the host. We bodge around this in an ad-hoc fashion. Sometimes we explicitly check whether userspace requested a single step, and fake a debug exception from within the kernel. Other times, we advance the HW singlestep state rely on the HW to generate the exception for us. Thus, the observed step behaviour differs for host and guest. Let's make this simpler and consistent by always advancing the HW singlestep state machine when we skip an instruction. Thus we can rely on the hardware to generate the singlestep exception for us, and never need to explicitly check for an active-pending step, nor do we need to fake a debug exception from the guest. Cc: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-13arm64/kvm: consistently handle host HCR_EL2 flagsMark Rutland
In KVM we define the configuration of HCR_EL2 for a VHE HOST in HCR_HOST_VHE_FLAGS, but we don't have a similar definition for the non-VHE host flags, and open-code HCR_RW. Further, in head.S we open-code the flags for VHE and non-VHE configurations. In future, we're going to want to configure more flags for the host, so lets add a HCR_HOST_NVHE_FLAGS defintion, and consistently use both HCR_HOST_VHE_FLAGS and HCR_HOST_NVHE_FLAGS in the kvm code and head.S. We now use mov_q to generate the HCR_EL2 value, as we use when configuring other registers in head.S. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10Merge branch 'kvm/cortex-a76-erratum-1165522' into aarch64/for-next/coreWill Deacon
Pull in KVM workaround for A76 erratum #116522. Conflicts: arch/arm64/include/asm/cpucaps.h
2018-12-10arm64: KVM: Handle ARM erratum 1165522 in TLB invalidationMarc Zyngier
In order to avoid TLB corruption whilst invalidating TLBs on CPUs affected by erratum 1165522, we need to prevent S1 page tables from being usable. For this, we set the EL1 S1 MMU on, and also disable the page table walker (by setting the TCR_EL1.EPD* bits to 1). This ensures that once we switch to the EL1/EL0 translation regime, speculated AT instructions won't be able to parse the page tables. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: KVM: Add synchronization on translation regime change for erratum 1165522Marc Zyngier
In order to ensure that slipping HCR_EL2.TGE is done at the right time when switching translation regime, let insert the required ISBs that will be patched in when erratum 1165522 is detected. Take this opportunity to add the missing include of asm/alternative.h which was getting there by pure luck. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: KVM: Install stage-2 translation before enabling trapsMarc Zyngier
It is a bit odd that we only install stage-2 translation after having cleared HCR_EL2.TGE, which means that there is a window during which AT requests could fail as stage-2 is not configured yet. Let's move stage-2 configuration before we clear TGE, making the guest entry sequence clearer: we first configure all the guest stuff, then only switch to the guest translation regime. While we're at it, do the same thing for !VHE. It doesn't hurt, and keeps things symmetric. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: KVM: Make VHE Stage-2 TLB invalidation operations non-interruptibleMarc Zyngier
Contrary to the non-VHE version of the TLB invalidation helpers, the VHE code has interrupts enabled, meaning that we can take an interrupt in the middle of such a sequence, and start running something else with HCR_EL2.TGE cleared. That's really not a good idea. Take the heavy-handed option and disable interrupts in __tlb_switch_to_guest_vhe, restoring them in __tlb_switch_to_host_vhe. The latter also gain an ISB in order to make sure that TGE really has taken effect. Cc: stable@vger.kernel.org Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: entry: Place an SB sequence following an ERET instructionWill Deacon
Some CPUs can speculate past an ERET instruction and potentially perform speculative accesses to memory before processing the exception return. Since the register state is often controlled by a lower privilege level at the point of an ERET, this could potentially be used as part of a side-channel attack. This patch emits an SB sequence after each ERET so that speculation is held up on exception return. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-10-25Merge tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Radim Krčmář: "ARM: - Improved guest IPA space support (32 to 52 bits) - RAS event delivery for 32bit - PMU fixes - Guest entry hardening - Various cleanups - Port of dirty_log_test selftest PPC: - Nested HV KVM support for radix guests on POWER9. The performance is much better than with PR KVM. Migration and arbitrary level of nesting is supported. - Disable nested HV-KVM on early POWER9 chips that need a particular hardware bug workaround - One VM per core mode to prevent potential data leaks - PCI pass-through optimization - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base s390: - Initial version of AP crypto virtualization via vfio-mdev - Improvement for vfio-ap - Set the host program identifier - Optimize page table locking x86: - Enable nested virtualization by default - Implement Hyper-V IPI hypercalls - Improve #PF and #DB handling - Allow guests to use Enlightened VMCS - Add migration selftests for VMCS and Enlightened VMCS - Allow coalesced PIO accesses - Add an option to perform nested VMCS host state consistency check through hardware - Automatic tuning of lapic_timer_advance_ns - Many fixes, minor improvements, and cleanups" * tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned Revert "kvm: x86: optimize dr6 restore" KVM: PPC: Optimize clearing TCEs for sparse tables x86/kvm/nVMX: tweak shadow fields selftests/kvm: add missing executables to .gitignore KVM: arm64: Safety check PSTATE when entering guest and handle IL KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips arm/arm64: KVM: Enable 32 bits kvm vcpu events support arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension() KVM: arm64: Fix caching of host MDCR_EL2 value KVM: VMX: enable nested virtualization by default KVM/x86: Use 32bit xor to clear registers in svm.c kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD kvm: vmx: Defer setting of DR6 until #DB delivery kvm: x86: Defer setting of CR2 until #PF delivery kvm: x86: Add payload operands to kvm_multiple_exception kvm: x86: Add exception payload fields to kvm_vcpu_events kvm: x86: Add has_payload and payload to kvm_queued_exception KVM: Documentation: Fix omission in struct kvm_vcpu_events KVM: selftests: add Enlightened VMCS test ...
2018-10-22Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Apart from some new arm64 features and clean-ups, this also contains the core mmu_gather changes for tracking the levels of the page table being cleared and a minor update to the generic compat_sys_sigaltstack() introducing COMPAT_SIGMINSKSZ. Summary: - Core mmu_gather changes which allow tracking the levels of page-table being cleared together with the arm64 low-level flushing routines - Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to mitigate Spectre-v4 dynamically without trapping to EL3 firmware - Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack - Optimise emulation of MRS instructions to ID_* registers on ARMv8.4 - Support for Common Not Private (CnP) translations allowing threads of the same CPU to share the TLB entries - Accelerated crc32 routines - Move swapper_pg_dir to the rodata section - Trap WFI instruction executed in user space - ARM erratum 1188874 workaround (arch_timer) - Miscellaneous fixes and clean-ups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (78 commits) arm64: KVM: Guests can skip __install_bp_hardening_cb()s HYP work arm64: cpufeature: Trap CTR_EL0 access only where it is necessary arm64: cpufeature: Fix handling of CTR_EL0.IDC field arm64: cpufeature: ctr: Fix cpu capability check for late CPUs Documentation/arm64: HugeTLB page implementation arm64: mm: Use __pa_symbol() for set_swapper_pgd() arm64: Add silicon-errata.txt entry for ARM erratum 1188873 Revert "arm64: uaccess: implement unsafe accessors" arm64: mm: Drop the unused cpu parameter MAINTAINERS: fix bad sdei paths arm64: mm: Use #ifdef for the __PAGETABLE_P?D_FOLDED defines arm64: Fix typo in a comment in arch/arm64/mm/kasan_init.c arm64: xen: Use existing helper to check interrupt status arm64: Use daifflag_restore after bp_hardening arm64: daifflags: Use irqflags functions for daifflags arm64: arch_timer: avoid unused function warning arm64: Trap WFI executed in userspace arm64: docs: Document SSBS HWCAP arm64: docs: Fix typos in ELF hwcaps arm64/kprobes: remove an extra semicolon in arch_prepare_kprobe ...
2018-10-19KVM: arm64: Safety check PSTATE when entering guest and handle ILChristoffer Dall
This commit adds a paranoid check when entering the guest to make sure we don't attempt running guest code in an equally or more privilged mode than the hypervisor. We also catch other accidental programming of the SPSR_EL2 which results in an illegal exception return and report this safely back to the user. Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>