summaryrefslogtreecommitdiff
path: root/arch/arm64/kvm
AgeCommit message (Collapse)Author
2021-06-30arm64: decouple check whether pfn is in linear map from pfn_valid()Mike Rapoport
The intended semantics of pfn_valid() is to verify whether there is a struct page for the pfn in question and nothing else. Yet, on arm64 it is used to distinguish memory areas that are mapped in the linear map vs those that require ioremap() to access them. Introduce a dedicated pfn_is_map_memory() wrapper for memblock_is_map_memory() to perform such check and use it where appropriate. Using a wrapper allows to avoid cyclic include dependencies. While here also update style of pfn_valid() so that both pfn_valid() and pfn_is_map_memory() declarations will be consistent. Link: https://lkml.kernel.org/r/20210511100550.28178-4-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29arch/arm64/kvm: use vma_lookup() instead of find_vma_intersection()Liam Howlett
vma_lookup() finds the vma of a specific address with a cleaner interface and is more readable. Link: https://lkml.kernel.org/r/20210521174745.2219620-5-Liam.Howlett@Oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-27KVM: arm64: Prevent mixed-width VM creationMarc Zyngier
It looks like we have tolerated creating mixed-width VMs since... forever. However, that was never the intention, and we'd rather not have to support that pointless complexity. Forbid such a setup by making sure all the vcpus have the same register width. Reported-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210524170752.1549797-1-maz@kernel.org
2021-05-27KVM: arm64: Resolve all pending PC updates before immediate exitZenghui Yu
Commit 26778aaa134a ("KVM: arm64: Commit pending PC adjustemnts before returning to userspace") fixed the PC updating issue by forcing an explicit synchronisation of the exception state on vcpu exit to userspace. However, we forgot to take into account the case where immediate_exit is set by userspace and KVM_RUN will exit immediately. Fix it by resolving all pending PC updates before returning to userspace. Since __kvm_adjust_pc() relies on a loaded vcpu context, I moved the immediate_exit checking right after vcpu_load(). We will get some overhead if immediate_exit is true (which should hopefully be rare). Fixes: 26778aaa134a ("KVM: arm64: Commit pending PC adjustemnts before returning to userspace") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210526141831.1662-1-yuzenghui@huawei.com Cc: stable@vger.kernel.org # 5.11
2021-05-15KVM: arm64: Fix debug register indexingMarc Zyngier
Commit 03fdfb2690099 ("KVM: arm64: Don't write junk to sysregs on reset") flipped the register number to 0 for all the debug registers in the sysreg table, hereby indicating that these registers live in a separate shadow structure. However, the author of this patch failed to realise that all the accessors are using that particular index instead of the register encoding, resulting in all the registers hitting index 0. Not quite a valid implementation of the architecture... Address the issue by fixing all the accessors to use the CRm field of the encoding, which contains the debug register index. Fixes: 03fdfb2690099 ("KVM: arm64: Don't write junk to sysregs on reset") Reported-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org
2021-05-15KVM: arm64: Commit pending PC adjustemnts before returning to userspaceMarc Zyngier
KVM currently updates PC (and the corresponding exception state) using a two phase approach: first by setting a set of flags, then by converting these flags into a state update when the vcpu is about to enter the guest. However, this creates a disconnect with userspace if the vcpu thread returns there with any exception/PC flag set. In this case, the exposed context is wrong, as userspace doesn't have access to these flags (they aren't architectural). It also means that these flags are preserved across a reset, which isn't expected. To solve this problem, force an explicit synchronisation of the exception state on vcpu exit to userspace. As an optimisation for nVHE systems, only perform this when there is something pending. Reported-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org # 5.11
2021-05-15KVM: arm64: Move __adjust_pc out of lineMarc Zyngier
In order to make it easy to call __adjust_pc() from the EL1 code (in the case of nVHE), rename it to __kvm_adjust_pc() and move it out of line. No expected functional change. Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org # 5.11
2021-05-15KVM: arm64: Mark the host stage-2 memory pools staticQuentin Perret
The host stage-2 memory pools are not used outside of mem_protect.c, mark them static. Fixes: 1025c8c0c6ac ("KVM: arm64: Wrap the host with a stage 2") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210514085640.3917886-3-qperret@google.com
2021-05-15KVM: arm64: Mark pkvm_pgtable_mm_ops staticQuentin Perret
It is not used outside of setup.c, mark it static. Fixes:f320bc742bc2 ("KVM: arm64: Prepare the creation of s1 mappings at EL2") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210514085640.3917886-2-qperret@google.com
2021-05-15KVM: arm64: Fix boolreturn.cocci warningskernel test robot
arch/arm64/kvm/mmu.c:1114:9-10: WARNING: return of 0/1 in function 'kvm_age_gfn' with return type bool arch/arm64/kvm/mmu.c:1084:9-10: WARNING: return of 0/1 in function 'kvm_set_spte_gfn' with return type bool arch/arm64/kvm/mmu.c:1127:9-10: WARNING: return of 0/1 in function 'kvm_test_age_gfn' with return type bool arch/arm64/kvm/mmu.c:1070:9-10: WARNING: return of 0/1 in function 'kvm_unmap_gfn_range' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci Fixes: cd4c71835228 ("KVM: arm64: Convert to the gfn-based MMU notifier callbacks") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210426223357.GA45871@cd4295a34ed8
2021-05-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "This is a large update by KVM standards, including AMD PSP (Platform Security Processor, aka "AMD Secure Technology") and ARM CoreSight (debug and trace) changes. ARM: - CoreSight: Add support for ETE and TRBE - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler x86: - AMD PSP driver changes - Optimizations and cleanup of nested SVM code - AMD: Support for virtual SPEC_CTRL - Optimizations of the new MMU code: fast invalidation, zap under read lock, enable/disably dirty page logging under read lock - /dev/kvm API for AMD SEV live migration (guest API coming soon) - support SEV virtual machines sharing the same encryption context - support SGX in virtual machines - add a few more statistics - improved directed yield heuristics - Lots and lots of cleanups Generic: - Rework of MMU notifier interface, simplifying and optimizing the architecture-specific code - a handful of "Get rid of oprofile leftovers" patches - Some selftests improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits) KVM: selftests: Speed up set_memory_region_test selftests: kvm: Fix the check of return value KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt() KVM: SVM: Skip SEV cache flush if no ASIDs have been used KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids() KVM: SVM: Drop redundant svm_sev_enabled() helper KVM: SVM: Move SEV VMCB tracking allocation to sev.c KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup() KVM: SVM: Unconditionally invoke sev_hardware_teardown() KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported) KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features KVM: SVM: Move SEV module params/variables to sev.c KVM: SVM: Disable SEV/SEV-ES if NPT is disabled KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails KVM: SVM: Zero out the VMCB array used to track SEV ASID association x86/sev: Drop redundant and potentially misleading 'sev_enabled' KVM: x86: Move reverse CPUID helpers to separate header file KVM: x86: Rename GPR accessors to make mode-aware variants the defaults ...
2021-04-27Merge tag 'cfi-v5.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull CFI on arm64 support from Kees Cook: "This builds on last cycle's LTO work, and allows the arm64 kernels to be built with Clang's Control Flow Integrity feature. This feature has happily lived in Android kernels for almost 3 years[1], so I'm excited to have it ready for upstream. The wide diffstat is mainly due to the treewide fixing of mismatched list_sort prototypes. Other things in core kernel are to address various CFI corner cases. The largest code portion is the CFI runtime implementation itself (which will be shared by all architectures implementing support for CFI). The arm64 pieces are Acked by arm64 maintainers rather than coming through the arm64 tree since carrying this tree over there was going to be awkward. CFI support for x86 is still under development, but is pretty close. There are a handful of corner cases on x86 that need some improvements to Clang and objtool, but otherwise works well. Summary: - Clean up list_sort prototypes (Sami Tolvanen) - Introduce CONFIG_CFI_CLANG for arm64 (Sami Tolvanen)" * tag 'cfi-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: arm64: allow CONFIG_CFI_CLANG to be selected KVM: arm64: Disable CFI for nVHE arm64: ftrace: use function_nocfi for ftrace_call arm64: add __nocfi to __apply_alternatives arm64: add __nocfi to functions that jump to a physical address arm64: use function_nocfi with __pa_symbol arm64: implement function_nocfi psci: use function_nocfi for cpu_resume lkdtm: use function_nocfi treewide: Change list_sort to use const pointers bpf: disable CFI in dispatcher functions kallsyms: strip ThinLTO hashes from static functions kthread: use WARN_ON_FUNCTION_MISMATCH workqueue: use WARN_ON_FUNCTION_MISMATCH module: ensure __cfi_check alignment mm: add generic function_nocfi macro cfi: add __cficanonical add support for Clang CFI
2021-04-26Merge tag 'irq-core-2021-04-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The usual updates from the irq departement: Core changes: - Provide IRQF_NO_AUTOEN as a flag for request*_irq() so drivers can be cleaned up which either use a seperate mechanism to prevent auto-enable at request time or have a racy mechanism which disables the interrupt right after request. - Get rid of the last usage of irq_create_identity_mapping() and remove the interface. - An overhaul of tasklet_disable(). Most usage sites of tasklet_disable() are in task context and usually in cleanup, teardown code pathes. tasklet_disable() spinwaits for a tasklet which is currently executed. That's not only a problem for PREEMPT_RT where this can lead to a live lock when the disabling task preempts the softirq thread. It's also problematic in context of virtualization when the vCPU which runs the tasklet is scheduled out and the disabling code has to spin wait until it's scheduled back in. There are a few code pathes which invoke tasklet_disable() from non-sleepable context. For these a new disable variant which still spinwaits is provided which allows to switch tasklet_disable() to a sleep wait mechanism. For the atomic use cases this does not solve the live lock issue on PREEMPT_RT. That is mitigated by blocking on the RT specific softirq lock. - The PREEMPT_RT specific implementation of softirq processing and local_bh_disable/enable(). On RT enabled kernels soft interrupt processing happens always in task context and all interrupt handlers, which are not explicitly marked to be invoked in hard interrupt context are forced into task context as well. This allows to protect against softirq processing with a per CPU lock, which in turn allows to make BH disabled regions preemptible. Most of the softirq handling code is still shared. The RT/non-RT specific differences are addressed with a set of inline functions which provide the context specific functionality. The local_bh_disable() / local_bh_enable() mechanism are obviously seperate. - The usual set of small improvements and cleanups Driver changes: - New drivers for Nuvoton WPCM450 and DT 79rc3243x interrupt controllers - Extended functionality for MStar, STM32 and SC7280 irq chips - Enhanced robustness for ARM GICv3/4.1 drivers - The usual set of cleanups and improvements all over the place" * tag 'irq-core-2021-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) irqchip/xilinx: Expose Kconfig option for Zynq/ZynqMP irqchip/gic-v3: Do not enable irqs when handling spurious interrups dt-bindings: interrupt-controller: Add IDT 79RC3243x Interrupt Controller irqchip: Add support for IDT 79rc3243x interrupt controller irqdomain: Drop references to recusive irqdomain setup irqdomain: Get rid of irq_create_strict_mappings() irqchip/jcore-aic: Kill use of irq_create_strict_mappings() ARM: PXA: Kill use of irq_create_strict_mappings() irqchip/gic-v4.1: Disable vSGI upon (GIC CPUIF < v4.1) detection irqchip/tb10x: Use 'fallthrough' to eliminate a warning genirq: Reduce irqdebug cacheline bouncing kernel: Initialize cpumask before parsing irqchip/wpcm450: Drop COMPILE_TEST irqchip/irq-mst: Support polarity configuration irqchip: Add driver for WPCM450 interrupt controller dt-bindings: interrupt-controller: Add nuvoton, wpcm450-aic dt-bindings: qcom,pdc: Add compatible for sc7280 irqchip/stm32: Add usart instances exti direct event support irqchip/gic-v3: Fix OF_BAD_ADDR error handling irqchip/sifive-plic: Mark two global variables __ro_after_init ...
2021-04-23Merge tag 'kvmarm-5.13' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.13 New features: - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler - Alexandru is now a reviewer (not really a new feature...) Fixes: - Proper emulation of the GICR_TYPER register - Handle the complete set of relocation in the nVHE EL2 object - Get rid of the oprofile dependency in the PMU code (and of the oprofile body parts at the same time) - Debug and SPE fixes - Fix vcpu reset
2021-04-22Merge branch 'kvm-sev-cgroup' into HEADPaolo Bonzini
2021-04-22irqchip/gic-v4.1: Disable vSGI upon (GIC CPUIF < v4.1) detectionLorenzo Pieralisi
GIC CPU interfaces versions predating GIC v4.1 were not built to accommodate vINTID within the vSGI range; as reported in the GIC specifications (8.2 "Changes to the CPU interface"), it is CONSTRAINED UNPREDICTABLE to deliver a vSGI to a PE with ID_AA64PFR0_EL1.GIC < b0011. Check the GIC CPUIF version by reading the SYS_ID_AA64_PFR0_EL1. Disable vSGIs if a CPUIF version < 4.1 is detected to prevent using vSGIs on systems where they may misbehave. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210317100719.3331-2-lorenzo.pieralisi@arm.com
2021-04-22Merge branch 'kvm-arm64/kill_oprofile_dependency' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-22KVM: arm64: Divorce the perf code from oprofile helpersMarc Zyngier
KVM/arm64 is the sole user of perf_num_counters(), and really could do without it. Stop using the obsolete API by relying on the existing probing code. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210414134409.1266357-2-maz@kernel.org
2021-04-17KVM: arm64: Convert to the gfn-based MMU notifier callbacksSean Christopherson
Move arm64 to the gfn-base MMU notifier APIs, which do the hva->gfn lookup in common code. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: constify kvm_arch_flush_remote_tlbs_memslotPaolo Bonzini
memslots are stored in RCU and there should be no need to change them. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: aarch64: implement KVM_CAP_SET_GUEST_DEBUG2Maxim Levitsky
Move KVM_GUESTDBG_VALID_MASK to kvm_host.h and use it to return the value of this capability. Compile tested only. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401135451.1004564-5-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: Move arm64's MMU notifier trace events to generic codeSean Christopherson
Move arm64's MMU notifier trace events into common code in preparation for doing the hva->gfn lookup in common code. The alternative would be to trace the gfn instead of hva, but that's not obviously better and could also be done in common code. Tracing the notifiers is also quite handy for debug regardless of architecture. Remove a completely redundant tracepoint from PPC e500. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-13Merge branch 'kvm-arm64/vlpi-save-restore' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/vgic-5.13' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/ptp' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/nvhe-wxn' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/nvhe-panic-info' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/misc-5.13' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/memslot-fixes' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/host-stage2' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/debug-5.13' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13KVM: arm/arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST readEric Auger
When reading the base address of the a REDIST region through KVM_VGIC_V3_ADDR_TYPE_REDIST we expect the redistributor region list to be populated with a single element. However list_first_entry() expects the list to be non empty. Instead we should use list_first_entry_or_null which effectively returns NULL if the list is empty. Fixes: dbd9733ab674 ("KVM: arm/arm64: Replace the single rdist region by a list") Cc: <Stable@vger.kernel.org> # v4.18+ Signed-off-by: Eric Auger <eric.auger@redhat.com> Reported-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210412150034.29185-1-eric.auger@redhat.com
2021-04-11KVM: arm64: Don't advertise FEAT_SPE to guestsAlexandru Elisei
Even though KVM sets up MDCR_EL2 to trap accesses to the SPE buffer and sampling control registers and to inject an undefined exception, the presence of FEAT_SPE is still advertised in the ID_AA64DFR0_EL1 register, if the hardware supports it. Getting an undefined exception when accessing a register usually happens for a hardware feature which is not implemented, and indeed this is how PMU emulation is handled when the virtual machine has been created without the KVM_ARM_VCPU_PMU_V3 feature. Let's be consistent and never advertise FEAT_SPE, because KVM doesn't have support for emulating it yet. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210409152154.198566-3-alexandru.elisei@arm.com
2021-04-11KVM: arm64: Don't print warning when trapping SPE registersAlexandru Elisei
KVM sets up MDCR_EL2 to trap accesses to the SPE buffer and sampling control registers and it relies on the fact that KVM injects an undefined exception for unknown registers. This mechanism of injecting undefined exceptions also prints a warning message for the host kernel; for example, when a guest tries to access PMSIDR_EL1: [ 2.691830] kvm [142]: Unsupported guest sys_reg access at: 80009e78 [800003c5] [ 2.691830] { Op0( 3), Op1( 0), CRn( 9), CRm( 9), Op2( 7), func_read }, This is unnecessary, because KVM has explicitly configured trapping of those registers and is well aware of their existence. Prevent the warning by adding the SPE registers to the list of registers that KVM emulates. The access function will inject the undefined exception. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210409152154.198566-2-alexandru.elisei@arm.com
2021-04-09KVM: arm64: Fully zero the vcpu state on resetMarc Zyngier
On vcpu reset, we expect all the registers to be brought back to their initial state, which happens to be a bunch of zeroes. However, some recent commit broke this, and is now leaving a bunch of registers (such as the FP state) with whatever was left by the guest. My bad. Zero the reset of the state (32bit SPSRs and FPSIMD state). Cc: stable@vger.kernel.org Fixes: e47c2055c68e ("KVM: arm64: Make struct kvm_regs userspace-only") Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-08KVM: arm64: Disable CFI for nVHESami Tolvanen
Disable CFI for the nVHE code to avoid address space confusion. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-18-samitolvanen@google.com
2021-04-08treewide: Change list_sort to use const pointersSami Tolvanen
list_sort() internally casts the comparison function passed to it to a different type with constant struct list_head pointers, and uses this pointer to call the functions, which trips indirect call Control-Flow Integrity (CFI) checking. Instead of removing the consts, this change defines the list_cmp_func_t type and changes the comparison function types of all list_sort() callers to use const pointers, thus avoiding type mismatches. Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
2021-04-07KVM: arm64: Initialize VCPU mdcr_el2 before loading itAlexandru Elisei
When a VCPU is created, the kvm_vcpu struct is initialized to zero in kvm_vm_ioctl_create_vcpu(). On VHE systems, the first time vcpu.arch.mdcr_el2 is loaded on hardware is in vcpu_load(), before it is set to a sensible value in kvm_arm_setup_debug() later in the run loop. The result is that KVM executes for a short time with MDCR_EL2 set to zero. This has several unintended consequences: * Setting MDCR_EL2.HPMN to 0 is constrained unpredictable according to ARM DDI 0487G.a, page D13-3820. The behavior specified by the architecture in this case is for the PE to behave as if MDCR_EL2.HPMN is set to a value less than or equal to PMCR_EL0.N, which means that an unknown number of counters are now disabled by MDCR_EL2.HPME, which is zero. * The host configuration for the other debug features controlled by MDCR_EL2 is temporarily lost. This has been harmless so far, as Linux doesn't use the other fields, but that might change in the future. Let's avoid both issues by initializing the VCPU's mdcr_el2 field in kvm_vcpu_vcpu_first_run_init(), thus making sure that the MDCR_EL2 register has a consistent value after each vcpu_load(). Fixes: d5a21bcc2995 ("KVM: arm64: Move common VHE/non-VHE trap config in separate functions") Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210407144857.199746-3-alexandru.elisei@arm.com
2021-04-07KVM: arm64: Add support for the KVM PTP serviceJianyong Wu
Implement the hypervisor side of the KVM PTP interface. The service offers wall time and cycle count from host to guest. The caller must specify whether they want the host's view of either the virtual or physical counter. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201209060932.212364-7-jianyong.wu@arm.com
2021-04-07KVM: arm64: Don't retrieve memory slot again in page fault handlerGavin Shan
We needn't retrieve the memory slot again in user_mem_abort() because the corresponding memory slot has been passed from the caller. This would save some CPU cycles. For example, the time used to write 1GB memory, which is backed by 2MB hugetlb pages and write-protected, is dropped by 6.8% from 928ms to 864ms. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210316041126.81860-4-gshan@redhat.com
2021-04-07KVM: arm64: Use find_vma_intersection()Gavin Shan
find_vma_intersection() has been existing to search the intersected vma. This uses the function where it's applicable, to simplify the code. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210316041126.81860-3-gshan@redhat.com
2021-04-07KVM: arm64: Hide kvm_mmu_wp_memory_region()Gavin Shan
We needn't expose the function as it's only used by mmu.c since it was introduced by commit c64735554c0a ("KVM: arm: Add initial dirty page locking support"). Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210316041126.81860-2-gshan@redhat.com
2021-04-06arm64: KVM: Enable access to TRBE support for hostSuzuki K Poulose
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1. For vhe, the EL2 must prevent the EL1 access to the Trace Buffer. The MDCR_EL2 bit definitions for TRBE are available here : https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/ Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405164307.1720226-8-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2021-04-06KVM: arm64: Move SPE availability check to VCPU loadSuzuki K Poulose
At the moment, we check the availability of SPE on the given CPU (i.e, SPE is implemented and is allowed at the host) during every guest entry. This can be optimized a bit by moving the check to vcpu_load time and recording the availability of the feature on the current CPU via a new flag. This will also be useful for adding the TRBE support. Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Alexandru Elisei <Alexandru.Elisei@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405164307.1720226-7-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2021-04-06KVM: arm64: Handle access to TRFCR_EL1Suzuki K Poulose
Rather than falling to an "unhandled access", inject add an explicit "undefined access" for TRFCR_EL1 access from the guest. Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405164307.1720226-6-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2021-04-06KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspaceEric Auger
Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the reporting of GICR_TYPER.Last for userspace") temporarily fixed a bug identified when attempting to access the GICR_TYPER register before the redistributor region setting, but dropped the support of the LAST bit. Emulating the GICR_TYPER.Last bit still makes sense for architecture compliance though. This patch restores its support (if the redistributor region was set) while keeping the code safe. We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which computes whether a redistributor is the highest one of a series of redistributor contributor pages. With this new implementation we do not need to have a uaccess read accessor anymore. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405163941.510258-9-eric.auger@redhat.com
2021-04-06kvm: arm64: vgic-v3: Introduce vgic_v3_free_redist_region()Eric Auger
To improve the readability, we introduce the new vgic_v3_free_redist_region helper and also rename vgic_v3_insert_redist_region into vgic_v3_alloc_redist_region Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405163941.510258-8-eric.auger@redhat.com
2021-04-06KVM: arm64: Simplify argument passing to vgic_uaccess_[read|write]Eric Auger
vgic_uaccess() takes a struct vgic_io_device argument, converts it to a struct kvm_io_device and passes it to the read/write accessor functions, which convert it back to a struct vgic_io_device. Avoid the indirection by passing the struct vgic_io_device argument directly to vgic_uaccess_{read,write}. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405163941.510258-7-eric.auger@redhat.com
2021-04-06KVM: arm/arm64: vgic: Reset base address on kvm_vgic_dist_destroy()Eric Auger
On vgic_dist_destroy(), the addresses are not reset. However for kvm selftest purpose this would allow to continue the test execution even after a failure when running KVM_RUN. So let's reset the base addresses. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405163941.510258-5-eric.auger@redhat.com
2021-04-06KVM: arm64: vgic-v3: Fix error handling in vgic_v3_set_redist_base()Eric Auger
vgic_v3_insert_redist_region() may succeed while vgic_register_all_redist_iodevs fails. For example this happens while adding a redistributor region overlapping a dist region. The failure only is detected on vgic_register_all_redist_iodevs when vgic_v3_check_base() gets called in vgic_register_redist_iodev(). In such a case, remove the newly added redistributor region and free it. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405163941.510258-4-eric.auger@redhat.com