summaryrefslogtreecommitdiff
path: root/virt/kvm/arm/mmu.c
AgeCommit message (Collapse)Author
2020-05-16KVM: arm64: Move virt/kvm/arm to arch/arm64Marc Zyngier
Now that the 32bit KVM/arm host is a distant memory, let's move the whole of the KVM/arm64 code into the arm64 tree. As they said in the song: Welcome Home (Sanitarium). Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200513104034.74741-1-maz@kernel.org
2020-03-16KVM: Terminate memslot walks via used_slotsSean Christopherson
Refactor memslot handling to treat the number of used slots as the de facto size of the memslot array, e.g. return NULL from id_to_memslot() when an invalid index is provided instead of relying on npages==0 to detect an invalid memslot. Rework the sorting and walking of memslots in advance of dynamically sizing memslots to aid bisection and debug, e.g. with luck, a bug in the refactoring will bisect here and/or hit a WARN instead of randomly corrupting memory. Alternatively, a global null/invalid memslot could be returned, i.e. so callers of id_to_memslot() don't have to explicitly check for a NULL memslot, but that approach runs the risk of introducing difficult-to- debug issues, e.g. if the global null slot is modified. Constifying the return from id_to_memslot() to combat such issues is possible, but would require a massive refactoring of arch specific code and would still be susceptible to casting shenanigans. Add function comments to update_memslots() and search_memslots() to explicitly (and loudly) state how memslots are sorted. Opportunistically stuff @hva with a non-canonical value when deleting a private memslot on x86 to detect bogus usage of the freed slot. No functional change intended. Tested-by: Christoffer Dall <christoffer.dall@arm.com> Tested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Simplify kvm_free_memslot() and all its descendentsSean Christopherson
Now that all callers of kvm_free_memslot() pass NULL for @dont, remove the param from the top-level routine and all arch's implementations. No functional change intended. Tested-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Drop "const" attribute from old memslot in commit_memory_region()Sean Christopherson
Drop the "const" attribute from @old in kvm_arch_commit_memory_region() to allow arch specific code to free arch specific resources in the old memslot without having to cast away the attribute. Freeing resources in kvm_arch_commit_memory_region() paves the way for simplifying kvm_free_memslot() by eliminating the last usage of its @dont param. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Drop kvm_arch_create_memslot()Sean Christopherson
Remove kvm_arch_create_memslot() now that all arch implementations are effectively nops. Removing kvm_arch_create_memslot() eliminates the possibility for arch specific code to allocate memory prior to setting a memslot, which sets the stage for simplifying kvm_free_memslot(). Cc: Janosch Frank <frankja@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-30Merge tag 'kvmarm-5.6' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for Linux 5.6 - Fix MMIO sign extension - Fix HYP VA tagging on tag space exhaustion - Fix PSTATE/CPSR handling when generating exception - Fix MMU notifier's advertizing of young pages - Fix poisoned page handling - Fix PMU SW event handling - Fix TVAL register access - Fix AArch32 external abort injection - Fix ITS unmapped collection handling - Various cleanups
2020-01-27mm: thp: KVM: Explicitly check for THP when populating secondary MMUSean Christopherson
Add a helper, is_transparent_hugepage(), to explicitly check whether a compound page is a THP and use it when populating KVM's secondary MMU. The explicit check fixes a bug where a remapped compound page, e.g. for an XDP Rx socket, is mapped into a KVM guest and is mistaken for a THP, which results in KVM incorrectly creating a huge page in its secondary MMU. Fixes: 936a5fe6e6148 ("thp: kvm mmu transparent hugepage support") Reported-by: syzbot+c9d1fb51ac9d0d10c39d@syzkaller.appspotmail.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-23KVM: arm/arm64: Fix young bit from mmu notifierGavin Shan
kvm_test_age_hva() is called upon mmu_notifier_test_young(), but wrong address range has been passed to handle_hva_to_gpa(). With the wrong address range, no young bits will be checked in handle_hva_to_gpa(). It means zero is always returned from mmu_notifier_test_young(). This fixes the issue by passing correct address range to the underly function handle_hva_to_gpa(), so that the hardware young (access) bit will be visited. Fixes: 35307b9a5f7e ("arm/arm64: KVM: Implement Stage-2 page aging") Signed-off-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200121055659.19560-1-gshan@redhat.com
2020-01-23KVM: arm/arm64: Cleanup MMIO handlingMarc Zyngier
Our MMIO handling is a bit odd, in the sense that it uses an intermediate per-vcpu structure to store the various decoded information that describe the access. But the same information is readily available in the HSR/ESR_EL2 field, and we actually use this field to populate the structure. Let's simplify the whole thing by getting rid of the superfluous structure and save a (tiny) bit of space in the vcpu structure. [32bit fix courtesy of Olof Johansson <olof@lixom.net>] Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-01-19KVM: arm/arm64: Re-check VMA on detecting a poisoned pageJames Morse
When we check for a poisoned page, we use the VMA to tell userspace about the looming disaster. But we pass a pointer to this VMA after having released the mmap_sem, which isn't a good idea. Instead, stash the shift value that goes with this pfn while we are holding the mmap_sem. Reported-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Link: https://lore.kernel.org/r/20191211165651.7889-3-maz@kernel.org Link: https://lore.kernel.org/r/20191217123809.197392-1-james.morse@arm.com
2019-12-12KVM: arm/arm64: Properly handle faulting of device mappingsMarc Zyngier
A device mapping is normally always mapped at Stage-2, since there is very little gain in having it faulted in. Nonetheless, it is possible to end-up in a situation where the device mapping has been removed from Stage-2 (userspace munmaped the VFIO region, and the MMU notifier did its job), but present in a userspace mapping (userpace has mapped it back at the same address). In such a situation, the device mapping will be demand-paged as the guest performs memory accesses. This requires to be careful when dealing with mapping size, cache management, and to handle potential execution of a device mapping. Reported-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191211165651.7889-2-maz@kernel.org
2019-12-06KVM: arm/arm64: Remove excessive permission check in ↵Jia He
kvm_arch_prepare_memory_region In kvm_arch_prepare_memory_region, arm kvm regards the memory region as writable if the flag has no KVM_MEM_READONLY, and the vm is readonly if !VM_WRITE. But there is common usage for setting kvm memory region as follows: e.g. qemu side (see the PROT_NONE flag) 1. mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); memory_region_init_ram_ptr() 2. re mmap the above area with read/write authority. Such example is used in virtio-fs qemu codes which hasn't been upstreamed [1]. But seems we can't forbid this example. Without this patch, it will cause an EPERM during kvm_set_memory_region() and cause qemu boot crash. As told by Ard, "the underlying assumption is incorrect, i.e., that the value of vm_flags at this point in time defines how the VMA is used during its lifetime. There may be other cases where a VMA is created with VM_READ vm_flags that are changed to VM_READ|VM_WRITE later, and we are currently rejecting this use case as well." [1] https://gitlab.com/virtio-fs/qemu/blob/5a356e/hw/virtio/vhost-user-fs.c#L488 Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jia He <justin.he@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Link: https://lore.kernel.org/r/20191206020802.196108-1-justin.he@arm.com
2019-07-12arm64: switch to generic version of pte allocationMike Rapoport
The PTE allocations in arm64 are identical to the generic ones modulo the GFP flags. Using the generic pte_alloc_one() functions ensures that the user page tables are allocated with __GFP_ACCOUNT set. The arm64 definition of PGALLOC_GFP is removed and replaced with GFP_PGTABLE_USER for p[gum]d_alloc_one() for the user page tables and GFP_PGTABLE_KERNEL for the kernel page tables. The KVM memory cache is now using GFP_PGTABLE_USER. The mappings created with create_pgd_mapping() are now using GFP_PGTABLE_KERNEL. The conversion to the generic version of pte_free_kernel() removes the NULL check for pte. The pte_free() version on arm64 is identical to the generic one and can be simply dropped. [cai@lca.pw: fix a bogus GFP flag in pgd_alloc()] Link: https://lore.kernel.org/r/1559656836-24940-1-git-send-email-cai@lca.pw/ [and fix it more] Link: https://lore.kernel.org/linux-mm/20190617151252.GF16810@rapoport-lnx/ Link: http://lkml.kernel.org/r/1557296232-15361-5-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Guo Ren <ren_guo@c-sky.com> Cc: Helge Deller <deller@gmx.de> Cc: Ley Foon Tan <lftan@altera.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Sam Creasey <sammy@sammy.net> Cc: Vincent Chen <deanbo422@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 266Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation 51 franklin street fifth floor boston ma 02110 1301 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 67 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141333.953658117@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-06Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "Mostly just incremental improvements here: - Introduce AT_HWCAP2 for advertising CPU features to userspace - Expose SVE2 availability to userspace - Support for "data cache clean to point of deep persistence" (DC PODP) - Honour "mitigations=off" on the cmdline and advertise status via sysfs - CPU timer erratum workaround (Neoverse-N1 #1188873) - Introduce perf PMU driver for the SMMUv3 performance counters - Add config option to disable the kuser helpers page for AArch32 tasks - Futex modifications to ensure liveness under contention - Rework debug exception handling to seperate kernel and user handlers - Non-critical fixes and cleanup" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits) Documentation: Add ARM64 to kernel-parameters.rst arm64/speculation: Support 'mitigations=' cmdline option arm64: ssbs: Don't treat CPUs with SSBS as unaffected by SSB arm64: enable generic CPU vulnerabilites support arm64: add sysfs vulnerability show for speculative store bypass arm64: Fix size of __early_cpu_boot_status clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable counters clocksource/arm_arch_timer: Remove use of workaround static key clocksource/arm_arch_timer: Drop use of static key in arch_timer_reg_read_stable clocksource/arm_arch_timer: Direcly assign set_next_event workaround arm64: Use arch_timer_read_counter instead of arch_counter_get_cntvct watchdog/sbsa: Use arch_timer_read_counter instead of arch_counter_get_cntvct ARM: vdso: Remove dependency with the arch_timer driver internals arm64: Apply ARM64_ERRATUM_1188873 to Neoverse-N1 arm64: Add part number for Neoverse N1 arm64: Make ARM64_ERRATUM_1188873 depend on COMPAT arm64: Restrict ARM64_ERRATUM_1188873 mitigation to AArch32 arm64: mm: Remove pte_unmap_nested() arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variable arm64: compat: Reduce address limit for 64K pages ...
2019-04-25kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THPSuzuki K Poulose
With commit a80868f398554842b14, we no longer ensure that the THP page is properly aligned in the guest IPA. Skip the stage2 huge mapping for unaligned IPA backed by transparent hugepages. Fixes: a80868f398554842b14 ("KVM: arm/arm64: Enforce PTE mappings at stage2 when needed") Reported-by: Eric Auger <eric.auger@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Chirstoffer Dall <christoffer.dall@arm.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Zheng Xiang <zhengxiang9@huawei.com> Cc: Andrew Murray <andrew.murray@arm.com> Cc: Eric Auger <eric.auger@redhat.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-09KVM: ARM: Remove pgtable page standard functions from stage-2 page tablesAnshuman Khandual
ARM64 standard pgtable functions are going to use pgtable_page_[ctor|dtor] or pgtable_pmd_page_[ctor|dtor] constructs. At present KVM guest stage-2 PUD|PMD|PTE level page tabe pages are allocated with __get_free_page() via mmu_memory_cache_alloc() but released with standard pud|pmd_free() or pte_free_kernel(). These will fail once they start calling into pgtable_ [pmd]_page_dtor() for pages which never originally went through respective constructor functions. Hence convert all stage-2 page table page release functions to call buddy directly while freeing pages. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Yu Zhao <yuzhao@google.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-03-28KVM: arm/arm64: Comments cleanup in mmu.cZenghui Yu
Some comments in virt/kvm/arm/mmu.c are outdated. Update them to reflect the current state of the code. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> [maz: commit message tidy-up] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-20KVM: arm/arm64: Fix handling of stage2 huge mappingsSuzuki K Poulose
We rely on the mmu_notifier call backs to handle the split/merge of huge pages and thus we are guaranteed that, while creating a block mapping, either the entire block is unmapped at stage2 or it is missing permission. However, we miss a case where the block mapping is split for dirty logging case and then could later be made block mapping, if we cancel the dirty logging. This not only creates inconsistent TLB entries for the pages in the the block, but also leakes the table pages for PMD level. Handle this corner case for the huge mappings at stage2 by unmapping the non-huge mapping for the block. This could potentially release the upper level table. So we need to restart the table walk once we unmap the range. Fixes : ad361f093c1e31d ("KVM: ARM: Support hugetlbfs backed huge pages") Reported-by: Zheng Xiang <zhengxiang9@huawei.com> Cc: Zheng Xiang <zhengxiang9@huawei.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-19KVM: arm/arm64: Enforce PTE mappings at stage2 when neededSuzuki K Poulose
commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings") made the checks to skip huge mappings, stricter. However it introduced a bug where we still use huge mappings, ignoring the flag to use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE. Also, the checks do not cover the PUD huge pages, that was under review during the same period. This patch fixes both the issues. Fixes : 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - some cleanups - direct physical timer assignment - cache sanitization for 32-bit guests s390: - interrupt cleanup - introduction of the Guest Information Block - preparation for processor subfunctions in cpu models PPC: - bug fixes and improvements, especially related to machine checks and protection keys x86: - many, many cleanups, including removing a bunch of MMU code for unnecessary optimizations - AVIC fixes Generic: - memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits) kvm: vmx: fix formatting of a comment KVM: doc: Document the life cycle of a VM and its resources MAINTAINERS: Add KVM selftests to existing KVM entry Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()" KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char() KVM: PPC: Fix compilation when KVM is not enabled KVM: Minor cleanups for kvm_main.c KVM: s390: add debug logging for cpu model subfunctions KVM: s390: implement subfunction processor calls arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2 KVM: arm/arm64: Remove unused timer variable KVM: PPC: Book3S: Improve KVM reference counting KVM: PPC: Book3S HV: Fix build failure without IOMMU support Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()" x86: kvmguest: use TSC clocksource if invariant TSC is exposed KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes() KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children ...
2019-03-04Merge branch 'acpi-apei'Rafael J. Wysocki
* acpi-apei: (29 commits) efi: cper: Fix possible out-of-bounds access ACPI: APEI: Fix possible out-of-bounds access to BERT region MAINTAINERS: Add James Morse to the list of APEI reviewers ACPI / APEI: Add support for the SDEI GHES Notification type firmware: arm_sdei: Add ACPI GHES registration helper ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications ACPI / APEI: Only use queued estatus entry during in_nmi_queue_one_entry() ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length ACPI / APEI: Make GHES estatus header validation more user friendly ACPI / APEI: Pass ghes and estatus separately to avoid a later copy ACPI / APEI: Let the notification helper specify the fixmap slot ACPI / APEI: Move locking to the notification helper arm64: KVM/mm: Move SEA handling behind a single 'claim' interface KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI ACPI / APEI: Don't allow ghes_ack_error() to mask earlier errors ACPI / APEI: Generalise the estatus queue's notify code ACPI / APEI: Don't update struct ghes' flags in read/clear estatus ACPI / APEI: Remove spurious GHES_TO_CLEAR check ...
2019-02-22Merge tag 'kvmarm-for-v5.1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next KVM/arm updates for Linux v5.1 - A number of pre-nested code rework - Direct physical timer assignment on VHE systems - kvm_call_hyp type safety enforcement - Set/Way cache sanitisation for 32bit guests - Build system cleanups - A bunch of janitorial fixes
2019-02-20KVM: Call kvm_arch_memslots_updated() before updating memslotsSean Christopherson
kvm_arch_memslots_updated() is at this point in time an x86-specific hook for handling MMIO generation wraparound. x86 stashes 19 bits of the memslots generation number in its MMIO sptes in order to avoid full page fault walks for repeat faults on emulated MMIO addresses. Because only 19 bits are used, wrapping the MMIO generation number is possible, if unlikely. kvm_arch_memslots_updated() alerts x86 that the generation has changed so that it can invalidate all MMIO sptes in case the effective MMIO generation has wrapped so as to avoid using a stale spte, e.g. a (very) old spte that was created with generation==0. Given that the purpose of kvm_arch_memslots_updated() is to prevent consuming stale entries, it needs to be called before the new generation is propagated to memslots. Invalidating the MMIO sptes after updating memslots means that there is a window where a vCPU could dereference the new memslots generation, e.g. 0, and incorrectly reuse an old MMIO spte that was created with (pre-wrap) generation==0. Fixes: e59dbe09f8e6 ("KVM: Introduce kvm_arch_memslots_updated()") Cc: <stable@vger.kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-20KVM: arm/arm64: Remove unused gpa_end variableShaokun Zhang
The 'gpa_end' local variable is never used and let's remove it. Cc: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-19KVM: arm/arm64: Move kvm_is_write_fault to header fileChristoffer Dall
Move this little function to the header files for arm/arm64 so other code can make use of it directly. Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-19KVM: arm/arm64: Factor out VMID into struct kvm_vmidChristoffer Dall
In preparation for nested virtualization where we are going to have more than a single VMID per VM, let's factor out the VMID data into a separate VMID data structure and change the VMID allocator to operate on this new structure instead of using a struct kvm. This also means that udate_vttbr now becomes update_vmid, and that the vttbr itself is generated on the fly based on the stage 2 page table base address and the vmid. We cache the physical address of the pgd when allocating the pgd to avoid doing the calculation on every entry to the guest and to avoid calling into potentially non-hyp-mapped code from hyp/EL2. If we wanted to merge the VMID allocator with the arm64 ASID allocator at some point in the future, it should actually become easier to do that after this patch. Note that to avoid mapping the kvm_vmid_bits variable into hyp, we simply forego the masking of the vmid value in kvm_get_vttbr and rely on update_vmid to always assign a valid vmid value (within the supported range). Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> [maz: minor cleanups] Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-07KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbingJames Morse
To split up APEIs in_nmi() path, the caller needs to always be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing out into a header file. Currently guest synchronous external aborts are claimed as RAS notifications by handle_guest_sea(), which is hidden in the arch codes mm/fault.c. 32bit gets a dummy declaration in system_misc.h. There is going to be more of this in the future if/when the kernel supports the SError-based firmware-first notification mechanism and/or kernel-first notifications for both synchronous external abort and SError. Each of these will come with some Kconfig symbols and a handful of header files. Create a header file for all this. This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the declarations to kvm_ras.h as preparation for a future patch that moves the ACPI-specific RAS code out of mm/fault.c. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Tyler Baicar <tbaicar@codeaurora.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-02-07KVM: arm64: Relax the restriction on using stage2 PUD huge mappingSuzuki K Poulose
We restrict mapping the PUD huge pages in stage2 to only when the stage2 has 4 level page table, leaving the feature unused with the default IPA size. But we could use it even with a 3 level page table, i.e, when the PUD level is folded into PGD, just like the stage1. Relax the condition to allow using the PUD huge page mappings at stage2 when it is possible. Cc: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-01-04mm: treewide: remove unused address argument from pte_alloc functionsJoel Fernandes (Google)
Patch series "Add support for fast mremap". This series speeds up the mremap(2) syscall by copying page tables at the PMD level even for non-THP systems. There is concern that the extra 'address' argument that mremap passes to pte_alloc may do something subtle architecture related in the future that may make the scheme not work. Also we find that there is no point in passing the 'address' to pte_alloc since its unused. This patch therefore removes this argument tree-wide resulting in a nice negative diff as well. Also ensuring along the way that the enabled architectures do not do anything funky with the 'address' argument that goes unnoticed by the optimization. Build and boot tested on x86-64. Build tested on arm64. The config enablement patch for arm64 will be posted in the future after more testing. The changes were obtained by applying the following Coccinelle script. (thanks Julia for answering all Coccinelle questions!). Following fix ups were done manually: * Removal of address argument from pte_fragment_alloc * Removal of pte_alloc_one_fast definitions from m68k and microblaze. // Options: --include-headers --no-includes // Note: I split the 'identifier fn' line, so if you are manually // running it, please unsplit it so it runs for you. virtual patch @pte_alloc_func_def depends on patch exists@ identifier E2; identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; type T2; @@ fn(... - , T2 E2 ) { ... } @pte_alloc_func_proto_noarg depends on patch exists@ type T1, T2, T3, T4; identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; @@ ( - T3 fn(T1, T2); + T3 fn(T1); | - T3 fn(T1, T2, T4); + T3 fn(T1, T2); ) @pte_alloc_func_proto depends on patch exists@ identifier E1, E2, E4; type T1, T2, T3, T4; identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; @@ ( - T3 fn(T1 E1, T2 E2); + T3 fn(T1 E1); | - T3 fn(T1 E1, T2 E2, T4 E4); + T3 fn(T1 E1, T2 E2); ) @pte_alloc_func_call depends on patch exists@ expression E2; identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; @@ fn(... -, E2 ) @pte_alloc_macro depends on patch exists@ identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; identifier a, b, c; expression e; position p; @@ ( - #define fn(a, b, c) e + #define fn(a, b) e | - #define fn(a, b) e + #define fn(a) e ) Link: http://lkml.kernel.org/r/20181108181201.88826-2-joelaf@google.com Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Suggested-by: Kirill A. Shutemov <kirill@shutemov.name> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michal Hocko <mhocko@kernel.org> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-21KVM: Make kvm_set_spte_hva() return intLan Tianyu
The patch is to make kvm_set_spte_hva() return int and caller can check return value to determine flush tlb or not. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-19KVM: arm/arm64: Fix unintended stage 2 PMD mappingsChristoffer Dall
There are two things we need to take care of when we create block mappings in the stage 2 page tables: (1) The alignment within a PMD between the host address range and the guest IPA range must be the same, since otherwise we end up mapping pages with the wrong offset. (2) The head and tail of a memory slot may not cover a full block size, and we have to take care to not map those with block descriptors, since we could expose memory to the guest that the host did not intend to expose. So far, we have been taking care of (1), but not (2), and our commentary describing (1) was somewhat confusing. This commit attempts to factor out the checks of both into a common function, and if we don't pass the check, we won't attempt any PMD mappings for neither hugetlbfs nor THP. Note that we used to only check the alignment for THP, not for hugetlbfs, but as far as I can tell the check needs to be applied to both scenarios. Cc: Ralph Palutke <ralph.palutke@fau.de> Cc: Lukas Braun <koomi@moshbit.net> Reported-by: Lukas Braun <koomi@moshbit.net> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELSChristoffer Dall
In attempting to re-construct the logic for our stage 2 page table layout I found the reasoning in the comment explaining how we calculate the number of levels used for stage 2 page tables a bit backwards. This commit attempts to clarify the comment, to make it slightly easier to read without having the Arm ARM open on the right page. While we're at it, fixup a typo in a comment that was recently changed. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm64: Add support for creating PUD hugepages at stage 2Punit Agrawal
KVM only supports PMD hugepages at stage 2. Now that the various page handling routines are updated, extend the stage 2 fault handling to map in PUD hugepages. Addition of PUD hugepage support enables additional page sizes (e.g., 1G with 4K granule) which can be useful on cores that support mapping larger block sizes in the TLB entries. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> [ Replace BUG() => WARN_ON(1) for arm32 PUD helpers ] Signed-off-by: Suzuki Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm64: Update age handlers to support PUD hugepagesPunit Agrawal
In preparation for creating larger hugepages at Stage 2, add support to the age handling notifiers for PUD hugepages when encountered. Provide trivial helpers for arm32 to allow sharing code. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> [ Replaced BUG() => WARN_ON(1) for arm32 PUD helpers ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm64: Support handling access faults for PUD hugepagesPunit Agrawal
In preparation for creating larger hugepages at Stage 2, extend the access fault handling at Stage 2 to support PUD hugepages when encountered. Provide trivial helpers for arm32 to allow sharing of code. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> [ Replaced BUG() => WARN_ON(1) in PUD helpers ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm64: Support PUD hugepage in stage2_is_exec()Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for detecting execute permissions on PUD page table entries. Faults due to lack of execute permissions on page table entries is used to perform i-cache invalidation on first execute. Provide trivial implementations of arm32 helpers to allow sharing of code. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> [ Replaced BUG() => WARN_ON(1) in arm32 PUD helpers ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm64: Support dirty page tracking for PUD hugepagesPunit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for write protecting PUD hugepages when they are encountered. Write protecting guest tables is used to track dirty pages when migrating VMs. Also, provide trivial implementations of required kvm_s2pud_* helpers to allow sharing of code with arm32. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> [ Replaced BUG() => WARN_ON() in arm32 pud helpers ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm/arm64: Introduce helpers to manipulate page table entriesPunit Agrawal
Introduce helpers to abstract architectural handling of the conversion of pfn to page table entries and marking a PMD page table entry as a block entry. The helpers are introduced in preparation for supporting PUD hugepages at stage 2 - which are supported on arm64 but do not exist on arm. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on faultPunit Agrawal
Stage 2 fault handler marks a page as executable if it is handling an execution fault or if it was a permission fault in which case the executable bit needs to be preserved. The logic to decide if the page should be marked executable is duplicated for PMD and PTE entries. To avoid creating another copy when support for PUD hugepages is introduced refactor the code to share the checks needed to mark a page table entry as executable. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm/arm64: Share common code in user_mem_abort()Punit Agrawal
The code for operations such as marking the pfn as dirty, and dcache/icache maintenance during stage 2 fault handling is duplicated between normal pages and PMD hugepages. Instead of creating another copy of the operations when we introduce PUD hugepages, let's share them across the different pagesizes. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-10-25Merge tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Radim Krčmář: "ARM: - Improved guest IPA space support (32 to 52 bits) - RAS event delivery for 32bit - PMU fixes - Guest entry hardening - Various cleanups - Port of dirty_log_test selftest PPC: - Nested HV KVM support for radix guests on POWER9. The performance is much better than with PR KVM. Migration and arbitrary level of nesting is supported. - Disable nested HV-KVM on early POWER9 chips that need a particular hardware bug workaround - One VM per core mode to prevent potential data leaks - PCI pass-through optimization - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base s390: - Initial version of AP crypto virtualization via vfio-mdev - Improvement for vfio-ap - Set the host program identifier - Optimize page table locking x86: - Enable nested virtualization by default - Implement Hyper-V IPI hypercalls - Improve #PF and #DB handling - Allow guests to use Enlightened VMCS - Add migration selftests for VMCS and Enlightened VMCS - Allow coalesced PIO accesses - Add an option to perform nested VMCS host state consistency check through hardware - Automatic tuning of lapic_timer_advance_ns - Many fixes, minor improvements, and cleanups" * tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned Revert "kvm: x86: optimize dr6 restore" KVM: PPC: Optimize clearing TCEs for sparse tables x86/kvm/nVMX: tweak shadow fields selftests/kvm: add missing executables to .gitignore KVM: arm64: Safety check PSTATE when entering guest and handle IL KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips arm/arm64: KVM: Enable 32 bits kvm vcpu events support arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension() KVM: arm64: Fix caching of host MDCR_EL2 value KVM: VMX: enable nested virtualization by default KVM/x86: Use 32bit xor to clear registers in svm.c kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD kvm: vmx: Defer setting of DR6 until #DB delivery kvm: x86: Defer setting of CR2 until #PF delivery kvm: x86: Add payload operands to kvm_multiple_exception kvm: x86: Add exception payload fields to kvm_vcpu_events kvm: x86: Add has_payload and payload to kvm_queued_exception KVM: Documentation: Fix omission in struct kvm_vcpu_events KVM: selftests: add Enlightened VMCS test ...
2018-10-24Merge branch 'siginfo-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull siginfo updates from Eric Biederman: "I have been slowly sorting out siginfo and this is the culmination of that work. The primary result is in several ways the signal infrastructure has been made less error prone. The code has been updated so that manually specifying SEND_SIG_FORCED is never necessary. The conversion to the new siginfo sending functions is now complete, which makes it difficult to send a signal without filling in the proper siginfo fields. At the tail end of the patchset comes the optimization of decreasing the size of struct siginfo in the kernel from 128 bytes to about 48 bytes on 64bit. The fundamental observation that enables this is by definition none of the known ways to use struct siginfo uses the extra bytes. This comes at the cost of a small user space observable difference. For the rare case of siginfo being injected into the kernel only what can be copied into kernel_siginfo is delivered to the destination, the rest of the bytes are set to 0. For cases where the signal and the si_code are known this is safe, because we know those bytes are not used. For cases where the signal and si_code combination is unknown the bits that won't fit into struct kernel_siginfo are tested to verify they are zero, and the send fails if they are not. I made an extensive search through userspace code and I could not find anything that would break because of the above change. If it turns out I did break something it will take just the revert of a single change to restore kernel_siginfo to the same size as userspace siginfo. Testing did reveal dependencies on preferring the signo passed to sigqueueinfo over si->signo, so bit the bullet and added the complexity necessary to handle that case. Testing also revealed bad things can happen if a negative signal number is passed into the system calls. Something no sane application will do but something a malicious program or a fuzzer might do. So I have fixed the code that performs the bounds checks to ensure negative signal numbers are handled" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits) signal: Guard against negative signal numbers in copy_siginfo_from_user32 signal: Guard against negative signal numbers in copy_siginfo_from_user signal: In sigqueueinfo prefer sig not si_signo signal: Use a smaller struct siginfo in the kernel signal: Distinguish between kernel_siginfo and siginfo signal: Introduce copy_siginfo_from_user and use it's return value signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE signal: Fail sigqueueinfo if si_signo != sig signal/sparc: Move EMT_TAGOVF into the generic siginfo.h signal/unicore32: Use force_sig_fault where appropriate signal/unicore32: Generate siginfo in ucs32_notify_die signal/unicore32: Use send_sig_fault where appropriate signal/arc: Use force_sig_fault where appropriate signal/arc: Push siginfo generation into unhandled_exception signal/ia64: Use force_sig_fault where appropriate signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn signal/ia64: Use the generic force_sigsegv in setup_frame signal/arm/kvm: Use send_sig_mceerr signal/arm: Use send_sig_fault where appropriate signal/arm: Use force_sig_fault where appropriate ...
2018-10-03KVM: arm/arm64: Ensure only THP is candidate for adjustmentPunit Agrawal
PageTransCompoundMap() returns true for hugetlbfs and THP hugepages. This behaviour incorrectly leads to stage 2 faults for unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be treated as THP faults. Tighten the check to filter out hugetlbfs pages. This also leads to consistently mapping all unsupported hugepage sizes as PTE level entries at stage 2. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-10-01kvm: arm/arm64: Prepare for VM specific stage2 translationsSuzuki K Poulose
Right now the stage2 page table for a VM is hard coded, assuming an IPA of 40bits. As we are about to add support for per VM IPA, prepare the stage2 page table helpers to accept the kvm instance to make the right decision for the VM. No functional changes. Adds stage2_pgd_size(kvm) to replace S2_PGD_SIZE. Also, moves some of the definitions in arm32 to align with the arm64. Also drop the _AC() specifier constants wherever possible. Cc: Christoffer Dall <cdall@kernel.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-10-01kvm: arm/arm64: Remove spurious WARN_ONSuzuki K Poulose
On a 4-level page table pgd entry can be empty, unlike a 3-level page table. Remove the spurious WARN_ON() in stage_get_pud(). Acked-by: Christoffer Dall <cdall@kernel.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-10-01kvm: arm/arm64: Fix stage2_flush_memslot for 4 level page tableSuzuki K Poulose
So far we have only supported 3 level page table with fixed IPA of 40bits, where PUD is folded. With 4 level page tables, we need to check if the PUD entry is valid or not. Fix stage2_flush_memslot() to do this check, before walking down the table. Acked-by: Christoffer Dall <cdall@kernel.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-09-27signal/arm/kvm: Use send_sig_mceerrEric W. Biederman
This simplifies the code making it clearer what is going on, and making the siginfo generation easier to maintain. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-09-07KVM: Remove obsolete kvm_unmap_hva notifier backendMarc Zyngier
kvm_unmap_hva is long gone, and we only have kvm_unmap_hva_range to deal with. Drop the now obsolete code. Fixes: fb1522e099f0 ("KVM: update to new mmu_notifier semantic v2") Cc: James Hogan <jhogan@kernel.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
2018-09-07KVM: arm/arm64: Clean dcache to PoC when changing PTE due to CoWMarc Zyngier
When triggering a CoW, we unmap the RO page via an MMU notifier (invalidate_range_start), and then populate the new PTE using another one (change_pte). In the meantime, we'll have copied the old page into the new one. The problem is that the data for the new page is sitting in the cache, and should the guest have an uncached mapping to that page (or its MMU off), following accesses will bypass the cache. In a way, this is similar to what happens on a translation fault: We need to clean the page to the PoC before mapping it. So let's just do that. This fixes a KVM unit test regression observed on a HiSilicon platform, and subsequently reproduced on Seattle. Fixes: a9c0e12ebee5 ("KVM: arm/arm64: Only clean the dcache on translation fault") Cc: stable@vger.kernel.org # v4.16+ Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>