summaryrefslogtreecommitdiff
path: root/arch/mips/kvm/mmu.c
AgeCommit message (Collapse)Author
2024-10-25KVM: MIPS: Use kvm_faultin_pfn() to map pfns into the guestSean Christopherson
Convert MIPS to kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-73-seanjc@google.com>
2024-10-25KVM: MIPS: Mark "struct page" pfns accessed prior to dropping mmu_lockSean Christopherson
Mark pages accessed before dropping mmu_lock when faulting in guest memory so that MIPS can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-72-seanjc@google.com>
2024-10-25KVM: MIPS: Mark "struct page" pfns accessed only in "slow" page fault pathSean Christopherson
Mark pages accessed only in the slow page fault path in order to remove an unnecessary user of kvm_pfn_to_refcounted_page(). Marking pages accessed in the primary MMU during KVM page fault handling isn't harmful, but it's largely pointless and likely a waste of a cycles since the primary MMU will call into KVM via mmu_notifiers when aging pages. I.e. KVM participates in a "pull" model, so there's no need to also "push" updates. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-71-seanjc@google.com>
2024-10-25KVM: MIPS: Mark "struct page" pfns dirty only in "slow" page fault pathSean Christopherson
Mark pages/folios dirty only the slow page fault path, i.e. only when mmu_lock is held and the operation is mmu_notifier-protected, as marking a page/folio dirty after it has been written back can make some filesystems unhappy (backing KVM guests will such filesystem files is uncommon, and the race is minuscule, hence the lack of complaints). See the link below for details. Link: https://lore.kernel.org/all/cover.1683044162.git.lstoakes@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-70-seanjc@google.com>
2024-04-11KVM: delete .change_pte MMU notifier callbackPaolo Bonzini
The .change_pte() MMU notifier callback was intended as an optimization. The original point of it was that KSM could tell KVM to flip its secondary PTE to a new location without having to first zap it. At the time there was also an .invalidate_page() callback; both of them were *not* bracketed by calls to mmu_notifier_invalidate_range_{start,end}(), and .invalidate_page() also doubled as a fallback implementation of .change_pte(). Later on, however, both callbacks were changed to occur within an invalidate_range_start/end() block. In the case of .change_pte(), commit 6bdb913f0a70 ("mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end", 2012-10-09) did so to remove the fallback from .invalidate_page() to .change_pte() and allow sleepable .invalidate_page() hooks. This however made KVM's usage of the .change_pte() callback completely moot, because KVM unmaps the sPTEs during .invalidate_range_start() and therefore .change_pte() has no hope of finding a sPTE to change. Drop the generic KVM code that dispatches to kvm_set_spte_gfn(), as well as all the architecture specific implementations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Anup Patel <anup@brainfault.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Reviewed-by: Bibo Mao <maobibo@loongson.cn> Message-ID: <20240405115815.3226315-2-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12KVM: MIPS: fix -Wunused-but-set-variable warningPaolo Bonzini
The variable is completely unused, remove it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-17KVM: Wrap kvm_{gfn,hva}_range.pte in a per-action unionSean Christopherson
Wrap kvm_{gfn,hva}_range.pte in a union so that future notifier events can pass event specific information up and down the stack without needing to constantly expand and churn the APIs. Lockless aging of SPTEs will pass around a bitmap, and support for memory attributes will pass around the new attributes for the range. Add a "KVM_NO_ARG" placeholder to simplify handling events without an argument (creating a dummy union variable is midly annoying). Opportunstically drop explicit zero-initialization of the "pte" field, as omitting the field (now a union) has the same effect. Cc: Yu Zhao <yuzhao@google.com> Link: https://lore.kernel.org/all/CAOUHufagkd2Jk3_HrVoFFptRXM=hX2CV8f+M-dka-hJU4bP8kw@mail.gmail.com Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Yu Zhao <yuzhao@google.com> Link: https://lore.kernel.org/r/20230729004144.1054885-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-12-11MIPS&LoongArch&NIOS2: adjust prototypes of p?d_init()Feiyang Chen
Patch series "mm/sparse-vmemmap: Generalise helpers and enable for LoongArch", v14. This series is in order to enable sparse-vmemmap for LoongArch. But LoongArch cannot use generic helpers directly because MIPS&LoongArch need to call pgd_init()/pud_init()/pmd_init() when populating page tables. So we adjust the prototypes of p?d_init() to make generic helpers can call them, then enable sparse-vmemmap with generic helpers, and to be further, generalise vmemmap_populate_hugepages() for ARM64, X86 and LoongArch. This patch (of 4): We are preparing to add sparse vmemmap support to LoongArch. MIPS and LoongArch need to call pgd_init()/pud_init()/pmd_init() when populating page tables, so adjust their prototypes to make generic helpers can call them. NIOS2 declares pmd_init() but doesn't use, just remove it to avoid build errors. Link: https://lkml.kernel.org/r/20221027125253.3458989-1-chenhuacai@loongson.cn Link: https://lkml.kernel.org/r/20221027125253.3458989-2-chenhuacai@loongson.cn Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Will Deacon <will@kernel.org> Cc: Xuefeng Li <lixuefeng@loongson.cn> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Min Zhou <zhoumin@loongson.cn> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-19KVM: Rename mmu_notifier_* to mmu_invalidate_*Chao Peng
The motivation of this renaming is to make these variables and related helper functions less mmu_notifier bound and can also be used for non mmu_notifier based page invalidation. mmu_invalidate_* was chosen to better describe the purpose of 'invalidating' a page that those variables are used for. - mmu_notifier_seq/range_start/range_end are renamed to mmu_invalidate_seq/range_start/range_end. - mmu_notifier_retry{_hva} helper functions are renamed to mmu_invalidate_retry{_hva}. - mmu_notifier_count is renamed to mmu_invalidate_in_progress to avoid confusion with mn_active_invalidate_count. - While here, also update kvm_inc/dec_notifier_count() to kvm_mmu_invalidate_begin/end() to match the change for mmu_notifier_count. No functional change intended. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <20220816125322.1110439-3-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-17mips: rename PGD_ORDER to PGD_TABLE_ORDERMike Rapoport
This is the order of the page table allocation, not the order of a PGD. While at it remove unused defintion of _PGD_ORDER in asm-offsets. Link: https://lkml.kernel.org/r/20220703141203.147893-7-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Xuerui Wang <kernel@xen0n.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2021-08-21MIPS: Return true/false (not 1/0) from bool functionsHuilong Deng
./arch/mips/kernel/uprobes.c:261:8-9: WARNING: return of 0/1 in function 'arch_uprobe_skip_sstep' with return type bool ./arch/mips/kernel/uprobes.c:78:10-11: WARNING: return of 0/1 in function 'is_trap_insn' with return type bool ./arch/mips/kvm/mmu.c:489:9-10: WARNING: return of 0/1 in function 'kvm_test_age_gfn' with return type bool ./arch/mips/kvm/mmu.c:445:8-9: WARNING: return of 0/1 in function 'kvm_unmap_gfn_range' with return type bool Signed-off-by: Huilong Deng <denghuilong@cdjrlc.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-05-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "This is a large update by KVM standards, including AMD PSP (Platform Security Processor, aka "AMD Secure Technology") and ARM CoreSight (debug and trace) changes. ARM: - CoreSight: Add support for ETE and TRBE - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler x86: - AMD PSP driver changes - Optimizations and cleanup of nested SVM code - AMD: Support for virtual SPEC_CTRL - Optimizations of the new MMU code: fast invalidation, zap under read lock, enable/disably dirty page logging under read lock - /dev/kvm API for AMD SEV live migration (guest API coming soon) - support SEV virtual machines sharing the same encryption context - support SGX in virtual machines - add a few more statistics - improved directed yield heuristics - Lots and lots of cleanups Generic: - Rework of MMU notifier interface, simplifying and optimizing the architecture-specific code - a handful of "Get rid of oprofile leftovers" patches - Some selftests improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits) KVM: selftests: Speed up set_memory_region_test selftests: kvm: Fix the check of return value KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt() KVM: SVM: Skip SEV cache flush if no ASIDs have been used KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids() KVM: SVM: Drop redundant svm_sev_enabled() helper KVM: SVM: Move SEV VMCB tracking allocation to sev.c KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup() KVM: SVM: Unconditionally invoke sev_hardware_teardown() KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported) KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features KVM: SVM: Move SEV module params/variables to sev.c KVM: SVM: Disable SEV/SEV-ES if NPT is disabled KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails KVM: SVM: Zero out the VMCB array used to track SEV ASID association x86/sev: Drop redundant and potentially misleading 'sev_enabled' KVM: x86: Move reverse CPUID helpers to separate header file KVM: x86: Rename GPR accessors to make mode-aware variants the defaults ...
2021-04-17KVM: MIPS/MMU: Convert to the gfn-based MMU notifier callbacksSean Christopherson
Move MIPS to the gfn-based MMU notifier APIs, which do the hva->gfn lookup in common code, and whose code is nearly identical to MIPS' lookup. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: MIPS: defer flush to generic MMU notifier codePaolo Bonzini
Return 1 from kvm_unmap_hva_range and kvm_set_spte_hva if a flush is needed, so that the generic code can coalesce the flushes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: MIPS: let generic code call prepare_flush_shadowPaolo Bonzini
Since all calls to kvm_flush_remote_tlbs must be preceded by kvm_mips_callbacks->prepare_flush_shadow, repurpose kvm_arch_flush_remote_tlb to invoke it. This makes it possible to use the TLB flushing mechanism provided by the generic MMU notifier code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: MIPS: rework flush_shadow_* callbacks into one that prepares the flushPaolo Bonzini
Both trap-and-emulate and VZ have a single implementation that covers both .flush_shadow_all and .flush_shadow_memslot, and both of them end with a call to kvm_flush_remote_tlbs. Unify the callbacks into one and extract the call to kvm_flush_remote_tlbs. The next patches will pull it further out of the the architecture-specific MMU notifier functions kvm_unmap_hva_range and kvm_set_spte_hva. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-10MIPS: Remove KVM_TE supportThomas Bogendoerfer
After removal of the guest part of KVM TE (trap and emulate), also remove the host part. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-11-19MIPS: kvm: Use vm_get_page_prot to get protection bitsThomas Bogendoerfer
MIPS protection bits are setup during runtime so using defines like PAGE_SHARED ignores this runtime changes. Using vm_get_page_prot to get correct page protection fixes this. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-08-21KVM: Pass MMU notifier range flags to kvm_unmap_hva_range()Will Deacon
The 'flags' field of 'struct mmu_notifier_range' is used to indicate whether invalidate_range_{start,end}() are permitted to block. In the case of kvm_mmu_notifier_invalidate_range_start(), this field is not forwarded on to the architecture-specific implementation of kvm_unmap_hva_range() and therefore the backend cannot sensibly decide whether or not to block. Add an extra 'flags' parameter to kvm_unmap_hva_range() so that architectures are aware as to whether or not they are permitted to block. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20200811102725.7121-2-will@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09KVM: MIPS: Use common KVM implementation of MMU memory cachesSean Christopherson
Move to the common MMU memory cache implementation now that the common code and MIPS's existing code are semantically compatible. No functional change intended. Suggested-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-22-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09KVM: MIPS: Account pages used for GPA page tablesSean Christopherson
Use GFP_KERNEL_ACCOUNT instead of GFP_KERNEL when allocating pages for the the GPA page tables. The primary motivation for accounting the allocations is to align with the common KVM memory cache helpers in preparation for moving to the common implementation in a future patch. The actual accounting is a bonus side effect. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-21-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09KVM: MIPS: Drop @max param from mmu_topup_memory_cache()Sean Christopherson
Replace the @max param in mmu_topup_memory_cache() and instead use ARRAY_SIZE() to terminate the loop to fill the cache. This removes a BUG_ON() and sets the stage for moving MIPS to the common memory cache implementation. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-20-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-09mm: consolidate pte_index() and pte_offset_*() definitionsMike Rapoport
All architectures define pte_index() as (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1) and all architectures define pte_offset_kernel() as an entry in the array of PTEs indexed by the pte_index(). For the most architectures the pte_offset_kernel() implementation relies on the availability of pmd_page_vaddr() that converts a PMD entry value to the virtual address of the page containing PTEs array. Let's move x86 definitions of the PTE accessors to the generic place in <linux/pgtable.h> and then simply drop the respective definitions from the other architectures. The architectures that didn't provide pmd_page_vaddr() are updated to have that defined. The generic implementation of pte_offset_kernel() can be overridden by an architecture and alpha makes use of this because it has special ordering requirements for its version of pte_offset_kernel(). [rppt@linux.ibm.com: v2] Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org [akpm@linux-foundation.org: fix x86 warning] [sfr@canb.auug.org.au: fix powerpc build] Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22mips: add support for folded p4d page tablesMike Rapoport
Implement primitives necessary for the 4th level folding, add walks of p4d level where appropriate, replace 5leve-fixup.h with pgtable-nop4d.h and drop usage of __ARCH_USE_5LEVEL_HACK. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Mike Rapoport <rppt@kernel.org>
2019-11-22mips: drop __pXd_offset() macros that duplicate pXd_index() onesMike Rapoport
The __pXd_offset() macros are identical to the pXd_index() macros and there is no point to keep both of them. All architectures define and use pXd_index() so let's keep only those to make mips consistent with the rest of the kernel. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Mike Rapoport <rppt@kernel.org>
2018-12-21KVM: Make kvm_set_spte_hva() return intLan Tianyu
The patch is to make kvm_set_spte_hva() return int and caller can check return value to determine flush tlb or not. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-09-07KVM: Remove obsolete kvm_unmap_hva notifier backendMarc Zyngier
kvm_unmap_hva is long gone, and we only have kvm_unmap_hva_range to deal with. Drop the now obsolete code. Fixes: fb1522e099f0 ("KVM: update to new mmu_notifier semantic v2") Cc: James Hogan <jhogan@kernel.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
2017-03-28KVM: MIPS: Implement VZ supportJames Hogan
Add the main support for the MIPS Virtualization ASE (A.K.A. VZ) to MIPS KVM. The bulk of this work is in vz.c, with various new state and definitions elsewhere. Enough is implemented to be able to run on a minimal VZ core. Further patches will fill out support for guest features which are optional or can be disabled. Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-doc@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Implement KVM_CAP_SYNC_MMUJames Hogan
Implement the SYNC_MMU capability for KVM MIPS, allowing changes in the underlying user host virtual address (HVA) mappings to be promptly reflected in the corresponding guest physical address (GPA) mappings. This allows for several features to work with guest RAM which require mappings to be altered or protected, such as copy-on-write, KSM (Kernel Samepage Merging), idle page tracking, memory swapping, and guest memory ballooning. There are two main aspects of this change, described below. The KVM MMU notifier architecture callbacks are implemented so we can be notified of changes in the HVA mappings. These arrange for the guest physical address (GPA) page tables to be modified and possibly for derived mappings (GVA page tables and TLBs) to be flushed. - kvm_unmap_hva[_range]() - These deal with HVA mappings being removed, for example before a copy-on-write takes place, which requires the corresponding GPA page table mappings to be removed too. - kvm_set_spte_hva() - These update a GPA page table entry to match the new HVA entry, but must be careful to respect KVM specific configuration such as not dirtying a clean guest page which is dirty to the host, and write protecting writable pages in read only memslots (which will soon be supported). - kvm[_test]_age_hva() - These update GPA page table entries to be old (invalid) so that access can be tracked, making them young again. The GPA page fault handling (kvm_mips_map_page) is updated to use gfn_to_pfn_prot() (which may provide read-only pages), to handle asynchronous page table invalidation from MMU notifier callbacks, and to handle more cases in the fast path. - mmu_notifier_seq is used to detect asynchronous page table invalidations while we're holding a pfn from gfn_to_pfn_prot() outside of kvm->mmu_lock, retrying if invalidations have taken place, e.g. a COW or a KSM page merge. - The fast path (_kvm_mips_map_page_fast) now handles marking old pages as young / accessed, and disallowing dirtying of clean pages that aren't actually writable (e.g. shared pages that should COW, and read-only memory regions when they are enabled in a future patch). - Due to the use of MMU notifications we no longer need to keep the page references after we've updated the GPA page tables. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Pass GPA PTE bits to mapped GVA PTEsJames Hogan
Propagate the GPA PTE protection bits on to the GVA PTEs on a mapped fault (except _PAGE_WRITE, and filtered by the guest TLB entry), rather than always overriding the protection. This allows dirty page tracking to work in mapped guest segments as a clear dirty bit in the GPA PTE will propagate to the GVA PTEs even when the guest TLB has the dirty bit set. Since the filtering of protection bits is now abstracted, if the buddy GVA PTE is also valid, we obtain the corresponding GPA PTE using a simple non-allocating walk and load that into the GVA PTE similarly (which may itself be invalid). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Pass GPA PTE bits to KSeg0 GVA PTEsJames Hogan
Propagate the GPA PTE protection bits on to the GVA PTEs on a KSeg0 fault (except _PAGE_WRITE), rather than always overriding the protection. This allows dirty page tracking to work in KSeg0 as a clear dirty bit in the GPA PTE will propagate to the GVA PTEs. This makes it simpler to use a single kvm_mips_map_page() to obtain both the main GPA PTE and its buddy (which may be invalid), which also allows memory regions to be fully accessible when they don't start and end on a 2*PAGE_SIZE boundary. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Handle dirty logging on GPA faultsJames Hogan
Update kvm_mips_map_page() to handle logging of dirty guest physical pages. Upcoming patches will propagate the dirty bit to the GVA page tables. A fast path is added for handling protection bits that can be resolved without calling into KVM, currently just dirtying of clean pages being written to. The slow path marks the GPA page table entry writable only on writes, and at the same time marks the page dirty in the dirty page logging bitmask. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Use generic dirty log & protect helperJames Hogan
MIPS hasn't up to this point properly supported dirty page logging, as pages in slots with dirty logging enabled aren't made clean, and tlbmod exceptions from writes to clean pages have been assumed to be due to guest TLB protection and unconditionally passed to the guest. Use the generic dirty logging helper kvm_get_dirty_log_protect() to properly implement kvm_vm_ioctl_get_dirty_log(), similar to how ARM does. This uses xchg to clear the dirty bits when reading them, rather than wiping them out afterwards with a memset, which would potentially wipe recently set bits that weren't caught by kvm_get_dirty_log(). It also makes the pages clean again using the kvm_arch_mmu_enable_log_dirty_pt_masked() architecture callback so that further writes after the shadow memslot is flushed will trigger tlbmod exceptions and dirty handling. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Add GPA PT mkclean helperJames Hogan
Add a helper function to make a range of guest physical address (GPA) mappings in the GPA page table clean so that writes can be caught. This will be used in a few places to manage dirty page logging. Note that until the dirty bit is transferred from GPA page table entries to GVA page table entries in an upcoming patch this won't trigger a TLB modified exception on write. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Treat unhandled guest KSeg0 as MMIOJames Hogan
Treat unhandled accesses to guest KSeg0 as MMIO, rather than only host KSeg0 addresses. This will allow read only memory regions (such as the Malta boot flash as emulated by QEMU) to have writes (before reads) treated as MMIO, and unallocated physical addresses to have all accesses treated as MMIO. The MMIO emulation uses the gva_to_gpa callback, so this is also updated for trap & emulate to handle guest KSeg0 addresses. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Pass type of fault down to kvm_mips_map_page()James Hogan
kvm_mips_map_page() will need to know whether the fault was due to a read or a write in order to support dirty page tracking, KVM_CAP_SYNC_MMU, and read only memory regions, so get that information passed down to it via new bool write_fault arguments to various functions. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Use lockless GVA helpers for get_inst()James Hogan
Use the lockless GVA helpers to implement the reading of guest instructions for emulation. This will allow it to handle asynchronous TLB flushes when they are implemented. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Add lockless GVA access helpersJames Hogan
Add helpers to allow for lockless direct access to the GVA space, by changing the VCPU mode to READING_SHADOW_PAGE_TABLES for the duration of the access. This allows asynchronous TLB flush requests in future patches to safely trigger either a TLB flush before the direct GVA space access, or a delay until the in-progress lockless direct access is complete. The kvm_trap_emul_gva_lockless_begin() and kvm_trap_emul_gva_lockless_end() helpers take care of guarding the direct GVA accesses, and kvm_trap_emul_gva_fault() tries to handle a uaccess fault resulting from a flush having taken place. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Update vcpu->mode and vcpu->cpuJames Hogan
Keep the vcpu->mode and vcpu->cpu variables up to date so that kvm_make_all_cpus_request() has a chance of functioning correctly. This will soon need to be used for kvm_flush_remote_tlbs(). We can easily update vcpu->cpu when the VCPU context is loaded or saved, which will happen when accessing guest context and when the guest is scheduled in and out. We need to be a little careful with vcpu->mode though, as we will in future be checking for outstanding VCPU requests, and this must be done after the value of IN_GUEST_MODE in vcpu->mode is visible to other CPUs. Otherwise the other CPU could fail to trigger an IPI to wait for completion dispite the VCPU request not being seen. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Convert guest physical map to page tableJames Hogan
Current guest physical memory is mapped to host physical addresses using a single linear array (guest_pmap of length guest_pmap_npages). This was only really meant to be temporary, and isn't sparse, so its wasteful of memory. A small amount of RAM at GPA 0 and a small boot exception vector at GPA 0x1fc00000 cannot be represented without a full 128KiB guest_pmap allocation (MIPS32 with 16KiB pages), which is one reason why QEMU currently runs its boot code at the top of RAM instead of the usual boot exception vector address. Instead use the existing infrastructure for host virtual page table management to allocate a page table for guest physical memory too. This should be sufficient for now, assuming the size of physical memory doesn't exceed the size of virtual memory. It may need extending in future to handle XPA (eXtended Physical Addressing) in 32-bit guests, as supported by VZ guests on P5600. Some of this code is based loosely on Cavium's VZ KVM implementation. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Improve kvm_get_inst() error returnJames Hogan
Currently kvm_get_inst() returns KVM_INVALID_INST in the event of a fault reading the guest instruction. This has the rather arbitrary magic value 0xdeadbeef. This API isn't very robust, and in fact 0xdeadbeef is a valid MIPS64 instruction encoding, namely "ld t1,-16657(s5)". Therefore change the kvm_get_inst() API to return 0 or -EFAULT, and to return the instruction via a u32 *out argument. We can then drop the KVM_INVALID_INST definition entirely. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Drop kvm_get_new_mmu_context()James Hogan
MIPS KVM uses its own variation of get_new_mmu_context() which takes an extra vcpu pointer (unused) and does exactly the same thing. Switch to just using get_new_mmu_context() directly and drop KVM's version of it as it doesn't really serve any purpose. The nearby declarations of kvm_mips_alloc_new_mmu_context(), kvm_mips_vcpu_load() and kvm_mips_vcpu_put() are also removed from kvm_host.h, as no definitions or users exist. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/TLB: Drop kvm_local_flush_tlb_all()James Hogan
Now that KVM no longer uses wired entries we can safely use local_flush_tlb_all() when we need to flush the entire TLB (on the start of a new ASID cycle). This doesn't flush wired entries, which allows other code to use them without KVM clobbering them all the time. It also is more up to date, knowing about the tlbinv architectural feature, flushing of micro TLB on cores where that is necessary (Loongson I believe), and knows to stop the HTW while doing so. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Use uaccess to read/modify guest instructionsJames Hogan
Now that we have GVA page tables, use standard user accesses with page faults disabled to read & modify guest instructions. This should be more robust (than the rather dodgy method of accessing guest mapped segments by just directly addressing them) and will also work with Enhanced Virtual Addressing (EVA) host kernel configurations where dedicated instructions are needed for accessing user mode memory. For simplicity and speed we do this regardless of the guest segment the address resides in, rather than handling guest KSeg0 specially with kmap_atomic() as before. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Convert commpage fault handling to page tablesJames Hogan
Now that we have GVA page tables and an optimised TLB refill handler in place, convert the handling of commpage faults from the guest kernel to fill the GVA page table and invalidate the TLB entry, rather than filling the wired TLB entry directly. For simplicity we no longer use a wired entry for the commpage (refill should be much cheaper with the fast-path handler anyway). Since we don't need to manipulate the TLB directly any longer, move the function from tlb.c to mmu.c. This puts it closer to the similar functions handling KSeg0 and TLB mapped page faults from the guest. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Convert TLB mapped faults to page tablesJames Hogan
Now that we have GVA page tables and an optimised TLB refill handler in place, convert the handling of page faults in TLB mapped segment from the guest to fill a single GVA page table entry and invalidate the TLB entry, rather than filling a TLB entry pair directly. Also remove the now unused kvm_mips_get_{kernel,user}_asid() functions in mmu.c and kvm_mips_host_tlb_write() in tlb.c. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Convert KSeg0 faults to page tablesJames Hogan
Now that we have GVA page tables and an optimised TLB refill handler in place, convert the handling of KSeg0 page faults from the guest to fill the GVA page tables and invalidate the TLB entry, rather than filling a TLB entry directly. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Invalidate stale GVA PTEs on TLBWJames Hogan
Implement invalidation of specific pairs of GVA page table entries in one or both of the GVA page tables. This is used when existing mappings are replaced in the guest TLB by emulated TLBWI/TLBWR instructions. Due to the sharing of page tables in the host kernel range, we should be careful not to allow host pages to be invalidated. Add a helper kvm_mips_walk_pgd() which can be used when walking of either GPA (future patches) or GVA page tables is needed, optionally with allocation of page tables along the way when they don't exist. GPA page table walking will need to be protected by the kvm->mmu_lock, so we also add a small MMU page cache in each KVM VCPU, like that found for other architectures but smaller. This allows enough pages to be pre-allocated to handle a single fault without holding the lock, allowing the helper to run with the lock held without having to handle allocation failures. Using the same mechanism for GVA allows the same code to be used, and allows it to use the same cache of allocated pages if the GPA walk didn't need to allocate any new tables. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Invalidate GVA PTs on ASID changesJames Hogan
Implement invalidation of large ranges of virtual addresses from GVA page tables in response to a guest ASID change (immediately for guest kernel page table, lazily for guest user page table). We iterate through a range of page tables invalidating entries and freeing fully invalidated tables. To minimise overhead the exact ranges invalidated depends on the flags argument to kvm_mips_flush_gva_pt(), which also allows it to be used in future KVM_CAP_SYNC_MMU patches in response to GPA changes, which unlike guest TLB mapping changes affects guest KSeg0 mappings. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Remove duplicated ASIDs from vcpuJames Hogan
The kvm_vcpu_arch structure contains both mm_structs for allocating MMU contexts (primarily the ASID) but it also copies the resulting ASIDs into guest_{user,kernel}_asid[] arrays which are referenced from uasm generated code. This duplication doesn't seem to serve any purpose, and it gets in the way of generalising the ASID handling across guest kernel/user modes, so lets just extract the ASID straight out of the mm_struct on demand, and in fact there are convenient cpu_context() and cpu_asid() macros for doing so. To reduce the verbosity of this code we do also add kern_mm and user_mm local variables where the kernel and user mm_structs are used. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org