summaryrefslogtreecommitdiff
path: root/arch/mips/kvm
AgeCommit message (Collapse)Author
2017-03-02sched/headers: Prepare to move signal wakeup & sigpending methods from ↵Ingo Molnar
<linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-17KVM: race-free exit from KVM_RUN without POSIX signalsPaolo Bonzini
The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick" a VCPU out of KVM_RUN through a POSIX signal. A signal is attached to a dummy signal handler; by blocking the signal outside KVM_RUN and unblocking it inside, this possible race is closed: VCPU thread service thread -------------------------------------------------------------- check flag set flag raise signal (signal handler does nothing) KVM_RUN However, one issue with KVM_SET_SIGNAL_MASK is that it has to take tsk->sighand->siglock on every KVM_RUN. This lock is often on a remote NUMA node, because it is on the node of a thread's creator. Taking this lock can be very expensive if there are many userspace exits (as is the case for SMP Windows VMs without Hyper-V reference time counter). As an alternative, we can put the flag directly in kvm_run so that KVM can see it: VCPU thread service thread -------------------------------------------------------------- raise signal signal handler set run->immediate_exit KVM_RUN check run->immediate_exit Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-03KVM: MIPS: Allow multiple VCPUs to be createdJames Hogan
Increase the maximum number of MIPS KVM VCPUs to 8, and implement the KVM_CAP_NR_VCPUS and KVM_CAP_MAX_CPUS capabilities which expose the recommended and maximum number of VCPUs to userland. The previous maximum of 1 didn't allow for any form of SMP guests. We calculate the values similarly to ARM, recommending as many VCPUs as there are CPUs online in the system. This will allow userland to know how many VCPUs it is possible to create. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Expose read-only CP0_IntCtl registerJames Hogan
Expose the CP0_IntCtl register through the KVM register access API, which is a required register since MIPS32r2. It is currently read-only since the VS field isn't implemented due to lack of Config3.VInt or Config3.VEIC. It is implemented in trap_emul.c so that a VZ implementation can allow writes. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Expose CP0_EntryLo0/1 registersJames Hogan
Expose the CP0_EntryLo0 and CP0_EntryLo1 registers through the KVM register access API. This is fairly straightforward for trap & emulate since we don't support the RI and XI bits. For the sake of future proofing (particularly for VZ) it is explicitly specified that the API always exposes the 64-bit version of these registers (i.e. with the RI and XI bits in bit positions 63 and 62 respectively), and they are implemented in trap_emul.c rather than mips.c to allow them to be implemented differently for VZ. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Default to reset vectorJames Hogan
Set the default VCPU state closer to the architectural reset state, with PC pointing at the reset vector (uncached PA 0x1fc00000, which for KVM T&E is VA 0x5fc00000), and with CP0_Status.BEV and CP0_Status.ERL to 1. Although QEMU at least will overwrite this state, it makes sense to do this now that CP0_EBase is properly implemented to check BEV, and now that we support a sparse GPA layout potentially with a boot ROM at GPA 0x1fc00000. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Implement CP0_EBase registerJames Hogan
The CP0_EBase register is a standard feature of MIPS32r2, so we should always have been implementing it properly. However the register value was ignored and wasn't exposed to userland. Fix the emulation of exceptions and interrupts to use the value stored in guest CP0_EBase, and fix the masks so that the top 3 bits (rather than the standard 2) are fixed, so that it is always in the guest KSeg0 segment. Also add CP0_EBASE to the KVM one_reg interface so it can be accessed by userland, also allowing the CPU number field to be written (which isn't permitted by the guest). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Move CP0 register access into T&EJames Hogan
Access to various CP0 registers via the KVM register access API needs to be implementation specific to allow restrictions to be made on changes, for example when VZ guest registers aren't present, so move them all into trap_emul.c in preparation for VZ. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Claim KVM_CAP_READONLY_MEM supportJames Hogan
Now that load/store faults due to read only memory regions are treated as MMIO accesses it is safe to claim support for read only memory regions (KVM_CAP_READONLY_MEM). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Implement KVM_CAP_SYNC_MMUJames Hogan
Implement the SYNC_MMU capability for KVM MIPS, allowing changes in the underlying user host virtual address (HVA) mappings to be promptly reflected in the corresponding guest physical address (GPA) mappings. This allows for several features to work with guest RAM which require mappings to be altered or protected, such as copy-on-write, KSM (Kernel Samepage Merging), idle page tracking, memory swapping, and guest memory ballooning. There are two main aspects of this change, described below. The KVM MMU notifier architecture callbacks are implemented so we can be notified of changes in the HVA mappings. These arrange for the guest physical address (GPA) page tables to be modified and possibly for derived mappings (GVA page tables and TLBs) to be flushed. - kvm_unmap_hva[_range]() - These deal with HVA mappings being removed, for example before a copy-on-write takes place, which requires the corresponding GPA page table mappings to be removed too. - kvm_set_spte_hva() - These update a GPA page table entry to match the new HVA entry, but must be careful to respect KVM specific configuration such as not dirtying a clean guest page which is dirty to the host, and write protecting writable pages in read only memslots (which will soon be supported). - kvm[_test]_age_hva() - These update GPA page table entries to be old (invalid) so that access can be tracked, making them young again. The GPA page fault handling (kvm_mips_map_page) is updated to use gfn_to_pfn_prot() (which may provide read-only pages), to handle asynchronous page table invalidation from MMU notifier callbacks, and to handle more cases in the fast path. - mmu_notifier_seq is used to detect asynchronous page table invalidations while we're holding a pfn from gfn_to_pfn_prot() outside of kvm->mmu_lock, retrying if invalidations have taken place, e.g. a COW or a KSM page merge. - The fast path (_kvm_mips_map_page_fast) now handles marking old pages as young / accessed, and disallowing dirtying of clean pages that aren't actually writable (e.g. shared pages that should COW, and read-only memory regions when they are enabled in a future patch). - Due to the use of MMU notifications we no longer need to keep the page references after we've updated the GPA page tables. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Pass GPA PTE bits to mapped GVA PTEsJames Hogan
Propagate the GPA PTE protection bits on to the GVA PTEs on a mapped fault (except _PAGE_WRITE, and filtered by the guest TLB entry), rather than always overriding the protection. This allows dirty page tracking to work in mapped guest segments as a clear dirty bit in the GPA PTE will propagate to the GVA PTEs even when the guest TLB has the dirty bit set. Since the filtering of protection bits is now abstracted, if the buddy GVA PTE is also valid, we obtain the corresponding GPA PTE using a simple non-allocating walk and load that into the GVA PTE similarly (which may itself be invalid). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Pass GPA PTE bits to KSeg0 GVA PTEsJames Hogan
Propagate the GPA PTE protection bits on to the GVA PTEs on a KSeg0 fault (except _PAGE_WRITE), rather than always overriding the protection. This allows dirty page tracking to work in KSeg0 as a clear dirty bit in the GPA PTE will propagate to the GVA PTEs. This makes it simpler to use a single kvm_mips_map_page() to obtain both the main GPA PTE and its buddy (which may be invalid), which also allows memory regions to be fully accessible when they don't start and end on a 2*PAGE_SIZE boundary. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Handle dirty logging on GPA faultsJames Hogan
Update kvm_mips_map_page() to handle logging of dirty guest physical pages. Upcoming patches will propagate the dirty bit to the GVA page tables. A fast path is added for handling protection bits that can be resolved without calling into KVM, currently just dirtying of clean pages being written to. The slow path marks the GPA page table entry writable only on writes, and at the same time marks the page dirty in the dirty page logging bitmask. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Clean & flush on dirty page logging enableJames Hogan
When an existing memory region has dirty page logging enabled, make the entire slot clean (read only) so that writes will immediately start logging dirty pages (once the dirty bit is transferred from GPA to GVA page tables in an upcoming patch). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Use generic dirty log & protect helperJames Hogan
MIPS hasn't up to this point properly supported dirty page logging, as pages in slots with dirty logging enabled aren't made clean, and tlbmod exceptions from writes to clean pages have been assumed to be due to guest TLB protection and unconditionally passed to the guest. Use the generic dirty logging helper kvm_get_dirty_log_protect() to properly implement kvm_vm_ioctl_get_dirty_log(), similar to how ARM does. This uses xchg to clear the dirty bits when reading them, rather than wiping them out afterwards with a memset, which would potentially wipe recently set bits that weren't caught by kvm_get_dirty_log(). It also makes the pages clean again using the kvm_arch_mmu_enable_log_dirty_pt_masked() architecture callback so that further writes after the shadow memslot is flushed will trigger tlbmod exceptions and dirty handling. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Add GPA PT mkclean helperJames Hogan
Add a helper function to make a range of guest physical address (GPA) mappings in the GPA page table clean so that writes can be caught. This will be used in a few places to manage dirty page logging. Note that until the dirty bit is transferred from GPA page table entries to GVA page table entries in an upcoming patch this won't trigger a TLB modified exception on write. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Handle read only GPA in TLB modJames Hogan
Rewrite TLB modified exception handling to handle read only GPA memory regions, instead of unconditionally passing the exception to the guest. If the guest TLB is not the cause of the exception we call into the normal TLB fault handling depending on the memory segment, which will soon attempt to remap the physical page to be writable (handling dirty page tracking or copy on write in the process). Failing that we fall back to treating it as MMIO, due to a read only memory region. Once the capability is enabled, this will allow read only memory regions (such as the Malta boot flash as emulated by QEMU) to have writes treated as MMIO, while still allowing reads to run untrapped. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Treat unhandled guest KSeg0 as MMIOJames Hogan
Treat unhandled accesses to guest KSeg0 as MMIO, rather than only host KSeg0 addresses. This will allow read only memory regions (such as the Malta boot flash as emulated by QEMU) to have writes (before reads) treated as MMIO, and unallocated physical addresses to have all accesses treated as MMIO. The MMIO emulation uses the gva_to_gpa callback, so this is also updated for trap & emulate to handle guest KSeg0 addresses. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Abstract bad access handlingJames Hogan
Abstract the handling of bad guest loads and stores which may need to trigger an MMIO, so that the same code can be used in a later patch for guest KSeg0 addresses (TLB exception handling) as well as for host KSeg1 addresses (existing address error exception and TLB exception handling). We now use kvm_mips_emulate_store() and kvm_mips_emulate_load() directly rather than the more generic kvm_mips_emulate_inst(), as there is no need to expose emulation of any other instructions. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Pass type of fault down to kvm_mips_map_page()James Hogan
kvm_mips_map_page() will need to know whether the fault was due to a read or a write in order to support dirty page tracking, KVM_CAP_SYNC_MMU, and read only memory regions, so get that information passed down to it via new bool write_fault arguments to various functions. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Ignore user writes to CP0_Config7James Hogan
Ignore userland writes to CP0_Config7 rather than reporting an error, since we do allow reads of this register and it is claimed to exist in the ioctl API. This allows userland to blindly save and restore KVM registers without having to special case certain registers as not being writable, for example during live migration once dirty page logging is fixed. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Implement kvm_arch_flush_shadow_all/memslotJames Hogan
Implement the kvm_arch_flush_shadow_all() and kvm_arch_flush_shadow_memslot() KVM functions for MIPS to allow guest physical mappings to be safely changed. The general MIPS KVM code takes care of flushing of GPA page table entries. kvm_arch_flush_shadow_all() flushes the whole GPA page table, and is always called on the cleanup path so there is no need to acquire the kvm->mmu_lock. kvm_arch_flush_shadow_memslot() flushes only the range of mappings in the GPA page table corresponding to the slot being flushed, and happens when memory regions are moved or deleted. MIPS KVM implementation callbacks are added for handling the implementation specific flushing of mappings derived from the GPA page tables. These are implemented for trap_emul.c using kvm_flush_remote_tlbs() which should now be functional, and will flush the per-VCPU GVA page tables and ASIDS synchronously (before next entering guest mode or directly accessing GVA space). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/Emulate: Use lockless GVA helpers for cache emulationJames Hogan
Use the lockless GVA helpers to implement the reading of guest instructions for emulation. This will allow it to handle asynchronous TLB flushes when they are implemented. This is a little more complicated than the other two cases (get_inst() and dynamic translation) due to the need to emulate the appropriate guest TLB exception when the address isn't present or isn't valid in the guest TLB. Since there are several protected cache ops that may need to be performed safely, this is abstracted by kvm_mips_guest_cache_op() which is passed a protected cache op function pointer and takes care of the lockless operation and fault handling / retry if the op should fail, taking advantage of the new errors which the protected cache ops can now return. This allows the existing advance fault handling which relied on host TLB lookups to be removed, along with the now unused kvm_mips_host_tlb_lookup(), Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Use lockless GVA helpers for get_inst()James Hogan
Use the lockless GVA helpers to implement the reading of guest instructions for emulation. This will allow it to handle asynchronous TLB flushes when they are implemented. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Use lockless GVA helpers for dyntransJames Hogan
Use the lockless GVA helpers to implement the dynamic translation of guest instructions. This will allow it to handle asynchronous TLB flushes when they are implemented. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Add lockless GVA access helpersJames Hogan
Add helpers to allow for lockless direct access to the GVA space, by changing the VCPU mode to READING_SHADOW_PAGE_TABLES for the duration of the access. This allows asynchronous TLB flush requests in future patches to safely trigger either a TLB flush before the direct GVA space access, or a delay until the in-progress lockless direct access is complete. The kvm_trap_emul_gva_lockless_begin() and kvm_trap_emul_gva_lockless_end() helpers take care of guarding the direct GVA accesses, and kvm_trap_emul_gva_fault() tries to handle a uaccess fault resulting from a flush having taken place. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Reduce stale ASID checksJames Hogan
The stale ASID checks taking place on VCPU load can be reduced: - Now that we check for a stale ASID on guest re-entry, there is no need to do so when loading the VCPU outside of guest context, since it will happen before entering the guest. Note that a lot of KVM VCPU ioctls will cause the VCPU to be loaded but guest context won't be entered. - There is no need to check for a stale kernel_mm ASID when the guest is in user mode and vice versa. In fact doing so can potentially be problematic since the user_mm ASID regeneration may trigger a new ASID cycle, which would cause the kern_mm ASID to become stale after it has been checked for staleness. Therefore only check the ASID for the mm corresponding to the current guest mode, and only if we're already in guest context. We drop some of the related kvm_debug() calls here too. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Handle TLB invalidation requestsJames Hogan
Add handling of TLB invalidation requests before entering guest mode. This will allow asynchonous invalidation of the VCPU mappings when physical memory regions are altered. Should the CPU running the VCPU already be in guest mode an IPI will be sent to trigger a guest exit. The reload_asid path will be used in a future patch for when GVA is about to be directly accessed by KVM. In the process, the stale user ASID check in the re-entry path (for lazy user GVA flushing) is generalised to check the ASID for the current guest mode, in case a TLB invalidation request was handled. This has the side effect of making the ASID checks on vcpu_load too conservative, which will be addressed in a later patch. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Update vcpu->mode and vcpu->cpuJames Hogan
Keep the vcpu->mode and vcpu->cpu variables up to date so that kvm_make_all_cpus_request() has a chance of functioning correctly. This will soon need to be used for kvm_flush_remote_tlbs(). We can easily update vcpu->cpu when the VCPU context is loaded or saved, which will happen when accessing guest context and when the guest is scheduled in and out. We need to be a little careful with vcpu->mode though, as we will in future be checking for outstanding VCPU requests, and this must be done after the value of IN_GUEST_MODE in vcpu->mode is visible to other CPUs. Otherwise the other CPU could fail to trigger an IPI to wait for completion dispite the VCPU request not being seen. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Convert guest physical map to page tableJames Hogan
Current guest physical memory is mapped to host physical addresses using a single linear array (guest_pmap of length guest_pmap_npages). This was only really meant to be temporary, and isn't sparse, so its wasteful of memory. A small amount of RAM at GPA 0 and a small boot exception vector at GPA 0x1fc00000 cannot be represented without a full 128KiB guest_pmap allocation (MIPS32 with 16KiB pages), which is one reason why QEMU currently runs its boot code at the top of RAM instead of the usual boot exception vector address. Instead use the existing infrastructure for host virtual page table management to allocate a page table for guest physical memory too. This should be sufficient for now, assuming the size of physical memory doesn't exceed the size of virtual memory. It may need extending in future to handle XPA (eXtended Physical Addressing) in 32-bit guests, as supported by VZ guests on P5600. Some of this code is based loosely on Cavium's VZ KVM implementation. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Use CP0_BadInstr[P] for emulationJames Hogan
When exiting from the guest, store the values of the CP0_BadInstr and CP0_BadInstrP registers if they exist, which contain the encodings of the instructions which caused the last synchronous exception. When the instruction is needed for emulation, kvm_get_badinstr() and kvm_get_badinstrp() are used instead of calling kvm_get_inst() directly, to decide whether to read the saved CP0_BadInstr/CP0_BadInstrP registers (if they exist), or read the instruction from memory (if not). The use of these registers should be more robust than using kvm_get_inst(), as it actually gives the instruction encoding seen by the hardware rather than relying on user accessors after the fact, which can be fooled by incoherent icache or a racing code modification. It will also work with VZ, where the guest virtual memory isn't directly accessible by the host with user accessors. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Improve kvm_get_inst() error returnJames Hogan
Currently kvm_get_inst() returns KVM_INVALID_INST in the event of a fault reading the guest instruction. This has the rather arbitrary magic value 0xdeadbeef. This API isn't very robust, and in fact 0xdeadbeef is a valid MIPS64 instruction encoding, namely "ld t1,-16657(s5)". Therefore change the kvm_get_inst() API to return 0 or -EFAULT, and to return the instruction via a u32 *out argument. We can then drop the KVM_INVALID_INST definition entirely. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Don't treat code fetch faults as MMIOJames Hogan
In order to make use of the CP0_BadInstr & CP0_BadInstrP registers we need to be a bit more careful not to treat code fetch faults as MMIO, lest we hit an UNPREDICTABLE register value when we try to emulate the MMIO load instruction but there was no valid instruction word available to the hardware. Add a kvm_is_ifetch_fault() helper to try to figure out whether a load fault was due to a code fetch, and prevent MMIO instruction emulation in that case. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Drop kvm_get_new_mmu_context()James Hogan
MIPS KVM uses its own variation of get_new_mmu_context() which takes an extra vcpu pointer (unused) and does exactly the same thing. Switch to just using get_new_mmu_context() directly and drop KVM's version of it as it doesn't really serve any purpose. The nearby declarations of kvm_mips_alloc_new_mmu_context(), kvm_mips_vcpu_load() and kvm_mips_vcpu_put() are also removed from kvm_host.h, as no definitions or users exist. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/Emulate: Drop redundant TLB flushes on exceptionsJames Hogan
When exceptions are injected into the MIPS KVM guest, the whole host TLB is flushed (except any entries in the guest KSeg0 range). This is certainly not mandated by the architecture when exceptions are taken (userland can't directly change TLB mappings anyway), and is a pretty heavyweight operation: - There may be hundreds of TLB entries especially when a 512 entry FTLB is present. These are walked and read and conditionally invalidated, so the TLBINV feature can't be used either. - It'll indiscriminately wipe out entries belonging to other memory spaces. A simple ASID regeneration would be much faster to perform, although it'd wipe out the guest KSeg0 mappings too. My suspicion is that this was simply to plaster over the fact that kvm_mips_host_tlb_inv() incorrectly only invalidated TLB entries in the ASID for guest usermode, and not the ASID for guest kernelmode. Now that the recent commit "KVM: MIPS/TLB: Flush host TLB entry in kernel ASID" fixes kvm_mips_host_tlb_inv() to flush TLB entries in the kernelmode ASID when the guest TLB changes, lets drop these calls and the otherwise unused kvm_mips_flush_host_tlb(). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/TLB: Drop kvm_local_flush_tlb_all()James Hogan
Now that KVM no longer uses wired entries we can safely use local_flush_tlb_all() when we need to flush the entire TLB (on the start of a new ASID cycle). This doesn't flush wired entries, which allows other code to use them without KVM clobbering them all the time. It also is more up to date, knowing about the tlbinv architectural feature, flushing of micro TLB on cores where that is necessary (Loongson I believe), and knows to stop the HTW while doing so. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/Emulate: Fix CACHE emulation for EVA hostsJames Hogan
Use protected_writeback_dcache_line() instead of flush_dcache_line(), and protected_flush_icache_line() instead of flush_icache_line(), so that CACHEE (the EVA variant) is used on EVA host kernels. Without this, guest floating point branch delay slot emulation via a trampoline on the user stack fails on EVA host kernels due to failure of the icache sync, resulting in the break instruction getting skipped and execution from the stack. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Use uaccess to read/modify guest instructionsJames Hogan
Now that we have GVA page tables, use standard user accesses with page faults disabled to read & modify guest instructions. This should be more robust (than the rather dodgy method of accessing guest mapped segments by just directly addressing them) and will also work with Enhanced Virtual Addressing (EVA) host kernel configurations where dedicated instructions are needed for accessing user mode memory. For simplicity and speed we do this regardless of the guest segment the address resides in, rather than handling guest KSeg0 specially with kmap_atomic() as before. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Drop vm_init() callbackJames Hogan
Now that the commpage doesn't use wired TLB entries, the per-CPU vm_init() callback is the only work done by kvm_mips_init_vm_percpu(). The trap & emulate implementation doesn't actually need to do anything from vm_init(), and the future VZ implementation would be better served by a kvm_arch_hardware_enable callback anyway. Therefore drop the vm_init() callback entirely, allowing the kvm_mips_init_vm_percpu() function to also be dropped, along with the kvm_mips_instance atomic counter. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Convert commpage fault handling to page tablesJames Hogan
Now that we have GVA page tables and an optimised TLB refill handler in place, convert the handling of commpage faults from the guest kernel to fill the GVA page table and invalidate the TLB entry, rather than filling the wired TLB entry directly. For simplicity we no longer use a wired entry for the commpage (refill should be much cheaper with the fast-path handler anyway). Since we don't need to manipulate the TLB directly any longer, move the function from tlb.c to mmu.c. This puts it closer to the similar functions handling KSeg0 and TLB mapped page faults from the guest. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Convert TLB mapped faults to page tablesJames Hogan
Now that we have GVA page tables and an optimised TLB refill handler in place, convert the handling of page faults in TLB mapped segment from the guest to fill a single GVA page table entry and invalidate the TLB entry, rather than filling a TLB entry pair directly. Also remove the now unused kvm_mips_get_{kernel,user}_asid() functions in mmu.c and kvm_mips_host_tlb_write() in tlb.c. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Convert KSeg0 faults to page tablesJames Hogan
Now that we have GVA page tables and an optimised TLB refill handler in place, convert the handling of KSeg0 page faults from the guest to fill the GVA page tables and invalidate the TLB entry, rather than filling a TLB entry directly. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Invalidate stale GVA PTEs on TLBWJames Hogan
Implement invalidation of specific pairs of GVA page table entries in one or both of the GVA page tables. This is used when existing mappings are replaced in the guest TLB by emulated TLBWI/TLBWR instructions. Due to the sharing of page tables in the host kernel range, we should be careful not to allow host pages to be invalidated. Add a helper kvm_mips_walk_pgd() which can be used when walking of either GPA (future patches) or GVA page tables is needed, optionally with allocation of page tables along the way when they don't exist. GPA page table walking will need to be protected by the kvm->mmu_lock, so we also add a small MMU page cache in each KVM VCPU, like that found for other architectures but smaller. This allows enough pages to be pre-allocated to handle a single fault without holding the lock, allowing the helper to run with the lock held without having to handle allocation failures. Using the same mechanism for GVA allows the same code to be used, and allows it to use the same cache of allocated pages if the GPA walk didn't need to allocate any new tables. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/MMU: Invalidate GVA PTs on ASID changesJames Hogan
Implement invalidation of large ranges of virtual addresses from GVA page tables in response to a guest ASID change (immediately for guest kernel page table, lazily for guest user page table). We iterate through a range of page tables invalidating entries and freeing fully invalidated tables. To minimise overhead the exact ranges invalidated depends on the flags argument to kvm_mips_flush_gva_pt(), which also allows it to be used in future KVM_CAP_SYNC_MMU patches in response to GPA changes, which unlike guest TLB mapping changes affects guest KSeg0 mappings. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASIDJames Hogan
Refactor kvm_mips_host_tlb_inv() to also be able to invalidate any matching TLB entry in the kernel ASID rather than assuming only the TLB entries in the user ASID can change. Two new bool user/kernel arguments allow the caller to indicate whether the mapping should affect each of the ASIDs for guest user/kernel mode. - kvm_mips_invalidate_guest_tlb() (used by TLBWI/TLBWR emulation) can now invalidate any corresponding TLB entry in both the kernel ASID (guest kernel may have accessed any guest mapping), and the user ASID if the entry being replaced is in guest USeg (where guest user may also have accessed it). - The tlbmod fault handler (and the KSeg0 / TLB mapped / commpage fault handlers in later patches) can now invalidate the corresponding TLB entry in whichever ASID is currently active, since only a single page table will have been updated anyway. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/TLB: Fix off-by-one in TLB invalidateJames Hogan
kvm_mips_host_tlb_inv() uses the TLBP instruction to probe the host TLB for an entry matching the given guest virtual address, and determines whether a match was found based on whether CP0_Index > 0. This is technically incorrect as an index of 0 (with the high bit clear) is a perfectly valid TLB index. This is harmless at the moment due to the use of at least 1 wired TLB entry for the KVM commpage, however we will soon be ridding ourselves of that particular wired entry so lets fix the condition in case the entry needing invalidation does land at TLB index 0. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Add fast path TLB refill handlerJames Hogan
Use functions from the general MIPS TLB exception vector generation code (tlbex.c) to construct a fast path TLB refill handler similar to the general one, but cut down and capable of preserving K0 and K1. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS: Support NetLogic KScratch registersJames Hogan
tlbex.c uses the implementation dependent $22 CP0 register group on NetLogic cores, with the help of the c0_kscratch() helper. Allow these registers to be allocated by the KVM entry code too instead of assuming KScratch registers are all $31, which will also allow pgd_reg to be handled since it is allocated that way. We also drop the masking of kscratch_mask with 0xfc, as it is redundant for the standard KScratch registers (Config4.KScrExist won't have the low 2 bits set anyway), and apparently not necessary for NetLogic. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Activate GVA page tables in guest contextJames Hogan
Activate the GVA page tables when in guest context. This will allow the normal Linux TLB refill handler to fill from it when guest memory is read, as well as preventing accidental reading from user memory. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2017-02-03KVM: MIPS/T&E: Allocate GVA -> HPA page tablesJames Hogan
Allocate GVA -> HPA page tables for guest kernel and guest user mode on each VCPU, to allow for fast path TLB refill handling to be added later. In the process kvm_arch_vcpu_init() needs updating to pass on any error from the vcpu_init() callback. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org