summaryrefslogtreecommitdiff
path: root/arch/s390/kvm/kvm-s390.c
AgeCommit message (Collapse)Author
2020-08-12mm/gup: remove task_struct pointer for all gup codePeter Xu
After the cleanup of page fault accounting, gup does not need to pass task_struct around any more. Remove that parameter in the whole gup stack. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-03Merge tag 'kvm-s390-next-5.9-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next-5.6 KVM: s390: Enhancement for 5.9 - implement diagnose 318
2020-07-10KVM: s390: clean up redundant 'kvm_run' parametersTianjia Zhang
In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu' structure. For historical reasons, many kvm-related function parameters retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This patch does a unified cleanup of these remaining redundant parameters. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200623131418.31473-2-tianjia.zhang@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08KVM: async_pf: change kvm_setup_async_pf()/kvm_arch_setup_async_pf() return ↵Vitaly Kuznetsov
type to bool Unlike normal 'int' functions returning '0' on success, kvm_setup_async_pf()/ kvm_arch_setup_async_pf() return '1' when a job to handle page fault asynchronously was scheduled and '0' otherwise. To avoid the confusion change return type to 'bool'. No functional change intended. Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200615121334.91300-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-23s390/kvm: diagnose 0x318 sync and resetCollin Walling
DIAGNOSE 0x318 (diag318) sets information regarding the environment the VM is running in (Linux, z/VM, etc) and is observed via firmware/service events. This is a privileged s390x instruction that must be intercepted by SIE. Userspace handles the instruction as well as migration. Data is communicated via VCPU register synchronization. The Control Program Name Code (CPNC) is stored in the SIE block. The CPNC along with the Control Program Version Code (CPVC) are stored in the kvm_vcpu_arch struct. This data is reset on load normal and clear resets. Signed-off-by: Collin Walling <walling@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200622154636.5499-3-walling@linux.ibm.com [borntraeger@de.ibm.com: fix sync_reg position] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-06-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull more KVM updates from Paolo Bonzini: "The guest side of the asynchronous page fault work has been delayed to 5.9 in order to sync with Thomas's interrupt entry rework, but here's the rest of the KVM updates for this merge window. MIPS: - Loongson port PPC: - Fixes ARM: - Fixes x86: - KVM_SET_USER_MEMORY_REGION optimizations - Fixes - Selftest fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (62 commits) KVM: x86: do not pass poisoned hva to __kvm_set_memory_region KVM: selftests: fix sync_with_host() in smm_test KVM: async_pf: Inject 'page ready' event only if 'page not present' was previously injected KVM: async_pf: Cleanup kvm_setup_async_pf() kvm: i8254: remove redundant assignment to pointer s KVM: x86: respect singlestep when emulating instruction KVM: selftests: Don't probe KVM_CAP_HYPERV_ENLIGHTENED_VMCS when nested VMX is unsupported KVM: selftests: do not substitute SVM/VMX check with KVM_CAP_NESTED_STATE check KVM: nVMX: Consult only the "basic" exit reason when routing nested exit KVM: arm64: Move hyp_symbol_addr() to kvm_asm.h KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts KVM: arm64: Remove host_cpu_context member from vcpu structure KVM: arm64: Stop sparse from moaning at __hyp_this_cpu_ptr KVM: arm64: Handle PtrAuth traps early KVM: x86: Unexport x86_fpu_cache and make it static KVM: selftests: Ignore KVM 5-level paging support for VM_MODE_PXXV48_4K KVM: arm64: Save the host's PtrAuth keys in non-preemptible context KVM: arm64: Stop save/restoring ACTLR_EL1 KVM: arm64: Add emulation for 32bit guests accessing ACTLR2 ...
2020-06-11KVM: async_pf: Inject 'page ready' event only if 'page not present' was ↵Vitaly Kuznetsov
previously injected 'Page not present' event may or may not get injected depending on guest's state. If the event wasn't injected, there is no need to inject the corresponding 'page ready' event as the guest may get confused. E.g. Linux thinks that the corresponding 'page not present' event wasn't delivered *yet* and allocates a 'dummy entry' for it. This entry is never freed. Note, 'wakeup all' events have no corresponding 'page not present' event and always get injected. s390 seems to always be able to inject 'page not present', the change is effectively a nop. Suggested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200610175532.779793-2-vkuznets@redhat.com> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208081 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-09mmap locking API: use coccinelle to convert mmap_sem rwsem call sitesMichel Lespinasse
This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock | -down_write +mmap_write_lock | -down_write_killable +mmap_write_lock_killable | -down_write_trylock +mmap_write_trylock | -up_write +mmap_write_unlock | -downgrade_write +mmap_write_downgrade | -down_read +mmap_read_lock | -down_read_killable +mmap_read_lock_killable | -down_read_trylock +mmap_read_trylock | -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: reorder includes after introduction of linux/pgtable.hMike Rapoport
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: introduce include/linux/pgtable.hMike Rapoport
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08Merge tag 's390-5.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for multi-function devices in pci code. - Enable PF-VF linking for architectures using the pdev->no_vf_scan flag (currently just s390). - Add reipl from NVMe support. - Get rid of critical section cleanup in entry.S. - Refactor PNSO CHSC (perform network subchannel operation) in cio and qeth. - QDIO interrupts and error handling fixes and improvements, more refactoring changes. - Align ioremap() with generic code. - Accept requests without the prefetch bit set in vfio-ccw. - Enable path handling via two new regions in vfio-ccw. - Other small fixes and improvements all over the code. * tag 's390-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (52 commits) vfio-ccw: make vfio_ccw_regops variables declarations static vfio-ccw: Add trace for CRW event vfio-ccw: Wire up the CRW irq and CRW region vfio-ccw: Introduce a new CRW region vfio-ccw: Refactor IRQ handlers vfio-ccw: Introduce a new schib region vfio-ccw: Refactor the unregister of the async regions vfio-ccw: Register a chp_event callback for vfio-ccw vfio-ccw: Introduce new helper functions to free/destroy regions vfio-ccw: document possible errors vfio-ccw: Enable transparent CCW IPL from DASD s390/pci: Log new handle in clp_disable_fh() s390/cio, s390/qeth: cleanup PNSO CHSC s390/qdio: remove q->first_to_kick s390/qdio: fix up qdio_start_irq() kerneldoc s390: remove critical section cleanup from entry.S s390: add machine check SIGP s390/pci: ioremap() align with generic code s390/ap: introduce new ap function ap_get_qdev() Documentation/s390: Update / remove developerWorks web links ...
2020-06-01KVM: rename kvm_arch_can_inject_async_page_present() to ↵Vitaly Kuznetsov
kvm_arch_can_dequeue_async_page_present() An innocent reader of the following x86 KVM code: bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu) { if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED)) return true; ... may get very confused: if APF mechanism is not enabled, why do we report that we 'can inject async page present'? In reality, upon injection kvm_arch_async_page_present() will check the same condition again and, in case APF is disabled, will just drop the item. This is fine as the guest which deliberately disabled APF doesn't expect to get any APF notifications. Rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present() to make it clear what we are checking: if the item can be dequeued (meaning either injected or just dropped). On s390 kvm_arch_can_inject_async_page_present() always returns 'true' so the rename doesn't matter much. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200525144125.143875-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-28s390: remove critical section cleanup from entry.SSven Schnelle
The current code is rather complex and caused a lot of subtle and hard to debug bugs in the past. Simplify the code by calling the system_call handler with interrupts disabled, save machine state, and re-enable them later. This requires significant changes to the machine check handling code as well. When the machine check interrupt arrived while being in kernel mode the new code will signal pending machine checks with a SIGP external call. When userspace was interrupted, the handler will switch to the kernel stack and directly execute s390_handle_mcck(). Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-05-15kvm: add halt-polling cpu usage statsDavid Matlack
Two new stats for exposing halt-polling cpu usage: halt_poll_success_ns halt_poll_fail_ns Thus sum of these 2 stats is the total cpu time spent polling. "success" means the VCPU polled until a virtual interrupt was delivered. "fail" means the VCPU had to schedule out (either because the maximum poll time was reached or it needed to yield the CPU). To avoid touching every arch's kvm_vcpu_stat struct, only update and export halt-polling cpu usage stats if we're on x86. Exporting cpu usage as a u64 and in nanoseconds means we will overflow at ~500 years, which seems reasonably large. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Jon Cargille <jcargill@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20200508182240.68440-1-jcargill@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13Merge branch 'kvm-amd-fixes' into HEADPaolo Bonzini
2020-05-07KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properlyPeter Xu
KVM_CAP_SET_GUEST_DEBUG should be supported for x86 however it's not declared as supported. My wild guess is that userspaces like QEMU are using "#ifdef KVM_CAP_SET_GUEST_DEBUG" to check for the capability instead, but that could be wrong because the compilation host may not be the runtime host. The userspace might still want to keep the old "#ifdef" though to not break the guest debug on old kernels. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200505154750.126300-1-peterx@redhat.com> [Do the same for PPC and s390. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-21KVM: Remove redundant argument to kvm_arch_vcpu_ioctl_runTianjia Zhang
In earlier versions of kvm, 'kvm_run' was an independent structure and was not included in the vcpu structure. At present, 'kvm_run' is already included in the vcpu structure, so the parameter 'kvm_run' is redundant. This patch simplifies the function definition, removes the extra 'kvm_run' parameter, and extracts it from the 'kvm_vcpu' structure if necessary. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Message-Id: <20200416051057.26526-1-tianjia.zhang@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-21kvm_host: unify VM_STAT and VCPU_STAT definitions in a single placeEmanuele Giuseppe Esposito
The macros VM_STAT and VCPU_STAT are redundantly implemented in multiple files, each used by a different architecure to initialize the debugfs entries for statistics. Since they all have the same purpose, they can be unified in a single common definition in include/linux/kvm_host.h Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20200414155625.20559-1-eesposit@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-14KVM: s390: Return last valid slot if approx index is out-of-boundsSean Christopherson
Return the index of the last valid slot from gfn_to_memslot_approx() if its binary search loop yielded an out-of-bounds index. The index can be out-of-bounds if the specified gfn is less than the base of the lowest memslot (which is also the last valid memslot). Note, the sole caller, kvm_s390_get_cmma(), ensures used_slots is non-zero. Fixes: afdad61615cc3 ("KVM: s390: Fix storage attributes migration with memory slots") Cc: stable@vger.kernel.org # 4.19.x: 0774a964ef56: KVM: Fix out of range accesses to memslots Cc: stable@vger.kernel.org # 4.19.x Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200408064059.8957-3-sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM: - GICv4.1 support - 32bit host removal PPC: - secure (encrypted) using under the Protected Execution Framework ultravisor s390: - allow disabling GISA (hardware interrupt injection) and protected VMs/ultravisor support. x86: - New dirty bitmap flag that sets all bits in the bitmap when dirty page logging is enabled; this is faster because it doesn't require bulk modification of the page tables. - Initial work on making nested SVM event injection more similar to VMX, and less buggy. - Various cleanups to MMU code (though the big ones and related optimizations were delayed to 5.8). Instead of using cr3 in function names which occasionally means eptp, KVM too has standardized on "pgd". - A large refactoring of CPUID features, which now use an array that parallels the core x86_features. - Some removal of pointer chasing from kvm_x86_ops, which will also be switched to static calls as soon as they are available. - New Tigerlake CPUID features. - More bugfixes, optimizations and cleanups. Generic: - selftests: cleanups, new MMU notifier stress test, steal-time test - CSV output for kvm_stat" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (277 commits) x86/kvm: fix a missing-prototypes "vmread_error" KVM: x86: Fix BUILD_BUG() in __cpuid_entry_get_reg() w/ CONFIG_UBSAN=y KVM: VMX: Add a trampoline to fix VMREAD error handling KVM: SVM: Annotate svm_x86_ops as __initdata KVM: VMX: Annotate vmx_x86_ops as __initdata KVM: x86: Drop __exit from kvm_x86_ops' hardware_unsetup() KVM: x86: Copy kvm_x86_ops by value to eliminate layer of indirection KVM: x86: Set kvm_x86_ops only after ->hardware_setup() completes KVM: VMX: Configure runtime hooks using vmx_x86_ops KVM: VMX: Move hardware_setup() definition below vmx_x86_ops KVM: x86: Move init-only kvm_x86_ops to separate struct KVM: Pass kvm_init()'s opaque param to additional arch funcs s390/gmap: return proper error code on ksm unsharing KVM: selftests: Fix cosmetic copy-paste error in vm_mem_region_move() KVM: Fix out of range accesses to memslots KVM: X86: Micro-optimize IPI fastpath delay KVM: X86: Delay read msr data iff writes ICR MSR KVM: PPC: Book3S HV: Add a capability for enabling secure guests KVM: arm64: GICv4.1: Expose HW-based SGIs in debugfs KVM: arm64: GICv4.1: Allow non-trapping WFI when using HW SGIs ...
2020-03-31KVM: Pass kvm_init()'s opaque param to additional arch funcsSean Christopherson
Pass @opaque to kvm_arch_hardware_setup() and kvm_arch_check_processor_compat() to allow architecture specific code to reference @opaque without having to stash it away in a temporary global variable. This will enable x86 to separate its vendor specific callback ops, which are passed via @opaque, into "init" and "runtime" ops without having to stash away the "init" ops. No functional change intended. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Tested-by: Cornelia Huck <cohuck@redhat.com> #s390 Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200321202603.19355-2-sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-26Merge tag 'kvm-s390-next-5.7-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: cleanups for 5.7 - mark sie control block as 512 byte aligned - use fallthrough;
2020-03-26KVM: Fix out of range accesses to memslotsSean Christopherson
Reset the LRU slot if it becomes invalid when deleting a memslot to fix an out-of-bounds/use-after-free access when searching through memslots. Explicitly check for there being no used slots in search_memslots(), and in the caller of s390's approximation variant. Fixes: 36947254e5f9 ("KVM: Dynamically size memslot array based on number of used slots") Reported-by: Qian Cai <cai@lca.pw> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320205546.2396-2-sean.j.christopherson@intel.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-23KVM: s390: Use fallthrough;Joe Perches
Convert the various uses of fallthrough comments to fallthrough; Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/d63c86429f3e5aa806aa3e185c97d213904924a5.1583896348.git.joe@perches.com [borntrager@de.ibm.com: Fix link to tool and subject] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-03-16Merge tag 'kvm-s390-next-5.7-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Features and Enhancements for 5.7 part1 1. Allow to disable gisa 2. protected virtual machines Protected VMs (PVM) are KVM VMs, where KVM can't access the VM's state like guest memory and guest registers anymore. Instead the PVMs are mostly managed by a new entity called Ultravisor (UV), which provides an API, so KVM and the PV can request management actions. PVMs are encrypted at rest and protected from hypervisor access while running. They switch from a normal operation into protected mode, so we can still use the standard boot process to load a encrypted blob and then move it into protected mode. Rebooting is only possible by passing through the unprotected/normal mode and switching to protected again. One mm related patch will go via Andrews mm tree ( mm/gup/writeback: add callbacks for inaccessible pages)
2020-03-16KVM: Ensure validity of memslot with respect to kvm_get_dirty_log()Sean Christopherson
Rework kvm_get_dirty_log() so that it "returns" the associated memslot on success. A future patch will rework memslot handling such that id_to_memslot() can return NULL, returning the memslot makes it more obvious that the validity of the memslot has been verified, i.e. precludes the need to add validity checks in the arch code that are technically unnecessary. To maintain ordering in s390, move the call to kvm_arch_sync_dirty_log() from s390's kvm_vm_ioctl_get_dirty_log() to the new kvm_get_dirty_log(). This is a nop for PPC, the only other arch that doesn't select KVM_GENERIC_DIRTYLOG_READ_PROTECT, as its sync_dirty_log() is empty. Ideally, moving the sync_dirty_log() call would be done in a separate patch, but it can't be done in a follow-on patch because that would temporarily break s390's ordering. Making the move in a preparatory patch would be functionally correct, but would create an odd scenario where the moved sync_dirty_log() would operate on a "different" memslot due to consuming the result of a different id_to_memslot(). The memslot couldn't actually be different as slots_lock is held, but the code is confusing enough as it is, i.e. moving sync_dirty_log() in this patch is the lesser of all evils. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Provide common implementation for generic dirty log functionsSean Christopherson
Move the implementations of KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG for CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT into common KVM code. The arch specific implemenations are extremely similar, differing only in whether the dirty log needs to be sync'd from hardware (x86) and how the TLBs are flushed. Add new arch hooks to handle sync and TLB flush; the sync will also be used for non-generic dirty log support in a future patch (s390). The ulterior motive for providing a common implementation is to eliminate the dependency between arch and common code with respect to the memslot referenced by the dirty log, i.e. to make it obvious in the code that the validity of the memslot is guaranteed, as a future patch will rework memslot handling such that id_to_memslot() can return NULL. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Drop "const" attribute from old memslot in commit_memory_region()Sean Christopherson
Drop the "const" attribute from @old in kvm_arch_commit_memory_region() to allow arch specific code to free arch specific resources in the old memslot without having to cast away the attribute. Freeing resources in kvm_arch_commit_memory_region() paves the way for simplifying kvm_free_memslot() by eliminating the last usage of its @dont param. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16KVM: Drop kvm_arch_create_memslot()Sean Christopherson
Remove kvm_arch_create_memslot() now that all arch implementations are effectively nops. Removing kvm_arch_create_memslot() eliminates the possibility for arch specific code to allocate memory prior to setting a memslot, which sets the stage for simplifying kvm_free_memslot(). Cc: Janosch Frank <frankja@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-11KVM: s390: Also reset registers in sync regs for initial cpu resetChristian Borntraeger
When we do the initial CPU reset we must not only clear the registers in the internal data structures but also in kvm_run sync_regs. For modern userspace sync_regs is the only place that it looks at. Fixes: 7de3f1423ff9 ("KVM: s390: Add new reset vcpu API") Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: introduce module parameter kvm.use_gisaMichael Mueller
The boolean module parameter "kvm.use_gisa" controls if newly created guests will use the GISA facility if provided by the host system. The default is yes. # cat /sys/module/kvm/parameters/use_gisa Y The parameter can be changed on the fly. # echo N > /sys/module/kvm/parameters/use_gisa Already running guests are not affected by this change. The kvm s390 debug feature shows if a guest is running with GISA. # grep gisa /sys/kernel/debug/s390dbf/kvm-$pid/sprintf 00 01582725059:843303 3 - 08 00000000e119bc01 gisa 0x00000000c9ac2642 initialized 00 01582725059:903840 3 - 11 000000004391ee22 00[0000000000000000-0000000000000000]: AIV gisa format-1 enabled for cpu 000 ... 00 01582725059:916847 3 - 08 0000000094fff572 gisa 0x00000000c9ac2642 cleared In general, that value should not be changed as the GISA facility enhances interruption delivery performance. A reason to switch the GISA facility off might be a performance comparison run or debugging. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Link: https://lore.kernel.org/r/20200227091031.102993-1-mimu@linux.ibm.com Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: introduce and enable KVM_CAP_S390_PROTECTEDChristian Borntraeger
Now that everything is in place, we can announce the feature. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2020-02-27KVM: s390: protvirt: Add UV cpu reset callsJanosch Frank
For protected VMs, the VCPU resets are done by the Ultravisor, as KVM has no access to the VCPU registers. Note that the ultravisor will only accept a call for the exact reset that has been requested. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: do not inject interrupts after startChristian Borntraeger
As PSW restart is handled by the ultravisor (and we only get a start notification) we must re-check the PSW after a start before injecting interrupts. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2020-02-27KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112Janosch Frank
We're not allowed to inject interrupts on intercepts that leave the guest state in an "in-between" state where the next SIE entry will do a continuation, namely secure instruction interception (104) and secure prefix interception (112). As our PSW is just a copy of the real one that will be replaced on the next exit, we can mask out the interrupt bits in the PSW to make sure that we do not inject anything. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: Support cmd 5 operation stateJanosch Frank
Code 5 for the set cpu state UV call tells the UV to load a PSW from the SE header (first IPL) or from guest location 0x0 (diag 308 subcode 0/1). Also it sets the cpu into operating state afterwards, so we can start it. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: Report CPU state to UltravisorJanosch Frank
VCPU states have to be reported to the ultravisor for SIGP interpretation, kdump, kexec and reboot. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: UV calls in support of diag308 0, 1Janosch Frank
diag 308 subcode 0 and 1 require several KVM and Ultravisor interactions. Specific to these "soft" reboots are * The "unshare all" UVC * The "prepare for reset" UVC Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: Only sync fmt4 registersJanosch Frank
A lot of the registers are controlled by the Ultravisor and never visible to KVM. Also some registers are overlayed, like gbea is with sidad, which might leak data to userspace. Hence we sync a minimal set of registers for both SIE formats and then check and sync format 2 registers if necessary. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: Do only reset registers that are accessibleJanosch Frank
For protected VMs the hypervisor can not access guest breaking event address, program parameter, bpbc and todpr. Do not reset those fields as the control block does not provide access to these fields. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: disallow one_regJanosch Frank
A lot of the registers are controlled by the Ultravisor and never visible to KVM. Some fields in the sie control block are overlayed, like gbea. As no known userspace uses the ONE_REG interface on s390 if sync regs are available, no functionality is lost if it is disabled for protected guests. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: S390: protvirt: Introduce instruction data area bounce bufferJanosch Frank
Now that we can't access guest memory anymore, we have a dedicated satellite block that's a bounce buffer for instruction data. We re-use the memop interface to copy the instruction data to / from userspace. This lets us re-use a lot of QEMU code which used that interface to make logical guest memory accesses which are not possible anymore in protected mode anyway. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: Add new gprs location handlingJanosch Frank
Guest registers for protected guests are stored at offset 0x380. We will copy those to the usual places. Long term we could refactor this or use register access functions. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: Add SCLP interrupt handlingChristian Borntraeger
The sclp interrupt is kind of special. The ultravisor polices that we do not inject an sclp interrupt with payload if no sccb is outstanding. On the other hand we have "asynchronous" event interrupts, e.g. for console input. We separate both variants into sclp interrupt and sclp event interrupt. The sclp interrupt is masked until a previous servc instruction has finished (sie exit 108). [frankja@linux.ibm.com: factoring out write_sclp] Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: Secure memory is not mergeableJanosch Frank
KSM will not work on secure pages, because when the kernel reads a secure page, it will be encrypted and hence no two pages will look the same. Let's mark the guest pages as unmergeable when we transition to secure mode. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: Add initial vm and cpu lifecycle handlingJanosch Frank
This contains 3 main changes: 1. changes in SIE control block handling for secure guests 2. helper functions for create/destroy/unpack secure guests 3. KVM_S390_PV_COMMAND ioctl to allow userspace dealing with secure machines Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27KVM: s390: protvirt: Add UV debug traceJanosch Frank
Let's have some debug traces which stay around for longer than the guest. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-05Merge tag 'kvm-s390-next-5.6-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes and cleanups for 5.6 - fix register corruption - ENOTSUPP/EOPNOTSUPP mixed - reset cleanups/fixes - selftests
2020-01-31KVM: s390: Add new reset vcpu APIJanosch Frank
The architecture states that we need to reset local IRQs for all CPU resets. Because the old reset interface did not support the normal CPU reset we never did that on a normal reset. Let's implement an interface for the missing normal and clear resets and reset all local IRQs, registers and control structures as stated in the architecture. Userspace might already reset the registers via the vcpu run struct, but as we need the interface for the interrupt clearing part anyway, we implement the resets fully and don't rely on userspace to reset the rest. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Link: https://lore.kernel.org/r/20200131100205.74720-4-frankja@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-01-31KVM: s390: Cleanup initial cpu resetJanosch Frank
The code seems to be quite old and uses lots of unneeded spaces for alignment, which doesn't really help with readability. Let's: * Get rid of the extra spaces * Remove the ULs as they are not needed on 0s * Define constants for the CR 0 and 14 initial values * Use the sizeof of the gcr array to memset it to 0 Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Link: https://lore.kernel.org/r/20200131100205.74720-3-frankja@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>