summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-10-13KVM: x86: clean up kvm_arch_vcpu_runnablePaolo Bonzini
Split the huge conditional in two functions. Fixes: 64d6067057d9658acb8675afcfba549abdb7fc16 Cc: stable@vger.kernel.org Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13KVM: x86: map/unmap private slots in __x86_set_memory_regionPaolo Bonzini
Otherwise, two copies (one of them never populated and thus bogus) are allocated for the regular and SMM address spaces. This breaks SMM with EPT but without unrestricted guest support, because the SMM copy of the identity page map is all zeros. By moving the allocation to the caller we also remove the last vestiges of kernel-allocated memory regions (not accessible anymore in userspace since commit b74a07beed0e, "KVM: Remove kernel-allocated memory regions", 2010-06-21); that is a nice bonus. Reported-by: Alexandre DERUMIER <aderumier@odiso.com> Cc: stable@vger.kernel.org Fixes: 9da0e4d5ac969909f6b435ce28ea28135a9cbd69 Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13KVM: x86: build kvm_userspace_memory_region in x86_set_memory_regionPaolo Bonzini
The next patch will make x86_set_memory_region fill the userspace_addr. Since the struct is not used untouched anymore, it makes sense to build it in x86_set_memory_region directly; it also simplifies the callers. Reported-by: Alexandre DERUMIER <aderumier@odiso.com> Cc: stable@vger.kernel.org Fixes: 9da0e4d5ac969909f6b435ce28ea28135a9cbd69 Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13Merge tag 'kvm-s390-next-20151013' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes for 4.4 A bunch of fixes and optimizations for interrupt and time handling. No fix is important enough to qualify for 4.3 or stable.
2015-10-13KVM: s390: factor out reading of the guest TOD clockDavid Hildenbrand
Let's factor this out and always use get_tod_clock_fast() when reading the guest TOD. STORE CLOCK FAST does not do serialization and, therefore, might result in some fuzziness between different processors in a way that subsequent calls on different CPUs might have time stamps that are earlier. This semantics is fine though for all KVM use cases. To make it obvious that the new function has STORE CLOCK FAST semantics we name it kvm_s390_get_tod_clock_fast. With this patch, we only have a handful of places were we have to care about STP sync (using preempt_disable() logic). Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: factor out and fix setting of guest TOD clockDavid Hildenbrand
Let's move that whole logic into one function. We now always use unsigned values when calculating the epoch (to avoid over/underflow defined). Also, we always have to get all VCPUs out of SIE before doing the update to avoid running differing VCPUs with different TODs. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: switch to get_tod_clock() and fix STP sync racesDavid Hildenbrand
Nobody except early.c makes use of store_tod_clock() to handle the cc. So if we would get a cc != 0, we would be in more trouble. Let's replace all users with get_tod_clock(). Returning a cc on an ioctl sounded strange either way. We can now also easily move the get_tod_clock() call into the preempt_disable() section. This is in fact necessary to make the STP sync work as expected. Otherwise the host TOD could change and we would end up with a wrong epoch calculation. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: correctly handle injection of pgm irqs and per eventsDavid Hildenbrand
PER events can always co-exist with other program interrupts. For now, we always overwrite all program interrupt parameters when injecting any type of program interrupt. Let's handle that correctly by only overwriting the relevant portion of the program interrupt parameters. Therefore we can now inject PER events and ordinary program interrupts concurrently, resulting in no loss of program interrupts. This will especially by helpful when manually detecting PER events later - as both types might be triggered during one SIE exit. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: simplify in-kernel program irq injectionDavid Hildenbrand
The main reason to keep program injection in kernel separated until now was that we were able to do some checking, if really only the owning thread injects program interrupts (via waitqueue_active(li->wq)). This BUG_ON was never triggered and the chances of really hitting it, if another thread injected a program irq to another vcpu, were very small. Let's drop this check and turn kvm_s390_inject_program_int() and kvm_s390_inject_prog_irq() into simple inline functions that makes use of kvm_s390_inject_vcpu(). __must_check can be dropped as they are implicitely given by kvm_s390_inject_vcpu(), to avoid ugly long function prototypes. Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: drop out early in kvm_s390_has_irq()David Hildenbrand
Let's get rid of the local variable and exit directly if we found any pending interrupt. This is not only faster, but also better readable. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: kvm_arch_vcpu_runnable already cares about timer interruptsDavid Hildenbrand
We can remove that double check. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: set interception requests for all floating irqsDavid Hildenbrand
No need to separate pending and floating irqs when setting interception requests. Let's do it for all equally. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: disabled wait cares about machine checks, not PERDavid Hildenbrand
We don't care about program event recording irqs (synchronous program irqs) but asynchronous irqs when checking for disabled wait. Machine checks were missing. Let's directly switch to the functions we have for that purpose instead of testing once again for magic bits. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: remove unused variable in __inject_vmChristian Borntraeger
the float int structure is no longer used in __inject_vm. Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-01iommu/vt-d: Add a command line parameter for VT-d posted-interruptsFeng Wu
Enable VT-d Posted-Interrtups and add a command line parameter for it. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: Update Posted-Interrupts Descriptor when vCPU is blockedFeng Wu
This patch updates the Posted-Interrupts Descriptor when vCPU is blocked. pre-block: - Add the vCPU to the blocked per-CPU list - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR post-block: - Remove the vCPU from the per-CPU list Signed-off-by: Feng Wu <feng.wu@intel.com> [Concentrate invocation of pre/post-block hooks to vcpu_block. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: Update Posted-Interrupts Descriptor when vCPU is preemptedFeng Wu
This patch updates the Posted-Interrupts Descriptor when vCPU is preempted. sched out: - Set 'SN' to suppress furture non-urgent interrupts posted for the vCPU. sched in: - Clear 'SN' - Change NDST if vCPU is scheduled to a different CPU - Set 'NV' to POSTED_INTR_VECTOR Signed-off-by: Feng Wu <feng.wu@intel.com> [Include asm/cpu.h to fix !CONFIG_SMP compilation. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: x86: select IRQ_BYPASS_MANAGERFeng Wu
Select IRQ_BYPASS_MANAGER for x86 when CONFIG_KVM is set Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: x86: Update IRTE for posted-interruptsFeng Wu
This patch adds the routine to update IRTE for posted-interrupts when guest changes the interrupt configuration. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> [Squashed in automatically generated patch from the build robot "KVM: x86: vcpu_to_pi_desc() can be static" - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01vfio: Register/unregister irq_bypass_producerFeng Wu
This patch adds the registration/unregistration of an irq_bypass_producer for MSI/MSIx on vfio pci devices. Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Feng Wu <feng.wu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: make kvm_set_msi_irq() publicFeng Wu
Make kvm_set_msi_irq() public, we can use this function outside. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: Define a new interface kvm_intr_is_single_vcpu()Feng Wu
This patch defines a new interface kvm_intr_is_single_vcpu(), which can returns whether the interrupt is for single-CPU or not. It is used by VT-d PI, since now we only support single-CPU interrupts, For lowest-priority interrupts, if user configures it via /proc/irq or uses irqbalance to make it single-CPU, we can use PI to deliver the interrupts to it. Full functionality of lowest-priority support will be added later. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: Add some helper functions for Posted-InterruptsFeng Wu
This patch adds some helper functions to manipulate the Posted-Interrupts Descriptor. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> [Make the new functions inline. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: Extend struct pi_desc for VT-d Posted-InterruptsFeng Wu
Extend struct pi_desc for VT-d Posted-Interrupts. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'Feng Wu
This patch adds an arch specific hooks 'arch_update' in 'struct kvm_kernel_irqfd'. On Intel side, it is used to update the IRTE when VT-d posted-interrupts is used. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: eventfd: add irq bypass consumer managementEric Auger
This patch adds the registration/unregistration of an irq_bypass_consumer on irqfd assignment/deassignment. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: introduce kvm_arch functions for IRQ bypassEric Auger
This patch introduces - kvm_arch_irq_bypass_add_producer - kvm_arch_irq_bypass_del_producer - kvm_arch_irq_bypass_stop - kvm_arch_irq_bypass_start They make possible to specialize the KVM IRQ bypass consumer in case CONFIG_KVM_HAVE_IRQ_BYPASS is set. Signed-off-by: Eric Auger <eric.auger@linaro.org> [Add weak implementations of the callbacks. - Feng] Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: create kvm_irqfd.hEric Auger
Move _irqfd_resampler and _irqfd struct declarations in a new public header: kvm_irqfd.h. They are respectively renamed into kvm_kernel_irqfd_resampler and kvm_kernel_irqfd. Those datatypes will be used by architecture specific code, in the context of IRQ bypass manager integration. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01virt: Add virt directory to the top MakefileFeng Wu
We need to build files in virt/lib/, which are now used by KVM and VFIO, so add virt directory to the top Makefile. Signed-off-by: Feng Wu <feng.wu@intel.com> Acked-by: Michal Marek <mmarek@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01virt: IRQ bypass managerAlex Williamson
When a physical I/O device is assigned to a virtual machine through facilities like VFIO and KVM, the interrupt for the device generally bounces through the host system before being injected into the VM. However, hardware technologies exist that often allow the host to be bypassed for some of these scenarios. Intel Posted Interrupts allow the specified physical edge interrupts to be directly injected into a guest when delivered to a physical processor while the vCPU is running. ARM IRQ Forwarding allows forwarded physical interrupts to be directly deactivated by the guest. The IRQ bypass manager here is meant to provide the shim to connect interrupt producers, generally the host physical device driver, with interrupt consumers, generally the hypervisor, in order to configure these bypass mechanism. To do this, we base the connection on a shared, opaque token. For KVM-VFIO this is expected to be an eventfd_ctx since this is the connection we already use to connect an eventfd to an irqfd on the in-kernel path. When a producer and consumer with matching tokens is found, callbacks via both registered participants allow the bypass facilities to be automatically enabled. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Tested-by: Eric Auger <eric.auger@linaro.org> Tested-by: Feng Wu <feng.wu@intel.com> Signed-off-by: Feng Wu <feng.wu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01irq_remapping: move structs outside #ifdefPaolo Bonzini
This is friendlier to clients of the code, who are going to prepare vcpu_data structs unconditionally, even if CONFIG_IRQ_REMAP is not defined. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01x86: kvmclock: abolish PVCLOCK_COUNTS_FROM_ZERORadim Krčmář
Newer KVM won't be exposing PVCLOCK_COUNTS_FROM_ZERO anymore. The purpose of that flags was to start counting system time from 0 when the KVM clock has been initialized. We can achieve the same by selecting one read as the initial point. A simple subtraction will work unless the KVM clock count overflows earlier (has smaller width) than scheduler's cycle count. We should be safe till x86_128. Because PVCLOCK_COUNTS_FROM_ZERO was enabled only on new hypervisors, setting sched clock as stable based on PVCLOCK_TSC_STABLE_BIT might regress on older ones. I presume we don't need to change kvm_clock_read instead of introducing kvm_sched_clock_read. A problem could arise in case sched_clock is expected to return the same value as get_cycles, but we should have merged those clocks in that case. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: VMX: drop rdtscp_enabled fieldXiao Guangrong
Check cpuid bit instead of it Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: VMX: clean up bit operation on SECONDARY_VM_EXEC_CONTROLXiao Guangrong
Use vmcs_set_bits() and vmcs_clear_bits() to clean up the code Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: VMX: unify SECONDARY_VM_EXEC_CONTROL updateXiao Guangrong
Unify the update in vmx_cpuid_update() Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> [Rewrite to use vmcs_set_secondary_exec_control. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: VMX: align vmx->nested.nested_vmx_secondary_ctls_high to ↵Paolo Bonzini
vmx->rdtscp_enabled The SECONDARY_EXEC_RDTSCP must be available iff RDTSCP is enabled in the guest. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: VMX: simplify invpcid handling in vmx_cpuid_update()Xiao Guangrong
If vmx_invpcid_supported() is true, second execution control filed must be supported and SECONDARY_EXEC_ENABLE_INVPCID must have already been set in current vmcs by vmx_secondary_exec_control() If vmx_invpcid_supported() is false, no need to clear SECONDARY_EXEC_ENABLE_INVPCID Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: VMX: simplify rdtscp handling in vmx_cpuid_update()Xiao Guangrong
if vmx_rdtscp_supported() is true SECONDARY_EXEC_RDTSCP must have already been set in current vmcs by vmx_secondary_exec_control() Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: VMX: drop rdtscp_enabled check in prepare_vmcs02()Xiao Guangrong
SECONDARY_EXEC_RDTSCP set for L2 guest comes from vmcs12 Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: x86: add pcommit supportXiao Guangrong
Pass PCOMMIT CPU feature to guest to enable PCOMMIT instruction Currently we do not catch pcommit instruction for L1 guest and allow L1 to catch this instruction for L2 if, as required by the spec, L1 can enumerate the PCOMMIT instruction via CPUID: | IA32_VMX_PROCBASED_CTLS2[53] (which enumerates support for the | 1-setting of PCOMMIT exiting) is always the same as | CPUID.07H:EBX.PCOMMIT[bit 22]. Thus, software can set PCOMMIT exiting | to 1 if and only if the PCOMMIT instruction is enumerated via CPUID The spec can be found at https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: x86: allow guest to use cflushopt and clwbXiao Guangrong
Pass these CPU features to guest to enable them in guest They are needed by nvdimm drivers Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: vmx: disable posted interrupts if no local APICPaolo Bonzini
Uniprocessor 32-bit randconfigs can disable the local APIC, and posted interrupts require reserving a vector on the LAPIC, so they are incompatible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01kvm/x86: Hyper-V HV_X64_MSR_VP_RUNTIME supportAndrey Smetanin
HV_X64_MSR_VP_RUNTIME msr used by guest to get "the time the virtual processor consumes running guest code, and the time the associated logical processor spends running hypervisor code on behalf of that guest." Calculation of this time is performed by task_cputime_adjusted() for vcpu task. Necessary to support loading of winhv.sys in guest, which in turn is required to support Windows VMBus. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Gleb Natapov <gleb@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01kvm/x86: Hyper-V HV_X64_MSR_VP_INDEX export for QEMU.Andrey Smetanin
Insert Hyper-V HV_X64_MSR_VP_INDEX into msr's emulated list, so QEMU can set Hyper-V features cpuid HV_X64_MSR_VP_INDEX_AVAILABLE bit correctly. KVM emulation part is in place already. Necessary to support loading of winhv.sys in guest, which in turn is required to support Windows VMBus. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Gleb Natapov <gleb@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01kvm/x86: Hyper-V HV_X64_MSR_RESET msrAndrey Smetanin
HV_X64_MSR_RESET msr is used by Hyper-V based Windows guest to reset guest VM by hypervisor. Necessary to support loading of winhv.sys in guest, which in turn is required to support Windows VMBus. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Gleb Natapov <gleb@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01kvm: add capability for any-length ioeventfdsJason Wang
Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01kvm: add tracepoint for fast mmioJason Wang
Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01kvm: use kmalloc() instead of kzalloc() during iodev register/unregisterJason Wang
All fields of kvm_io_range were initialized or copied explicitly afterwards. So switch to use kmalloc(). Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: x86: Add support for local interrupt requests from userspaceSteve Rutherford
In order to enable userspace PIC support, the userspace PIC needs to be able to inject local interrupts even when the APICs are in the kernel. KVM_INTERRUPT now supports sending local interrupts to an APIC when APICs are in the kernel. The ready_for_interrupt_request flag is now only set when the CPU/APIC will immediately accept and inject an interrupt (i.e. APIC has not masked the PIC). When the PIC wishes to initiate an INTA cycle with, say, CPU0, it kicks CPU0 out of the guest, and renedezvous with CPU0 once it arrives in userspace. When the CPU/APIC unmasks the PIC, a KVM_EXIT_IRQ_WINDOW_OPEN is triggered, so that userspace has a chance to inject a PIC interrupt if it had been pending. Overall, this design can lead to a small number of spurious userspace renedezvous. In particular, whenever the PIC transistions from low to high while it is masked and whenever the PIC becomes unmasked while it is low. Note: this does not buffer more than one local interrupt in the kernel, so the VMM needs to enter the guest in order to complete interrupt injection before injecting an additional interrupt. Compiles for x86. Can pass the KVM Unit Tests. Signed-off-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01KVM: x86: Add EOI exit bitmap inferenceSteve Rutherford
In order to support a userspace IOAPIC interacting with an in kernel APIC, the EOI exit bitmaps need to be configurable. If the IOAPIC is in userspace (i.e. the irqchip has been split), the EOI exit bitmaps will be set whenever the GSI Routes are configured. In particular, for the low MSI routes are reservable for userspace IOAPICs. For these MSI routes, the EOI Exit bit corresponding to the destination vector of the route will be set for the destination VCPU. The intention is for the userspace IOAPICs to use the reservable MSI routes to inject interrupts into the guest. This is a slight abuse of the notion of an MSI Route, given that MSIs classically bypass the IOAPIC. It might be worthwhile to add an additional route type to improve clarity. Compile tested for Intel x86. Signed-off-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>