summaryrefslogtreecommitdiff
path: root/drivers/virt/acrn
AgeCommit message (Collapse)Author
2023-11-28eventfd: simplify eventfd_signal()Christian Brauner
Ever since the eventfd type was introduced back in 2007 in commit e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() function only ever passed 1 as a value for @n. There's no point in keeping that additional argument. Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-2-bd549b14ce0c@kernel.org Acked-by: Xu Yilun <yilun.xu@intel.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Eric Farman <farman@linux.ibm.com> # s390 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-30Merge tag 'hardening-v6.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "One of the more voluminous set of changes is for adding the new __counted_by annotation[1] to gain run-time bounds checking of dynamically sized arrays with UBSan. - Add LKDTM test for stuck CPUs (Mark Rutland) - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo) - Refactor more 1-element arrays into flexible arrays (Gustavo A. R. Silva) - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem Shaikh) - Convert group_info.usage to refcount_t (Elena Reshetova) - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva) - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas Bulwahn) - Fix randstruct GCC plugin performance mode to stay in groups (Kees Cook) - Fix strtomem() compile-time check for small sources (Kees Cook)" * tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (56 commits) hwmon: (acpi_power_meter) replace open-coded kmemdup_nul reset: Annotate struct reset_control_array with __counted_by kexec: Annotate struct crash_mem with __counted_by virtio_console: Annotate struct port_buffer with __counted_by ima: Add __counted_by for struct modsig and use struct_size() MAINTAINERS: Include stackleak paths in hardening entry string: Adjust strtomem() logic to allow for smaller sources hardening: x86: drop reference to removed config AMD_IOMMU_V2 randstruct: Fix gcc-plugin performance mode to stay in group mailbox: zynqmp: Annotate struct zynqmp_ipi_pdata with __counted_by drivers: thermal: tsens: Annotate struct tsens_priv with __counted_by irqchip/imx-intmux: Annotate struct intmux_data with __counted_by KVM: Annotate struct kvm_irq_routing_table with __counted_by virt: acrn: Annotate struct vm_memory_region_batch with __counted_by hwmon: Annotate struct gsc_hwmon_platform_data with __counted_by sparc: Annotate struct cpuinfo_tree with __counted_by isdn: kcapi: replace deprecated strncpy with strscpy_pad isdn: replace deprecated strncpy with strscpy NFS/flexfiles: Annotate struct nfs4_ff_layout_segment with __counted_by nfs41: Annotate struct nfs4_file_layout_dsaddr with __counted_by ...
2023-10-10x86/cpu: Encapsulate topology information in cpuinfo_x86Thomas Gleixner
The topology related information is randomly scattered across cpuinfo_x86. Create a new structure cpuinfo_topo and move in a first step initial_apicid and apicid into it. Aside of being better readable this is in preparation for replacing the horribly fragile CPU topology evaluation code further down the road. Consolidate APIC ID fields to u32 as that represents the hardware type. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.269787744@linutronix.de
2023-10-08virt: acrn: Annotate struct vm_memory_region_batch with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct vm_memory_region_batch. Additionally, since the element count member must be set before accessing the annotated flexible array member, move its initialization earlier. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Fei Li <fei1.li@intel.com> Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230922175102.work.020-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-24minmax: add in_range() macroMatthew Wilcox (Oracle)
Patch series "New page table range API", v6. This patchset changes the API used by the MM to set up page table entries. The four APIs are: set_ptes(mm, addr, ptep, pte, nr) update_mmu_cache_range(vma, addr, ptep, nr) flush_dcache_folio(folio) flush_icache_pages(vma, page, nr) flush_dcache_folio() isn't technically new, but no architecture implemented it, so I've done that for them. The old APIs remain around but are mostly implemented by calling the new interfaces. The new APIs are based around setting up N page table entries at once. The N entries belong to the same PMD, the same folio and the same VMA, so ptep++ is a legitimate operation, and locking is taken care of for you. Some architectures can do a better job of it than just a loop, but I have hesitated to make too deep a change to architectures I don't understand well. One thing I have changed in every architecture is that PG_arch_1 is now a per-folio bit instead of a per-page bit when used for dcache clean/dirty tracking. This was something that would have to happen eventually, and it makes sense to do it now rather than iterate over every page involved in a cache flush and figure out if it needs to happen. The point of all this is better performance, and Fengwei Yin has measured improvement on x86. I suspect you'll see improvement on your architecture too. Try the new will-it-scale test mentioned here: https://lore.kernel.org/linux-mm/20230206140639.538867-5-fengwei.yin@intel.com/ You'll need to run it on an XFS filesystem and have CONFIG_TRANSPARENT_HUGEPAGE set. This patchset is the basis for much of the anonymous large folio work being done by Ryan, so it's received quite a lot of testing over the last few months. This patch (of 38): Determine if a value lies within a range more efficiently (subtraction + comparison vs two comparisons and an AND). It also has useful (under some circumstances) behaviour if the range exceeds the maximum value of the type. Convert all the conflicting definitions of in_range() within the kernel; some can use the generic definition while others need their own definition. Link: https://lkml.kernel.org/r/20230802151406.3735276-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230802151406.3735276-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-08virt: acrn: Use alloc_ordered_workqueue() to create ordered workqueuesTejun Heo
BACKGROUND ========== When multiple work items are queued to a workqueue, their execution order doesn't match the queueing order. They may get executed in any order and simultaneously. When fully serialized execution - one by one in the queueing order - is needed, an ordered workqueue should be used which can be created with alloc_ordered_workqueue(). However, alloc_ordered_workqueue() was a later addition. Before it, an ordered workqueue could be obtained by creating an UNBOUND workqueue with @max_active==1. This originally was an implementation side-effect which was broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered"). Because there were users that depended on the ordered execution, 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered") made workqueue allocation path to implicitly promote UNBOUND workqueues w/ @max_active==1 to ordered workqueues. While this has worked okay, overloading the UNBOUND allocation interface this way creates other issues. It's difficult to tell whether a given workqueue actually needs to be ordered and users that legitimately want a min concurrency level wq unexpectedly gets an ordered one instead. With planned UNBOUND workqueue updates to improve execution locality and more prevalence of chiplet designs which can benefit from such improvements, this isn't a state we wanna be in forever. This patch series audits all callsites that create an UNBOUND workqueue w/ @max_active==1 and converts them to alloc_ordered_workqueue() as necessary. WHAT TO LOOK FOR ================ The conversions are from alloc_workqueue(WQ_UNBOUND | flags, 1, args..) to alloc_ordered_workqueue(flags, args...) which don't cause any functional changes. If you know that fully ordered execution is not ncessary, please let me know. I'll drop the conversion and instead add a comment noting the fact to reduce confusion while conversion is in progress. If you aren't fully sure, it's completely fine to let the conversion through. The behavior will stay exactly the same and we can always reconsider later. As there are follow-up workqueue core changes, I'd really appreciate if the patch can be routed through the workqueue tree w/ your acks. Thanks. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Fei Li <fei1.li@intel.com>
2022-07-08virt: acrn: using for_each_set_bit to simplify the codeYang Yingliang
It's more cleanly to use for_each_set_bit() instead of opencoding it. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Fei Li <fei1.li@intel.com> Link: https://lore.kernel.org/r/20220704125044.2192381-1-yangyingliang@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26virt: acrn: Prefer array_size and struct_size over open coded arithmeticLen Baker
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. So, use the array_size() helper to do the arithmetic instead of the argument "count * size" in the vzalloc() function. Also, take the opportunity to add a flexible array member of struct vm_memory_region_op to the vm_memory_region_batch structure. And then, change the code accordingly and use the struct_size() helper to do the arithmetic instead of the argument "size + size * count" in the kzalloc function. This code was detected with the help of Coccinelle and audited and fixed manually. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Acked-by: Fei Li <fei1.li@intel.com> Signed-off-by: Len Baker <len.baker@gmx.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2022-03-18virt: acrn: fix a memory leak in acrn_dev_ioctl()Xiaolong Huang
The vm_param and cpu_regs need to be freed via kfree() before return -EINVAL error. Fixes: 9c5137aedd11 ("virt: acrn: Introduce VM management interfaces") Fixes: 2ad2aaee1bc9 ("virt: acrn: Introduce an ioctl to set vCPU registers state") Signed-off-by: Xiaolong Huang <butterflyhuangxx@gmail.com> Signed-off-by: Fei Li <fei1.li@intel.com> Link: https://lore.kernel.org/r/20220308092047.1008409-1-butterflyhuangxx@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-18virt: acrn: obtain pa from VMA with PFNMAP flagYonghua Huang
acrn_vm_ram_map can't pin the user pages with VM_PFNMAP flag by calling get_user_pages_fast(), the PA(physical pages) may be mapped by kernel driver and set PFNMAP flag. This patch fixes logic to setup EPT mapping for PFN mapped RAM region by checking the memory attribute before adding EPT mapping for them. Fixes: 88f537d5e8dd ("virt: acrn: Introduce EPT mapping management") Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Signed-off-by: Fei Li <fei1.li@intel.com> Link: https://lore.kernel.org/r/20220228022212.419406-1-yonghua.huang@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-18virt: acrn: Remove unsued acrn_irqfds_mutex.Sebastian Andrzej Siewior
acrn_irqfds_mutex is not used, never was. Remove acrn_irqfds_mutex. Fixes: aa3b483ff1d71 ("virt: acrn: Introduce irqfd") Cc: Fei Li <fei1.li@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/YidLo57Kw/u/cpA5@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-15all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriateYury Norov
find_first{,_zero}_bit is a more effective analogue of 'next' version if start == 0. This patch replaces 'next' with 'first' where things look trivial. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2021-10-05virt: acrn: Introduce interfaces for virtual device creating/destroyingShuo Liu
The ACRN hypervisor can emulate a virtual device within hypervisor for a Guest VM. The emulated virtual device can work without the ACRN userspace after creation. The hypervisor do the emulation of that device. To support the virtual device creating/destroying, HSM provides the following ioctls: - ACRN_IOCTL_CREATE_VDEV Pass data struct acrn_vdev from userspace to the hypervisor, and inform the hypervisor to create a virtual device for a User VM. - ACRN_IOCTL_DESTROY_VDEV Pass data struct acrn_vdev from userspace to the hypervisor, and inform the hypervisor to destroy a virtual device of a User VM. These new APIs will be used by user space code vm_add_hv_vdev and vm_remove_hv_vdev in https://github.com/projectacrn/acrn-hypervisor/blob/master/devicemodel/core/vmmapi.c Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Signed-off-by: Fei Li <fei1.li@intel.com> Link: https://lore.kernel.org/r/20210923084128.18902-3-fei1.li@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-05virt: acrn: Introduce interfaces for MMIO device passthroughShuo Liu
MMIO device passthrough enables an OS in a virtual machine to directly access a MMIO device in the host. It promises almost the native performance, which is required in performance-critical scenarios of ACRN. HSM provides the following ioctls: - Assign - ACRN_IOCTL_ASSIGN_MMIODEV Pass data struct acrn_mmiodev from userspace to the hypervisor, and inform the hypervisor to assign a MMIO device to a User VM. - De-assign - ACRN_IOCTL_DEASSIGN_PCIDEV Pass data struct acrn_mmiodev from userspace to the hypervisor, and inform the hypervisor to de-assign a MMIO device from a User VM. These new APIs will be used by user space code vm_assign_mmiodev and vm_deassign_mmiodev in https://github.com/projectacrn/acrn-hypervisor/blob/master/devicemodel/core/vmmapi.c Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Signed-off-by: Fei Li <fei1.li@intel.com> Link: https://lore.kernel.org/r/20210923084128.18902-2-fei1.li@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-27virt: acrn: Do hcall_destroy_vm() before resource releaseShuo Liu
The ACRN hypervisor has scenarios which could run a real-time guest VM. The real-time guest VM occupies dedicated CPU cores, be assigned with dedicated PCI devices. It can run without the Service VM after boot up. hcall_destroy_vm() returns failure when a real-time guest VM refuses. The clearing of flag ACRN_VM_FLAG_DESTROYED causes some kernel resource double-freed in a later acrn_vm_destroy(). Do hcall_destroy_vm() before resource release to drop this chance to destroy the VM if hypercall fails. Fixes: 9c5137aedd11 ("virt: acrn: Introduce VM management interfaces") Cc: stable <stable@vger.kernel.org> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Signed-off-by: Fei Li <fei1.li@intel.com> Link: https://lore.kernel.org/r/20210722062736.15050-1-fei1.li@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24virt: acrn: Fix document of acrn_msi_inject()Shuo Liu
This fixes below build warning with extra build checks. $ make W=1 ../drivers/virt/acrn/vm.c:105: warning: expecting prototype for acrn_inject_msi(). Prototype was for acrn_msi_inject() instead Fixes: c7cf8d27244f ("virt: acrn: Introduce interrupt injection interfaces") Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210311015206.19715-1-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10virt: acrn: Correct type casting of argument of copy_from_user()Shuo Liu
hsm.c:336:50: warning: incorrect type in argument 2 (different address spaces) hsm.c:336:50: expected void const [noderef] __user *from hsm.c:336:50: got void * This patch fixes above sparse warning. Fixes: 3d679d5aec64 ("virt: acrn: Introduce interfaces to query C-states and P-states allowed by hypervisor") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210310153708.17451-1-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10virt: acrn: Use EPOLLIN instead of POLLINYejune Deng
This fixes the following sparse warning: "sparse warnings: (new ones prefixed by >>)" >> drivers/virt/acrn/irqfd.c:163:13: sparse: sparse: restricted __poll_t degrades to integer Fixes: dcf9625f2adf ("virt: acrn: Use vfs_poll() instead of f_op->poll()") Reported-by: kernel test robot <lkp@intel.com> Acked-by: Shuo Liu <shuo.a.liu@intel.com> Signed-off-by: Yejune Deng <yejune.deng@gmail.com> Link: https://lore.kernel.org/r/20210310074901.7486-1-yejune.deng@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10virt: acrn: Use vfs_poll() instead of f_op->poll()Yejune Deng
Use a more advanced function vfs_poll() in acrn_irqfd_assign(). At the same time, modify the definition of events. Signed-off-by: Yejune Deng <yejune.deng@gmail.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210221133306.33530-1-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10virt: acrn: Make remove_cpu sysfs invisible with !CONFIG_HOTPLUG_CPUShuo Liu
Without cpu hotplug support, vCPU cannot be removed from a Service VM. Don't expose remove_cpu sysfs when CONFIG_HOTPLUG_CPU disabled. Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Qais Yousef <qais.yousef@arm.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210221134339.57851-2-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce an interface for Service VM to control vCPUShuo Liu
ACRN supports partition mode to achieve real-time requirements. In partition mode, a CPU core can be dedicated to a vCPU of User VM. The local APIC of the dedicated CPU core can be passthrough to the User VM. The Service VM controls the assignment of the CPU cores. Introduce an interface for the Service VM to remove the control of CPU core from hypervisor perspective so that the CPU core can be a dedicated CPU core of User VM. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-18-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce irqfdShuo Liu
irqfd is a mechanism to inject a specific interrupt to a User VM using a decoupled eventfd mechanism. Vhost is a kernel-level virtio server which uses eventfd for interrupt injection. To support vhost on ACRN, irqfd is introduced in HSM. HSM provides ioctls to associate a virtual Message Signaled Interrupt (MSI) with an eventfd. The corresponding virtual MSI will be injected into a User VM once the eventfd got signal. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-17-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce ioeventfdShuo Liu
ioeventfd is a mechanism to register PIO/MMIO regions to trigger an eventfd signal when written to by a User VM. ACRN userspace can register any arbitrary I/O address with a corresponding eventfd and then pass the eventfd to a specific end-point of interest for handling. Vhost is a kernel-level virtio server which uses eventfd for signalling. To support vhost on ACRN, ioeventfd is introduced in HSM. A new I/O client dedicated to ioeventfd is associated with a User VM during VM creation. HSM provides ioctls to associate an I/O region with a eventfd. The I/O client signals a eventfd once its corresponding I/O region is matched with an I/O request. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-16-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce I/O ranges operation interfacesShuo Liu
An I/O request of a User VM, which is constructed by hypervisor, is distributed by the ACRN Hypervisor Service Module to an I/O client corresponding to the address range of the I/O request. I/O client maintains a list of address ranges. Introduce acrn_ioreq_range_{add,del}() to manage these address ranges. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-15-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce interfaces to query C-states and P-states allowed by ↵Shuo Liu
hypervisor The C-states and P-states data are used to support CPU power management. The hypervisor controls C-states and P-states for a User VM. ACRN userspace need to query the data from the hypervisor to build ACPI tables for a User VM. HSM provides ioctls for ACRN userspace to query C-states and P-states data obtained from the hypervisor. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-14-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce interrupt injection interfacesShuo Liu
ACRN userspace need to inject virtual interrupts into a User VM in devices emulation. HSM needs provide interfaces to do so. Introduce following interrupt injection interfaces: ioctl ACRN_IOCTL_SET_IRQLINE: Pass data from userspace to the hypervisor, and inform the hypervisor to inject a virtual IOAPIC GSI interrupt to a User VM. ioctl ACRN_IOCTL_INJECT_MSI: Pass data struct acrn_msi_entry from userspace to the hypervisor, and inform the hypervisor to inject a virtual MSI to a User VM. ioctl ACRN_IOCTL_VM_INTR_MONITOR: Set a 4-Kbyte aligned shared page for statistics information of interrupts of a User VM. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-13-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce interfaces for PCI device passthroughShuo Liu
PCI device passthrough enables an OS in a virtual machine to directly access a PCI device in the host. It promises almost the native performance, which is required in performance-critical scenarios of ACRN. HSM provides the following ioctls: - Assign - ACRN_IOCTL_ASSIGN_PCIDEV Pass data struct acrn_pcidev from userspace to the hypervisor, and inform the hypervisor to assign a PCI device to a User VM. - De-assign - ACRN_IOCTL_DEASSIGN_PCIDEV Pass data struct acrn_pcidev from userspace to the hypervisor, and inform the hypervisor to de-assign a PCI device from a User VM. - Set a interrupt of a passthrough device - ACRN_IOCTL_SET_PTDEV_INTR Pass data struct acrn_ptdev_irq from userspace to the hypervisor, and inform the hypervisor to map a INTx interrupt of passthrough device of User VM. - Reset passthrough device interrupt - ACRN_IOCTL_RESET_PTDEV_INTR Pass data struct acrn_ptdev_irq from userspace to the hypervisor, and inform the hypervisor to unmap a INTx interrupt of passthrough device of User VM. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-12-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce PCI configuration space PIO accesses combinerShuo Liu
A User VM can access its virtual PCI configuration spaces via port IO approach, which has two following steps: 1) writes address into port 0xCF8 2) put/get data in/from port 0xCFC To distribute a complete PCI configuration space access one time, HSM need to combine such two accesses together. Combine two paired PIO I/O requests into one PCI I/O request and continue the I/O request distribution. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-11-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce I/O request managementShuo Liu
An I/O request of a User VM, which is constructed by the hypervisor, is distributed by the ACRN Hypervisor Service Module to an I/O client corresponding to the address range of the I/O request. For each User VM, there is a shared 4-KByte memory region used for I/O requests communication between the hypervisor and Service VM. An I/O request is a 256-byte structure buffer, which is 'struct acrn_io_request', that is filled by an I/O handler of the hypervisor when a trapped I/O access happens in a User VM. ACRN userspace in the Service VM first allocates a 4-KByte page and passes the GPA (Guest Physical Address) of the buffer to the hypervisor. The buffer is used as an array of 16 I/O request slots with each I/O request slot being 256 bytes. This array is indexed by vCPU ID. An I/O client, which is 'struct acrn_ioreq_client', is responsible for handling User VM I/O requests whose accessed GPA falls in a certain range. Multiple I/O clients can be associated with each User VM. There is a special client associated with each User VM, called the default client, that handles all I/O requests that do not fit into the range of any other I/O clients. The ACRN userspace acts as the default client for each User VM. The state transitions of a ACRN I/O request are as follows. FREE -> PENDING -> PROCESSING -> COMPLETE -> FREE -> ... FREE: this I/O request slot is empty PENDING: a valid I/O request is pending in this slot PROCESSING: the I/O request is being processed COMPLETE: the I/O request has been processed An I/O request in COMPLETE or FREE state is owned by the hypervisor. HSM and ACRN userspace are in charge of processing the others. The processing flow of I/O requests are listed as following: a) The I/O handler of the hypervisor will fill an I/O request with PENDING state when a trapped I/O access happens in a User VM. b) The hypervisor makes an upcall, which is a notification interrupt, to the Service VM. c) The upcall handler schedules a worker to dispatch I/O requests. d) The worker looks for the PENDING I/O requests, assigns them to different registered clients based on the address of the I/O accesses, updates their state to PROCESSING, and notifies the corresponding client to handle. e) The notified client handles the assigned I/O requests. f) The HSM updates I/O requests states to COMPLETE and notifies the hypervisor of the completion via hypercalls. Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-10-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce EPT mapping managementShuo Liu
The HSM provides hypervisor services to the ACRN userspace. While launching a User VM, ACRN userspace needs to allocate memory and request the ACRN Hypervisor to set up the EPT mapping for the VM. A mapping cache is introduced for accelerating the translation between the Service VM kernel virtual address and User VM physical address. >From the perspective of the hypervisor, the types of GPA of User VM can be listed as following: 1) RAM region, which is used by User VM as system ram. 2) MMIO region, which is recognized by User VM as MMIO. MMIO region is used to be utilized for devices emulation. Generally, User VM RAM regions mapping is set up before VM started and is released in the User VM destruction. MMIO regions mapping may be set and unset dynamically during User VM running. To achieve this, ioctls ACRN_IOCTL_SET_MEMSEG and ACRN_IOCTL_UNSET_MEMSEG are introduced in HSM. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-9-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce an ioctl to set vCPU registers stateShuo Liu
A virtual CPU of User VM has different context due to the different registers state. ACRN userspace needs to set the virtual CPU registers state (e.g. giving a initial registers state to a virtual BSP of a User VM). HSM provides an ioctl ACRN_IOCTL_SET_VCPU_REGS to do the virtual CPU registers state setting. The ioctl passes the registers state from ACRN userspace to the hypervisor directly. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-8-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce VM management interfacesShuo Liu
The VM management interfaces expose several VM operations to ACRN userspace via ioctls. For example, creating VM, starting VM, destroying VM and so on. The ACRN Hypervisor needs to exchange data with the ACRN userspace during the VM operations. HSM provides VM operation ioctls to the ACRN userspace and communicates with the ACRN Hypervisor for VM operations via hypercalls. HSM maintains a list of User VM. Each User VM will be bound to an existing file descriptor of /dev/acrn_hsm. The User VM will be destroyed when the file descriptor is closed. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-7-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce ACRN HSM basic driverShuo Liu
ACRN Hypervisor Service Module (HSM) is a kernel module in Service VM which communicates with ACRN userspace through ioctls and talks to ACRN Hypervisor through hypercalls. Add a basic HSM driver which allows Service VM userspace to communicate with ACRN. The following patches will add more ioctls, guest VM memory mapping caching, I/O request processing, ioeventfd and irqfd into this module. HSM exports a char device interface (/dev/acrn_hsm) to userspace. Cc: Dave Hansen <dave.hansen@intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-6-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>