summaryrefslogtreecommitdiff
path: root/arch/arm64/kvm/vgic
AgeCommit message (Collapse)Author
2025-01-11KVM: arm64: vgic: Use str_enabled_disabled() in vgic_v3_probe()Thorsten Blum
Remove hard-coded strings by using the str_enabled_disabled() helper function. Suggested-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250110225310.369980-2-thorsten.blum@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-12-20KVM: arm64: Introduce __pkvm_vcpu_{load,put}()Marc Zyngier
Rather than look-up the hyp vCPU on every run hypercall at EL2, introduce a per-CPU 'loaded_hyp_vcpu' tracking variable which is updated by a pair of load/put hypercalls called directly from kvm_arch_vcpu_{load,put}() when pKVM is enabled. Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20241218194059.3670226-10-qperret@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-12-03KVM: arm64: vgic-its: Add error handling in vgic_its_cache_translationKeisuke Nishimura
The return value of xa_store() needs to be checked. This fix adds an error handling path that resolves the kref inconsistency on failure. As suggested by Oliver Upton, this function does not return the error code intentionally because the translation cache is best effort. Fixes: 8201d1028caa ("KVM: arm64: vgic-its: Maintain a translation cache per ITS") Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr> Suggested-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241130144952.23729-1-keisuke.nishimura@inria.fr Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-20KVM: arm64: vgic-its: Add stronger type-checking to the ITS entry sizesMarc Zyngier
The ITS ABI infrastructure allows for some pretty lax code, where the size of the data doesn't have to match the size of the entry, potentially leading to a collection of interesting bugs. Commit 7fe28d7e68f9 ("KVM: arm64: vgic-its: Add a data length check in vgic_its_save_*") added some checks, but starts by implicitly casting all writes to a 64bit value, hiding some of the issues. Instead, introduce macros that will check the data type actually used for dealing with the table entries. The macros are taking a symbolic entry type that is used to fetch the size of the entry type for the current ABI. This immediately catches a couple of low-impact gotchas (zero values that are implicitly 32bit), easy enough to fix. Given that we currently only have a single ABI, hardcode a couple of BUILD_BUG_ON()s that will fire if we use anything but a 64bit quantity, and some (currently unreachable) fallback code that may become useful one day. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241117165757.247686-5-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-20KVM: arm64: vgic: Kill VGIC_MAX_PRIVATE definitionMarc Zyngier
VGIC_MAX_PRIVATE is a pretty useless definition, and is better replaced with VGIC_NR_PRIVATE_IRQS. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241117165757.247686-4-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-20KVM: arm64: vgic: Make vgic_get_irq() more robustMarc Zyngier
vgic_get_irq() has an awkward signature, as it takes both a kvm *and* a vcpu, where the vcpu is allowed to be NULL if the INTID being looked up is a global interrupt (SPI or LPI). This leads to potentially problematic situations where the INTID passed is a private interrupt, but that there is no vcpu. In order to make things less ambiguous, let have *two* helpers instead: - vgic_get_irq(struct kvm *kvm, u32 intid), which is only concerned with *global* interrupts, as indicated by the lack of vcpu. - vgic_get_vcpu_irq(struct kvm_vcpu *vcpu, u32 intid), which can return *any* interrupt class, but must have of course a non-NULL vcpu. Most of the code nicely falls under one or the other situations, except for a couple of cases (close to the UABI or in the debug code) where we have to distinguish between the two cases. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241117165757.247686-3-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-20KVM: arm64: vgic-v3: Sanitise guest writes to GICR_INVLPIRMarc Zyngier
Make sure we filter out non-LPI invalidation when handling writes to GICR_INVLPIR. Fixes: 4645d11f4a553 ("KVM: arm64: vgic-v3: Implement MMIO-based LPI invalidation") Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241117165757.247686-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11KVM: arm64: vgic-its: Clear ITE when DISCARD frees an ITEKunkun Jiang
When DISCARD frees an ITE, it does not invalidate the corresponding ITE. In the scenario of continuous saves and restores, there may be a situation where an ITE is not saved but is restored. This is unreasonable and may cause restore to fail. This patch clears the corresponding ITE when DISCARD frees an ITE. Cc: stable@vger.kernel.org Fixes: eff484e0298d ("KVM: arm64: vgic-its: ITT save and restore") Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> [Jing: Update with entry write helper] Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20241107214137.428439-6-jingzhangos@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11KVM: arm64: vgic-its: Clear DTE when MAPD unmaps a deviceKunkun Jiang
vgic_its_save_device_tables will traverse its->device_list to save DTE for each device. vgic_its_restore_device_tables will traverse each entry of device table and check if it is valid. Restore if valid. But when MAPD unmaps a device, it does not invalidate the corresponding DTE. In the scenario of continuous saves and restores, there may be a situation where a device's DTE is not saved but is restored. This is unreasonable and may cause restore to fail. This patch clears the corresponding DTE when MAPD unmaps a device. Cc: stable@vger.kernel.org Fixes: 57a9a117154c ("KVM: arm64: vgic-its: Device table save/restore") Co-developed-by: Shusen Li <lishusen2@huawei.com> Signed-off-by: Shusen Li <lishusen2@huawei.com> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> [Jing: Update with entry write helper] Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20241107214137.428439-5-jingzhangos@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11KVM: arm64: vgic-its: Add a data length check in vgic_its_save_*Jing Zhang
In all the vgic_its_save_*() functinos, they do not check whether the data length is 8 bytes before calling vgic_write_guest_lock. This patch adds the check. To prevent the kernel from being blown up when the fault occurs, KVM_BUG_ON() is used. And the other BUG_ON()s are replaced together. Cc: stable@vger.kernel.org Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> [Jing: Update with the new entry read/write helpers] Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20241107214137.428439-4-jingzhangos@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-17KVM: arm64: Ensure vgic_ready() is ordered against MMIO registrationOliver Upton
kvm_vgic_map_resources() prematurely marks the distributor as 'ready', potentially allowing vCPUs to enter the guest before the distributor's MMIO registration has been made visible. Plug the race by marking the distributor as ready only after MMIO registration is completed. Rely on the implied ordering of synchronize_srcu() to ensure the MMIO registration is visible before vgic_dist::ready. This also means that writers to vgic_dist::ready are now serialized by the slots_lock, which was effectively the case already as all writers held the slots_lock in addition to the config_lock. Fixes: 59112e9c390b ("KVM: arm64: vgic: Fix a circular locking issue") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241017001947.2707312-3-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-10-17KVM: arm64: vgic: Don't check for vgic_ready() when setting NR_IRQSOliver Upton
KVM commits to a particular sizing of SPIs when the vgic is initialized, which is before the point a vgic becomes ready. On top of that, KVM supplies a default amount of SPIs should userspace not explicitly configure this. As such, the check for vgic_ready() in the handling of KVM_DEV_ARM_VGIC_GRP_NR_IRQS is completely wrong, and testing if nr_spis is nonzero is sufficient for preventing userspace from playing games with us. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241017001947.2707312-2-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-10-11KVM: arm64: Don't eagerly teardown the vgic on init errorMarc Zyngier
As there is very little ordering in the KVM API, userspace can instanciate a half-baked GIC (missing its memory map, for example) at almost any time. This means that, with the right timing, a thread running vcpu-0 can enter the kernel without a GIC configured and get a GIC created behind its back by another thread. Amusingly, it will pick up that GIC and start messing with the data structures without the GIC having been fully initialised. Similarly, a thread running vcpu-1 can enter the kernel, and try to init the GIC that was previously created. Since this GIC isn't properly configured (no memory map), it fails to correctly initialise. And that's the point where we decide to teardown the GIC, freeing all its resources. Behind vcpu-0's back. Things stop pretty abruptly, with a variety of symptoms. Clearly, this isn't good, we should be a bit more careful about this. It is obvious that this guest is not viable, as it is missing some important part of its configuration. So instead of trying to tear bits of it down, let's just mark it as *dead*. It means that any further interaction from userspace will result in -EIO. The memory will be released on the "normal" path, when userspace gives up. Cc: stable@vger.kernel.org Reported-by: Alexander Potapenko <glider@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241009183603.3221824-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-10-08KVM: arm64: Unregister redistributor for failed vCPU creationOliver Upton
Alex reports that syzkaller has managed to trigger a use-after-free when tearing down a VM: BUG: KASAN: slab-use-after-free in kvm_put_kvm+0x300/0xe68 virt/kvm/kvm_main.c:5769 Read of size 8 at addr ffffff801c6890d0 by task syz.3.2219/10758 CPU: 3 UID: 0 PID: 10758 Comm: syz.3.2219 Not tainted 6.11.0-rc6-dirty #64 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x17c/0x1a8 arch/arm64/kernel/stacktrace.c:317 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324 __dump_stack lib/dump_stack.c:93 [inline] dump_stack_lvl+0x94/0xc0 lib/dump_stack.c:119 print_report+0x144/0x7a4 mm/kasan/report.c:377 kasan_report+0xcc/0x128 mm/kasan/report.c:601 __asan_report_load8_noabort+0x20/0x2c mm/kasan/report_generic.c:381 kvm_put_kvm+0x300/0xe68 virt/kvm/kvm_main.c:5769 kvm_vm_release+0x4c/0x60 virt/kvm/kvm_main.c:1409 __fput+0x198/0x71c fs/file_table.c:422 ____fput+0x20/0x30 fs/file_table.c:450 task_work_run+0x1cc/0x23c kernel/task_work.c:228 do_notify_resume+0x144/0x1a0 include/linux/resume_user_mode.h:50 el0_svc+0x64/0x68 arch/arm64/kernel/entry-common.c:169 el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598 Upon closer inspection, it appears that we do not properly tear down the MMIO registration for a vCPU that fails creation late in the game, e.g. a vCPU w/ the same ID already exists in the VM. It is important to consider the context of commit that introduced this bug by moving the unregistration out of __kvm_vgic_vcpu_destroy(). That change correctly sought to avoid an srcu v. config_lock inversion by breaking up the vCPU teardown into two parts, one guarded by the config_lock. Fix the use-after-free while avoiding lock inversion by adding a special-cased unregistration to __kvm_vgic_vcpu_destroy(). This is safe because failed vCPUs are torn down outside of the config_lock. Cc: stable@vger.kernel.org Fixes: f616506754d3 ("KVM: arm64: vgic: Don't hold config_lock while unregistering redistributors") Reported-by: Alexander Potapenko <glider@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20241007223909.2157336-1-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-27KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guestMarc Zyngier
In order to be consistent, we shouldn't advertise a GICv3 when none is actually usable by the guest. Wipe the feature when these conditions apply, and allow the field to be written from userspace. This now allows us to rewrite the kvm_has_gicv3 helper() in terms of kvm_has_feat(), given that it is always evaluated at runtime. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240827152517.3909653-6-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-27KVM: arm64: Force GICv3 trap activation when no irqchip is configured on VHEMarc Zyngier
On a VHE system, no GICv3 traps get configured when no irqchip is present. This is not quite matching the "no GICv3" semantics that we want to present. Force such traps to be configured in this case. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240827152517.3909653-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-27KVM: arm64: Force SRE traps when SRE access is not enabledMarc Zyngier
We so far only write the ICH_HCR_EL2 config in two situations: - when we need to emulate the GICv3 CPU interface due to HW bugs - when we do direct injection, as the virtual CPU interface needs to be enabled This is all good. But it also means that we don't do anything special when we emulate a GICv2, or that there is no GIC at all. What happens in this case when the guest uses the GICv3 system registers? The *guest* gets a trap for a sysreg access (EC=0x18) while we'd really like it to get an UNDEF. Fixing this is a bit involved: - we need to set all the required trap bits (TC, TALL0, TALL1, TDIR) - for these traps to take effect, we need to (counter-intuitively) set ICC_SRE_EL1.SRE to 1 so that the above traps take priority. Note that doesn't fully work when GICv2 emulation is enabled, as we cannot set ICC_SRE_EL1.SRE to 1 (it breaks Group0 delivery as IRQ). Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240827152517.3909653-3-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-27KVM: arm64: Move GICv3 trap configuration to kvm_calculate_traps()Marc Zyngier
Follow the pattern introduced with vcpu_set_hcr(), and introduce vcpu_set_ich_hcr(), which configures the GICv3 traps at the same point. This will allow future changes to introduce trap configuration on a per-VM basis. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240827152517.3909653-2-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-22KVM: arm64: Make ICC_*SGI*_EL1 undef in the absence of a vGICv3Marc Zyngier
On a system with a GICv3, if a guest hasn't been configured with GICv3 and that the host is not capable of GICv2 emulation, a write to any of the ICC_*SGI*_EL1 registers is trapped to EL2. We therefore try to emulate the SGI access, only to hit a NULL pointer as no private interrupt is allocated (no GIC, remember?). The obvious fix is to give the guest what it deserves, in the shape of a UNDEF exception. Reported-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240820100349.3544850-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-19KVM: arm64: vgic: Don't hold config_lock while unregistering redistributorsMarc Zyngier
We recently moved the teardown of the vgic part of a vcpu inside a critical section guarded by the config_lock. This teardown phase involves calling into kvm_io_bus_unregister_dev(), which takes the kvm->srcu lock. However, this violates the established order where kvm->srcu is taken on a memory fault (such as an MMIO access), possibly followed by taking the config_lock if the GIC emulation requires mutual exclusion from the other vcpus. It therefore results in a bad lockdep splat, as reported by Zenghui. Fix this by moving the call to kvm_io_bus_unregister_dev() outside of the config_lock critical section. At this stage, there shouln't be any need to hold the config_lock. As an additional bonus, document the ordering between kvm->slots_lock, kvm->srcu and kvm->arch.config_lock so that I cannot pretend I didn't know about those anymore. Fixes: 9eb18136af9f ("KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20240819125045.3474845-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-19KVM: arm64: vgic-debug: Don't put unmarked LPIsZenghui Yu
If there were LPIs being mapped behind our back (i.e., between .start() and .stop()), we would put them at iter_unmark_lpis() without checking if they were actually *marked*, which is obviously not good. Switch to use the xa_for_each_marked() iterator to fix it. Cc: stable@vger.kernel.org Fixes: 85d3ccc8b75b ("KVM: arm64: vgic-debug: Use an xarray mark for debug iterator") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240817101541.1664-1-yuzenghui@huawei.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-08KVM: arm64: vgic: Hold config_lock while tearing down a CPU interfaceMarc Zyngier
Tearing down a vcpu CPU interface involves freeing the private interrupt array. If we don't hold the lock, we may race against another thread trying to configure it. Yeah, fuzzers do wonderful things... Taking the lock early solves this particular problem. Fixes: 03b3d00a70b5 ("KVM: arm64: vgic: Allocate private interrupts on demand") Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240808091546.3262111-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-07KVM: arm64: vgic-debug: Exit the iterator properly w/o LPIZenghui Yu
In case the guest doesn't have any LPI, we previously relied on the iterator setting 'intid = nr_spis + VGIC_NR_PRIVATE_IRQS' && 'lpi_idx = 1' to exit the iterator. But it was broken with commit 85d3ccc8b75b ("KVM: arm64: vgic-debug: Use an xarray mark for debug iterator") -- the intid remains at 'nr_spis + VGIC_NR_PRIVATE_IRQS - 1', and we end up endlessly printing the last SPI's state. Consider that it's meaningless to search the LPI xarray and populate lpi_idx when there is no LPI, let's just skip the process for that case. The result is that * If there's no LPI, we focus on the intid and exit the iterator when it runs out of the valid SPI range. * Otherwise we keep the current logic and let the xarray drive the iterator. Fixes: 85d3ccc8b75b ("KVM: arm64: vgic-debug: Use an xarray mark for debug iterator") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240807052024.2084-1-yuzenghui@huawei.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-02KVM: arm64: vgic: fix unexpected unlock sparse warningsSebastian Ott
Get rid of unexpected unlock sparse warnings in vgic code by adding an annotation to vgic_queue_irq_unlock(). arch/arm64/kvm/vgic/vgic.c:334:17: warning: context imbalance in 'vgic_queue_irq_unlock' - unexpected unlock arch/arm64/kvm/vgic/vgic.c:419:5: warning: context imbalance in 'kvm_vgic_inject_irq' - different lock contexts for basic block Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240723101204.7356-4-sebott@redhat.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-08-02KVM: arm64: fix kdoc warnings in W=1 buildsSebastian Ott
Fix kdoc warnings by adding missing function parameter descriptions or by conversion to a normal comment. Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240723101204.7356-3-sebott@redhat.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-06KVM: arm64: Disassociate vcpus from redistributor region on teardownMarc Zyngier
When tearing down a redistributor region, make sure we don't have any dangling pointer to that region stored in a vcpu. Fixes: e5a35635464b ("kvm: arm64: vgic-v3: Introduce vgic_v3_free_redist_region()") Reported-by: Alexander Potapenko <glider@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240605175637.1635653-1-maz@kernel.org Cc: stable@vger.kernel.org
2024-05-12Merge tag 'kvmarm-6.10-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 6.10 - Move a lot of state that was previously stored on a per vcpu basis into a per-CPU area, because it is only pertinent to the host while the vcpu is loaded. This results in better state tracking, and a smaller vcpu structure. - Add full handling of the ERET/ERETAA/ERETAB instructions in nested virtualisation. The last two instructions also require emulating part of the pointer authentication extension. As a result, the trap handling of pointer authentication has been greattly simplified. - Turn the global (and not very scalable) LPI translation cache into a per-ITS, scalable cache, making non directly injected LPIs much cheaper to make visible to the vcpu. - A batch of pKVM patches, mostly fixes and cleanups, as the upstreaming process seems to be resuming. Fingers crossed! - Allocate PPIs and SGIs outside of the vcpu structure, allowing for smaller EL2 mapping and some flexibility in implementing more or less than 32 private IRQs. - Purge stale mpidr_data if a vcpu is created after the MPIDR map has been created. - Preserve vcpu-specific ID registers across a vcpu reset. - Various minor cleanups and improvements.
2024-05-08Merge branch kvm-arm64/misc-6.10 into kvmarm-master/nextMarc Zyngier
* kvm-arm64/misc-6.10: : . : Misc fixes and updates targeting 6.10 : : - Improve boot-time diagnostics when the sysreg tables : are not correctly sorted : : - Allow FFA_MSG_SEND_DIRECT_REQ in the FFA proxy : : - Fix duplicate XNX field in the ID_AA64MMFR1_EL1 : writeable mask : : - Allocate PPIs and SGIs outside of the vcpu structure, allowing : for smaller EL2 mapping and some flexibility in implementing : more or less than 32 private IRQs. : : - Use bitmap_gather() instead of its open-coded equivalent : : - Make protected mode use hVHE if available : : - Purge stale mpidr_data if a vcpu is created after the MPIDR : map has been created : . KVM: arm64: Destroy mpidr_data for 'late' vCPU creation KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE support KVM: arm64: Fix hvhe/nvhe early alias parsing KVM: arm64: Convert kvm_mpidr_index() to bitmap_gather() KVM: arm64: vgic: Allocate private interrupts on demand KVM: arm64: Remove duplicated AA64MMFR1_EL1 XNX KVM: arm64: Remove FFA_MSG_SEND_DIRECT_REQ from the denylist KVM: arm64: Improve out-of-order sysreg table diagnostics Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-03Merge branch kvm-arm64/pkvm-6.10 into kvmarm-master/nextMarc Zyngier
* kvm-arm64/pkvm-6.10: (25 commits) : . : At last, a bunch of pKVM patches, courtesy of Fuad Tabba. : From the cover letter: : : "This series is a bit of a bombay-mix of patches we've been : carrying. There's no one overarching theme, but they do improve : the code by fixing existing bugs in pKVM, refactoring code to : make it more readable and easier to re-use for pKVM, or adding : functionality to the existing pKVM code upstream." : . KVM: arm64: Force injection of a data abort on NISV MMIO exit KVM: arm64: Restrict supported capabilities for protected VMs KVM: arm64: Refactor setting the return value in kvm_vm_ioctl_enable_cap() KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst KVM: arm64: Rename firmware pseudo-register documentation file KVM: arm64: Reformat/beautify PTP hypercall documentation KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit KVM: arm64: Introduce and use predicates that check for protected VMs KVM: arm64: Add is_pkvm_initialized() helper KVM: arm64: Simplify vgic-v3 hypercalls KVM: arm64: Move setting the page as dirty out of the critical section KVM: arm64: Change kvm_handle_mmio_return() return polarity KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() KVM: arm64: Prevent kmemleak from accessing .hyp.data KVM: arm64: Do not map the host fpsimd state to hyp in pKVM KVM: arm64: Rename __tlb_switch_to_{guest,host}() in VHE KVM: arm64: Support TLB invalidation in guest context KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE KVM: arm64: Check for PTE validity when checking for executable/cacheable KVM: arm64: Avoid BUG-ing from the host abort path ... Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-03KVM: arm64: vgic: Allocate private interrupts on demandMarc Zyngier
Private interrupts are currently part of the CPU interface structure that is part of each and every vcpu we create. Currently, we have 32 of them per vcpu, resulting in a per-vcpu array that is just shy of 4kB. On its own, that's no big deal, but it gets in the way of other things: - each vcpu gets mapped at EL2 on nVHE/hVHE configurations. This requires memory that is physically contiguous. However, the EL2 code has no purpose looking at the interrupt structures and could do without them being mapped. - supporting features such as EPPIs, which extend the number of private interrupts past the 32 limit would make the array even larger, even for VMs that do not use the EPPI feature. Address these issues by moving the private interrupt array outside of the vcpu, and replace it with a simple pointer. We take this opportunity to make it obvious what gets initialised when, as that path was remarkably opaque, and tighten the locking. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502154545.3012089-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01KVM: arm64: Simplify vgic-v3 hypercallsMarc Zyngier
Consolidate the GICv3 VMCR accessor hypercalls into the APR save/restore hypercalls so that all of the EL2 GICv3 state is covered by a single pair of hypercalls. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-17-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Get rid of the lpi_list_lockOliver Upton
The last genuine use case for the lpi_list_lock was the global LPI translation cache, which has been removed in favor of a per-ITS xarray. Remove a layer from the locking puzzle by getting rid of it. vgic_add_lpi() still has a critical section that needs to protect against the insertion of other LPIs; change it to take the LPI xarray's xa_lock to retain this property. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-13-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Rip out the global translation cacheOliver Upton
The MSI injection fast path has been transitioned away from the global translation cache. Rip it out. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-12-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Use the per-ITS translation cache for injectionOliver Upton
Everything is in place to switch to per-ITS translation caches. Start using the per-ITS cache to avoid the lock serialization related to the global translation cache. Explicitly check for out-of-range device and event IDs as the cache index is packed based on the range the ITS actually supports. Take the RCU read lock to protect against the returned descriptor being freed while trying to take a reference on it, as it is no longer necessary to acquire the lpi_list_lock. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-11-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Spin off helper for finding ITS by doorbell addrOliver Upton
The fast path will soon need to find an ITS by doorbell address, as the translation caches will become local to an ITS. Spin off a helper to do just that. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-10-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Maintain a translation cache per ITSOliver Upton
Within the context of a single ITS, it is possible to use an xarray to cache the device ID & event ID translation to a particular irq descriptor. Take advantage of this to build a translation cache capable of fitting all valid translations for a given ITS. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-9-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Scope translation cache invalidations to an ITSOliver Upton
As the current LPI translation cache is global, the corresponding invalidation helpers are also globally-scoped. In anticipation of constructing a translation cache per ITS, add a helper for scoped cache invalidations. We still need to support global invalidations when LPIs are toggled on a redistributor, as a property of the translation cache is that all stored LPIs are known to be delieverable. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-8-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Get rid of vgic_copy_lpi_list()Oliver Upton
The last user has been transitioned to walking the LPI xarray directly. Cut the wart off, and get rid of the now unneeded lpi_count while doing so. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-7-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-debug: Use an xarray mark for debug iteratorOliver Upton
The vgic debug iterator is the final user of vgic_copy_lpi_list(), but is a bit more complicated to transition to something else. Use a mark in the LPI xarray to record the indices 'known' to the debug iterator. Protect against the LPIs from being freed by associating an additional reference with the xarray mark. Rework iter_next() to let the xarray walk 'drive' the iteration after visiting all of the SGIs, PPIs, and SPIs. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-6-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Walk LPI xarray in vgic_its_cmd_handle_movall()Oliver Upton
The new LPI xarray makes it possible to walk the VM's LPIs without holding a lock, meaning that vgic_copy_lpi_list() is no longer necessary. Prepare for the deletion by walking the LPI xarray directly in vgic_its_cmd_handle_movall(). Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-5-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Walk LPI xarray in vgic_its_invall()Oliver Upton
The new LPI xarray makes it possible to walk the VM's LPIs without holding a lock, meaning that vgic_copy_lpi_list() is no longer necessary. Prepare for the deletion by walking the LPI xarray directly in vgic_its_invall(). Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-4-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-25KVM: arm64: vgic-its: Walk LPI xarray in its_sync_lpi_pending_table()Oliver Upton
The new LPI xarray makes it possible to walk the VM's LPIs without holding a lock, meaning that vgic_copy_lpi_list() is no longer necessary. Prepare for the deletion by walking the LPI xarray directly in its_sync_lpi_pending_table(). Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240422200158.2606761-3-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-24KVM: arm64: vgic-v2: Check for non-NULL vCPU in vgic_v2_parse_attr()Oliver Upton
vgic_v2_parse_attr() is responsible for finding the vCPU that matches the user-provided CPUID, which (of course) may not be valid. If the ID is invalid, kvm_get_vcpu_by_id() returns NULL, which isn't handled gracefully. Similar to the GICv3 uaccess flow, check that kvm_get_vcpu_by_id() actually returns something and fail the ioctl if not. Cc: stable@vger.kernel.org Fixes: 7d450e282171 ("KVM: arm/arm64: vgic-new: Add userland access to VGIC dist registers") Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240424173959.3776798-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-03-11Merge tag 'kvmarm-6.9' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.9 - Infrastructure for building KVM's trap configuration based on the architectural features (or lack thereof) advertised in the VM's ID registers - Support for mapping vfio-pci BARs as Normal-NC (vaguely similar to x86's WC) at stage-2, improving the performance of interacting with assigned devices that can tolerate it - Conversion of KVM's representation of LPIs to an xarray, utilized to address serialization some of the serialization on the LPI injection path - Support for _architectural_ VHE-only systems, advertised through the absence of FEAT_E2H0 in the CPU's ID register - Miscellaneous cleanups, fixes, and spelling corrections to KVM and selftests
2024-03-07Merge branch kvm-arm64/kerneldoc into kvmarm/nextOliver Upton
* kvm-arm64/kerneldoc: : kerneldoc warning fixes, courtesy of Randy Dunlap : : Fixes addressing the widespread misuse of kerneldoc-style comments : throughout KVM/arm64. KVM: arm64: vgic: fix a kernel-doc warning KVM: arm64: vgic-its: fix kernel-doc warnings KVM: arm64: vgic-init: fix a kernel-doc warning KVM: arm64: sys_regs: fix kernel-doc warnings KVM: arm64: PMU: fix kernel-doc warnings KVM: arm64: mmu: fix a kernel-doc warning KVM: arm64: vhe: fix a kernel-doc warning KVM: arm64: hyp/aarch32: fix kernel-doc warnings KVM: arm64: guest: fix kernel-doc warnings KVM: arm64: debug: fix kernel-doc warnings Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-03-07Merge branch kvm-arm64/lpi-xarray into kvmarm/nextOliver Upton
* kvm-arm64/lpi-xarray: : xarray-based representation of vgic LPIs : : KVM's linked-list of LPI state has proven to be a bottleneck in LPI : injection paths, due to lock serialization when acquiring / releasing a : reference on an IRQ. : : Start the tedious process of reworking KVM's LPI injection by replacing : the LPI linked-list with an xarray, leveraging this to allow RCU readers : to walk it outside of the spinlock. KVM: arm64: vgic: Don't acquire the lpi_list_lock in vgic_put_irq() KVM: arm64: vgic: Ensure the irq refcount is nonzero when taking a ref KVM: arm64: vgic: Rely on RCU protection in vgic_get_lpi() KVM: arm64: vgic: Free LPI vgic_irq structs in an RCU-safe manner KVM: arm64: vgic: Use atomics to count LPIs KVM: arm64: vgic: Get rid of the LPI linked-list KVM: arm64: vgic-its: Walk the LPI xarray in vgic_copy_lpi_list() KVM: arm64: vgic-v3: Iterate the xarray to find pending LPIs KVM: arm64: vgic: Use xarray to find LPI in vgic_get_lpi() KVM: arm64: vgic: Store LPIs in an xarray Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-02-24KVM: arm64: Fix typosBjorn Helgaas
Fix typos, most reported by "codespell arch/arm64". Only touches comments, no code changes. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.linux.dev Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240103231605.1801364-6-helgaas@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-02-23KVM: arm64: vgic: Don't acquire the lpi_list_lock in vgic_put_irq()Oliver Upton
The LPI xarray's xa_lock is sufficient for synchronizing writers when freeing a given LPI. Furthermore, readers can only take a new reference on an IRQ if it was already nonzero. Stop taking the lpi_list_lock unnecessarily and get rid of __vgic_put_lpi_locked(). Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240221054253.3848076-11-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-02-23KVM: arm64: vgic: Ensure the irq refcount is nonzero when taking a refOliver Upton
It will soon be possible for get() and put() calls to happen in parallel, which means in most cases we must ensure the refcount is nonzero when taking a new reference. Switch to using vgic_try_get_irq_kref() where necessary, and document the few conditions where an IRQ's refcount is guaranteed to be nonzero. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240221054253.3848076-10-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-02-23KVM: arm64: vgic: Rely on RCU protection in vgic_get_lpi()Oliver Upton
Stop acquiring the lpi_list_lock in favor of RCU for protecting the read-side critical section in vgic_get_lpi(). In order for this to be safe, we also need to be careful not to take a reference on an irq with a refcount of 0, as it is about to be freed. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240221054253.3848076-9-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>