Age | Commit message (Collapse) | Author |
|
Fix the comment to clarify that __pkvm_vcpu_init_traps()
initializes traps for all VMs in protected mode, and not only
for protected VMs.
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-14-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We've added a .data section for the hypervisor, which kmemleak is
eager to parse. This clearly doesn't go well, so add the section
to kmemleak's block list.
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-13-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
pKVM maintains its own state at EL2 for tracking the host fpsimd
state. Therefore, no need to map and share the host's view with
it.
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-12-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Rename __tlb_switch_to_{guest,host}() to
{enter,exit}_vmid_context() in VHE code to maintain symmetry
between the nVHE and VHE TLB invalidations.
No functional change intended.
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-11-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Typically, TLB invalidation of guest stage-2 mappings using nVHE is
performed by a hypercall originating from the host. For the invalidation
instruction to be effective, therefore, __tlb_switch_to_{guest,host}()
swizzle the active stage-2 context around the TLBI instruction.
With guest-to-host memory sharing and unsharing hypercalls
originating from the guest under pKVM, there is need to support
both guest and host VMID invalidations issued from guest context.
Replace the __tlb_switch_to_{guest,host}() functions with a more general
{enter,exit}_vmid_context() implementation which supports being invoked
from guest context and acts as a no-op if the target context matches the
running context.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-10-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Break-before-make (BBM) can be expensive, as transitioning via an
invalid mapping (i.e. the "break" step) requires the completion of TLB
invalidation and can also cause other agents to fault concurrently on
the invalid mapping.
Since BBM is not required when changing only the software bits of a PTE,
avoid the sequence in this case and just update the PTE directly.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-9-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Don't just assume that the PTE is valid when checking whether it
describes an executable or cacheable mapping.
This makes sure that we don't issue CMOs for invalid mappings.
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-8-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Under certain circumstances __get_fault_info() may resolve the faulting
address using the AT instruction. Given that this is being done outside
of the host lock critical section, it is racy and the resolution via AT
may fail. We currently BUG() in this situation, which is obviously less
than ideal. Moving the address resolution to the critical section may
have a performance impact, so let's keep it where it is, but bail out
and return to the host to try a second time.
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-7-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
On the guest teardown path, pKVM will zero the pages used to back
the guest data structures before returning them to the host as
they may contain secrets (e.g. in the vCPU registers). However,
the zeroing is done using a cacheable alias, and CMOs are
missing, hence giving the host a potential opportunity to read
the original content of the guest structs from memory.
Fix this by issuing CMOs after zeroing the pages.
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The lock is already initialized in core KVM code at
kvm_create_vm().
Fixes: 9d0c063a4d1d ("KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1")
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
To avoid direct comparison against the fp_owner enum, add a new
function that performs the check, host_owns_fp_regs(), to
complement the existing guest_owns_fp_regs().
To check for fpsimd state ownership, use the helpers instead of
directly using the enums.
No functional change intended.
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
guest_owns_fp_regs() will be used to check fpsimd state ownership
across kvm/arm64. Therefore, move it to kvm_host.h to widen its
scope.
Moreover, the host state is not per-vcpu anymore, the vcpu
parameter isn't used, so remove it as well.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Since the host_fpsimd_state has been removed from kvm_vcpu_arch,
it isn't pointing to the hyp's version of the host fp_regs in
protected mode.
Initialize the host_data fpsimd_state point to the host_data's
context fp_regs on pKVM initialization.
Fixes: 51e09b5572d6 ("KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_arch")
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Commit d5a32b60dc18 ("KVM: arm64: Allow userspace to change
ID_AA64MMFR{0-2}_EL1") made certain fields in these registers writable,
but in doing so, ID_AA64MMFR1_EL1_XNX was listed twice. Remove the
duplication.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Zenghui Yu <zenghui.yu@linux.dev>
Link: https://lore.kernel.org/r/E1s2AxF-00AWLv-03@rmk-PC.armlinux.org.uk
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The last genuine use case for the lpi_list_lock was the global LPI
translation cache, which has been removed in favor of a per-ITS xarray.
Remove a layer from the locking puzzle by getting rid of it.
vgic_add_lpi() still has a critical section that needs to protect
against the insertion of other LPIs; change it to take the LPI xarray's
xa_lock to retain this property.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-13-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The MSI injection fast path has been transitioned away from the global
translation cache. Rip it out.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-12-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Everything is in place to switch to per-ITS translation caches. Start
using the per-ITS cache to avoid the lock serialization related to the
global translation cache. Explicitly check for out-of-range device and
event IDs as the cache index is packed based on the range the ITS
actually supports.
Take the RCU read lock to protect against the returned descriptor being
freed while trying to take a reference on it, as it is no longer
necessary to acquire the lpi_list_lock.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-11-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The fast path will soon need to find an ITS by doorbell address, as the
translation caches will become local to an ITS. Spin off a helper to do
just that.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-10-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Within the context of a single ITS, it is possible to use an xarray to
cache the device ID & event ID translation to a particular irq
descriptor. Take advantage of this to build a translation cache capable
of fitting all valid translations for a given ITS.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-9-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
As the current LPI translation cache is global, the corresponding
invalidation helpers are also globally-scoped. In anticipation of
constructing a translation cache per ITS, add a helper for scoped cache
invalidations.
We still need to support global invalidations when LPIs are toggled on
a redistributor, as a property of the translation cache is that all
stored LPIs are known to be delieverable.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-8-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The last user has been transitioned to walking the LPI xarray directly.
Cut the wart off, and get rid of the now unneeded lpi_count while doing
so.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-7-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The vgic debug iterator is the final user of vgic_copy_lpi_list(), but
is a bit more complicated to transition to something else. Use a mark
in the LPI xarray to record the indices 'known' to the debug iterator.
Protect against the LPIs from being freed by associating an additional
reference with the xarray mark.
Rework iter_next() to let the xarray walk 'drive' the iteration after
visiting all of the SGIs, PPIs, and SPIs.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-6-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The new LPI xarray makes it possible to walk the VM's LPIs without
holding a lock, meaning that vgic_copy_lpi_list() is no longer
necessary. Prepare for the deletion by walking the LPI xarray directly
in vgic_its_cmd_handle_movall().
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-5-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The new LPI xarray makes it possible to walk the VM's LPIs without
holding a lock, meaning that vgic_copy_lpi_list() is no longer
necessary. Prepare for the deletion by walking the LPI xarray directly
in vgic_its_invall().
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-4-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The new LPI xarray makes it possible to walk the VM's LPIs without
holding a lock, meaning that vgic_copy_lpi_list() is no longer
necessary. Prepare for the deletion by walking the LPI xarray directly
in its_sync_lpi_pending_table().
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240422200158.2606761-3-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
vgic_v2_parse_attr() is responsible for finding the vCPU that matches
the user-provided CPUID, which (of course) may not be valid. If the ID
is invalid, kvm_get_vcpu_by_id() returns NULL, which isn't handled
gracefully.
Similar to the GICv3 uaccess flow, check that kvm_get_vcpu_by_id()
actually returns something and fail the ioctl if not.
Cc: stable@vger.kernel.org
Fixes: 7d450e282171 ("KVM: arm/arm64: vgic-new: Add userland access to VGIC dist registers")
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240424173959.3776798-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
We still support GCC 8.x, and it appears that this toolchain
usually comes with an assembler that does not understand "pauth"
as a valid architectural extension.
This results in the NV ERETAx code breaking the build, as it relies
on this extention to make use of the PACGA instruction (required
by assemblers such as LLVM's).
Work around it by hand-assembling the instruction, which removes the
requirement for any assembler directive.
Fixes: 6ccc971ee2c6 ("KVM: arm64: nv: Add emulation for ERETAx instructions")
Reported-by: Linaro Kernel Functional Testing <lkft@linaro.org>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We currently insist on disabling PAuth on vcpu_load(), and get to
enable it on first guest use of an instruction or a key (ignoring
the NV case for now).
It isn't clear at all what this is trying to achieve: guests tend
to use PAuth when available, and nothing forces you to expose it
to the guest if you don't want to. This also isn't totally free:
we take a full GPR save/restore between host and guest, only to
write ten 64bit registers. The "value proposition" escapes me.
So let's forget this stuff and enable PAuth eagerly if exposed to
the guest. This results in much simpler code. Performance wise,
that's not bad either (tested on M2 Pro running a fully automated
Debian installer as the workload):
- On a non-NV guest, I can see reduction of 0.24% in the number
of cycles (measured with perf over 10 consecutive runs)
- On a NV guest (L2), I see a 2% reduction in wall-clock time
(measured with 'time', as M2 doesn't have a PMUv3 and NV
doesn't support it either)
So overall, a much reduced complexity and a (small) performance
improvement.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-16-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Now that we (hopefully) correctly handle ERETAx, drop the masking
of the PAuth feature (something that was not even complete, as
APA3 and AGA3 were still exposed).
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-15-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Now that we have some emulation in place for ERETA[AB], we can
plug it into the exception handling machinery.
As for a bare ERET, an "easy" ERETAx instruction is processed as
a fixup, while something that requires a translation regime
transition or an exception delivery is left to the slow path.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-14-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
FEAT_NV has the interesting property of relying on ERET being
trapped. An added complexity is that it also traps ERETAA and
ERETAB, meaning that the Pointer Authentication aspect of these
instruction must be emulated.
Add an emulation of Pointer Authentication, limited to ERETAx
(always using SP_EL2 as the modifier and ELR_EL2 as the pointer),
using the Generic Authentication instructions.
The emulation, however small, is placed in its own compilation
unit so that it can be avoided if the configuration doesn't
include it (or the toolchan in not up to the task).
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-13-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In order for a L1 hypervisor to correctly handle PAuth instructions,
it must observe traps caused by a L1 PAuth instruction when
HCR_EL2.API==0. Since we already handle the case for API==1 as
a fixup, only the exception injection case needs to be handled.
Rework the kvm_handle_ptrauth() callback to reinject the trap
in this case. Note that APK==0 is already handled by the exising
triage_sysreg_trap() helper.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-11-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Although KVM couples API and APK for simplicity, the architecture
makes no such requirement, and the two can be independently set or
cleared.
Check for which of the two possible reasons we have trapped here,
and if the corresponding L1 control bit isn't set, delegate the
handling for forwarding.
Otherwise, set this exact bit in HCR_EL2 and resume the guest.
Of course, in the non-NV case, we keep setting both bits and
be done with it. Note that the entry core already saves/restores
the keys should any of the two control bits be set.
This results in a bit of rework, and the removal of the (trivial)
vcpu_ptrauth_enable() helper.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-10-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
If the L1 hypervisor decides to trap ERETs while running L2,
make sure we don't try to emulate it, just like we wouldn't
if it had its NV bit set.
The exception will be reinjected from the core handler.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-9-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
A significant part of the FEAT_NV extension is to trap ERET
instructions so that the hypervisor gets a chance to switch
from a vEL2 L1 guest to an EL1 L2 guest.
But this also has the unfortunate consequence of trapping ERET
in unsuspecting circumstances, such as staying at vEL2 (interrupt
handling while being in the guest hypervisor), or returning to host
userspace in the case of a VHE guest.
Although we already make some effort to handle these ERET quicker
by not doing the put/load dance, it is still way too far down the
line for it to be efficient enough.
For these cases, it would ideal to ERET directly, no question asked.
Of course, we can't do that. But the next best thing is to do it as
early as possible, in fixup_guest_exit(), much as we would handle
FPSIMD exceptions.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-8-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Honor the trap forwarding bits for both ERET and SMC, using a new
helper that checks for common conditions.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-7-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add the HCR_EL2 configuration for FEAT_NV2, adding the required
bits for running a guest hypervisor, and overall merging the
allowed bits provided by the guest.
This heavily replies on unavaliable features being sanitised
when the HCR_EL2 shadow register is accessed, and only a couple
of bits must be explicitly disabled.
Non-NV guests are completely unaffected by any of this.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-6-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
It has become obvious that HCR_EL2.NV serves the exact same use
as VCPU_HYP_CONTEXT, only in an architectural way. So just drop
the flag for good.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-5-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
PAuth comes it two parts: address authentication, and generic
authentication. So far, KVM mandates that both are implemented.
PAuth also comes in three flavours: Q5, Q3, and IMPDEF. Only one
can be implemented for any of address and generic authentication.
Crucially, the architecture doesn't mandate that address and generic
authentication implement the *same* flavour. This would make
implementing ERETAx very difficult for NV, something we are not
terribly keen on.
So only allow PAuth support for KVM on systems that are not totally
insane. Which is so far 100% of the known HW.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-4-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The ESR_ELx_ERET_ISS_ERET* macros are a bit confusing:
- ESR_ELx_ERET_ISS_ERET really indicates that we have trapped an
ERETA* instruction, as opposed to an ERET
- ESR_ELx_ERET_ISS_ERETA really indicates that we have trapped
an ERETAB instruction, as opposed to an ERETAA.
We could repaint those to make more sense, but these are the
names that are present in the ARM ARM, and we are sentimentally
attached to those.
Instead, add two new helpers:
- esr_iss_is_eretax() being true tells you that you need to
authenticate the ERET
- esr_iss_is_eretab() tells you that you need to use the B key
instead of the A key
Following patches will make use of these primitives.
Suggested-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The denylist is blocking the 32 bit version of the call but is allowing
the 64 bit version of it. There is no reason for blocking only one of
them and the hypervisor should support these calls.
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Link: https://lore.kernel.org/r/20240411135700.2140550-1-sebastianene@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Adding new entries to our system register tables is a painful exercise,
as we require them to be ordered by Op0,Op1,CRn,CRm,Op2.
If an entry is misordered, we output an error that indicates the
pointer to the entry and the number *of the last valid one*.
That's not very helpful, and would be much better if we printed the
number of the *offending* entry as well as its name (which is present
in the vast majority of the cases).
This makes debugging new additions to the tables much easier.
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20240410152503.3593890-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In retrospect, it is fairly obvious that the FP state ownership
is only meaningful for a given CPU, and that locating this
information in the vcpu was just a mistake.
Move the ownership tracking into the host data structure, and
rename it from fp_state to fp_owner, which is a better description
(name suggested by Mark Brown).
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
As the name of the field indicates, host_fpsimd_state is strictly
a host piece of data, and we reset this pointer on each PID change.
So let's move it where it belongs, and set it at load-time. Although
this is slightly more often, it is a well defined life-cycle which
matches other pieces of data.
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
As for the rest of the host debug state, the host copy of mdcr_el2
has little to do in the vcpu, and is better placed in the host_data
structure.
Reviewed-by : Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Keeping host_debug_state on a per-vcpu basis is completely
pointless. The lifetime of this data is only that of the inner
run-loop, which means it is never accessed outside of the core
EL2 code.
Move the structure into kvm_host_data, and save over 500 bytes
per vcpu.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In order to facilitate the introduction of new per-CPU state,
add a new host_data_ptr() helped that hides some of the per-CPU
verbosity, and make it easier to move that state around in the
future.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The .change_pte() MMU notifier callback was intended as an
optimization. The original point of it was that KSM could tell KVM to flip
its secondary PTE to a new location without having to first zap it. At
the time there was also an .invalidate_page() callback; both of them were
*not* bracketed by calls to mmu_notifier_invalidate_range_{start,end}(),
and .invalidate_page() also doubled as a fallback implementation of
.change_pte().
Later on, however, both callbacks were changed to occur within an
invalidate_range_start/end() block.
In the case of .change_pte(), commit 6bdb913f0a70 ("mm: wrap calls to
set_pte_at_notify with invalidate_range_start and invalidate_range_end",
2012-10-09) did so to remove the fallback from .invalidate_page() to
.change_pte() and allow sleepable .invalidate_page() hooks.
This however made KVM's usage of the .change_pte() callback completely
moot, because KVM unmaps the sPTEs during .invalidate_range_start()
and therefore .change_pte() has no hope of finding a sPTE to change.
Drop the generic KVM code that dispatches to kvm_set_spte_gfn(), as
well as all the architecture specific implementations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anup Patel <anup@brainfault.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20240405115815.3226315-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
We are not very consistent when it comes to displaying which mode
we're in (VHE, {n,h}VHE, protected or not). For example, booting
in protected mode with hVHE results in:
[ 0.969545] kvm [1]: Protected nVHE mode initialized successfully
which is mildly amusing considering that the machine is VHE only.
We already cleaned this up a bit with commit 1f3ca7023fe6 ("KVM:
arm64: print Hyp mode"), but that's still unsatisfactory.
Unify the three strings into one and use a mess of conditional
statements to sort it out (yes, it's a slow day).
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240321173706.3280796-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
When zapping a table entry in stage2_try_break_pte(), we issue range
TLB invalidation for the region that was mapped by the table. However,
we neglect to align the base address down to the granule size and so
if we ended up reaching the table entry via a misaligned address then
we will accidentally skip invalidation for some prefix of the affected
address range.
Align 'ctx->addr' down to the granule size when performing TLB
invalidation for an unmapped table in stage2_try_break_pte().
Cc: Raghavendra Rao Ananta <rananta@google.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Shaoqin Huang <shahuang@redhat.com>
Cc: Quentin Perret <qperret@google.com>
Fixes: defc8cc7abf0 ("KVM: arm64: Invalidate the table entries upon a range")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240327124853.11206-5-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|