Age | Commit message (Collapse) | Author |
|
... otherwise we can inherit the host configuration if this differs from
the KVM configuration.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[maz: simplified a couple of things]
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Now that we have computed RES0 bits, use them to sanitise the
guest view of FGT registers.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The current FTP tables are only concerned with the bits generating
ESR_ELx.EC==0x18. However, we want an exhaustive view of what KVM
really knows about.
So let's add another small table that provides that extra information.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In the process of decoupling KVM's view of the FGT bits from the
wider architectural state, use KVM's own FGT tables to build
a synthetic view of what is actually known.
This allows for some checking along the way.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We don't seem to be handling the GCS-specific exception class.
Handle it by delivering an UNDEF to the guest, and populate the
relevant trap bits.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Treating HCRX_EL2 as yet another FGT register seems excessive, and
gets in a way of further improvements. It is actually simpler to
just be explicit about the masking, so just to that.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We currently unconditionally make ACCDATA_EL1 accesses UNDEF.
As we are about to support it, restrict the UNDEF behaviour to cases
where FEAT_LS64_ACCDATA is not exposed to the guest.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We generally don't expect FEAT_LS64* instructions to trap, unless
they are trapped by a guest hypervisor.
Otherwise, this is just the guest playing tricks on us by using
an instruction that isn't advertised, which we handle with a well
deserved UNDEF.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
check_fgt_bit() and triage_sysreg_trap() implement the same thing
twice for no good reason. We have to lookup the FGT register twice,
as we don't communicate it. Similarly, we extract the register value
at the wrong spot.
Reorganise the code in a more logical way so that things are done
at the correct location, removing a lot of duplication.
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
triage_sysreg_trap() assumes that it knows all the possible values
for FGT groups, which won't be the case as we start adding more
FGT registers (unless we add everything in one go, which is obviously
undesirable).
At the same time, it doesn't offer much in terms of debugging info
when things go wrong.
Turn the "__NR_FGT_GROUP_IDS__" case into a default, covering any
unhandled value, and give the kernel hacker a bit of a clue about
what's wrong (system register and full trap descriptor).
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Treating HFGRTR_EL2 and HFGWTR_EL2 identically was a mistake.
It makes things hard to reason about, has the potential to
introduce bugs by giving a meaning to bits that are really reserved,
and is in general a bad description of the architecture.
Given that #defines are cheap, let's describe both registers as
intended by the architecture, and repaint all the existing uses.
Yes, this is painful.
The registers themselves are generated from the JSON file in
an automated way.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The pKVM selftest intends to test as many memory 'transitions' as
possible, so extend it to cover sharing pages with non-protected guests,
including in the case of multi-sharing.
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416160900.3078417-5-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We have recently found a bug [1] in the pKVM memory ownership
transitions by code inspection, but it could have been caught with a
test.
Introduce a boot-time selftest exercising all the known pKVM memory
transitions and importantly checks the rejection of illegal transitions.
The new test is hidden behind a new Kconfig option separate from
CONFIG_EL2_NVHE_DEBUG on purpose as that has side effects on the
transition checks ([1] doesn't reproduce with EL2 debug enabled).
[1] https://lore.kernel.org/kvmarm/20241128154406.602875-1-qperret@google.com/
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416160900.3078417-4-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We currently WARN() if the host attempts to share a page that is not in
an acceptable state with a guest. This isn't strictly necessary and
makes testing much harder, so drop the WARN and make sure to propage the
error code instead.
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416160900.3078417-3-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The hypervisor has not needed its own .data section because all globals
were either .rodata or .bss. To avoid having to initialize future
data-structures at run-time, let's introduce add a .data section to the
hypervisor.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416160900.3078417-2-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We keep setting and clearing these bits depending on the role of
the host kernel, mimicking what we do for nVHE. But that's actually
pretty pointless, as we always want physical interrupts to make it
to the host, at EL2.
This has also two problems:
- it prevents IRQs from being taken when these bits are cleared
if the implementation has chosen to implement these bits as
masks when HCR_EL2.{TGE,xMO}=={0,0}
- it triggers a bad erratum on the AmpereOne HW, which catches
fire on clearing these bits while an interrupt is being taken
(AC03_CPU_36).
Let's kill these two birds with a single stone, and permanently
set the xMO bits when running VHE. This involves a bit of surgery
on code paths that rely on flipping these bits on and off for
other purposes.
Note that the earliest setting of hcr_el2 (in the init_hcr_el2
macro) is left untouched as is runs extremely early, with interrupts
disabled, and soon enough overwritten with the final value containing
the xMO bits.
Reported-by: D Scott Phillips <scott@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20250429114326.3618875-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Replace repetitive ternary expressions with the str_on_off() helper
function. This change improves code readability and ensures consistency
in tracepoint string formatting
Signed-off-by: Seongsu Park <sgsu.park@samsung.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1891546521.01744691102904.JavaMail.epsvc@epcpadp1new
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
virtualisable EL
A sorry excuse for a selftest is trying to disable AArch64 support.
And yes, this goes as well as you can imagine.
Let's forbid this sort of things. Normal userspace shouldn't get
caught doing that.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20250429114117.3618800-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
We keep setting and clearing these bits depending on the role of
the host kernel, mimicking what we do for nVHE. But that's actually
pretty pointless, as we always want physical interrupts to make it
to the host, at EL2.
This has also two problems:
- it prevents IRQs from being taken when these bits are cleared
if the implementation has chosen to implement these bits as
masks when HCR_EL2.{TGE,xMO}=={0,0}
- it triggers a bad erratum on the AmpereOne HW, which catches
fire on clearing these bits while an interrupt is being taken
(AC03_CPU_36).
Let's kill these two birds with a single stone, and permanently
set the xMO bits when running VHE. This involves a bit of surgery
on code paths that rely on flipping these bits on and off for
other purposes.
Note that the earliest setting of hcr_el2 (in the init_hcr_el2
macro) is left untouched as is runs extremely early, with interrupts
disabled, and soon enough overwritten with the final value containing
the xMO bits.
Reported-by: D Scott Phillips <scott@os.amperecomputing.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250429114326.3618875-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Commit fce886a60207 ("KVM: arm64: Plumb the pKVM MMU in KVM") made the
initialization of the local memcache variable in user_mem_abort()
conditional, leaving a codepath where it is used uninitialized via
kvm_pgtable_stage2_map().
This can fail on any path that requires a stage-2 allocation
without transition via a permission fault or dirty logging.
Fix this by making sure that memcache is always valid.
Fixes: fce886a60207 ("KVM: arm64: Plumb the pKVM MMU in KVM")
Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/kvmarm/3f5db4c7-ccce-fb95-595c-692fa7aad227@redhat.com/
Link: https://lore.kernel.org/r/20250505173148.33900-1-sebott@redhat.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Now that gcc-8 and binutils-2.30 are the minimum versions, a lot of
the individual feature checks can go away for simplification.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
* kvm-arm64/nv-pmu-fixes:
: .
: Fixes for NV PMU emulation. From the cover letter:
:
: "Joey reports that some of his PMU tests do not behave quite as
: expected:
:
: - MDCR_EL2.HPMN is set to 0 out of reset
:
: - PMCR_EL0.P should reset all the counters when written from EL2
:
: Oliver points out that setting PMCR_EL0.N from userspace by writing to
: the register is silly with NV, and that we need a new PMU attribute
: instead.
:
: On top of that, I figured out that we had a number of little gotchas:
:
: - It is possible for a guest to write an HPMN value that is out of
: bound, and it seems valuable to limit it
:
: - PMCR_EL0.N should be the maximum number of counters when read from
: EL2, and MDCR_EL2.HPMN when read from EL0/EL1
:
: - Prevent userspace from updating PMCR_EL0.N when EL2 is available"
: .
KVM: arm64: Let kvm_vcpu_read_pmcr() return an EL-dependent value for PMCR_EL0.N
KVM: arm64: Handle out-of-bound write to MDCR_EL2.HPMN
KVM: arm64: Don't let userspace write to PMCR_EL0.N when the vcpu has EL2
KVM: arm64: Allow userspace to limit the number of PMU counters for EL2 VMs
KVM: arm64: Contextualise the handling of PMCR_EL0.P writes
KVM: arm64: Fix MDCR_EL2.HPMN reset value
KVM: arm64: Repaint pmcr_n into nr_pmu_counters
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Now that the hypervisor's state is stored in the hyp_vmemmap, we no
longer need an expensive page-table walk to read it. This means we can
now afford to cross check the hyp-state during all memory ownership
transitions where the hyp is involved unconditionally, hence avoiding
problems such as [1].
[1] https://lore.kernel.org/kvmarm/20241128154406.602875-1-qperret@google.com/
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-8-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We currently blindly map into EL2 stage-1 *any* page passed to the
__pkvm_host_share_hyp() HVC. This is less than ideal from a security
perspective as it makes exploitation of potential hypervisor gadgets
easier than it should be. But interestingly, pKVM should never need to
access SHARED_BORROWED pages that it hasn't previously pinned, so there
is no need to map the page before that.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-7-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Tracking the hypervisor's ownership state into struct hyp_page has
several benefits, including allowing far more efficient lookups (no
page-table walk needed) and de-corelating the state from the presence
of a mapping. This will later allow to map pages into EL2 stage-1 less
proactively which is generally a good thing for security. And in the
future this will help with tracking the state of pages mapped into the
hypervisor's private range without requiring an alias into the 'linear
map' range.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-6-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Instead of directly accessing the host_state member in struct hyp_page,
introduce static inline accessors to do it. The future hyp_state member
will follow the same pattern as it will need some logic in the accessors.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-5-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The page ownership state encoded as 0b11 is currently considered
reserved for future use, and PKVM_NOPAGE uses bit 2. In order to
simplify the relocation of the hyp ownership state into the
vmemmap in later patches, let's use the 'reserved' encoding for
the PKVM_NOPAGE state. The struct hyp_page layout isn't guaranteed
stable at all, so there is no real reason to have 'reserved' encodings.
No functional changes intended.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-4-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Most of the comments relating to pKVM page-tracking in nvhe/memory.h are
now either slightly outdated or outright wrong. Fix the comments.
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-3-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When dealing with a guest with SVE enabled, make sure the host SVE
state is pinned at EL2 S1, and that the hypervisor vCPU state is
correctly initialised (and then unpinned on teardown).
Co-authored-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250416152648.2982950-2-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
kvm_arch_has_irq_bypass() is a small function and even though it does
not appear in any *really* hot paths, it's also not entirely rare.
Make it inline---it also works out nicely in preparation for using it in
kvm-intel.ko and kvm-amd.ko, since the function is not currently exported.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When EL2 is present, PMCR_EL0.N is the effective value of MDCR_EL2.HPMN
when accessed from EL1 or EL0.
Make sure we honor this requirement.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We don't really pay attention to what gets written to MDCR_EL2.HPMN,
and funky guests could play ugly games on us.
Restrict what gets written there, and limit the number of counters
to what the PMU is allowed to have.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Now that userspace can provide its limit for hte maximum number of
counters, prevent it from writing to PMCR_EL0.N, as the value should
be derived from MDCR_EL2.HPMN in that case.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
As long as we had purely EL1 VMs, we could easily update the number
of guest-visible counters by letting userspace write to PMCR_EL0.N.
With VMs started at EL2, PMCR_EL1.N only reflects MDCR_EL2.HPMN,
and we don't have a good way to limit it.
For this purpose, introduce a new PMUv3 attribute that allows
limiting the maximum number of counters. This requires the explicit
selection of a PMU.
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Contrary to what the comment says in kvm_pmu_handle_pmcr(),
writing PMCR_EL0.P==1 has the following effects:
<quote>
The event counters affected by this field are:
* All event counters in the first range.
* If any of the following are true, all event counters in the second
range:
- EL2 is disabled or not implemented in the current Security state.
- The PE is executing at EL2 or EL3.
</quote>
where the "first range" represent the counters in the [0..HPMN-1]
range, and the "second range" the counters in the [HPMN..MAX] range.
It so appears that writing P from EL2 should nuke all counters,
and not just the "guest" view. Just do that, and nuke the misleading
comment.
Reported-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The MDCR_EL2 documentation indicates that the HPMN field has
the following behaviour:
"On a Warm reset, this field resets to the expression NUM_PMU_COUNTERS."
However, it appears we reset it to zero, which is not very useful.
Add a reset helper for MDCR_EL2, and handle the case where userspace
changes the target PMU, which may force us to change HPMN again.
Reported-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The pmcr_n field obviously refers to PMCR_EL0.N, but is generally used
as the number of counters seen by the guest. Rename it accordingly.
Suggested-by: Oliver upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Rework heuristics for resolving the fault IPA (HPFAR_EL2 v. re-walk
stage-1 page tables) to align with the architecture. This avoids
possibly taking an SEA at EL2 on the page table walk or using an
architecturally UNKNOWN fault IPA
- Use acquire/release semantics in the KVM FF-A proxy to avoid
reading a stale value for the FF-A version
- Fix KVM guest driver to match PV CPUID hypercall ABI
- Use Inner Shareable Normal Write-Back mappings at stage-1 in KVM
selftests, which is the only memory type for which atomic
instructions are architecturally guaranteed to work
s390:
- Don't use %pK for debug printing and tracepoints
x86:
- Use a separate subclass when acquiring KVM's per-CPU posted
interrupts wakeup lock in the scheduled out path, i.e. when adding
a vCPU on the list of vCPUs to wake, to workaround a false positive
deadlock. The schedule out code runs with a scheduler lock that the
wakeup handler takes in the opposite order; but it does so with
IRQs disabled and cannot run concurrently with a wakeup
- Explicitly zero-initialize on-stack CPUID unions
- Allow building irqbypass.ko as as module when kvm.ko is a module
- Wrap relatively expensive sanity check with KVM_PROVE_MMU
- Acquire SRCU in KVM_GET_MP_STATE to protect guest memory accesses
selftests:
- Add more scenarios to the MONITOR/MWAIT test
- Add option to rseq test to override /dev/cpu_dma_latency
- Bring list of exit reasons up to date
- Cleanup Makefile to list once tests that are valid on all
architectures
Other:
- Documentation fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits)
KVM: arm64: Use acquire/release to communicate FF-A version negotiation
KVM: arm64: selftests: Explicitly set the page attrs to Inner-Shareable
KVM: arm64: selftests: Introduce and use hardware-definition macros
KVM: VMX: Use separate subclasses for PI wakeup lock to squash false positive
KVM: VMX: Assert that IRQs are disabled when putting vCPU on PI wakeup list
KVM: x86: Explicitly zero-initialize on-stack CPUID unions
KVM: Allow building irqbypass.ko as as module when kvm.ko is a module
KVM: x86/mmu: Wrap sanity check on number of TDP MMU pages with KVM_PROVE_MMU
KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency
KVM: x86: Acquire SRCU in KVM_GET_MP_STATE to protect guest memory accesses
Documentation: kvm: remove KVM_CAP_MIPS_TE
Documentation: kvm: organize capabilities in the right section
Documentation: kvm: fix some definition lists
Documentation: kvm: drop "Capability" heading from capabilities
Documentation: kvm: give correct name for KVM_CAP_SPAPR_MULTITCE
Documentation: KVM: KVM_GET_SUPPORTED_CPUID now exposes TSC_DEADLINE
selftests: kvm: list once tests that are valid on all architectures
selftests: kvm: bring list of exit reasons up to date
selftests: kvm: revamp MONITOR/MWAIT tests
KVM: arm64: Don't translate FAR if invalid/unsafe
...
|
|
The pKVM FF-A proxy rejects FF-A requests other than FFA_VERSION until
version negotiation is complete, which is signalled by setting the
global 'has_version_negotiated' variable.
To avoid excessive locking, this variable is checked directly from
kvm_host_ffa_handler() in response to an FF-A call, but this can race
against another CPU performing the negotiation and potentially lead to
reading a torn value (incredibly unlikely for a 'bool') or problematic
re-ordering of the accesses to 'has_version_negotiated' and
'hyp_ffa_version' whereby a stale version number could be read by
__do_ffa_mem_xfer().
Use acquire/release primitives when writing 'has_version_negotiated'
with the version lock held and when reading without the lock held.
Cc: Sebastian Ene <sebastianene@google.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Marc Zyngier <maz@kernel.org>
Fixes: c9c012625e12 ("KVM: arm64: Trap FFA_VERSION host call in pKVM")
Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250407152755.1041-1-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Don't re-walk the page tables if an SEA occurred during the faulting
page table walk to avoid taking a fatal exception in the hyp.
Additionally, check that FAR_EL2 is valid for SEAs not taken on PTW
as the architecture doesn't guarantee it contains the fault VA.
Finally, fix up the rest of the abort path by checking for SEAs early
and bugging the VM if we get further along with an UNKNOWN fault IPA.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250402201725.2963645-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Switch over to the typical sysreg table for HPFAR_EL2 as we're about to
start using more fields in the register.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250402201725.2963645-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
KVM's logic for deciding when HPFAR_EL2 is UNKNOWN doesn't align with
the architecture. Most notably, KVM assumes HPFAR_EL2 contains the
faulting IPA even in the case of an SEA.
Align the logic with the architecture rather than attempting to
paraphrase it. Additionally, take the opportunity to improve the
language around ARM erratum #834220 such that it actually describes the
bug.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250402201725.2963645-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- The series "Enable strict percpu address space checks" from Uros
Bizjak uses x86 named address space qualifiers to provide
compile-time checking of percpu area accesses.
This has caused a small amount of fallout - two or three issues were
reported. In all cases the calling code was found to be incorrect.
- The series "Some cleanup for memcg" from Chen Ridong implements some
relatively monir cleanups for the memcontrol code.
- The series "mm: fixes for device-exclusive entries (hmm)" from David
Hildenbrand fixes a boatload of issues which David found then using
device-exclusive PTE entries when THP is enabled. More work is
needed, but this makes thins better - our own HMM selftests now
succeed.
- The series "mm: zswap: remove z3fold and zbud" from Yosry Ahmed
remove the z3fold and zbud implementations. They have been deprecated
for half a year and nobody has complained.
- The series "mm: further simplify VMA merge operation" from Lorenzo
Stoakes implements numerous simplifications in this area. No runtime
effects are anticipated.
- The series "mm/madvise: remove redundant mmap_lock operations from
process_madvise()" from SeongJae Park rationalizes the locking in the
madvise() implementation. Performance gains of 20-25% were observed
in one MADV_DONTNEED microbenchmark.
- The series "Tiny cleanup and improvements about SWAP code" from
Baoquan He contains a number of touchups to issues which Baoquan
noticed when working on the swap code.
- The series "mm: kmemleak: Usability improvements" from Catalin
Marinas implements a couple of improvements to the kmemleak
user-visible output.
- The series "mm/damon/paddr: fix large folios access and schemes
handling" from Usama Arif provides a couple of fixes for DAMON's
handling of large folios.
- The series "mm/damon/core: fix wrong and/or useless damos_walk()
behaviors" from SeongJae Park fixes a few issues with the accuracy of
kdamond's walking of DAMON regions.
- The series "expose mapping wrprotect, fix fb_defio use" from Lorenzo
Stoakes changes the interaction between framebuffer deferred-io and
core MM. No functional changes are anticipated - this is preparatory
work for the future removal of page structure fields.
- The series "mm/damon: add support for hugepage_size DAMOS filter"
from Usama Arif adds a DAMOS filter which permits the filtering by
huge page sizes.
- The series "mm: permit guard regions for file-backed/shmem mappings"
from Lorenzo Stoakes extends the guard region feature from its
present "anon mappings only" state. The feature now covers shmem and
file-backed mappings.
- The series "mm: batched unmap lazyfree large folios during
reclamation" from Barry Song cleans up and speeds up the unmapping
for pte-mapped large folios.
- The series "reimplement per-vma lock as a refcount" from Suren
Baghdasaryan puts the vm_lock back into the vma. Our reasons for
pulling it out were largely bogus and that change made the code more
messy. This patchset provides small (0-10%) improvements on one
microbenchmark.
- The series "Docs/mm/damon: misc DAMOS filters documentation fixes and
improves" from SeongJae Park does some maintenance work on the DAMON
docs.
- The series "hugetlb/CMA improvements for large systems" from Frank
van der Linden addresses a pile of issues which have been observed
when using CMA on large machines.
- The series "mm/damon: introduce DAMOS filter type for unmapped pages"
from SeongJae Park enables users of DMAON/DAMOS to filter my the
page's mapped/unmapped status.
- The series "zsmalloc/zram: there be preemption" from Sergey
Senozhatsky teaches zram to run its compression and decompression
operations preemptibly.
- The series "selftests/mm: Some cleanups from trying to run them" from
Brendan Jackman fixes a pile of unrelated issues which Brendan
encountered while runnimg our selftests.
- The series "fs/proc/task_mmu: add guard region bit to pagemap" from
Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to
determine whether a particular page is a guard page.
- The series "mm, swap: remove swap slot cache" from Kairui Song
removes the swap slot cache from the allocation path - it simply
wasn't being effective.
- The series "mm: cleanups for device-exclusive entries (hmm)" from
David Hildenbrand implements a number of unrelated cleanups in this
code.
- The series "mm: Rework generic PTDUMP configs" from Anshuman Khandual
implements a number of preparatoty cleanups to the GENERIC_PTDUMP
Kconfig logic.
- The series "mm/damon: auto-tune aggregation interval" from SeongJae
Park implements a feedback-driven automatic tuning feature for
DAMON's aggregation interval tuning.
- The series "Fix lazy mmu mode" from Ryan Roberts fixes some issues in
powerpc, sparc and x86 lazy MMU implementations. Ryan did this in
preparation for implementing lazy mmu mode for arm64 to optimize
vmalloc.
- The series "mm/page_alloc: Some clarifications for migratetype
fallback" from Brendan Jackman reworks some commentary to make the
code easier to follow.
- The series "page_counter cleanup and size reduction" from Shakeel
Butt cleans up the page_counter code and fixes a size increase which
we accidentally added late last year.
- The series "Add a command line option that enables control of how
many threads should be used to allocate huge pages" from Thomas
Prescher does that. It allows the careful operator to significantly
reduce boot time by tuning the parallalization of huge page
initialization.
- The series "Fix calculations in trace_balance_dirty_pages() for cgwb"
from Tang Yizhou fixes the tracing output from the dirty page
balancing code.
- The series "mm/damon: make allow filters after reject filters useful
and intuitive" from SeongJae Park improves the handling of allow and
reject filters. Behaviour is made more consistent and the documention
is updated accordingly.
- The series "Switch zswap to object read/write APIs" from Yosry Ahmed
updates zswap to the new object read/write APIs and thus permits the
removal of some legacy code from zpool and zsmalloc.
- The series "Some trivial cleanups for shmem" from Baolin Wang does as
it claims.
- The series "fs/dax: Fix ZONE_DEVICE page reference counts" from
Alistair Popple regularizes the weird ZONE_DEVICE page refcount
handling in DAX, permittig the removal of a number of special-case
checks.
- The series "refactor mremap and fix bug" from Lorenzo Stoakes is a
preparatoty refactoring and cleanup of the mremap() code.
- The series "mm: MM owner tracking for large folios (!hugetlb) +
CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in
which we determine whether a large folio is known to be mapped
exclusively into a single MM.
- The series "mm/damon: add sysfs dirs for managing DAMOS filters based
on handling layers" from SeongJae Park adds a couple of new sysfs
directories to ease the management of DAMON/DAMOS filters.
- The series "arch, mm: reduce code duplication in mem_init()" from
Mike Rapoport consolidates many per-arch implementations of
mem_init() into code generic code, where that is practical.
- The series "mm/damon/sysfs: commit parameters online via
damon_call()" from SeongJae Park continues the cleaning up of sysfs
access to DAMON internal data.
- The series "mm: page_ext: Introduce new iteration API" from Luiz
Capitulino reworks the page_ext initialization to fix a boot-time
crash which was observed with an unusual combination of compile and
cmdline options.
- The series "Buddy allocator like (or non-uniform) folio split" from
Zi Yan reworks the code to split a folio into smaller folios. The
main benefit is lessened memory consumption: fewer post-split folios
are generated.
- The series "Minimize xa_node allocation during xarry split" from Zi
Yan reduces the number of xarray xa_nodes which are generated during
an xarray split.
- The series "drivers/base/memory: Two cleanups" from Gavin Shan
performs some maintenance work on the drivers/base/memory code.
- The series "Add tracepoints for lowmem reserves, watermarks and
totalreserve_pages" from Martin Liu adds some more tracepoints to the
page allocator code.
- The series "mm/madvise: cleanup requests validations and
classifications" from SeongJae Park cleans up some warts which
SeongJae observed during his earlier madvise work.
- The series "mm/hwpoison: Fix regressions in memory failure handling"
from Shuai Xue addresses two quite serious regressions which Shuai
has observed in the memory-failure implementation.
- The series "mm: reliable huge page allocator" from Johannes Weiner
makes huge page allocations cheaper and more reliable by reducing
fragmentation.
- The series "Minor memcg cleanups & prep for memdescs" from Matthew
Wilcox is preparatory work for the future implementation of memdescs.
- The series "track memory used by balloon drivers" from Nico Pache
introduces a way to track memory used by our various balloon drivers.
- The series "mm/damon: introduce DAMOS filter type for active pages"
from Nhat Pham permits users to filter for active/inactive pages,
separately for file and anon pages.
- The series "Adding Proactive Memory Reclaim Statistics" from Hao Jia
separates the proactive reclaim statistics from the direct reclaim
statistics.
- The series "mm/vmscan: don't try to reclaim hwpoison folio" from
Jinjiang Tu fixes our handling of hwpoisoned pages within the reclaim
code.
* tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (431 commits)
mm/page_alloc: remove unnecessary __maybe_unused in order_to_pindex()
x86/mm: restore early initialization of high_memory for 32-bits
mm/vmscan: don't try to reclaim hwpoison folio
mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
cgroup: docs: add pswpin and pswpout items in cgroup v2 doc
mm: vmscan: split proactive reclaim statistics from direct reclaim statistics
selftests/mm: speed up split_huge_page_test
selftests/mm: uffd-unit-tests support for hugepages > 2M
docs/mm/damon/design: document active DAMOS filter type
mm/damon: implement a new DAMOS filter type for active pages
fs/dax: don't disassociate zero page entries
MM documentation: add "Unaccepted" meminfo entry
selftests/mm: add commentary about 9pfs bugs
fork: use __vmalloc_node() for stack allocation
docs/mm: Physical Memory: Populate the "Zones" section
xen: balloon: update the NR_BALLOON_PAGES state
hv_balloon: update the NR_BALLOON_PAGES state
balloon_compaction: update the NR_BALLOON_PAGES state
meminfo: add a per node counter for balloon drivers
mm: remove references to folio in __memcg_kmem_uncharge_page()
...
|
|
Pull kvm updates from Paolo Bonzini:
"ARM:
- Nested virtualization support for VGICv3, giving the nested
hypervisor control of the VGIC hardware when running an L2 VM
- Removal of 'late' nested virtualization feature register masking,
making the supported feature set directly visible to userspace
- Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers
- Paravirtual interface for discovering the set of CPU
implementations where a VM may run, addressing a longstanding issue
of guest CPU errata awareness in big-little systems and
cross-implementation VM migration
- Userspace control of the registers responsible for identifying a
particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
allowing VMs to be migrated cross-implementation
- pKVM updates, including support for tracking stage-2 page table
allocations in the protected hypervisor in the 'SecPageTable' stat
- Fixes to vPMU, ensuring that userspace updates to the vPMU after
KVM_RUN are reflected into the backing perf events
LoongArch:
- Remove unnecessary header include path
- Assume constant PGD during VM context switch
- Add perf events support for guest VM
RISC-V:
- Disable the kernel perf counter during configure
- KVM selftests improvements for PMU
- Fix warning at the time of KVM module removal
x86:
- Add support for aging of SPTEs without holding mmu_lock.
Not taking mmu_lock allows multiple aging actions to run in
parallel, and more importantly avoids stalling vCPUs. This includes
an implementation of per-rmap-entry locking; aging the gfn is done
with only a per-rmap single-bin spinlock taken, whereas locking an
rmap for write requires taking both the per-rmap spinlock and the
mmu_lock.
Note that this decreases slightly the accuracy of accessed-page
information, because changes to the SPTE outside aging might not
use atomic operations even if they could race against a clear of
the Accessed bit.
This is deliberate because KVM and mm/ tolerate false
positives/negatives for accessed information, and testing has shown
that reducing the latency of aging is far more beneficial to
overall system performance than providing "perfect" young/old
information.
- Defer runtime CPUID updates until KVM emulates a CPUID instruction,
to coalesce updates when multiple pieces of vCPU state are
changing, e.g. as part of a nested transition
- Fix a variety of nested emulation bugs, and add VMX support for
synthesizing nested VM-Exit on interception (instead of injecting
#UD into L2)
- Drop "support" for async page faults for protected guests that do
not set SEND_ALWAYS (i.e. that only want async page faults at CPL3)
- Bring a bit of sanity to x86's VM teardown code, which has
accumulated a lot of cruft over the years. Particularly, destroy
vCPUs before the MMU, despite the latter being a VM-wide operation
- Add common secure TSC infrastructure for use within SNP and in the
future TDX
- Block KVM_CAP_SYNC_REGS if guest state is protected. It does not
make sense to use the capability if the relevant registers are not
available for reading or writing
- Don't take kvm->lock when iterating over vCPUs in the suspend
notifier to fix a largely theoretical deadlock
- Use the vCPU's actual Xen PV clock information when starting the
Xen timer, as the cached state in arch.hv_clock can be stale/bogus
- Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across
different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as
KVM's suspend notifier only accounts for kvmclock, and there's no
evidence that the flag is actually supported by Xen guests
- Clean up the per-vCPU "cache" of its reference pvclock, and instead
only track the vCPU's TSC scaling (multipler+shift) metadata (which
is moderately expensive to compute, and rarely changes for modern
setups)
- Don't write to the Xen hypercall page on MSR writes that are
initiated by the host (userspace or KVM) to fix a class of bugs
where KVM can write to guest memory at unexpected times, e.g.
during vCPU creation if userspace has set the Xen hypercall MSR
index to collide with an MSR that KVM emulates
- Restrict the Xen hypercall MSR index to the unofficial synthetic
range to reduce the set of possible collisions with MSRs that are
emulated by KVM (collisions can still happen as KVM emulates
Hyper-V MSRs, which also reside in the synthetic range)
- Clean up and optimize KVM's handling of Xen MSR writes and
xen_hvm_config
- Update Xen TSC leaves during CPUID emulation instead of modifying
the CPUID entries when updating PV clocks; there is no guarantee PV
clocks will be updated between TSC frequency changes and CPUID
emulation, and guest reads of the TSC leaves should be rare, i.e.
are not a hot path
x86 (Intel):
- Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and
thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1
- Pass XFD_ERR as the payload when injecting #NM, as a preparatory
step for upcoming FRED virtualization support
- Decouple the EPT entry RWX protection bit macros from the EPT
Violation bits, both as a general cleanup and in anticipation of
adding support for emulating Mode-Based Execution Control (MBEC)
- Reject KVM_RUN if userspace manages to gain control and stuff
invalid guest state while KVM is in the middle of emulating nested
VM-Enter
- Add a macro to handle KVM's sanity checks on entry/exit VMCS
control pairs in anticipation of adding sanity checks for secondary
exit controls (the primary field is out of bits)
x86 (AMD):
- Ensure the PSP driver is initialized when both the PSP and KVM
modules are built-in (the initcall framework doesn't handle
dependencies)
- Use long-term pins when registering encrypted memory regions, so
that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and
don't lead to excessive fragmentation
- Add macros and helpers for setting GHCB return/error codes
- Add support for Idle HLT interception, which elides interception if
the vCPU has a pending, unmasked virtual IRQ when HLT is executed
- Fix a bug in INVPCID emulation where KVM fails to check for a
non-canonical address
- Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is
invalid, e.g. because the vCPU was "destroyed" via SNP's AP
Creation hypercall
- Reject SNP AP Creation if the requested SEV features for the vCPU
don't match the VM's configured set of features
Selftests:
- Fix again the Intel PMU counters test; add a data load and do
CLFLUSH{OPT} on the data instead of executing code. The theory is
that modern Intel CPUs have learned new code prefetching tricks
that bypass the PMU counters
- Fix a flaw in the Intel PMU counters test where it asserts that an
event is counting correctly without actually knowing what the event
counts on the underlying hardware
- Fix a variety of flaws, bugs, and false failures/passes
dirty_log_test, and improve its coverage by collecting all dirty
entries on each iteration
- Fix a few minor bugs related to handling of stats FDs
- Add infrastructure to make vCPU and VM stats FDs available to tests
by default (open the FDs during VM/vCPU creation)
- Relax an assertion on the number of HLT exits in the xAPIC IPI test
when running on a CPU that supports AMD's Idle HLT (which elides
interception of HLT if a virtual IRQ is pending and unmasked)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (216 commits)
RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed
RISC-V: KVM: Teardown riscv specific bits after kvm_exit
LoongArch: KVM: Register perf callbacks for guest
LoongArch: KVM: Implement arch-specific functions for guest perf
LoongArch: KVM: Add stub for kvm_arch_vcpu_preempted_in_kernel()
LoongArch: KVM: Remove PGD saving during VM context switch
LoongArch: KVM: Remove unnecessary header include path
KVM: arm64: Tear down vGIC on failed vCPU creation
KVM: arm64: PMU: Reload when resetting
KVM: arm64: PMU: Reload when user modifies registers
KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs
KVM: arm64: PMU: Assume PMU presence in pmu-emul.c
KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu
KVM: arm64: Factor out pKVM hyp vcpu creation to separate function
KVM: arm64: Initialize HCRX_EL2 traps in pKVM
KVM: arm64: Factor out setting HCRX_EL2 traps into separate function
KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected
KVM: x86: Add infrastructure for secure TSC
KVM: x86: Push down setting vcpu.arch.user_set_tsc
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Nothing major this time around.
Apart from the usual perf/PMU updates, some page table cleanups, the
notable features are average CPU frequency based on the AMUv1
counters, CONFIG_HOTPLUG_SMT and MOPS instructions (memcpy/memset) in
the uaccess routines.
Perf and PMUs:
- Support for the 'Rainier' CPU PMU from Arm
- Preparatory driver changes and cleanups that pave the way for BRBE
support
- Support for partial virtualisation of the Apple-M1 PMU
- Support for the second event filter in Arm CSPMU designs
- Minor fixes and cleanups (CMN and DWC PMUs)
- Enable EL2 requirements for FEAT_PMUv3p9
Power, CPU topology:
- Support for AMUv1-based average CPU frequency
- Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It
adds a generic topology_is_primary_thread() function overridden by
x86 and powerpc
New(ish) features:
- MOPS (memcpy/memset) support for the uaccess routines
Security/confidential compute:
- Fix the DMA address for devices used in Realms with Arm CCA. The
CCA architecture uses the address bit to differentiate between
shared and private addresses
- Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by
default
Memory management clean-ups:
- Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs
- Some minor page table accessor clean-ups
- PIE/POE (permission indirection/overlay) helpers clean-up
Kselftests:
- MTE: skip hugetlb tests if MTE is not supported on such mappings
and user correct naming for sync/async tag checking modes
Miscellaneous:
- Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people
request)
- Sysreg updates for new register fields
- CPU type info for some Qualcomm Kryo cores"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits)
arm64: mm: Don't use %pK through printk
perf/arm_cspmu: Fix missing io.h include
arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists
arm64: cputype: Add MIDR_CORTEX_A76AE
arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list
arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB
arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list
arm64/sysreg: Enforce whole word match for open/close tokens
arm64/sysreg: Fix unbalanced closing block
arm64: Kconfig: Enable HOTPLUG_SMT
arm64: topology: Support SMT control on ACPI based system
arch_topology: Support SMT control for OF based system
cpu/SMT: Provide a default topology_is_primary_thread()
arm64/mm: Define PTDESC_ORDER
perf/arm_cspmu: Add PMEVFILT2R support
perf/arm_cspmu: Generalise event filtering
perf/arm_cspmu: Move register definitons to header
arm64/kernel: Always use level 2 or higher for early mappings
arm64/mm: Drop PXD_TABLE_BIT
arm64/mm: Check pmd_table() in pmd_trans_huge()
...
|
|
'for-next/sysreg', 'for-next/misc', 'for-next/pgtable-cleanups', 'for-next/kselftest', 'for-next/uaccess-mops', 'for-next/pie-poe-cleanup', 'for-next/cputype-kryo', 'for-next/cca-dma-address', 'for-next/drop-pxd_table_bit' and 'for-next/spectre-bhb-assume-vulnerable', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
perf/arm_cspmu: Fix missing io.h include
perf/arm_cspmu: Add PMEVFILT2R support
perf/arm_cspmu: Generalise event filtering
perf/arm_cspmu: Move register definitons to header
drivers/perf: apple_m1: Support host/guest event filtering
drivers/perf: apple_m1: Refactor event select/filter configuration
perf/dwc_pcie: fix duplicate pci_dev devices
perf/dwc_pcie: fix some unreleased resources
perf/arm-cmn: Minor event type housekeeping
perf: arm_pmu: Move PMUv3-specific data
perf: apple_m1: Don't disable counter in m1_pmu_enable_event()
perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event()
perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts
perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event()
perf: arm_pmu: Don't disable counter in armpmu_add()
perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters
perf: arm_pmuv3: Add support for ARM Rainier PMU
* for-next/amuv1-avg-freq:
: Add support for AArch64 AMUv1-based average freq
arm64: Utilize for_each_cpu_wrap for reference lookup
arm64: Update AMU-based freq scale factor on entering idle
arm64: Provide an AMU-based version of arch_freq_get_on_cpu
cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
cpufreq: Allow arch_freq_get_on_cpu to return an error
arch_topology: init capacity_freq_ref to 0
* for-next/pkey_unrestricted:
: mm/pkey: Add PKEY_UNRESTRICTED macro
selftest/powerpc/mm/pkey: fix build-break introduced by commit 00894c3fc917
selftests/powerpc: Use PKEY_UNRESTRICTED macro
selftests/mm: Use PKEY_UNRESTRICTED macro
mm/pkey: Add PKEY_UNRESTRICTED macro
* for-next/sysreg:
: arm64 sysreg updates
arm64/sysreg: Enforce whole word match for open/close tokens
arm64/sysreg: Fix unbalanced closing block
arm64/sysreg: Add register fields for HFGWTR2_EL2
arm64/sysreg: Add register fields for HFGRTR2_EL2
arm64/sysreg: Add register fields for HFGITR2_EL2
arm64/sysreg: Add register fields for HDFGWTR2_EL2
arm64/sysreg: Add register fields for HDFGRTR2_EL2
arm64/sysreg: Update register fields for ID_AA64MMFR0_EL1
* for-next/misc:
: Miscellaneous arm64 patches
arm64: mm: Don't use %pK through printk
arm64/fpsimd: Remove unused declaration fpsimd_kvm_prepare()
* for-next/pgtable-cleanups:
: arm64 pgtable accessors cleanup
arm64/mm: Define PTDESC_ORDER
arm64/kernel: Always use level 2 or higher for early mappings
arm64/hugetlb: Consistently use pud_sect_supported()
arm64/mm: Convert __pte_to_phys() and __phys_to_pte_val() as functions
* for-next/kselftest:
: arm64 kselftest updates
kselftest/arm64: mte: Skip the hugetlb tests if MTE not supported on such mappings
kselftest/arm64: mte: Use the correct naming for tag check modes in check_hugetlb_options.c
* for-next/uaccess-mops:
: Implement the uaccess memory copy/set using MOPS instructions
arm64: lib: Use MOPS for usercopy routines
arm64: mm: Handle PAN faults on uaccess CPY* instructions
arm64: extable: Add fixup handling for uaccess CPY* instructions
* for-next/pie-poe-cleanup:
: PIE/POE helpers cleanup
arm64/sysreg: Move POR_EL0_INIT to asm/por.h
arm64/sysreg: Rename POE_RXW to POE_RWX
arm64/sysreg: Improve PIR/POR helpers
* for-next/cputype-kryo:
: Add cputype info for some Qualcomm Kryo cores
arm64: cputype: Add comments about Qualcomm Kryo 5XX and 6XX cores
arm64: cputype: Add QCOM_CPU_PART_KRYO_3XX_GOLD
* for-next/cca-dma-address:
: Fix DMA address for devices used in realms with Arm CCA
arm64: realm: Use aliased addresses for device DMA to shared buffers
dma: Introduce generic dma_addr_*crypted helpers
dma: Fix encryption bit clearing for dma_to_phys
* for-next/drop-pxd_table_bit:
: Drop the arm64 PXD_TABLE_BIT (clean-up in preparation for 128-bit PTEs)
arm64/mm: Drop PXD_TABLE_BIT
arm64/mm: Check pmd_table() in pmd_trans_huge()
arm64/mm: Check PUD_TYPE_TABLE in pud_bad()
arm64/mm: Check PXD_TYPE_TABLE in [p4d|pgd]_bad()
arm64/mm: Clear PXX_TYPE_MASK and set PXD_TYPE_SECT in [pmd|pud]_mkhuge()
arm64/mm: Clear PXX_TYPE_MASK in mk_[pmd|pud]_sect_prot()
arm64/ptdump: Test PMD_TYPE_MASK for block mapping
KVM: arm64: ptdump: Test PMD_TYPE_MASK for block mapping
* for-next/spectre-bhb-assume-vulnerable:
: Rework Spectre BHB mitigations to not assume "safe"
arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists
arm64: cputype: Add MIDR_CORTEX_A76AE
arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list
arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB
arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer cleanups from Thomas Gleixner:
"A treewide hrtimer timer cleanup
hrtimers are initialized with hrtimer_init() and a subsequent store to
the callback pointer. This turned out to be suboptimal for the
upcoming Rust integration and is obviously a silly implementation to
begin with.
This cleanup replaces the hrtimer_init(T); T->function = cb; sequence
with hrtimer_setup(T, cb);
The conversion was done with Coccinelle and a few manual fixups.
Once the conversion has completely landed in mainline, hrtimer_init()
will be removed and the hrtimer::function becomes a private member"
* tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits)
wifi: rt2x00: Switch to use hrtimer_update_function()
io_uring: Use helper function hrtimer_update_function()
serial: xilinx_uartps: Use helper function hrtimer_update_function()
ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup()
RDMA: Switch to use hrtimer_setup()
virtio: mem: Switch to use hrtimer_setup()
drm/vmwgfx: Switch to use hrtimer_setup()
drm/xe/oa: Switch to use hrtimer_setup()
drm/vkms: Switch to use hrtimer_setup()
drm/msm: Switch to use hrtimer_setup()
drm/i915/request: Switch to use hrtimer_setup()
drm/i915/uncore: Switch to use hrtimer_setup()
drm/i915/pmu: Switch to use hrtimer_setup()
drm/i915/perf: Switch to use hrtimer_setup()
drm/i915/gvt: Switch to use hrtimer_setup()
drm/i915/huc: Switch to use hrtimer_setup()
drm/amdgpu: Switch to use hrtimer_setup()
stm class: heartbeat: Switch to use hrtimer_setup()
i2c: Switch to use hrtimer_setup()
iio: Switch to use hrtimer_setup()
...
|
|
* kvm-arm64/pmu-fixes:
: vPMU fixes for 6.15 courtesy of Akihiko Odaki
:
: Various fixes to KVM's vPMU implementation, notably ensuring
: userspace-directed changes to the PMCs are reflected in the backing perf
: events.
KVM: arm64: PMU: Reload when resetting
KVM: arm64: PMU: Reload when user modifies registers
KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs
KVM: arm64: PMU: Assume PMU presence in pmu-emul.c
KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/pkvm-6.15:
: pKVM updates for 6.15
:
: - SecPageTable stats for stage-2 table pages allocated by the protected
: hypervisor (Vincent Donnefort)
:
: - HCRX_EL2 trap + vCPU initialization fixes for pKVM (Fuad Tabba)
KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu
KVM: arm64: Factor out pKVM hyp vcpu creation to separate function
KVM: arm64: Initialize HCRX_EL2 traps in pKVM
KVM: arm64: Factor out setting HCRX_EL2 traps into separate function
KVM: arm64: Count pKVM stage-2 usage in secondary pagetable stats
KVM: arm64: Distinct pKVM teardown memcache for stage-2
KVM: arm64: Add flags to kvm_hyp_memcache
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* kvm-arm64/writable-midr:
: Writable implementation ID registers, courtesy of Sebastian Ott
:
: Introduce a new capability that allows userspace to set the
: ID registers that identify a CPU implementation: MIDR_EL1, REVIDR_EL1,
: and AIDR_EL1. Also plug a hole in KVM's trap configuration where
: SMIDR_EL1 was readable at EL1, despite the fact that KVM does not
: support SME.
KVM: arm64: Fix documentation for KVM_CAP_ARM_WRITABLE_IMP_ID_REGS
KVM: arm64: Copy MIDR_EL1 into hyp VM when it is writable
KVM: arm64: Copy guest CTR_EL0 into hyp VM
KVM: selftests: arm64: Test writes to MIDR,REVIDR,AIDR
KVM: arm64: Allow userspace to change the implementation ID registers
KVM: arm64: Load VPIDR_EL2 with the VM's MIDR_EL1 value
KVM: arm64: Maintain per-VM copy of implementation ID regs
KVM: arm64: Set HCR_EL2.TID1 unconditionally
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|