summaryrefslogtreecommitdiff
path: root/arch/arm64/include
AgeCommit message (Collapse)Author
2024-11-11Merge branch kvm-arm64/misc into kvmarm/nextOliver Upton
* kvm-arm64/misc: : Miscellaneous updates : : - Drop useless check against vgic state in ICC_CLTR_EL1.SEIS read : emulation : : - Fix trap configuration for pKVM : : - Close the door on initialization bugs surrounding userspace irqchip : static key by removing it. KVM: selftests: Don't bother deleting memslots in KVM when freeing VMs KVM: arm64: Get rid of userspace_irqchip_in_use KVM: arm64: Initialize trap register values in hyp in pKVM KVM: arm64: Initialize the hypervisor's VM state at EL2 KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use KVM: arm64: Move pkvm_vcpu_init_traps() to init_pkvm_hyp_vcpu() KVM: arm64: Don't map 'kvm_vgic_global_state' at EL2 with pKVM KVM: arm64: Just advertise SEIS as 0 when emulating ICC_CTLR_EL1 Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11Merge branch kvm-arm64/mpam-ni into kvmarm/nextOliver Upton
* kvm-arm64/mpam-ni: : Hiding FEAT_MPAM from KVM guests, courtesy of James Morse + Joey Gouly : : Fix a longstanding bug where FEAT_MPAM was accidentally exposed to KVM : guests + the EL2 trap configuration was not explicitly configured. As : part of this, bring in skeletal support for initialising the MPAM CPU : context so KVM can actually set traps for its guests. : : Be warned -- if this series leads to boot failures on your system, : you're running on turd firmware. : : As an added bonus (that builds upon the infrastructure added by the MPAM : series), allow userspace to configure CTR_EL0.L1Ip, courtesy of Shameer : Kolothum. KVM: arm64: Make L1Ip feature in CTR_EL0 writable from userspace KVM: arm64: selftests: Test ID_AA64PFR0.MPAM isn't completely ignored KVM: arm64: Disable MPAM visibility by default and ignore VMM writes KVM: arm64: Add a macro for creating filtered sys_reg_descs entries KVM: arm64: Fix missing traps of guest accesses to the MPAM registers arm64: cpufeature: discover CPU support for MPAM arm64: head.S: Initialise MPAM EL2 registers and disable traps arm64/sysreg: Convert existing MPAM sysregs and add the remaining entries Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11Merge branch kvm-arm64/psci-1.3 into kvmarm/nextOliver Upton
* kvm-arm64/psci-1.3: : PSCI v1.3 support, courtesy of David Woodhouse : : Bump KVM's PSCI implementation up to v1.3, with the added bonus of : implementing the SYSTEM_OFF2 call. Like other system-scoped PSCI calls, : this gets relayed to userspace for further processing with a new : KVM_SYSTEM_EVENT_SHUTDOWN flag. : : As an added bonus, implement client-side support for hibernation with : the SYSTEM_OFF2 call. arm64: Use SYSTEM_OFF2 PSCI call to power off for hibernate KVM: arm64: nvhe: Pass through PSCI v1.3 SYSTEM_OFF2 call KVM: selftests: Add test for PSCI SYSTEM_OFF2 KVM: arm64: Add support for PSCI v1.2 and v1.3 KVM: arm64: Add PSCI v1.3 SYSTEM_OFF2 function for hibernation firmware/psci: Add definitions for PSCI v1.3 specification Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-10Merge tag 'mm-hotfixes-stable-2024-11-09-22-40' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "20 hotfixes, 14 of which are cc:stable. Three affect DAMON. Lorenzo's five-patch series to address the mmap_region error handling is here also. Apart from that, various singletons" * tag 'mm-hotfixes-stable-2024-11-09-22-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add entry for Thorsten Blum ocfs2: remove entry once instead of null-ptr-dereference in ocfs2_xa_remove() signal: restore the override_rlimit logic fs/proc: fix compile warning about variable 'vmcore_mmap_ops' ucounts: fix counter leak in inc_rlimit_get_ucounts() selftests: hugetlb_dio: check for initial conditions to skip in the start mm: fix docs for the kernel parameter ``thp_anon=`` mm/damon/core: avoid overflow in damon_feed_loop_next_input() mm/damon/core: handle zero schemes apply interval mm/damon/core: handle zero {aggregation,ops_update} intervals mm/mlock: set the correct prev on failure objpool: fix to make percpu slot allocation more robust mm/page_alloc: keep track of free highatomic mm: resolve faulty mmap_region() error path behaviour mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling mm: refactor map_deny_write_exec() mm: unconditionally close VMAs on error mm: avoid unsafe VMA hook invocation when error arises on mmap hook mm/thp: fix deferred split unqueue naming and locking mm/thp: fix deferred split queue not partially_mapped
2024-11-08Merge tag 'kvm-riscv-6.13-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini
KVM/riscv changes for 6.13 - Accelerate KVM RISC-V when running as a guest - Perf support to collect KVM guest statistics from host side
2024-11-08arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux()Ard Biesheuvel
The function scs_patch_vmlinux() was removed in the LPA2 boot code refactoring so remove the declaration as well. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20241106185513.3096442-8-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-08arm64/scs: Fix handling of DWARF augmentation data in CIE/FDE framesArd Biesheuvel
The dynamic SCS patching code pretends to parse the DWARF augmentation data in the CIE (header) frame, and handle accordingly when processing the individual FDE frames based on this CIE frame. However, the boolean variable is defined inside the loop, and so the parsed value is ignored. The same applies to the code alignment field, which is also read from the header but then discarded. This was never spotted before because Clang is the only compiler that supports dynamic SCS patching (which is essentially an Android feature), and the unwind tables it produces are highly uniform, and match the de facto defaults. So instead of testing for the 'z' flag in the augmentation data field, require a fixed augmentation data string of 'zR', and simplify the rest of the code accordingly. Also introduce some error codes to specify why the patching failed, and log it to the kernel console on failure when this happens when loading a module. (Doing so for vmlinux is infeasible, as the patching is done extremely early in the boot.) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20241106185513.3096442-6-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-07arch: introduce set_direct_map_valid_noflush()Mike Rapoport (Microsoft)
Add an API that will allow updates of the direct/linear map for a set of physically contiguous pages. It will be used in the following patches. Link: https://lkml.kernel.org/r/20241023162711.2579610-6-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: kdevops <kdevops@lists.linux.dev> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07asm-generic: introduce text-patching.hMike Rapoport (Microsoft)
Several architectures support text patching, but they name the header files that declare patching functions differently. Make all such headers consistently named text-patching.h and add an empty header in asm-generic for architectures that do not support text patching. Link: https://lkml.kernel.org/r/20241023162711.2579610-4-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: kdevops <kdevops@lists.linux.dev> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06kaslr: rename physmem_end and PHYSMEM_END to direct_map_physmem_endJohn Hubbard
For clarity. It's increasingly hard to reason about the code, when KASLR is moving around the boundaries. In this case where KASLR is randomizing the location of the kernel image within physical memory, the maximum number of address bits for physical memory has not changed. What has changed is the ending address of memory that is allowed to be directly mapped by the kernel. Let's name the variable, and the associated macro accordingly. Also, enhance the comment above the direct_map_physmem_end definition, to further clarify how this all works. Link: https://lkml.kernel.org/r/20241009025024.89813-1-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alistair Popple <apopple@nvidia.com> Cc: Jordan Niethe <jniethe@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06ACPI: processor: Move arch_init_invariance_cppc() call laterMario Limonciello
arch_init_invariance_cppc() is called at the end of acpi_cppc_processor_probe() in order to configure frequency invariance based upon the values from _CPC. This however doesn't work on AMD CPPC shared memory designs that have AMD preferred cores enabled because _CPC needs to be analyzed from all cores to judge if preferred cores are enabled. This issue manifests to users as a warning since commit 21fb59ab4b97 ("ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn"): ``` Could not retrieve highest performance (-19) ``` However the warning isn't the cause of this, it was actually commit 279f838a61f9 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()") which exposed the issue. To fix this problem, change arch_init_invariance_cppc() into a new weak symbol that is called at the end of acpi_processor_driver_init(). Each architecture that supports it can declare the symbol to override the weak one. Define it for x86, in arch/x86/kernel/acpi/cppc.c, and for all of the architectures using the generic arch_topology.c code. Fixes: 279f838a61f9 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()") Reported-by: Ivan Shapovalov <intelfx@intelfx.name> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219431 Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/20241104222855.3959267-1-superm1@kernel.org [ rjw: Changelog edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-11-05mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handlingLorenzo Stoakes
Currently MTE is permitted in two circumstances (desiring to use MTE having been specified by the VM_MTE flag) - where MAP_ANONYMOUS is specified, as checked by arch_calc_vm_flag_bits() and actualised by setting the VM_MTE_ALLOWED flag, or if the file backing the mapping is shmem, in which case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap hook is activated in mmap_region(). The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also set is the arm64 implementation of arch_validate_flags(). Unfortunately, we intend to refactor mmap_region() to perform this check earlier, meaning that in the case of a shmem backing we will not have invoked shmem_mmap() yet, causing the mapping to fail spuriously. It is inappropriate to set this architecture-specific flag in general mm code anyway, so a sensible resolution of this issue is to instead move the check somewhere else. We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via the arch_calc_vm_flag_bits() call. This is an appropriate place to do this as we already check for the MAP_ANONYMOUS case here, and the shmem file case is simply a variant of the same idea - we permit RAM-backed memory. This requires a modification to the arch_calc_vm_flag_bits() signature to pass in a pointer to the struct file associated with the mapping, however this is not too egregious as this is only used by two architectures anyway - arm64 and parisc. So this patch performs this adjustment and removes the unnecessary assignment of VM_MTE_ALLOWED in shmem_mmap(). [akpm@linux-foundation.org: fix whitespace, per Catalin] Link: https://lkml.kernel.org/r/ec251b20ba1964fb64cf1607d2ad80c47f3873df.1730224667.git.lorenzo.stoakes@oracle.com Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Jann Horn <jannh@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andreas Larsson <andreas@gaisler.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Brown <broonie@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()Yicong Yang
Young bit operation on PMD table entry is only supported if FEAT_HAFT enabled system wide. Add a warning for notifying the misbehaviour. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20241102104235.62560-6-yangyicong@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNGYicong Yang
With the support of FEAT_HAFT, the NONLEAF_PMD_YOUNG can be enabled on arm64 since the hardware is capable of updating the AF flag for PMD table descriptor. Since the AF bit of the table descriptor shares the same bit position in block descriptors, we only need to implement arch_has_hw_nonleaf_pmd_young() and select related configs. The related pmd_young test/update operations keeps the same with and already implemented for transparent page support. Currently ARCH_HAS_NONLEAF_PMD_YOUNG is used to improve the efficiency of lru-gen aging. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20241102104235.62560-5-yangyicong@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05arm64: Add support for FEAT_HAFTYicong Yang
Armv8.9/v9.4 introduces the feature Hardware managed Access Flag for Table descriptors (FEAT_HAFT). The feature is indicated by ID_AA64MMFR1_EL1.HAFDBS == 0b0011 and can be enabled by TCR2_EL1.HAFT so it has a dependency on FEAT_TCR2. Adds the Kconfig for FEAT_HAFT and support detecting and enabling the feature. The feature is enabled in __cpu_setup() before MMU on just like HA. A CPU capability is added to notify the user of the feature. Add definition of P{G,4,U,M}D_TABLE_AF bit and set the AF bit when creating the page table, which will save the hardware from having to update them at runtime. This will be ignored if FEAT_HAFT is not enabled. The AF bit of table descriptors cannot be managed by the software per spec, unlike the HA. So this should be used only if it's supported system wide by system_supports_haft(). Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241102104235.62560-4-yangyicong@huawei.com Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> [catalin.marinas@arm.com: added the ID check back to __cpu_setup in case of future CPU errata] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05arm64/mm: Sanity check PTE address before runtime P4D/PUD foldingArd Biesheuvel
The runtime P4D/PUD folding logic assumes that the respective pgd_t* and p4d_t* arguments are pointers into actual page tables that are part of the hierarchy being operated on. This may not always be the case, and we have been bitten once by this already [0], where the argument was actually a stack variable, and in this case, the logic does not work at all. So let's add a VM_BUG_ON() for each case, to ensure that the address of the provided page table entry is consistent with the address being translated. [0] https://lore.kernel.org/all/20240725090345.28461-1-will@kernel.org/T/#u Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20241105093919.1312049-2-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-04jump_label: adjust inline asm to be consistentAlice Ryhl
To avoid duplication of inline asm between C and Rust, we need to import the inline asm from the relevant `jump_label.h` header into Rust. To make that easier, this patch updates the header files to expose the inline asm via a new ARCH_STATIC_BRANCH_ASM macro. The header files are all updated to define a ARCH_STATIC_BRANCH_ASM that takes the same arguments in a consistent order so that Rust can use the same logic for every architecture. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Jason Baron <jbaron@akamai.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Gary Guo <gary@garyguo.net> Cc: " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " <bjorn3_gh@protonmail.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anup Patel <apatel@ventanamicro.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Conor Dooley <conor.dooley@microchip.com> Cc: Samuel Holland <samuel.holland@sifive.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tianrui Zhao <zhaotianrui@loongson.cn> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/20241030-tracepoint-v12-4-eec7f0f8ad22@google.com Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Co-developed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> # RISC-V Signed-off-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-04arm64/mm: Drop setting PTE_TYPE_PAGE in pte_mkcont()Anshuman Khandual
PTE_TYPE_PAGE bits were being set in pte_mkcont() because PTE_TABLE_BIT was being cleared in pte_mkhuge(). But after arch_make_huge_pte() modification in commit f8192813dcbe ("arm64/mm: Re-organize arch_make_huge_pte()"), which dropped pte_mkhuge() completely, setting back PTE_TYPE_PAGE bits is no longer necessary. Change pte_mkcont() to only set PTE_CONT. Cc: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20241104041617.3804617-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-02arm64: vdso: Drop LBASE_VDSOThomas Weißschuh
This constant is always "0", providing no value and making the logic harder to understand. Also prepare for a consolidation of the vdso linkerscript logic by aligning it with other architectures. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-4-b64f0842d512@linutronix.de
2024-11-01arm64/mm: Re-organize arch_make_huge_pte()Anshuman Khandual
Core HugeTLB defines a fallback definition for arch_make_huge_pte(), which calls platform provided pte_mkhuge(). But if any platform already provides an override for arch_make_huge_pte(), then it does not need to provide the helper pte_mkhuge(). arm64 override for arch_make_huge_pte() calls pte_mkhuge() internally, thus creating an impression, that both of these callbacks are being used in core HugeTLB and hence required to be defined. This drops off pte_mkhuge() which was never required to begin with as there could not be any section mappings at the PTE level. Re-organize arch_make_huge_pte() based on requested page size and create the entry for the applicable page table level as needed. It also removes a redundancy of clearing PTE_TABLE_BIT bit followed by setting both PTE_TABLE_BIT and PTE_VALID bits (via PTE_TYPE_MASK) in the pte, while creating CONT_PTE_SIZE size entries. Cc: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20241029044529.2624785-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-31KVM: arm64: Get rid of userspace_irqchip_in_useRaghavendra Rao Ananta
Improper use of userspace_irqchip_in_use led to syzbot hitting the following WARN_ON() in kvm_timer_update_irq(): WARNING: CPU: 0 PID: 3281 at arch/arm64/kvm/arch_timer.c:459 kvm_timer_update_irq+0x21c/0x394 Call trace: kvm_timer_update_irq+0x21c/0x394 arch/arm64/kvm/arch_timer.c:459 kvm_timer_vcpu_reset+0x158/0x684 arch/arm64/kvm/arch_timer.c:968 kvm_reset_vcpu+0x3b4/0x560 arch/arm64/kvm/reset.c:264 kvm_vcpu_set_target arch/arm64/kvm/arm.c:1553 [inline] kvm_arch_vcpu_ioctl_vcpu_init arch/arm64/kvm/arm.c:1573 [inline] kvm_arch_vcpu_ioctl+0x112c/0x1b3c arch/arm64/kvm/arm.c:1695 kvm_vcpu_ioctl+0x4ec/0xf74 virt/kvm/kvm_main.c:4658 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl fs/ioctl.c:893 [inline] __arm64_sys_ioctl+0x108/0x184 fs/ioctl.c:893 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x78/0x1b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0xe8/0x1b0 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x40/0x50 arch/arm64/kernel/syscall.c:151 el0_svc+0x54/0x14c arch/arm64/kernel/entry-common.c:712 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598 The following sequence led to the scenario: - Userspace creates a VM and a vCPU. - The vCPU is initialized with KVM_ARM_VCPU_PMU_V3 during KVM_ARM_VCPU_INIT. - Without any other setup, such as vGIC or vPMU, userspace issues KVM_RUN on the vCPU. Since the vPMU is requested, but not setup, kvm_arm_pmu_v3_enable() fails in kvm_arch_vcpu_run_pid_change(). As a result, KVM_RUN returns after enabling the timer, but before incrementing 'userspace_irqchip_in_use': kvm_arch_vcpu_run_pid_change() ret = kvm_arm_pmu_v3_enable() if (!vcpu->arch.pmu.created) return -EINVAL; if (ret) return ret; [...] if (!irqchip_in_kernel(kvm)) static_branch_inc(&userspace_irqchip_in_use); - Userspace ignores the error and issues KVM_ARM_VCPU_INIT again. Since the timer is already enabled, control moves through the following flow, ultimately hitting the WARN_ON(): kvm_timer_vcpu_reset() if (timer->enabled) kvm_timer_update_irq() if (!userspace_irqchip()) ret = kvm_vgic_inject_irq() ret = vgic_lazy_init() if (unlikely(!vgic_initialized(kvm))) if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2) return -EBUSY; WARN_ON(ret); Theoretically, since userspace_irqchip_in_use's functionality can be simply replaced by '!irqchip_in_kernel()', get rid of the static key to avoid the mismanagement, which also helps with the syzbot issue. Cc: <stable@vger.kernel.org> Reported-by: syzbot <syzkaller@googlegroups.com> Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: nv: Reinject traps that take effect in Host EL0Oliver Upton
Wire up the other end of traps that affect host EL0 by actually injecting them into the guest hypervisor. Skip over FGT entirely, as a cursory glance suggests no FGT is effective in host EL0. Note that kvm_inject_nested() is already equipped for handling exceptions while the VM is already in a host context. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241025182354.3364124-9-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Describe RES0/RES1 bits of MDCR_EL2Oliver Upton
Add support for sanitising MDCR_EL2 and describe the RES0/RES1 bits according to the feature set exposed to the VM. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241025182354.3364124-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31arm64: sysreg: Migrate MDCR_EL2 definition to tableOliver Upton
Migrate MDCR_EL2 over to the sysreg table and align definitions with DDI0601 2024-09. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241025182354.3364124-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp useFuad Tabba
Move kvm_vcpu_enable_ptrauth() to a shared header to be used by hypervisor code in protected mode. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241018074833.2563674-3-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Move pkvm_vcpu_init_traps() to init_pkvm_hyp_vcpu()Fuad Tabba
Move pkvm_vcpu_init_traps() to the initialization of the hypervisor's vcpu state in init_pkvm_hyp_vcpu(), and remove the associated hypercall. In protected mode, traps need to be initialized whenever a VCPU is initialized anyway, and not only for protected VMs. This also saves an unnecessary hypercall. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241018074833.2563674-2-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Fix missing traps of guest accesses to the MPAM registersJames Morse
commit 011e5f5bf529f ("arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register") exposed the MPAM field of AA64PFR0_EL1 to guests, but didn't add trap handling. If you are unlucky, this results in an MPAM aware guest being delivered an undef during boot. The host prints: | kvm [97]: Unsupported guest sys_reg access at: ffff800080024c64 [00000005] | { Op0( 3), Op1( 0), CRn(10), CRm( 5), Op2( 0), func_read }, Which results in: | Internal error: Oops - Undefined instruction: 0000000002000000 [#1] PREEMPT SMP | Modules linked in: | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.0-rc7-00559-gd89c186d50b2 #14616 | Hardware name: linux,dummy-virt (DT) | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : test_has_mpam+0x18/0x30 | lr : test_has_mpam+0x10/0x30 | sp : ffff80008000bd90 ... | Call trace: | test_has_mpam+0x18/0x30 | update_cpu_capabilities+0x7c/0x11c | setup_cpu_features+0x14/0xd8 | smp_cpus_done+0x24/0xb8 | smp_init+0x7c/0x8c | kernel_init_freeable+0xf8/0x280 | kernel_init+0x24/0x1e0 | ret_from_fork+0x10/0x20 | Code: 910003fd 97ffffde 72001c00 54000080 (d538a500) | ---[ end trace 0000000000000000 ]--- | Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b | ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- Add the support to enable the traps, and handle the three guest accessible registers by injecting an UNDEF. This stops KVM from spamming the host log, but doesn't yet hide the feature from the id registers. With MPAM v1.0 we can trap the MPAMIDR_EL1 register only if ARM64_HAS_MPAM_HCR, with v1.1 an additional MPAM2_EL2.TIDR bit traps MPAMIDR_EL1 on platforms that don't have MPAMHCR_EL2. Enable one of these if either is supported. If neither is supported, the guest can discover that the CPU has MPAM support, and how many PARTID etc the host has ... but it can't influence anything, so its harmless. Fixes: 011e5f5bf529f ("arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register") CC: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/linux-arm-kernel/20200925160102.118858-1-james.morse@arm.com/ Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241030160317.2528209-5-joey.gouly@arm.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31arm64: cpufeature: discover CPU support for MPAMJames Morse
ARMv8.4 adds support for 'Memory Partitioning And Monitoring' (MPAM) which describes an interface to cache and bandwidth controls wherever they appear in the system. Add support to detect MPAM. Like SVE, MPAM has an extra id register that describes some more properties, including the virtualisation support, which is optional. Detect this separately so we can detect mismatched/insane systems, but still use MPAM on the host even if the virtualisation support is missing. MPAM needs enabling at the highest implemented exception level, otherwise the register accesses trap. The 'enabled' flag is accessible to lower exception levels, but its in a register that traps when MPAM isn't enabled. The cpufeature 'matches' hook is extended to test this on one of the CPUs, so that firmware can emulate MPAM as disabled if it is reserved for use by secure world. Secondary CPUs that appear late could trip cpufeature's 'lower safe' behaviour after the MPAM properties have been advertised to user-space. Add a verify call to ensure late secondaries match the existing CPUs. (If you have a boot failure that bisects here its likely your CPUs advertise MPAM in the id registers, but firmware failed to either enable or MPAM, or emulate the trap as if it were disabled) Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241030160317.2528209-4-joey.gouly@arm.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31arm64: head.S: Initialise MPAM EL2 registers and disable trapsJames Morse
Add code to head.S's el2_setup to detect MPAM and disable any EL2 traps. This register resets to an unknown value, setting it to the default parititons/pmg before we enable the MMU is the best thing to do. Kexec/kdump will depend on this if the previous kernel left the CPU configured with a restrictive configuration. If linux is booted at the highest implemented exception level el2_setup will clear the enable bit, disabling MPAM. This code can't be enabled until a subsequent patch adds the Kconfig and cpufeature boiler plate. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241030160317.2528209-3-joey.gouly@arm.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31arm64/sysreg: Convert existing MPAM sysregs and add the remaining entriesJames Morse
Move the existing MPAM system register defines from sysreg.h to tools/sysreg and add the remaining system registers. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241030160317.2528209-2-joey.gouly@arm.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Add basic support for POR_EL2Marc Zyngier
S1POE support implies support for POR_EL2, which we provide by - adding it to the vcpu_sysreg enum - advertising it as mapped to its EL1 counterpart in get_el2_to_el1_mapping - wiring it in the sys_reg_desc table with the correct visibility - handling POR_EL1 in __vcpu_{read,write}_sys_reg_from_cpu() Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241023145345.1613824-32-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Add kvm_has_s1poe() helperMarc Zyngier
Just like we have kvm_has_s1pie(), add its S1POE counterpart, making the code slightly more readable. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241023145345.1613824-31-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Hide S1PIE registers from userspace when disabled for guestsMark Brown
When the guest does not support S1PIE we should not allow any access to the system registers it adds in order to ensure that we do not create spurious issues with guest migration. Add a visibility operation for these registers. Fixes: 86f9de9db178 ("KVM: arm64: Save/restore PIE registers") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240822-kvm-arm64-hide-pie-regs-v2-3-376624fa829c@kernel.org [maz: simplify by using __el2_visibility(), kvm_has_s1pie() throughout] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241023145345.1613824-26-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Hide TCR2_EL1 from userspace when disabled for guestsMark Brown
When the guest does not support FEAT_TCR2 we should not allow any access to it in order to ensure that we do not create spurious issues with guest migration. Add a visibility operation for it. Fixes: fbff56068232 ("KVM: arm64: Save/restore TCR2_EL1") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240822-kvm-arm64-hide-pie-regs-v2-2-376624fa829c@kernel.org [maz: simplify by using __el2_visibility(), kvm_has_tcr2() throughout] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241023145345.1613824-25-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Add PIR{,E0}_EL2 to the sysreg arraysMarc Zyngier
Add the FEAT_S1PIE EL2 registers to the per-vcpu sysreg register array. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241023145345.1613824-15-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Extend masking facility to arbitrary registersMarc Zyngier
We currently only use the masking (RES0/RES1) facility for VNCR registers, as they are memory-based and thus easy to sanitise. But we could apply the same thing to other registers if we: - split the sanitisation from __VNCR_START__ - apply the sanitisation when reading from a HW register This involves a new "marker" in the vcpu_sysreg enum, which defines the point at which the sanitisation applies (the VNCR registers being of course after this marker). Whle we are at it, rename kvm_vcpu_sanitise_vncr_reg() to kvm_vcpu_apply_reg_masks(), which is vaguely more explicit, and harden set_sysreg_masks() against setting masks for random registers... Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20241023145345.1613824-10-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Correctly access TCR2_EL1, PIR_EL1, PIRE0_EL1 with VHEMarc Zyngier
For code that accesses any of the guest registers for emulation purposes, it is crucial to know where the most up-to-date data is. While this is pretty clear for nVHE (memory is the sole repository), things are a lot muddier for VHE, as depending on the SYSREGS_ON_CPU flag, registers can either be loaded on the HW or be in memory. Even worse with NV, where the loaded state is by definition partial. For these reasons, KVM offers the vcpu_read_sys_reg() and vcpu_write_sys_reg() primitives that always do the right thing. However, these primitive must know what register to access, and this is the role of the __vcpu_read_sys_reg_from_cpu() and __vcpu_write_sys_reg_to_cpu() helpers. As it turns out, TCR2_EL1, PIR_EL1, PIRE0_EL1 and not described in the latter helpers, meaning that the AT code cannot use them to emulate S1PIE. Add the three registers to the (long) list. Fixes: 86f9de9db178 ("KVM: arm64: Save/restore PIE registers") Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20241023145345.1613824-9-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31KVM: arm64: Add TCR2_EL2 to the sysreg arraysMarc Zyngier
Add the TCR2_EL2 register to the per-vcpu sysreg register array, the sysreg descriptor array, and advertise it as mapped to TCR2_EL1 for NV purposes. Access to this register is conditional based on ID_AA64MMFR3_EL1.TCRX being advertised. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241023145345.1613824-12-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-31arm64: Remove VNCR definition for PIRE0_EL2Marc Zyngier
As of the ARM ARM Known Issues document 102105_K.a_04_en, D22677 fixes a problem with the PIRE0_EL2 register, resulting in its removal from the VNCR page (it had no purpose being there the first place). Follow the architecture update by removing this offset. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241023145345.1613824-3-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-30KVM: Protect vCPU's "last run PID" with rwlock, not RCUSean Christopherson
To avoid jitter on KVM_RUN due to synchronize_rcu(), use a rwlock instead of RCU to protect vcpu->pid, a.k.a. the pid of the task last used to a vCPU. When userspace is doing M:N scheduling of tasks to vCPUs, e.g. to run SEV migration helper vCPUs during post-copy, the synchronize_rcu() needed to change the PID associated with the vCPU can stall for hundreds of milliseconds, which is problematic for latency sensitive post-copy operations. In the directed yield path, do not acquire the lock if it's contended, i.e. if the associated PID is changing, as that means the vCPU's task is already running. Reported-by: Steve Rutherford <srutherford@google.com> Reviewed-by: Steve Rutherford <srutherford@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240802200136.329973-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-10-28arm64: Use new fallback IO memcpy/memsetJulian Vetter
Use the new fallback memcpy_{from,to}io and memset_io functions from lib/iomem_copy.c on the arm64 processor architecture. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Yann Sionneau <ysionneau@kalrayinc.com> Signed-off-by: Julian Vetter <jvetter@kalrayinc.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-28asm-generic: provide generic page_to_phys and phys_to_page implementationsChristoph Hellwig
page_to_phys is duplicated by all architectures, and from some strange reason placed in <asm/io.h> where it doesn't fit at all. phys_to_page is only provided by a few architectures despite having a lot of open coded users. Provide generic versions in <asm-generic/memory_model.h> to make these helpers more easily usable. Note with this patch powerpc loses the CONFIG_DEBUG_VIRTUAL pfn_valid check. It will be added back in a generic version later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-28perf/arm_pmuv3: Add PMUv3.9 per counter EL0 access controlRob Herring (Arm)
Armv8.9/9.4 PMUv3.9 adds per counter EL0 access controls. Per counter access is enabled with the UEN bit in PMUSERENR_EL1 register. Individual counters are enabled/disabled in the PMUACR_EL1 register. When UEN is set, the CR/ER bits control EL0 write access and must be set to disable write access. With the access controls, the clearing of unused counters can be skipped. KVM also configures PMUSERENR_EL1 in order to trap to EL2. UEN does not need to be set for it since only PMUv3.5 is exposed to guests. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20241002184326.1105499-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-10-25KVM: arm64: Don't mark "struct page" accessed when making SPTE youngSean Christopherson
Don't mark pages/folios as accessed in the primary MMU when making a SPTE young in KVM's secondary MMU, as doing so relies on kvm_pfn_to_refcounted_page(), and generally speaking is unnecessary and wasteful. KVM participates in page aging via mmu_notifiers, so there's no need to push "accessed" updates to the primary MMU. Dropping use of kvm_set_pfn_accessed() also paves the way for removing kvm_pfn_to_refcounted_page() and all its users. Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-84-seanjc@google.com>
2024-10-24KVM: arm64: Add PSCI v1.3 SYSTEM_OFF2 function for hibernationDavid Woodhouse
The PSCI v1.3 specification adds support for a SYSTEM_OFF2 function which is analogous to ACPI S4 state. This will allow hosting environments to determine that a guest is hibernated rather than just powered off, and ensure that they preserve the virtual environment appropriately to allow the guest to resume safely (or bump the hardware_signature in the FACS to trigger a clean reboot instead). This feature is safe to enable unconditionally (in a subsequent commit) because it is exposed to userspace through the existing KVM_SYSTEM_EVENT_SHUTDOWN event, just with an additional flag which userspace can use to know that the instance intended hibernation instead of a plain power-off. As with SYSTEM_RESET2, there is only one type available (in this case HIBERNATE_OFF), and it is not explicitly reported to userspace through the event; userspace can get it from the registers if it cares). Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Miguel Luis <miguel.luis@oracle.com> Link: https://lore.kernel.org/r/20241019172459.2241939-3-dwmw2@infradead.org [oliver: slight cleanup of comments] Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-10-23arm64/mm: Drop _PROT_SECT_DEFAULTAnshuman Khandual
'commit db95ea787bd1 ("arm64: mm: Wire up TCR.DS bit to PTE shareability fields")' dropped the last reference to symbol _PROT_SECT_DEFAULT, while transitioning from PMD_SECT_S to PMD_MAYBE_SHARED for PROT_SECT_DEFAULT. Hence let's just drop that symbol which is now unused. Cc: Will Deacon <will@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20241021063713.750870-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-23arm64: Enable memory encrypt for RealmsSuzuki K Poulose
Use the memory encryption APIs to trigger a RSI call to request a transition between protected memory and shared memory (or vice versa) and updating the kernel's linear map of modified pages to flip the top bit of the IPA. This requires that block mappings are not used in the direct map for realm guests. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Co-developed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20241017131434.40935-10-steven.price@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-23arm64: rsi: Add support for checking whether an MMIO is protectedSuzuki K Poulose
On Arm CCA, with RMM-v1.0, all MMIO regions are shared. However, in the future, an Arm CCA-v1.0 compliant guest may be run in a lesser privileged partition in the Realm World (with Arm CCA-v1.1 Planes feature). In this case, some of the MMIO regions may be emulated by a higher privileged component in the Realm world, i.e, protected. Thus the guest must decide today, whether a given MMIO region is shared vs Protected and create the stage1 mapping accordingly. On Arm CCA, this detection is based on the "IPA State" (RIPAS == RIPAS_IO). Provide a helper to run this check on a given range of MMIO. Also, provide a arm64 helper which may be hooked in by other solutions. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20241017131434.40935-5-steven.price@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-23arm64: realm: Query IPA size from the RMMSteven Price
The top bit of the configured IPA size is used as an attribute to control whether the address is protected or shared. Query the configuration from the RMM to assertain which bit this is. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20241017131434.40935-4-steven.price@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-23arm64: Detect if in a realm and set RIPAS RAMSuzuki K Poulose
Detect that the VM is a realm guest by the presence of the RSI interface. This is done after PSCI has been initialised so that we can check the SMCCC conduit before making any RSI calls. If in a realm then iterate over all memory ensuring that it is marked as RIPAS RAM. The loader is required to do this for us, however if some memory is missed this will cause the guest to receive a hard to debug external abort at some random point in the future. So for a belt-and-braces approach set all memory to RIPAS RAM. Any failure here implies that the RAM regions passed to Linux are incorrect so panic() promptly to make the situation clear. Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Co-developed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20241017131434.40935-3-steven.price@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>