Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"20 hotfixes. 15 are cc:stable and the remainder address post-6.16
issues or aren't considered necessary for -stable kernels. 14 of these
fixes are for MM.
This includes
- kexec fixes from Breno for a recently introduced
use-uninitialized bug
- DAMON fixes from Quanmin Yan to avoid div-by-zero crashes
which can occur if the operator uses poorly-chosen insmod
parameters
and misc singleton fixes"
* tag 'mm-hotfixes-stable-2025-09-10-20-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS: add tree entry to numa memblocks and emulation block
mm/damon/sysfs: fix use-after-free in state_show()
proc: fix type confusion in pde_set_flags()
compiler-clang.h: define __SANITIZE_*__ macros only when undefined
mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()
ocfs2: fix recursive semaphore deadlock in fiemap call
mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory
mm/mremap: fix regression in vrm->new_addr check
percpu: fix race on alloc failed warning limit
mm/memory-failure: fix redundant updates for already poisoned pages
s390: kexec: initialize kexec_buf struct
riscv: kexec: initialize kexec_buf struct
arm64: kexec: initialize kexec_buf struct in load_other_segments()
mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters()
mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters()
mm/damon/core: set quota->charged_from to jiffies at first charge window
mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range()
init/main.c: fix boot time tracing crash
mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range()
mm/khugepaged: fix the address passed to notifier on testing young
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Incorrect __BITS_PER_LONG as 64 when compiling the compat vDSO
- Unreachable PLT for ftrace_caller() in a module's .init.text
following past reworking of the module VA range selection
- Memory leak in the ACPI iort_rmr_alloc_sids() after a failed
krealloc_array()
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: ftrace: fix unreachable PLT for ftrace_caller in init_module with CONFIG_DYNAMIC_FTRACE
ACPI/IORT: Fix memory leak in iort_rmr_alloc_sids()
arm64: uapi: Provide correct __BITS_PER_LONG for the compat vDSO
|
|
CONFIG_DYNAMIC_FTRACE
On arm64, it has been possible for a module's sections to be placed more
than 128M away from each other since commit:
commit 3e35d303ab7d ("arm64: module: rework module VA range selection")
Due to this, an ftrace callsite in a module's .init.text section can be
out of branch range for the module's ftrace PLT entry (in the module's
.text section). Any attempt to enable tracing of that callsite will
result in a BRK being patched into the callsite, resulting in a fatal
exception when the callsite is later executed.
Fix this by adding an additional trampoline for .init.text, which will
be within range.
No additional trampolines are necessary due to the way a given
module's executable sections are packed together. Any executable
section beginning with ".init" will be placed in MOD_INIT_TEXT,
and any other executable section, including those beginning with ".exit",
will be placed in MOD_TEXT.
Fixes: 3e35d303ab7d ("arm64: module: rework module VA range selection")
Cc: <stable@vger.kernel.org> # 6.5.x
Signed-off-by: panfan <panfan@qti.qualcomm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250905032236.3220885-1-panfan@qti.qualcomm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Patch series "kexec: Fix invalid field access".
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are cleanly
set, preventing future instances of uninitialized memory being used.
An initial fix was already landed for arm64[0], and this patchset fixes
the problem on the remaining arm64 code and on riscv, as raised by Mark.
Discussions about this problem could be found at[1][2].
This patch (of 3):
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are
cleanly set, preventing future instances of uninitialized memory being
used.
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-0-1df9882bb01a@debian.org
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-1-1df9882bb01a@debian.org
Link: https://lore.kernel.org/all/20250826180742.f2471131255ec1c43683ea07@linux-foundation.org/ [0]
Link: https://lore.kernel.org/all/oninomspajhxp4omtdapxnckxydbk2nzmrix7rggmpukpnzadw@c67o7njgdgm3/ [1]
Link: https://lore.kernel.org/all/20250826-akpm-v1-1-3c831f0e3799@debian.org/ [2]
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Coiby Xu <coxu@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- CFI failure due to kpti_ng_pgd_alloc() signature mismatch
- Underallocation bug in the SVE ptrace kselftest
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
kselftest/arm64: Don't open code SVE_PT_SIZE() in fp-ptrace
arm64: mm: Fix CFI failure due to kpti_ng_pgd_alloc function signature
|
|
Seen during KPTI initialization:
CFI failure at create_kpti_ng_temp_pgd+0x124/0xce8 (target: kpti_ng_pgd_alloc+0x0/0x14; expected type: 0xd61b88b6)
The call site is alloc_init_pud() at arch/arm64/mm/mmu.c:
pud_phys = pgtable_alloc(TABLE_PUD);
alloc_init_pud() has the prototype:
static void alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end,
phys_addr_t phys, pgprot_t prot,
phys_addr_t (*pgtable_alloc)(enum pgtable_type),
int flags)
where the pgtable_alloc() prototype is declared.
The target (kpti_ng_pgd_alloc) is used in arch/arm64/kernel/cpufeature.c:
create_kpti_ng_temp_pgd(kpti_ng_temp_pgd, __pa(alloc), KPTI_NG_TEMP_VA,
PAGE_SIZE, PAGE_KERNEL, kpti_ng_pgd_alloc, 0);
which is an alias for __create_pgd_mapping_locked() with prototype:
extern __alias(__create_pgd_mapping_locked)
void create_kpti_ng_temp_pgd(pgd_t *pgdir, phys_addr_t phys,
unsigned long virt,
phys_addr_t size, pgprot_t prot,
phys_addr_t (*pgtable_alloc)(enum pgtable_type),
int flags);
__create_pgd_mapping_locked() passes the function pointer down:
__create_pgd_mapping_locked() -> alloc_init_p4d() -> alloc_init_pud()
But the target function (kpti_ng_pgd_alloc) has the wrong signature:
static phys_addr_t __init kpti_ng_pgd_alloc(int shift);
The "int" should be "enum pgtable_type".
To make "enum pgtable_type" available to cpufeature.c, move
enum pgtable_type definition from arch/arm64/mm/mmu.c to
arch/arm64/include/asm/mmu.h.
Adjust kpti_ng_pgd_alloc to use "enum pgtable_type" instead of "int".
The function behavior remains identical (parameter is unused).
Fixes: c64f46ee1377 ("arm64: mm: use enum to identify pgtable level instead of *_SHIFT")
Cc: <stable@vger.kernel.org> # 6.16.x
Signed-off-by: Kees Cook <kees@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829190721.it.373-kees@kernel.org
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 changes for 6.17, take #2
- Correctly handle 'invariant' system registers for protected VMs
- Improved handling of VNCR data aborts, including external aborts
- Fixes for handling of FEAT_RAS for NV guests, providing a sane
fault context during SEA injection and preventing the use of
RASv1p1 fault injection hardware
- Ensure that page table destruction when a VM is destroyed gives an
opportunity to reschedule
- Large fix to KVM's infrastructure for managing guest context loaded
on the CPU, addressing issues where the output of AT emulation
doesn't get reflected to the guest
- Fix AT S12 emulation to actually perform stage-2 translation when
necessary
- Avoid attempting vLPI irqbypass when GICv4 has been explicitly
disabled for a VM
- Minor KVM + selftest fixes
|
|
Detecting FEAT_RASv1p1 is rather complicated, as there are two
ways for the architecture to advertise the same thing (always a
delight...).
Add a capability that will advertise this in a synthetic way to
the rest of the kernel.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link: https://lore.kernel.org/r/20250817202158.395078-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Pull kvm updates from Paolo Bonzini:
"ARM:
- Host driver for GICv5, the next generation interrupt controller for
arm64, including support for interrupt routing, MSIs, interrupt
translation and wired interrupts
- Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
GICv5 hardware, leveraging the legacy VGIC interface
- Userspace control of the 'nASSGIcap' GICv3 feature, allowing
userspace to disable support for SGIs w/o an active state on
hardware that previously advertised it unconditionally
- Map supporting endpoints with cacheable memory attributes on
systems with FEAT_S2FWB and DIC where KVM no longer needs to
perform cache maintenance on the address range
- Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the
guest hypervisor to inject external aborts into an L2 VM and take
traps of masked external aborts to the hypervisor
- Convert more system register sanitization to the config-driven
implementation
- Fixes to the visibility of EL2 registers, namely making VGICv3
system registers accessible through the VGIC device instead of the
ONE_REG vCPU ioctls
- Various cleanups and minor fixes
LoongArch:
- Add stat information for in-kernel irqchip
- Add tracepoints for CPUCFG and CSR emulation exits
- Enhance in-kernel irqchip emulation
- Various cleanups
RISC-V:
- Enable ring-based dirty memory tracking
- Improve perf kvm stat to report interrupt events
- Delegate illegal instruction trap to VS-mode
- MMU improvements related to upcoming nested virtualization
s390x
- Fixes
x86:
- Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O
APIC, PIC, and PIT emulation at compile time
- Share device posted IRQ code between SVM and VMX and harden it
against bugs and runtime errors
- Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups
O(1) instead of O(n)
- For MMIO stale data mitigation, track whether or not a vCPU has
access to (host) MMIO based on whether the page tables have MMIO
pfns mapped; using VFIO is prone to false negatives
- Rework the MSR interception code so that the SVM and VMX APIs are
more or less identical
- Recalculate all MSR intercepts from scratch on MSR filter changes,
instead of maintaining shadow bitmaps
- Advertise support for LKGS (Load Kernel GS base), a new instruction
that's loosely related to FRED, but is supported and enumerated
independently
- Fix a user-triggerable WARN that syzkaller found by setting the
vCPU in INIT_RECEIVED state (aka wait-for-SIPI), and then putting
the vCPU into VMX Root Mode (post-VMXON). Trying to detect every
possible path leading to architecturally forbidden states is hard
and even risks breaking userspace (if it goes from valid to valid
state but passes through invalid states), so just wait until
KVM_RUN to detect that the vCPU state isn't allowed
- Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling
interception of APERF/MPERF reads, so that a "properly" configured
VM can access APERF/MPERF. This has many caveats (APERF/MPERF
cannot be zeroed on vCPU creation or saved/restored on suspend and
resume, or preserved over thread migration let alone VM migration)
but can be useful whenever you're interested in letting Linux
guests see the effective physical CPU frequency in /proc/cpuinfo
- Reject KVM_SET_TSC_KHZ for vm file descriptors if vCPUs have been
created, as there's no known use case for changing the default
frequency for other VM types and it goes counter to the very reason
why the ioctl was added to the vm file descriptor. And also, there
would be no way to make it work for confidential VMs with a
"secure" TSC, so kill two birds with one stone
- Dynamically allocation the shadow MMU's hashed page list, and defer
allocating the hashed list until it's actually needed (the TDP MMU
doesn't use the list)
- Extract many of KVM's helpers for accessing architectural local
APIC state to common x86 so that they can be shared by guest-side
code for Secure AVIC
- Various cleanups and fixes
x86 (Intel):
- Preserve the host's DEBUGCTL.FREEZE_IN_SMM when running the guest.
Failure to honor FREEZE_IN_SMM can leak host state into guests
- Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter to
prevent L1 from running L2 with features that KVM doesn't support,
e.g. BTF
x86 (AMD):
- WARN and reject loading kvm-amd.ko instead of panicking the kernel
if the nested SVM MSRPM offsets tracker can't handle an MSR (which
is pretty much a static condition and therefore should never
happen, but still)
- Fix a variety of flaws and bugs in the AVIC device posted IRQ code
- Inhibit AVIC if a vCPU's ID is too big (relative to what hardware
supports) instead of rejecting vCPU creation
- Extend enable_ipiv module param support to SVM, by simply leaving
IsRunning clear in the vCPU's physical ID table entry
- Disable IPI virtualization, via enable_ipiv, if the CPU is affected
by erratum #1235, to allow (safely) enabling AVIC on such CPUs
- Request GA Log interrupts if and only if the target vCPU is
blocking, i.e. only if KVM needs a notification in order to wake
the vCPU
- Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to
the vCPU's CPUID model
- Accept any SNP policy that is accepted by the firmware with respect
to SMT and single-socket restrictions. An incompatible policy
doesn't put the kernel at risk in any way, so there's no reason for
KVM to care
- Drop a superfluous WBINVD (on all CPUs!) when destroying a VM and
use WBNOINVD instead of WBINVD when possible for SEV cache
maintenance
- When reclaiming memory from an SEV guest, only do cache flushes on
CPUs that have ever run a vCPU for the guest, i.e. don't flush the
caches for CPUs that can't possibly have cache lines with dirty,
encrypted data
Generic:
- Rework irqbypass to track/match producers and consumers via an
xarray instead of a linked list. Using a linked list leads to
O(n^2) insertion times, which is hugely problematic for use cases
that create large numbers of VMs. Such use cases typically don't
actually use irqbypass, but eliminating the pointless registration
is a future problem to solve as it likely requires new uAPI
- Track irqbypass's "token" as "struct eventfd_ctx *" instead of a
"void *", to avoid making a simple concept unnecessarily difficult
to understand
- Decouple device posted IRQs from VFIO device assignment, as binding
a VM to a VFIO group is not a requirement for enabling device
posted IRQs
- Clean up and document/comment the irqfd assignment code
- Disallow binding multiple irqfds to an eventfd with a priority
waiter, i.e. ensure an eventfd is bound to at most one irqfd
through the entire host, and add a selftest to verify eventfd:irqfd
bindings are globally unique
- Add a tracepoint for KVM_SET_MEMORY_ATTRIBUTES to help debug issues
related to private <=> shared memory conversions
- Drop guest_memfd's .getattr() implementation as the VFS layer will
call generic_fillattr() if inode_operations.getattr is NULL
- Fix issues with dirty ring harvesting where KVM doesn't bound the
processing of entries in any way, which allows userspace to keep
KVM in a tight loop indefinitely
- Kill off kvm_arch_{start,end}_assignment() and x86's associated
tracking, now that KVM no longer uses assigned_device_count as a
heuristic for either irqbypass usage or MDS mitigation
Selftests:
- Fix a comment typo
- Verify KVM is loaded when getting any KVM module param so that
attempting to run a selftest without kvm.ko loaded results in a
SKIP message about KVM not being loaded/enabled (versus some random
parameter not existing)
- Skip tests that hit EACCES when attempting to access a file, and
print a "Root required?" help message. In most cases, the test just
needs to be run with elevated permissions"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (340 commits)
Documentation: KVM: Use unordered list for pre-init VGIC registers
RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map()
RISC-V: KVM: Use find_vma_intersection() to search for intersecting VMAs
RISC-V: perf/kvm: Add reporting of interrupt events
RISC-V: KVM: Enable ring-based dirty memory tracking
RISC-V: KVM: Fix inclusion of Smnpm in the guest ISA bitmap
RISC-V: KVM: Delegate illegal instruction fault to VS mode
RISC-V: KVM: Pass VMID as parameter to kvm_riscv_hfence_xyz() APIs
RISC-V: KVM: Factor-out g-stage page table management
RISC-V: KVM: Add vmid field to struct kvm_riscv_hfence
RISC-V: KVM: Introduce struct kvm_gstage_mapping
RISC-V: KVM: Factor-out MMU related declarations into separate headers
RISC-V: KVM: Use ncsr_xyz() in kvm_riscv_vcpu_trap_redirect()
RISC-V: KVM: Implement kvm_arch_flush_remote_tlbs_range()
RISC-V: KVM: Don't flush TLB when PTE is unchanged
RISC-V: KVM: Replace KVM_REQ_HFENCE_GVMA_VMID_ALL with KVM_REQ_TLB_FLUSH
RISC-V: KVM: Rename and move kvm_riscv_local_tlb_sanitize()
RISC-V: KVM: Drop the return value of kvm_riscv_vcpu_aia_init()
RISC-V: KVM: Check kvm_riscv_vcpu_alloc_vector_context() return value
KVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"A quick summary: perf support for Branch Record Buffer Extensions
(BRBE), typical PMU hardware updates, small additions to MTE for
store-only tag checking and exposing non-address bits to signal
handlers, HAVE_LIVEPATCH enabled on arm64, VMAP_STACK forced on.
There is also a TLBI optimisation on hardware that does not require
break-before-make when changing the user PTEs between contiguous and
non-contiguous.
More details:
Perf and PMU updates:
- Add support for new (v3) Hisilicon SLLC and DDRC PMUs
- Add support for Arm-NI PMU integrations that share interrupts
between clock domains within a given instance
- Allow SPE to be configured with a lower sample period than the
minimum recommendation advertised by PMSIDR_EL1.Interval
- Add suppport for Arm's "Branch Record Buffer Extension" (BRBE)
- Adjust the perf watchdog period according to cpu frequency changes
- Minor driver fixes and cleanups
Hardware features:
- Support for MTE store-only checking (FEAT_MTE_STORE_ONLY)
- Support for reporting the non-address bits during a synchronous MTE
tag check fault (FEAT_MTE_TAGGED_FAR)
- Optimise the TLBI when folding/unfolding contiguous PTEs on
hardware with FEAT_BBM (break-before-make) level 2 and no TLB
conflict aborts
Software features:
- Enable HAVE_LIVEPATCH after implementing arch_stack_walk_reliable()
and using the text-poke API for late module relocations
- Force VMAP_STACK always on and change arm64_efi_rt_init() to use
arch_alloc_vmap_stack() in order to avoid KASAN false positives
ACPI:
- Improve SPCR handling and messaging on systems lacking an SPCR
table
Debug:
- Simplify the debug exception entry path
- Drop redundant DBG_MDSCR_* macros
Kselftests:
- Cleanups and improvements for SME, SVE and FPSIMD tests
Miscellaneous:
- Optimise loop to reduce redundant operations in contpte_ptep_get()
- Remove ISB when resetting POR_EL0 during signal handling
- Mark the kernel as tainted on SEA and SError panic
- Remove redundant gcs_free() call"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits)
arm64/gcs: task_gcs_el0_enable() should use passed task
arm64: Kconfig: Keep selects somewhat alphabetically ordered
arm64: signal: Remove ISB when resetting POR_EL0
kselftest/arm64: Handle attempts to disable SM on SME only systems
kselftest/arm64: Fix SVE write data generation for SME only systems
kselftest/arm64: Test SME on SME only systems in fp-ptrace
kselftest/arm64: Test FPSIMD format data writes via NT_ARM_SVE in fp-ptrace
kselftest/arm64: Allow sve-ptrace to run on SME only systems
arm64/mm: Drop redundant addr increment in set_huge_pte_at()
kselftest/arm4: Provide local defines for AT_HWCAP3
arm64: Mark kernel as tainted on SAE and SError panic
arm64/gcs: Don't call gcs_free() when releasing task_struct
drivers/perf: hisi: Support PMUs with no interrupt
drivers/perf: hisi: Relax the event number check of v2 PMUs
drivers/perf: hisi: Add support for HiSilicon SLLC v3 PMU driver
drivers/perf: hisi: Use ACPI driver_data to retrieve SLLC PMU information
drivers/perf: hisi: Add support for HiSilicon DDRC v3 PMU driver
drivers/perf: hisi: Simplify the probe process for each DDRC version
perf/arm-ni: Support sharing IRQs within an NI instance
perf/arm-ni: Consolidate CPU affinity handling
...
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 changes for 6.17, round #1
- Host driver for GICv5, the next generation interrupt controller for
arm64, including support for interrupt routing, MSIs, interrupt
translation and wired interrupts.
- Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
GICv5 hardware, leveraging the legacy VGIC interface.
- Userspace control of the 'nASSGIcap' GICv3 feature, allowing
userspace to disable support for SGIs w/o an active state on hardware
that previously advertised it unconditionally.
- Map supporting endpoints with cacheable memory attributes on systems
with FEAT_S2FWB and DIC where KVM no longer needs to perform cache
maintenance on the address range.
- Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest
hypervisor to inject external aborts into an L2 VM and take traps of
masked external aborts to the hypervisor.
- Convert more system register sanitization to the config-driven
implementation.
- Fixes to the visibility of EL2 registers, namely making VGICv3 system
registers accessible through the VGIC device instead of the ONE_REG
vCPU ioctls.
- Various cleanups and minor fixes.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- Introduce and start using TRAILING_OVERLAP() helper for fixing
embedded flex array instances (Gustavo A. R. Silva)
- mux: Convert mux_control_ops to a flex array member in mux_chip
(Thorsten Blum)
- string: Group str_has_prefix() and strstarts() (Andy Shevchenko)
- Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
Kees Cook)
- Refactor and rename stackleak feature to support Clang
- Add KUnit test for seq_buf API
- Fix KUnit fortify test under LTO
* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
sched/task_stack: Add missing const qualifier to end_of_stack()
kstack_erase: Support Clang stack depth tracking
kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
init.h: Disable sanitizer coverage for __init and __head
kstack_erase: Disable kstack_erase for all of arm compressed boot code
x86: Handle KCOV __init vs inline mismatches
arm64: Handle KCOV __init vs inline mismatches
s390: Handle KCOV __init vs inline mismatches
arm: Handle KCOV __init vs inline mismatches
mips: Handle KCOV __init vs inline mismatch
powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
configs/hardening: Enable CONFIG_KSTACK_ERASE
stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
stackleak: Rename STACKLEAK to KSTACK_ERASE
seq_buf: Introduce KUnit tests
string: Group str_has_prefix() and strstarts()
kunit/fortify: Add back "volatile" for sizeof() constants
acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve updates from Kees Cook:
- Introduce regular REGSET note macros arch-wide (Dave Martin)
- Remove arbitrary 4K limitation of program header size (Yin Fengwei)
- Reorder function qualifiers for copy_clone_args_from_user() (Dishank Jogi)
* tag 'execve-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (25 commits)
fork: reorder function qualifiers for copy_clone_args_from_user
binfmt_elf: remove the 4k limitation of program header size
binfmt_elf: Warn on missing or suspicious regset note names
xtensa: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
um: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
x86/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
sparc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
sh: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
s390/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
riscv: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
powerpc/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
parisc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
openrisc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
nios2: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
MIPS: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
m68k: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
LoongArch: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
hexagon: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
csky: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
arm64: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
...
|
|
GICv5 initial host support
Add host kernel support for the new arm64 GICv5 architecture, which is
quite a departure from the previous ones.
Include support for the full gamut of the architecture (interrupt
routing and delivery to CPUs, wired interrupts, MSIs, and interrupt
translation).
* tag 'irqchip-gic-v5-host': (32 commits)
arm64: smp: Fix pNMI setup after GICv5 rework
arm64: Kconfig: Enable GICv5
docs: arm64: gic-v5: Document booting requirements for GICv5
irqchip/gic-v5: Add GICv5 IWB support
irqchip/gic-v5: Add GICv5 ITS support
irqchip/msi-lib: Add IRQ_DOMAIN_FLAG_FWNODE_PARENT handling
irqchip/gic-v3: Rename GICv3 ITS MSI parent
PCI/MSI: Add pci_msi_map_rid_ctlr_node() helper function
of/irq: Add of_msi_xlate() helper function
irqchip/gic-v5: Enable GICv5 SMP booting
irqchip/gic-v5: Add GICv5 LPI/IPI support
irqchip/gic-v5: Add GICv5 IRS/SPI support
irqchip/gic-v5: Add GICv5 PPI support
arm64: Add support for GICv5 GSB barriers
arm64: smp: Support non-SGIs for IPIs
arm64: cpucaps: Add GICv5 CPU interface (GCIE) capability
arm64: cpucaps: Rename GICv3 CPU interface capability
arm64: Disable GICv5 read/write/instruction traps
arm64/sysreg: Add ICH_HFGITR_EL2
arm64/sysreg: Add ICH_HFGWTR_EL2
...
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
* for-next/feat_mte_store_only:
: MTE feature to restrict tag checking to store only operations
kselftest/arm64/mte: Add MTE_STORE_ONLY testcases
kselftest/arm64/mte: Preparation for mte store only test
kselftest/arm64/abi: Add MTE_STORE_ONLY feature hwcap test
KVM: arm64: Expose MTE_STORE_ONLY feature to guest
arm64/hwcaps: Add MTE_STORE_ONLY hwcaps
arm64/kernel: Support store-only mte tag check
prctl: Introduce PR_MTE_STORE_ONLY
arm64/cpufeature: Add MTE_STORE_ONLY feature
|
|
'for-next/misc', 'for-next/acpi', 'for-next/debug-entry', 'for-next/feat_mte_tagged_far', 'for-next/kselftest', 'for-next/mdscr-cleanup' and 'for-next/vmap-stack', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf: (23 commits)
drivers/perf: hisi: Support PMUs with no interrupt
drivers/perf: hisi: Relax the event number check of v2 PMUs
drivers/perf: hisi: Add support for HiSilicon SLLC v3 PMU driver
drivers/perf: hisi: Use ACPI driver_data to retrieve SLLC PMU information
drivers/perf: hisi: Add support for HiSilicon DDRC v3 PMU driver
drivers/perf: hisi: Simplify the probe process for each DDRC version
perf/arm-ni: Support sharing IRQs within an NI instance
perf/arm-ni: Consolidate CPU affinity handling
perf/cxlpmu: Fix typos in cxl_pmu.c comments and documentation
perf/cxlpmu: Remove unintended newline from IRQ name format string
perf/cxlpmu: Fix devm_kcalloc() argument order in cxl_pmu_probe()
perf: arm_spe: Relax period restriction
perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
KVM: arm64: nvhe: Disable branch generation in nVHE guests
arm64: Handle BRBE booting requirements
arm64/sysreg: Add BRBE registers and fields
perf/arm: Add missing .suppress_bind_attrs
perf/arm-cmn: Reduce stack usage during discovery
perf: imx9_perf: make the read-only array mask static const
perf/arm-cmn: Broaden module description for wider interconnect support
...
* for-next/livepatch:
: Support for HAVE_LIVEPATCH on arm64
arm64: Kconfig: Keep selects somewhat alphabetically ordered
arm64: Implement HAVE_LIVEPATCH
arm64: stacktrace: Implement arch_stack_walk_reliable()
arm64: stacktrace: Check kretprobe_find_ret_addr() return value
arm64/module: Use text-poke API for late relocations.
* for-next/user-contig-bbml2:
: Optimise the TLBI when folding/unfolding contigous PTEs on hardware with BBML2 and no TLB conflict aborts
arm64/mm: Elide tlbi in contpte_convert() under BBML2
iommu/arm: Add BBM Level 2 smmu feature
arm64: Add BBM Level 2 cpu feature
arm64: cpufeature: Introduce MATCH_ALL_EARLY_CPUS capability type
* for-next/misc:
: Miscellaneous arm64 patches
arm64/gcs: task_gcs_el0_enable() should use passed task
arm64: signal: Remove ISB when resetting POR_EL0
arm64/mm: Drop redundant addr increment in set_huge_pte_at()
arm64: Mark kernel as tainted on SAE and SError panic
arm64/gcs: Don't call gcs_free() when releasing task_struct
arm64: fix unnecessary rebuilding when CONFIG_DEBUG_EFI=y
arm64/mm: Optimize loop to reduce redundant operations of contpte_ptep_get
arm64: pi: use 'targets' instead of extra-y in Makefile
* for-next/acpi:
: Various ACPI arm64 changes
ACPI: Suppress misleading SPCR console message when SPCR table is absent
ACPI: Return -ENODEV from acpi_parse_spcr() when SPCR support is disabled
* for-next/debug-entry:
: Simplify the debug exception entry path
arm64: debug: remove debug exception registration infrastructure
arm64: debug: split bkpt32 exception entry
arm64: debug: split brk64 exception entry
arm64: debug: split hardware watchpoint exception entry
arm64: debug: split single stepping exception entry
arm64: debug: refactor reinstall_suspended_bps()
arm64: debug: split hardware breakpoint exception entry
arm64: entry: Add entry and exit functions for debug exceptions
arm64: debug: remove break/step handler registration infrastructure
arm64: debug: call step handlers statically
arm64: debug: call software breakpoint handlers statically
arm64: refactor aarch32_break_handler()
arm64: debug: clean up single_step_handler logic
* for-next/feat_mte_tagged_far:
: Support for reporting the non-address bits during a synchronous MTE tag check fault
kselftest/arm64/mte: Add mtefar tests on check_mmap_options
kselftest/arm64/mte: Refactor check_mmap_option test
kselftest/arm64/mte: Add verification for address tag in signal handler
kselftest/arm64/mte: Add address tag related macro and function
kselftest/arm64/mte: Check MTE_FAR feature is supported
kselftest/arm64/mte: Register mte signal handler with SA_EXPOSE_TAGBITS
kselftest/arm64: Add MTE_FAR hwcap test
KVM: arm64: Expose FEAT_MTE_TAGGED_FAR feature to guest
arm64: Report address tag when FEAT_MTE_TAGGED_FAR is supported
arm64/cpufeature: Add FEAT_MTE_TAGGED_FAR feature
* for-next/kselftest:
: Kselftest updates for arm64
kselftest/arm64: Handle attempts to disable SM on SME only systems
kselftest/arm64: Fix SVE write data generation for SME only systems
kselftest/arm64: Test SME on SME only systems in fp-ptrace
kselftest/arm64: Test FPSIMD format data writes via NT_ARM_SVE in fp-ptrace
kselftest/arm64: Allow sve-ptrace to run on SME only systems
kselftest/arm4: Provide local defines for AT_HWCAP3
kselftest/arm64: Specify SVE data when testing VL set in sve-ptrace
kselftest/arm64: Fix test for streaming FPSIMD write in sve-ptrace
kselftest/arm64: Fix check for setting new VLs in sve-ptrace
kselftest/arm64: Convert tpidr2 test to use kselftest.h
* for-next/mdscr-cleanup:
: Drop redundant DBG_MDSCR_* macros
KVM: selftests: Change MDSCR_EL1 register holding variables as uint64_t
arm64/debug: Drop redundant DBG_MDSCR_* macros
* for-next/vmap-stack:
: Force VMAP_STACK on arm64
arm64: remove CONFIG_VMAP_STACK checks from entry code
arm64: remove CONFIG_VMAP_STACK checks from SDEI stack handling
arm64: remove CONFIG_VMAP_STACK checks from stacktrace overflow logic
arm64: remove CONFIG_VMAP_STACK conditionals from traps overflow stack
arm64: remove CONFIG_VMAP_STACK conditionals from irq stack setup
arm64: Remove CONFIG_VMAP_STACK conditionals from THREAD_SHIFT and THREAD_ALIGN
arm64: efi: Remove CONFIG_VMAP_STACK check
arm64: Mandate VMAP_STACK
arm64: efi: Fix KASAN false positive for EFI runtime stack
arm64/ptrace: Fix stack-out-of-bounds read in regs_get_kernel_stack_nth()
arm64/gcs: Don't call gcs_free() during flush_gcs()
arm64: Restrict pagetable teardown to avoid false warning
docs: arm64: Fix ICC_SRE_EL2 register typo in booting.rst
|
|
Mark Rutland noticed that the task parameter is ignored and
'current' is being used instead. Since this is usually
what its passed, it hasn't yet been causing problems but likely
will as the code gets more testing.
But, once this is fixed, it creates a new bug in copy_thread_gcs()
since the gcs_el_mode isn't yet set for the task before its being
checked. Move gcs_alloc_thread_stack() after the new task's
gcs_el0_mode initialization to avoid this.
Fixes: fc84bc5378a8 ("arm64/gcs: Context switch GCS state for EL0")
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250719043740.4548-2-jeremy.linton@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
`cpu_switch_to()` and `call_on_irq_stack()` manipulate SP to change
to different stacks along with the Shadow Call Stack if it is enabled.
Those two stack changes cannot be done atomically and both functions
can be interrupted by SErrors or Debug Exceptions which, though unlikely,
is very much broken : if interrupted, we can end up with mismatched stacks
and Shadow Call Stack leading to clobbered stacks.
In `cpu_switch_to()`, it can happen when SP_EL0 points to the new task,
but x18 stills points to the old task's SCS. When the interrupt handler
tries to save the task's SCS pointer, it will save the old task
SCS pointer (x18) into the new task struct (pointed to by SP_EL0),
clobbering it.
In `call_on_irq_stack()`, it can happen when switching from the task stack
to the IRQ stack and when switching back. In both cases, we can be
interrupted when the SCS pointer points to the IRQ SCS, but SP points to
the task stack. The nested interrupt handler pushes its return addresses
on the IRQ SCS. It then detects that SP points to the task stack,
calls `call_on_irq_stack()` and clobbers the task SCS pointer with
the IRQ SCS pointer, which it will also use !
This leads to tasks returning to addresses on the wrong SCS,
or even on the IRQ SCS, triggering kernel panics via CONFIG_VMAP_STACK
or FPAC if enabled.
This is possible on a default config, but unlikely.
However, when enabling CONFIG_ARM64_PSEUDO_NMI, DAIF is unmasked and
instead the GIC is responsible for filtering what interrupts the CPU
should receive based on priority.
Given the goal of emulating NMIs, pseudo-NMIs can be received by the CPU
even in `cpu_switch_to()` and `call_on_irq_stack()`, possibly *very*
frequently depending on the system configuration and workload, leading
to unpredictable kernel panics.
Completely mask DAIF in `cpu_switch_to()` and restore it when returning.
Do the same in `call_on_irq_stack()`, but restore and mask around
the branch.
Mask DAIF even if CONFIG_SHADOW_CALL_STACK is not enabled for consistency
of behaviour between all configurations.
Introduce and use an assembly macro for saving and masking DAIF,
as the existing one saves but only masks IF.
Cc: <stable@vger.kernel.org>
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Reported-by: Cristian Prundeanu <cpru@amazon.com>
Fixes: 59b37fe52f49 ("arm64: Stash shadow stack pointer in the task struct on interrupt")
Tested-by: Cristian Prundeanu <cpru@amazon.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250718142814.133329-1-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
POR_EL0 is set to its most permissive value before setting up the
signal frame, to ensure that uaccess succeeds regardless of the
signal stack's pkey.
We are now tolerant to spurious POE faults. This means that we do
not strictly need to issue an ISB after updating POR_EL0, even when
followed by uaccess. The question is whether a fault is likely to
happen or not if the ISB is omitted; in this case the answer seems
to be no. If the regular stack is used, then it should already be
accessible. If the alternate signal stack is used, then a special
(inaccessible) pkey may be used - the assumption is that this
situation is very uncommon.
Remove the ISB to speed up the regular path - this should not have
any functional impact regardless of the scenario.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20250619160042.2499290-3-kevin.brodsky@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In preparation for Clang stack depth tracking for KSTACK_ERASE,
split the stackleak-specific cflags out of GCC_PLUGINS_CFLAGS into
KSTACK_ERASE_CFLAGS.
Link: https://lore.kernel.org/r/20250717232519.2984886-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
In preparation for adding Clang sanitizer coverage stack depth tracking
that can support stack depth callbacks:
- Add the new top-level CONFIG_KSTACK_ERASE option which will be
implemented either with the stackleak GCC plugin, or with the Clang
stack depth callback support.
- Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE,
but keep it for anything specific to the GCC plugin itself.
- Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named
for what it does rather than what it protects against), but leave as
many of the internals alone as possible to avoid even more churn.
While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS,
since that's the only place it is referenced from.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Set TAINT_MACHINE_CHECK when SError or Synchronous External Abort (SEA)
interrupts trigger a panic to flag potential hardware faults. This
tainting mechanism aids in debugging and enables correlation of
hardware-related crashes in large-scale deployments.
This change aligns with similar patches[1] that mark machine check
events when the system crashes due to hardware errors.
Link: https://lore.kernel.org/all/20250702-add_tain-v1-1-9187b10914b9@debian.org/ [1]
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250716-vmcore_hw_error-v2-1-f187f7d62aba@debian.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Breno reports that pNMIs are not behaving the way they should since
they were reworked for GICv5. Turns out we feed the IRQ number to
the pNMI helper instead of the IPI number -- not a good idea.
Fix it by providing the correct number (duh).
Fixes: ba1004f861d16 ("arm64: smp: Support non-SGIs for IPIs")
Reported-by: Breno Leitao <leitao@debian.org>
Suggested-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Currently we call gcs_free() when releasing task_struct but this is
redundant, it attempts to deallocate any kernel managed userspace GCS
which should no longer be relevant and resets values in the struct we're
in the process of freeing.
By the time arch_release_task_struct() is called the mm will have been
disassociated from the task so the check for a mm in gcs_free() will
always be false, for threads that are exiting leaving the mm active
deactivate_mm() will have been called previously and freed any kernel
managed GCS.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250714-arm64-gcs-release-task-v2-1-8a83cadfc846@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Instead of having the core code guess the note name for each regset,
use USER_REGSET_NOTE_TYPE() to pick the correct name from elf.h.
This does not affect the correctness of switch(note_type) and similar
code, since note type values known to Linux for coredump purposes were
already required to be unique.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: linux-arm-kernel@lists.infradead.org
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250701135616.29630-7-Dave.Martin@arm.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
KVM will soon support FEAT_DoubleFault2. Add a descriptor for the
corresponding ID register field.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250708172532.1699409-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
KVM is about to pick up support for SCTLR2. Add cpucap for later use in
the guest/host context switch hot path.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250708172532.1699409-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
An IRS supports Logical Peripheral Interrupts (LPIs) and implement
Linux IPIs on top of it.
LPIs are used for interrupt signals that are translated by a
GICv5 ITS (Interrupt Translation Service) but also for software
generated IRQs - namely interrupts that are not driven by a HW
signal, ie IPIs.
LPIs rely on memory storage for interrupt routing and state.
LPIs state and routing information is kept in the Interrupt
State Table (IST).
IRSes provide support for 1- or 2-level IST tables configured
to support a maximum number of interrupts that depend on the
OS configuration and the HW capabilities.
On systems that provide 2-level IST support, always allow
the maximum number of LPIs; On systems with only 1-level
support, limit the number of LPIs to 2^12 to prevent
wasting memory (presumably a system that supports a 1-level
only IST is not expecting a large number of interrupts).
On a 2-level IST system, L2 entries are allocated on
demand.
The IST table memory is allocated using the kmalloc() interface;
the allocation required may be smaller than a page and must be
made up of contiguous physical pages if larger than a page.
On systems where the IRS is not cache-coherent with the CPUs,
cache mainteinance operations are executed to clean and
invalidate the allocated memory to the point of coherency
making it visible to the IRS components.
On GICv5 systems, IPIs are implemented using LPIs.
Add an LPI IRQ domain and implement an IPI-specific IRQ domain created
as a child/subdomain of the LPI domain to allocate the required number
of LPIs needed to implement the IPIs.
IPIs are backed by LPIs, add LPIs allocation/de-allocation
functions.
The LPI INTID namespace is managed using an IDA to alloc/free LPI INTIDs.
Associate an IPI irqchip with IPI IRQ descriptors to provide
core code with the irqchip.ipi_send_single() method required
to raise an IPI.
Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Co-developed-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-22-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The arm64 arch has relied so far on GIC architectural software
generated interrupt (SGIs) to handle IPIs. Those are per-cpu
software generated interrupts.
arm64 architecture code that allocates the IPIs virtual IRQs and
IRQ descriptors was written accordingly.
On GICv5 systems, IPIs are implemented using LPIs that are not
per-cpu interrupts - they are just normal routable IRQs.
Add arch code to set-up IPIs on systems where they are handled
using normal routable IRQs.
For those systems, force the IRQ affinity (and make it immutable)
to the cpu a given IRQ was assigned to.
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
[lpieralisi: changed affinity set-up, log]
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-18-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Implement the GCIE capability as a strict boot cpu capability to
detect whether architectural GICv5 support is available in HW.
Plug it in with a naming consistent with the existing GICv3
CPU interface capability.
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-17-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In preparation for adding a GICv5 CPU interface capability,
rework the existing GICv3 CPUIF capability - change its name and
description so that the subsequent GICv5 CPUIF capability
can be added with a more consistent naming on top.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250703-gicv5-host-v7-16-12e71f1b3528@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When CONFIG_DEBUG_EFI is enabled, some objects are needlessly rebuilt.
[Steps to reproduce]
Enable CONFIG_DEBUG_EFI and run 'make' twice in a clean source tree.
On the second run, arch/arm64/kernel/head.o is rebuilt even though
no files have changed.
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- clean
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
[ snip ]
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
CALL scripts/checksyscalls.sh
AS arch/arm64/kernel/head.o
AR arch/arm64/kernel/built-in.a
AR arch/arm64/built-in.a
AR built-in.a
[ snip ]
The issue is caused by the use of the $(realpath ...) function.
At the time arch/arm64/kernel/Makefile is parsed on the first run,
$(objtree)/vmlinux does not exist. As a result,
$(realpath $(objtree)/vmlinux) expands to an empty string.
On the second run of Make, $(objtree)/vmlinux already exists, so
$(realpath $(objtree)/vmlinux) expands to the absolute path of vmlinux.
However, this change in the command line causes arch/arm64/kernel/head.o
to be rebuilt.
To address this issue, use $(abspath ...) instead, which does not require
the file to exist. While $(abspath ...) does not resolve symlinks, this
should be fine from a debugging perspective.
The GNU Make manual [1] clearly explains the difference between the two:
$(realpath names...)
For each file name in names return the canonical absolute name.
A canonical name does not contain any . or .. components, nor any
repeated path separators (/) or symlinks. In case of a failure the
empty string is returned. Consult the realpath(3) documentation for
a list of possible failure causes.
$(abspath namees...)
For each file name in names return an absolute name that does not
contain any . or .. components, nor any repeated path separators (/).
Note that, in contrast to realpath function, abspath does not resolve
symlinks and does not require the file names to refer to an existing
file or directory. Use the wildcard function to test for existence.
The same problem exists in drivers/firmware/efi/libstub/Makefile.zboot.
On the first run of Make, $(obj)/vmlinuz.efi.elf does not exist when the
Makefile is parsed, so -DZBOOT_EFI_PATH is set to an empty string.
Replace $(realpath ...) with $(abspath ...) there as well.
[1]: https://www.gnu.org/software/make/manual/make.html#File-Name-Functions
Fixes: 757b435aaabe ("efi: arm64: Add vmlinux debug link to the Image binary")
Fixes: a050910972bb ("efi/libstub: implement generic EFI zboot")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250625125555.2504734-1-masahiroy@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
With VMAP_STACK now always enabled on arm64, remove all CONFIG_VMAP_STACK
conditionals from entry handling in arch/arm64/kernel/entry-common.c and
arch/arm64/kernel/entry.S.
This change unconditionally includes the bad stack handling and overflow
detection logic, simplifying the code and reflecting the mandatory use of
VMAP_STACK for all arm64 kernel builds.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707-arm64_vmap-v1-8-8de98ca0f91c@debian.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
With VMAP_STACK now always enabled on arm64, remove all
CONFIG_VMAP_STACK conditionals from SDEI stack allocation and
initialization in arch/arm64/kernel/sdei.c.
This change unconditionally defines the SDEI stack pointers and replaces
runtime checks with BUILD_BUG_ON() assertions, ensuring that the code is
only built when VMAP_STACK is enabled. This simplifies the logic and
reflects the mandatory use of VMAP_STACK for all arm64 kernel builds.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707-arm64_vmap-v1-7-8de98ca0f91c@debian.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
With VMAP_STACK now always enabled on arm64, remove all CONFIG_VMAP_STACK
conditionals from overflow stack handling in stacktrace code.
This change unconditionally defines the per-CPU overflow_stack and
stackinfo_get_overflow() helper in arch/arm64/include/asm/stacktrace.h,
and always includes the overflow stack in the stack_info array in
arch/arm64/kernel/stacktrace.c. Also, drop redundant CONFIG_VMAP_STACK
checks from SDEI stack declarations.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707-arm64_vmap-v1-6-8de98ca0f91c@debian.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
With VMAP_STACK now always enabled on arm64, remove the
CONFIG_VMAP_STACK checks from overflow stack definitions and related
code in arch/arm64/kernel/traps.c. The overflow_stack and
panic_bad_stack() logic are now unconditionally included, simplifying
the source and matching the mandatory stack model.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707-arm64_vmap-v1-5-8de98ca0f91c@debian.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
With VMAP_STACK always enabled on arm64, drop the CONFIG_VMAP_STACK
checks and legacy irq stack allocation from arch/arm64/kernel/irq.c. The
code now unconditionally uses the VMAP_STACK path for irq stack
initialization, simplifying the logic.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707-arm64_vmap-v1-4-8de98ca0f91c@debian.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Remove the CONFIG_VMAP_STACK check in arm64_efi_rt_init() since
VMAP_STACK is now always enabled on arm64.
The arch_alloc_vmap_stack() call will fail to build if VMAP_STACK
is not set, providing sufficient protection without the explicit
runtime check.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707-arm64_vmap-v1-2-8de98ca0f91c@debian.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Now that debug exceptions are handled individually and without the need
for dynamic registration, remove the unused registration infrastructure.
This removes the external caller for `debug_exception_enter()` and
`debug_exception_exit()`.
Make them static again and remove them from the header.
Remove `early_brk64()` as it has been made redundant by
(arm64: debug: split brk64 exception entry) and is not used anymore.
Note : in `early_brk64()` `bug_brk_handler()` is called unconditionally
as a fall-through, but now `call_break_hook()` only calls it if the
immediate matches.
This does not change the behaviour in early boot, as if
`bug_brk_handler()` was called on a non-BUG immediate it would return
DBG_HOOK_ERROR anyway, which `call_break_hook()` will do if no immediate
matches.
Remove `trap_init()`, as it would be empty and a weak definition already
exists in `init/main.c`.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-14-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently all debug exceptions share common entry code and are routed
to `do_debug_exception()`, which calls dynamically-registered
handlers for each specific debug exception. This is unfortunate as
different debug exceptions have different entry handling requirements,
and it would be better to handle these distinct requirements earlier.
The BKPT32 exception can only be triggered by a BKPT instruction. Thus,
we know that the PC is a legitimate address and isn't being used to train
a branch predictor with a bogus address : we don't need to call
`arm64_apply_bp_hardening()`.
The handler for this exception only pends a signal and doesn't depend
on any per-CPU state : we don't need to inhibit preemption, nor do we
need to keep the DAIF exceptions masked, so we can unmask them earlier.
Split the BKPT32 exception entry and adjust function signatures and its
behaviour to match its relaxed constraints compared to other
debug exceptions.
We can also remove `NOKRPOBE_SYMBOL`, as this cannot lead to a kprobe
recursion.
This replaces the last usage of `el0_dbg()`, so remove it.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-13-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently all debug exceptions share common entry code and are routed
to `do_debug_exception()`, which calls dynamically-registered
handlers for each specific debug exception. This is unfortunate as
different debug exceptions have different entry handling requirements,
and it would be better to handle these distinct requirements earlier.
The BRK64 instruction can only be triggered by a BRK instruction. Thus,
we know that the PC is a legitimate address and isn't being used to train
a branch predictor with a bogus address : we don't need to call
`arm64_apply_bp_hardening()`.
We do not need to handle the Cortex-A76 erratum #1463225 either, as it
only relevant for single stepping at EL1.
BRK64 does not write FAR_EL1 either, as only hardware watchpoints do so.
Split the BRK64 exception entry, adjust the function signature, and its
behaviour to match the lack of needed mitigations.
Further, as the EL0 and EL1 code paths are cleanly separated, we can split
`do_brk64()` into `do_el0_brk64()` and `do_el1_brk64()`, and call them
directly from the relevant entry paths.
Use `die()` directly for the EL1 error path, as in `do_el1_bti()` and
`do_el1_undef()`.
We can also remove `NOKRPOBE_SYMBOL` for the EL0 path, as it cannot
lead to a kprobe recursion.
When taking a BRK64 exception from EL0, the exception handling is safely
preemptible : the only possible handler is `uprobe_brk_handler()`.
It only operates on task-local data and properly checks its validity,
then raises a Thread Information Flag, processed before returning
to userspace in `do_notify_resume()`, which is already preemptible.
Thus we can safely unmask interrupts and enable preemption before
handling the break itself, fixing a PREEMPT_RT issue where the handler
could call a sleeping function with preemption disabled.
Given that the break hook registration is handled statically in
`call_break_hook` since
(arm64: debug: call software break handlers statically)
and that we now bypass the exception handler registration, this change
renders `early_brk64` redundant : its functionality is now handled through
the post-init path.
This also removes the last usage of `el1_dbg()`.
This also removes the last usage of `el0_dbg()` without `CONFIG_COMPAT`.
Mark it `__maybe_unused`, to prevent a warning when building this patch
without `CONFIG_COMPAT`, as the following patch removes `el0_dbg()`.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-12-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently all debug exceptions share common entry code and are routed
to `do_debug_exception()`, which calls dynamically-registered
handlers for each specific debug exception. This is unfortunate as
different debug exceptions have different entry handling requirements,
and it would be better to handle these distinct requirements earlier.
Hardware watchpoints are the only debug exceptions that will write
FAR_EL1, so we need to preserve it and pass it down.
However, they cannot be used to maliciously train branch predictors, so
we can omit calling `arm64_bp_hardening()`, nor do they need to handle
the Cortex-A76 erratum #1463225, as it only applies to single stepping
exceptions.
As the hardware watchpoint handler only returns 0 and never triggers
the call to `arm64_notify_die()`, we can call it directly from
`entry-common.c`.
Split the hardware watchpoint exception entry and adjust the behaviour
to match the lack of needed mitigations.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-11-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently all debug exceptions share common entry code and are routed
to `do_debug_exception()`, which calls dynamically-registered
handlers for each specific debug exception. This is unfortunate as
different debug exceptions have different entry handling requirements,
and it would be better to handle these distinct requirements earlier.
The single stepping exception has the most constraints : it can be
exploited to train branch predictors and it needs special handling at EL1
for the Cortex-A76 erratum #1463225. We need to conserve all those
mitigations.
However, it does not write an address at FAR_EL1, as only hardware
watchpoints do so.
The single-step handler does its own signaling if it needs to and only
returns 0, so we can call it directly from `entry-common.c`.
Split the single stepping exception entry, adjust the function signature,
keep the security mitigation and erratum handling.
Further, as the EL0 and EL1 code paths are cleanly separated, we can split
`do_softstep()` into `do_el0_softstep()` and `do_el1_softstep()` and
call them directly from the relevant entry paths.
We can also remove `NOKPROBE_SYMBOL` for the EL0 path, as it cannot
lead to a kprobe recursion.
Move the call to `arm64_apply_bp_hardening()` to `entry-common.c` so that
we can do it as early as possible, and only for the exceptions coming
from EL0, where it is needed.
This is safe to do as it is `noinstr`, as are all the functions it
may call. `el0_ia()` and `el0_pc()` already call it this way.
When taking a soft-step exception from EL0, most of the single stepping
handling is safely preemptible : the only possible handler is
`uprobe_single_step_handler()`. It only operates on task-local data and
properly checks its validity, then raises a Thread Information Flag,
processed before returning to userspace in `do_notify_resume()`, which
is already preemptible.
However, the soft-step handler first calls `reinstall_suspended_bps()`
to check if there is any hardware breakpoint or watchpoint pending
or already stepped through.
This cannot be preempted as it manipulates the hardware breakpoint and
watchpoint registers.
Move the call to `try_step_suspended_breakpoints()` to `entry-common.c`
and adjust the relevant comments.
We can now safely unmask interrupts before handling the step itself,
fixing a PREEMPT_RT issue where the handler could call a sleeping function
with preemption disabled.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Closes: https://lore.kernel.org/linux-arm-kernel/Z6YW_Kx4S2tmj2BP@uudg.org/
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-10-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
`reinstall_suspended_bps()` plays a key part in the stepping process
when we have hardware breakpoints and watchpoints enabled.
It checks if we need to step one, will re-enable it if it has
been handled and will return whether or not we need to proceed with
a single-step.
However, the current naming and return values make it harder to understand
the logic and goal of the function.
Rename it `try_step_suspended_breakpoints()` and change the return value
to a boolean, aligning it with similar functions used in
`do_el0_undef()` like `try_emulate_mrs()`, and making its behaviour
more obvious.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-9-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently all debug exceptions share common entry code and are routed
to `do_debug_exception()`, which calls dynamically-registered
handlers for each specific debug exception. This is unfortunate as
different debug exceptions have different entry handling requirements,
and it would be better to handle these distinct requirements earlier.
Hardware breakpoints exceptions are generated by the hardware after user
configuration. As such, they can be exploited when training branch
predictors outside of the userspace VA range: they still need to call
`arm64_apply_bp_hardening()` if needed to mitigate against this attack.
However, they do not need to handle the Cortex-A76 erratum #1463225 as
it only applies to single stepping exceptions.
It does not set an address in FAR_EL1 either, only the hardware
watchpoint does.
As the hardware breakpoint handler only returns 0 and never triggers
the call to `arm64_notify_die()`, we can call it directly from
`entry-common.c`.
Split the hardware breakpoint exception entry, adjust
the function signature, and handling of the Cortex-A76 erratum to fit
the behaviour of the exception.
Move the call to `arm64_apply_bp_hardening()` to `entry-common.c` so that
we can do it as early as possible, and only for the exceptions coming
from EL0, where it is needed.
This is safe to do as it is `noinstr`, as are all the functions it
may call. `el0_ia()` and `el0_pc()` already call it this way.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-8-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Move the `debug_exception_enter()` and `debug_exception_exit()`
functions from mm/fault.c, as they are needed to split
the debug exceptions entry paths from the current unified one.
Make them externally visible in include/asm/exception.h until
the caller in mm/fault.c is cleaned up.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-7-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Remove all infrastructure for the dynamic registration previously used by
software breakpoints and stepping handlers.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-6-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Software stepping checks for the correct handler by iterating over a list
of dynamically registered handlers and calling all of them until one
handles the exception.
This is the only generic way to handle software stepping handlers in arm64
as the exception does not provide an immediate that could be checked,
contrary to software breakpoints.
However, the registration mechanism is not exported and has only
two current users : the KGDB stepping handler, and the uprobe single step
handler.
Given that one comes from user mode and the other from kernel mode, call
the appropriate one by checking the source EL of the exception.
Add a stand-in that returns DBG_HOOK_ERROR when the configuration
options are not enabled.
Remove `arch_init_uprobes()` as it is not useful anymore and is
specific to arm64.
Unify the naming of the handler to XXX_single_step_handler(), making it
clear they are related.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-5-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Software breakpoints pass an immediate value in ESR ("comment") that can
be used to call a specialized handler (KGDB, KASAN...).
We do so in two different ways :
- During early boot, `early_brk64` statically checks against known
immediates and calls the corresponding handler,
- During init, handlers are dynamically registered into a list. When
called, the generic software breakpoint handler will iterate over
the list to find the appropriate handler.
The dynamic registration does not provide any benefit here as it is not
exported and all its uses are within the arm64 tree. It also depends on an
RCU list, whose safe access currently relies on the non-preemptible state
of `do_debug_exception`.
Replace the list iteration logic in `call_break_hooks` to call
the breakpoint handlers statically if they are enabled, like in
`early_brk64`.
Expose the handlers in their respective headers to be reachable from
`arch/arm64/kernel/debug-monitors.c` at link time.
Unify the naming of the software breakpoint handlers to XXX_brk_handler(),
making it clear they are related and to differentiate from the
hardware breakpoints.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250707114109.35672-4-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
`aarch32_break_handler()` is called in `do_el0_undef()` when we
are trying to handle an exception whose Exception Syndrome is unknown.
It checks if the instruction hit might be a 32-bit arm break (be it
A32 or T2), and sends a SIGTRAP to userspace if it is so that it can
be handled.
However, this is badly represented in the naming of the function, and
is not consistent with the other functions called with the same logic
in `do_el0_undef()`.
Rename it `try_handle_aarch32_break()` and change the return value to
a boolean to align with the logic of the other tentative handlers in
`do_el0_undef()`, the previous error code being ignored anyway.
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250707114109.35672-3-ada.coupriediaz@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|