summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-06-28cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFERJonathan Cameron
For arm64 the CPU registration cannot complete until the ACPI interpreter us up and running so in those cases the arch specific arch_register_cpu() will return -EPROBE_DEFER at this stage and the registration will be attempted later. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Miguel Luis <miguel.luis@oracle.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240529133446.28446-3-Jonathan.Cameron@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-28ACPI: processor: Simplify initial onlining to use same path for cold and hotplugJonathan Cameron
Separate code paths, combined with a flag set in acpi_processor.c to indicate a struct acpi_processor was for a hotplugged CPU ensured that per CPU data was only set up the first time that a CPU was initialized. This appears to be unnecessary as the paths can be combined by letting the online logic also handle any CPUs online at the time of driver load. Motivation for this change, beyond simplification, is that ARM64 virtual CPU HP uses the same code paths for hotplug and cold path in acpi_processor.c so had no easy way to set the flag for hotplug only. Removing this necessity will enable ARM64 vCPU HP to reuse the existing code paths. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Tested-by: Miguel Luis <miguel.luis@oracle.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Miguel Luis <miguel.luis@oracle.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240529133446.28446-2-Jonathan.Cameron@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-24arm64: Cleanup __cpu_set_tcr_t0sz()Seongsu Park
The T0SZ field of TCR_EL1 occupies bits 0-5 of the register and encode the virtual address space translated by TTBR0_EL1. When updating the field, for example because we are switching to/from the idmap page-table, __cpu_set_tcr_t0sz() erroneously treats its 't0sz' argument as unshifted, resulting in harmless but confusing double shifts by 0 in the code. Co-developed-by: Leem ChaeHoon <infinite.run@gmail.com> Signed-off-by: Leem ChaeHoon <infinite.run@gmail.com> Signed-off-by: Seongsu Park <sgsu.park@samsung.com> Link: https://lore.kernel.org/r/20240523122146.144483-1-sgsu.park@samsung.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-24ACPI / amba: Drop unnecessary check for registered amba_dummy_clkYouwan Wang
amba_register_dummy_clk() is called only once from acpi_amba_init() and acpi_amba_init() itself is called once during the initialisation. amba_dummy_clk can't be initialised before this in any other code path and hence the check for already registered amba_dummy_clk is not necessary. Drop the same. Signed-off-by: Youwan Wang <youwan@nfschina.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/20240624023101.369633-1-youwan@nfschina.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-24arm64: irqchip/gic-v3: Select priorities at boot timeMark Rutland
The distributor and PMR/RPR can present different views of the interrupt priority space dependent upon the values of GICD_CTLR.DS and SCR_EL3.FIQ. Currently we treat the distributor's view of the priority space as canonical, and when the two differ we change the way we handle values in the PMR/RPR, using the `gic_nonsecure_priorities` static key to decide what to do. This approach works, but it's sub-optimal. When using pseudo-NMI we manipulate the distributor rarely, and we manipulate the PMR/RPR registers very frequently in code spread out throughout the kernel (e.g. local_irq_{save,restore}()). It would be nicer if we could use fixed values for the PMR/RPR, and dynamically choose the values programmed into the distributor. This patch changes the GICv3 driver and arm64 code accordingly. PMR values are chosen at compile time, and the GICv3 driver determines the appropriate values to program into the distributor at boot time. This removes the need for the `gic_nonsecure_priorities` static key and results in smaller and better generated code for saving/restoring the irqflags. Before this patch, local_irq_disable() compiles to: | 0000000000000000 <outlined_local_irq_disable>: | 0: d503201f nop | 4: d50343df msr daifset, #0x3 | 8: d65f03c0 ret | c: d503201f nop | 10: d2800c00 mov x0, #0x60 // #96 | 14: d5184600 msr icc_pmr_el1, x0 | 18: d65f03c0 ret | 1c: d2801400 mov x0, #0xa0 // #160 | 20: 17fffffd b 14 <outlined_local_irq_disable+0x14> After this patch, local_irq_disable() compiles to: | 0000000000000000 <outlined_local_irq_disable>: | 0: d503201f nop | 4: d50343df msr daifset, #0x3 | 8: d65f03c0 ret | c: d2801800 mov x0, #0xc0 // #192 | 10: d5184600 msr icc_pmr_el1, x0 | 14: d65f03c0 ret ... with 3 fewer instructions per call. For defconfig + CONFIG_PSEUDO_NMI=y, this results in a minor saving of ~4K of text, and will make it easier to make further improvements to the way we manipulate irqflags and DAIF bits. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24irqchip/gic-v3: Detect GICD_CTRL.DS and SCR_EL3.FIQ earlierMark Rutland
In subsequent patches the GICv3 driver will choose the regular interrupt priority at boot time, dependent on the configuration of GICD_CTRL.DS and SCR_EL3.FIQ. This will need to be chosen before we configure the distributor with default prioirities for all the interrupts, which happens before we currently detect these in gic_cpu_sys_reg_init(). Add a new gic_prio_init() function to detect these earlier and log them to the console so that any problems can be debugged more easily. This also allows the uniformity checks in gic_cpu_sys_reg_init() to be simplified, as we can compare directly with the boot CPU values which were recorded earlier. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24irqchip/gic-v3: Make distributor priorities variablesMark Rutland
In subsequent patches the GICv3 driver will choose the regular interrupt priority at boot time. In preparation for using dynamic priorities, place the priorities in variables and update the code to pass these as parameters. Users of GICD_INT_DEF_PRI_X4 are modified to replicate the priority byte using REPEAT_BYTE_U32(). There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24irqchip/gic-common: Remove sync_access callbackMark Rutland
The gic_configure_irq(), gic_dist_config(), and gic_cpu_config() functions each take an optional "sync_access" callback, but in almost all cases this is not used. The only user is the GICv3 driver's gic_cpu_init() function, which uses gic_redist_wait_for_rwp() as the "sync_access" callback for gic_cpu_config(). It would be simpler and clearer to remove the callback and have the GICv3 driver call gic_redist_wait_for_rwp() explicitly after gic_cpu_config(). Remove the "sync_access" callback, and call gic_redist_wait_for_rwp() explicitly in the GICv3 driver. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24wordpart.h: Add REPEAT_BYTE_U32()Mark Rutland
In some cases it's necessary to replicate a byte across a u32 value, for which REPEAT_BYTE() would be helpful. Currently this requires explicit masking of the result to avoid sparse warnings, as e.g. (u32)REPEAT_BYTE(0xa0)) ... will result in a warning: cast truncates bits from constant value (a0a0a0a0a0a0a0a0 becomes a0a0a0a0) Add a new REPEAT_BYTE_U32() which does the necessary masking internally, so that we don't need to duplicate this for every usage. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24arm64/mm: Stop using ESR_ELx_FSC_TYPE during faultAnshuman Khandual
Fault status codes at page table level 0, 1, 2 and 3 for access, permission and translation faults are architecturally organized in a way, that masking out ESR_ELx_FSC_TYPE, fetches Level 0 status code for the respective fault. Helpers like esr_fsc_is_[translation|permission|access_flag]_fault() mask out ESR_ELx_FSC_TYPE before comparing against corresponding Level 0 status code as the kernel does not yet care about the page table level, where in the fault really occurred previously. This scheme is starting to crumble after FEAT_LPA2 when level -1 got added. Fault status code for translation fault at level -1 is 0x2B which does not follow ESR_ELx_FSC_TYPE, requiring esr_fsc_is_translation_fault() changes. This changes above helpers to compare against individual fault status code values for each page table level and stop using ESR_ELx_FSC_TYPE, which is losing its value as a common mask. Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240618034703.3622510-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-21arm64: Kconfig: fix typo in __builtin_return_adddressMike Rapoport (IBM)
Comment about BUILTIN_RETURN_ADDRESS_STRIPS_PAC spells __builtin_return_adddress with a triple 'd', fix it. Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Link: https://lore.kernel.org/r/20240620174038.3721466-1-rppt@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13ARM64: reloc_test: add missing MODULE_DESCRIPTION() macroJeff Johnson
With ARCH=arm64, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in arch/arm64/kernel/arm64-reloc-test.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240612-md-arch-arm64-kernel-v1-1-1fafe8d11df3@quicinc.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13kselftest/arm64: Fix a couple of spelling mistakesColin Ian King
There are two spelling mistakes in some error messages. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240613073429.1797451-1-colin.i.king@gmail.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13arm64: FFH: Move ACPI specific code into drivers/acpi/arm64/Sudeep Holla
The ACPI FFH Opregion code can be moved out of arm64 arch code as it just uses SMCCC. Move all the ACPI FFH Opregion code into drivers/acpi/arm64/ffh.c Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/20240605131458.3341095-4-sudeep.holla@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13arm64: cpuidle: Move ACPI specific code into drivers/acpi/arm64/Sudeep Holla
The ACPI cpuidle LPI FFH code can be moved out of arm64 arch code as it just uses SMCCC. Move all the ACPI cpuidle LPI FFH code into drivers/acpi/arm64/cpuidle.c Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/20240605131458.3341095-3-sudeep.holla@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13ACPI: arm64: Sort entries alphabeticallySudeep Holla
Sort the entries in the Makefile alphabetically. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/20240605131458.3341095-2-sudeep.holla@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12arm64: errata: Expand speculative SSBS workaroundMark Rutland
A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS special-purpose register does not affect subsequent speculative instructions, permitting speculative store bypassing for a window of time. We worked around this for Cortex-X4 and Neoverse-V3, in commit: 7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417") ... as per their Software Developer Errata Notice (SDEN) documents: * Cortex-X4 SDEN v8.0, erratum 3194386: https://developer.arm.com/documentation/SDEN-2432808/0800/ * Neoverse-V3 SDEN v6.0, erratum 3312417: https://developer.arm.com/documentation/SDEN-2891958/0600/ Since then, similar errata have been published for a number of other Arm Ltd CPUs, for which the mitigation is the same. This is described in their respective SDEN documents: * Cortex-A710 SDEN v19.0, errataum 3324338 https://developer.arm.com/documentation/SDEN-1775101/1900/?lang=en * Cortex-A720 SDEN v11.0, erratum 3456091 https://developer.arm.com/documentation/SDEN-2439421/1100/?lang=en * Cortex-X2 SDEN v19.0, erratum 3324338 https://developer.arm.com/documentation/SDEN-1775100/1900/?lang=en * Cortex-X3 SDEN v14.0, erratum 3324335 https://developer.arm.com/documentation/SDEN-2055130/1400/?lang=en * Cortex-X925 SDEN v8.0, erratum 3324334 https://developer.arm.com/documentation/109108/800/?lang=en * Neoverse-N2 SDEN v17.0, erratum 3324339 https://developer.arm.com/documentation/SDEN-1982442/1700/?lang=en * Neoverse-V2 SDEN v9.0, erratum 3324336 https://developer.arm.com/documentation/SDEN-2332927/900/?lang=en Note that due to shared design lineage, some CPUs share the same erratum number. Add these to the existing mitigation under CONFIG_ARM64_ERRATUM_3194386. As listing all of the erratum IDs in the runtime description would be unwieldy, this is reduced to: "SSBS not fully self-synchronizing" ... matching the description of the errata in all of the SDENs. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12arm64: errata: Unify speculative SSBS errata logicMark Rutland
Cortex-X4 erratum 3194386 and Neoverse-V3 erratum 3312417 are identical, with duplicate Kconfig text and some unsightly ifdeffery. While we try to share code behind CONFIG_ARM64_WORKAROUND_SPECULATIVE_SSBS, having separate options results in a fair amount of boilerplate code, and this will only get worse as we expand the set of affected CPUs. To reduce this boilerplate, unify the two behind a common Kconfig option. This removes the duplicate text and Kconfig logic, and removes the need for the intermediate ARM64_WORKAROUND_SPECULATIVE_SSBS option. The set of affected CPUs is described as a list so that this can easily be extended. I've used ARM64_ERRATUM_3194386 (matching the Neoverse-V3 erratum ID) as the common option, matching the way we use ARM64_ERRATUM_1319367 to cover Cortex-A57 erratum 1319537 and Cortex-A72 erratum 1319367. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <wilL@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12arm64: cputype: Add Cortex-X925 definitionsMark Rutland
Add cputype definitions for Cortex-X925. These will be used for errata detection in subsequent patches. These values can be found in Table A-285 ("MIDR_EL1 bit descriptions") in issue 0001-05 of the Cortex-X925 TRM, which can be found at: https://developer.arm.com/documentation/102807/0001/?lang=en Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12arm64: cputype: Add Cortex-A720 definitionsMark Rutland
Add cputype definitions for Cortex-A720. These will be used for errata detection in subsequent patches. These values can be found in Table A-186 ("MIDR_EL1 bit descriptions") in issue 0002-05 of the Cortex-A720 TRM, which can be found at: https://developer.arm.com/documentation/102530/0002/?lang=en Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12arm64: cputype: Add Cortex-X3 definitionsMark Rutland
Add cputype definitions for Cortex-X3. These will be used for errata detection in subsequent patches. These values can be found in Table A-263 ("MIDR_EL1 bit descriptions") in issue 07 of the Cortex-X3 TRM, which can be found at: https://developer.arm.com/documentation/101593/0102/?lang=en Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12arm64: mte: Make mte_check_tfsr_*() conditional on KASAN instead of MTEPeter Collingbourne
The check in mte_check_tfsr_el1() is only necessary if HW tag based KASAN is enabled. However, we were also executing the check if MTE is enabled and KASAN is enabled at build time but disabled at runtime. This turned out to cause a measurable increase in power consumption on a specific microarchitecture after enabling MTE. Moreover, on the same system, an increase in invalid syscall latency (as measured by [1]) of around 20-30% (depending on the cluster) was observed after enabling MTE; this almost entirely goes away after removing this check. Therefore, make the check conditional on whether KASAN is enabled rather than on whether MTE is enabled. [1] https://lore.kernel.org/all/CAMn1gO4MwRV8bmFJ_SeY5tsYNPn2ZP56LjAhafygjFaKuu5ouw@mail.gmail.com/ Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/I22d98d1483dd400a95595946552b769a5a1ad7bd Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20240528225131.3577704-1-pcc@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12kselftest/arm64: Fix redundancy of a testcaseDev Jain
Currently, we are writing the same value as we read into the TLS register, hence we cannot confirm update of the register, making the testcase "verify_tpidr_one" redundant. Fix this. Signed-off-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240605115448.640717-1-dev.jain@arm.com [catalin.marinas@arm.com: remove the increment style change] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12kselftest/arm64: Include kernel mode NEON in fp-stressMark Brown
Currently fp-stress only covers userspace use of floating point, it does not cover any kernel mode uses. Since currently kernel mode floating point usage can't be preempted and there are explicit preemption points in the existing implementations this isn't so important for fp-stress but when we readd preemption it will be good to try to exercise it. When the arm64 accelerated crypto operations are implemented we can relatively straightforwardly trigger kernel mode floating point usage by using the crypto userspace API to hash data, using the splice() support in an effort to minimise copying. We use /proc/crypto to check which accelerated implementations are available, picking the first symmetric hash we find. We run the kernel mode test unconditionally, replacing the second copy of the FPSIMD testcase for systems with FPSIMD only. If we don't think there are any suitable kernel mode implementations we fall back to running another copy of fpsimd-stress. There are a number issues with this approach, we don't actually verify that we are using an accelerated (or even CPU) implementation of the algorithm being tested and even with attempting to use splice() to minimise copying there are sizing limits on how much data gets spliced at once. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240521-arm64-fp-stress-kernel-v1-1-e38f107baad4@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12arm64: implement raw_smp_processor_id() using thread_infoPuranjay Mohan
Historically, arm64 implemented raw_smp_processor_id() as a read of current_thread_info()->cpu. This changed when arm64 moved thread_info into task struct, as at the time CONFIG_THREAD_INFO_IN_TASK made core code use thread_struct::cpu for the cpu number, and due to header dependencies prevented using this in raw_smp_processor_id(). As a workaround, we moved to using a percpu variable in commit: 57c82954e77fa12c ("arm64: make cpu number a percpu variable") Since then, thread_info::cpu was reintroduced, and core code was made to use this in commits: 001430c1910df65a ("arm64: add CPU field to struct thread_info") bcf9033e5449bdca ("sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y") Consequently it is possible to use current_thread_info()->cpu again. This decreases the number of emitted instructions like in the following example: Dump of assembler code for function bpf_get_smp_processor_id: 0xffff8000802cd608 <+0>: nop 0xffff8000802cd60c <+4>: nop 0xffff8000802cd610 <+8>: adrp x0, 0xffff800082138000 0xffff8000802cd614 <+12>: mrs x1, tpidr_el1 0xffff8000802cd618 <+16>: add x0, x0, #0x8 0xffff8000802cd61c <+20>: ldrsw x0, [x0, x1] 0xffff8000802cd620 <+24>: ret After this patch: Dump of assembler code for function bpf_get_smp_processor_id: 0xffff8000802c9130 <+0>: nop 0xffff8000802c9134 <+4>: nop 0xffff8000802c9138 <+8>: mrs x0, sp_el0 0xffff8000802c913c <+12>: ldr w0, [x0, #24] 0xffff8000802c9140 <+16>: ret A microbenchmark[1] was built to measure the performance improvement provided by this change. It calls the following function given number of times and finds the runtime overhead: static noinline int get_cpu_id(void) { return smp_processor_id(); } Run the benchmark like: modprobe smp_processor_id nr_function_calls=1000000000 +--------------------------+------------------------+ | | Number of Calls | Time taken | +--------+-----------------+------------------------+ | Before | 1000000000 | 1602888401ns | +--------+-----------------+------------------------+ | After | 1000000000 | 1206212658ns | +--------+-----------------+------------------------+ | Difference (decrease) | 396675743ns (24.74%) | +---------------------------------------------------+ Remove the percpu variable cpu_number as it is used only in set_smp_ipi_range() as a dummy variable to be passed to ipi_handler(). Use irq_stat in place of cpu_number here like arm32. [1] https://github.com/puranjaymohan/linux/commit/77d3fdd Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20240503171847.68267-2-puranjay@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12arm64/arch_timer: include <linux/percpu.h>Puranjay Mohan
arch_timer.h includes linux/smp.h since the commit: 6acc71ccac7187fc ("arm64: arch_timer: Allows a CPU-specific erratum to only affect a subset of CPUs") It was included to use DEFINE_PER_CPU(), etc. But It should have included <linux/percpu.h> rather than <linux/smp.h>. It worked because smp.h includes percpu.h. The next commit will remove percpu.h from smp.h and it will break this usage. Explicitly include percpu.h and remove smp.h Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240503171847.68267-1-puranjay@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12arm64/cpufeatures/kvm: Add ARMv8.9 FEAT_ECBHB bits in ID_AA64MMFR1 registerNianyao Tang
Enable ECBHB bits in ID_AA64MMFR1 register as per ARM DDI 0487K.a specification. When guest OS read ID_AA64MMFR1_EL1, kvm emulate this reg using ftr_id_aa64mmfr1 and always return ID_AA64MMFR1_EL1.ECBHB=0 to guest. It results in guest syscall jump to tramp ventry, which is not needed in implementation with ID_AA64MMFR1_EL1.ECBHB=1. Let's make the guest syscall process the same as the host. Signed-off-by: Nianyao Tang <tangnianyao@huawei.com> Link: https://lore.kernel.org/r/20240611122049.2758600-1-tangnianyao@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-09Linux 6.10-rc3v6.10-rc3Linus Torvalds
2024-06-09Merge tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Update copies of kernel headers, which resulted in support for the new 'mseal' syscall, SUBVOL statx return mask bit, RISC-V and PPC prctls, fcntl's DUPFD_QUERY, POSTED_MSI_NOTIFICATION IRQ vector, 'map_shadow_stack' syscall for x86-32. - Revert perf.data record memory allocation optimization that ended up causing a regression, work is being done to re-introduce it in the next merge window. - Fix handling of minimal vmlinux.h file used with BPF's CO-RE when interrupting the build. * tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf bpf: Fix handling of minimal vmlinux.h file when interrupting the build Revert "perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event" tools headers arm64: Sync arm64's cputype.h with the kernel sources tools headers uapi: Sync linux/stat.h with the kernel sources to pick STATX_SUBVOL tools headers UAPI: Update i915_drm.h with the kernel sources tools headers UAPI: Sync kvm headers with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers: Update the syscall tables and unistd.h, mostly to support the new 'mseal' syscall perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources to pick POSTED_MSI_NOTIFICATION perf beauty: Update copy of linux/socket.h with the kernel sources tools headers UAPI: Sync fcntl.h with the kernel sources to pick F_DUPFD_QUERY tools headers UAPI: Sync linux/prctl.h with the kernel sources tools include UAPI: Sync linux/stat.h with the kernel sources
2024-06-09Merge tag 'edac_urgent_for_v6.10_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fixes from Borislav Petkov: - Convert PCI core error codes to proper error numbers since latter get propagated all the way up to the module loading functions * tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/igen6: Convert PCIBIOS_* return codes to errnos EDAC/amd64: Convert PCIBIOS_* return codes to errnos
2024-06-08Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fix from Stephen Boyd: "One fix for the SiFive PRCI clocks so that the device boots again. This driver was registering clkdev lookups that were always going to be useless. This wasn't a problem until clkdev started returning an error in these cases, causing this driver to fail probe, and thus boot to fail because clks are essential for most drivers. The fix is simple, don't use clkdev because this is a DT based system where clkdev isn't used" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: sifive: Do not register clkdevs for PRCI clocks
2024-06-08Merge tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: "Two small smb3 client fixes: - fix deadlock in umount - minor cleanup due to netfs change" * tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Don't advance the I/O iterator before terminating subrequest smb: client: fix deadlock in smb2_find_smb_tcon()
2024-06-08Merge tag 'for-linus-2024060801' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - fix potential read out of bounds in hid-asus (Andrew Ballance) - fix endian-conversion on little endian systems in intel-ish-hid (Arnd Bergmann) - A couple of new input event codes (Aseda Aboagye) - errors handling fixes in hid-nvidia-shield (Chen Ni), hid-nintendo (Christophe JAILLET), hid-logitech-dj (José Expósito) - current leakage fix while the device is in suspend on a i2c-hid laptop (Johan Hovold) - other assorted smaller fixes and device ID / quirk entry additions * tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: Ignore battery for ELAN touchscreens 2F2C and 4116 HID: i2c-hid: elan: fix reset suspend current leakage dt-bindings: HID: i2c-hid: elan: add 'no-reset-on-power-off' property dt-bindings: HID: i2c-hid: elan: add Elan eKTH5015M dt-bindings: HID: i2c-hid: add dedicated Ilitek ILI2901 schema input: Add support for "Do Not Disturb" input: Add event code for accessibility key hid: asus: asus_report_fixup: fix potential read out of bounds HID: logitech-hidpp: add missing MODULE_DESCRIPTION() macro HID: intel-ish-hid: fix endian-conversion HID: nintendo: Fix an error handling path in nintendo_hid_probe() HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode() HID: core: remove unnecessary WARN_ON() in implement() HID: nvidia-shield: Add missing check for input_ff_create_memless HID: intel-ish-hid: Fix build error for COMPILE_TEST
2024-06-08Merge tag 'kbuild-fixes-v6.10-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix the initial state of the save button in 'make gconfig' - Improve the Kconfig documentation - Fix a Kconfig bug regarding property visibility - Fix build breakage for systems where 'sed' is not installed in /bin - Fix a false warning about missing MODULE_DESCRIPTION() * tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.o kbuild: explicitly run mksysmap as sed script from link-vmlinux.sh kconfig: remove wrong expr_trans_bool() kconfig: doc: document behavior of 'select' and 'imply' followed by 'if' kconfig: doc: fix a typo in the note about 'imply' kconfig: gconf: give a proper initial state to the Save button kconfig: remove unneeded code for user-supplied values being out of range
2024-06-08Merge tag 'media/v6.10-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - fixes for the new ipu6 driver (and related fixes to mei csi driver) - fix a double debugfs remove logic at mgb4 driver - a documentation fix * tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: intel/ipu6: add csi2 port sanity check in notifier bound media: intel/ipu6: update the maximum supported csi2 port number to 6 media: mei: csi: Warn less verbosely of a missing device fwnode media: mei: csi: Put the IPU device reference media: intel/ipu6: fix the buffer flags caused by wrong parentheses media: intel/ipu6: Fix an error handling path in isys_probe() media: intel/ipu6: Move isys_remove() close to isys_probe() media: intel/ipu6: Fix some redundant resources freeing in ipu6_pci_remove() media: Documentation: v4l: Fix ACTIVE route flag media: mgb4: Fix double debugfs remove
2024-06-08Merge tag 'irq-urgent-2024-06-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: - Fix possible memory leak the riscv-intc irqchip driver load failures - Fix boot crash in the sifive-plic irqchip driver caused by recently changed boot initialization order - Fix race condition in the gic-v3-its irqchip driver * tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update() irqchip/sifive-plic: Chain to parent IRQ after handlers are ready irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails
2024-06-08Merge tag 'x86-urgent-2024-06-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Miscellaneous fixes: - Fix kexec() crash if call depth tracking is enabled - Fix SMN reads on inaccessible registers on certain AMD systems" * tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd_nb: Check for invalid SMN reads x86/kexec: Fix bug with call depth tracking
2024-06-08Merge tag 'perf-urgent-2024-06-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fix from Ingo Molnar: "Fix race between perf_event_free_task() and perf_event_release_kernel() that can result in missed wakeups and hung tasks" * tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix missing wakeup when waiting for context reference
2024-06-08Merge tag 'locking-urgent-2024-06-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking doc fix from Ingo Molnar: "Fix typos in the kerneldoc of some of the atomic APIs" * tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldoc
2024-06-07Merge tag 'mm-hotfixes-stable-2024-06-07-15-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "14 hotfixes, 6 of which are cc:stable. All except the nilfs2 fix affect MM and all are singletons - see the chagelogs for details" * tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors mm: fix xyz_noprof functions calling profiled functions codetag: avoid race at alloc_slab_obj_exts mm/hugetlb: do not call vma_add_reservation upon ENOMEM mm/ksm: fix ksm_zero_pages accounting mm/ksm: fix ksm_pages_scanned accounting kmsan: do not wipe out origin when doing partial unpoisoning vmalloc: check CONFIG_EXECMEM in is_vmalloc_or_module_addr() mm: page_alloc: fix highatomic typing in multi-block buddies nilfs2: fix potential kernel bug due to lack of writeback flag waiting memcg: remove the lockdep assert from __mod_objcg_mlstate() mm: arm64: fix the out-of-bounds issue in contpte_clear_young_dirty_ptes mm: huge_mm: fix undefined reference to `mthp_stats' for CONFIG_SYSFS=n mm: drop the 'anon_' prefix for swap-out mTHP counters
2024-06-07Merge tag 'gpio-fixes-for-v6.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - interrupt handling and Kconfig fixes for gpio-tqmx86 - add a buffer for storing output values in gpio-tqmx86 as reading back the registers always returns the input values - add missing MODULE_DESCRIPTION()s to several GPIO drivers * tag 'gpio-fixes-for-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: add missing MODULE_DESCRIPTION() macros gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type gpio: tqmx86: store IRQ trigger type and unmask status separately gpio: tqmx86: introduce shadow register for GPIO output value gpio: tqmx86: fix typo in Kconfig label
2024-06-07Merge tag 'block-6.10-20240607' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for null_blk block size validation (Andreas) - NVMe pull request via Keith: - Use reserved tags for special fabrics operations (Chunguang) - Persistent Reservation status masking fix (Weiwen) * tag 'block-6.10-20240607' of git://git.kernel.dk/linux: null_blk: fix validation of block size nvme: fix nvme_pr_* status code parsing nvme-fabrics: use reserved tag for reg read/write command
2024-06-07Merge tag 'io_uring-6.10-20240607' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix a locking order issue with setting max async thread workers (Hagar) - Fix for a NULL pointer dereference for failed async flagged requests using ring provided buffers. This doesn't affect the current kernel, but it does affect older kernels, and is being queued up for 6.10 just to make the stable process easier (me) - Fix for NAPI timeout calculations for how long to busy poll, and subsequently how much to sleep post that if a wait timeout is passed in (me) - Fix for a regression in this release cycle, where we could end up using a partially unitialized match value for io-wq (Su) * tag 'io_uring-6.10-20240607' of git://git.kernel.dk/linux: io_uring: fix possible deadlock in io_register_iowq_max_workers() io_uring/io-wq: avoid garbage value of 'match' in io_wq_enqueue() io_uring/napi: fix timeout calculation io_uring: check for non-NULL file pointer in io_file_can_poll()
2024-06-07Merge tag 'for-6.10-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix handling of folio private changes. The private value holds pointer to our extent buffer structure representing a metadata range. Release and create of the range was not properly synchronized when updating the private bit which ended up in double folio_put, leading to all sorts of breakage - fix a crash, reported as duplicate key in metadata, but caused by a race of fsync and size extending write. Requires prealloc target range + fsync and other conditions (log tree state, timing) - fix leak of qgroup extent records after transaction abort * tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: protect folio::private when attaching extent buffer folios btrfs: fix leak of qgroup extent records after transaction abort btrfs: fix crash on racing fsync and size-extending write into prealloc
2024-06-07Merge tag 'nfsd-6.10-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Fix an occasional memory overwrite caused by a fix added in 6.10 * tag 'nfsd-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Fix loop termination condition in gss_free_in_token_pages()
2024-06-07Merge tag 'riscv-for-linus-6.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - Another fix to avoid allocating pages that overlap with ERR_PTR, which manifests on rv32 - A revert for the badaccess patch I incorrectly picked up an early version of * tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: Revert "riscv: mm: accelerate pagefault when badaccess" riscv: fix overlap of allocated page and PTR_ERR
2024-06-07Merge tag 's390-6.10-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - Do not create PT_LOAD program header for the kenel image when the virtual memory informaton in OS_INFO data is not available. That fixes stand-alone dump failures against kernels that do not provide the virtual memory informaton - Add KVM s390 shared zeropage selftest * tag 's390-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: KVM: s390x: selftests: Add shared zeropage test s390/crash: Do not use VM info if os_info does not have it
2024-06-07Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - Fix spurious CPU hotplug warning message from SETEND emulation code - Fix the build when GCC wasn't inlining our I/O accessor internals * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/io: add constant-argument check arm64: armv8_deprecated: Fix warning in isndep cpuhp starting process
2024-06-07Merge tag 'platform-drivers-x86-v6.10-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - Default silead touchscreen driver to 10 fingers and drop 10 finger setting from all DMI quirks. More of a cleanup then a pure fix, but since the DMI quirks always get updated through the fixes branch this avoids conflicts. - Kconfig fix for randconfig builds - dell-smbios: Fix wrong token data in sysfs - amd-hsmp: Fix driver poking unsupported hw when loaded manually * tag 'platform-drivers-x86-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/amd/hsmp: Check HSMP support on AMD family of processors platform/x86: dell-smbios: Simplify error handling platform/x86: dell-smbios: Fix wrong token data in sysfs platform/x86: yt2-1380: add CONFIG_EXTCON dependency platform/x86: touchscreen_dmi: Use 2-argument strscpy() platform/x86: touchscreen_dmi: Drop "silead,max-fingers" property Input: silead - Always support 10 fingers
2024-06-07Merge tag 'iommu-fixes-v6.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: "Core: - Make iommu-dma code recognize 'force_aperture' again - Fix for potential NULL-ptr dereference from iommu_sva_bind_device() return value AMD IOMMU fixes: - Fix lockdep splat for invalid wait context - Add feature bit check before enabling PPR - Make workqueue name fit into buffer - Fix memory leak in sysfs code" * tag 'iommu-fixes-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix Invalid wait context issue iommu/amd: Check EFR[EPHSup] bit before enabling PPR iommu/amd: Fix workqueue name iommu: Return right value in iommu_sva_bind_device() iommu/dma: Fix domain init iommu/amd: Fix sysfs leak in iommu init