summaryrefslogtreecommitdiff
path: root/arch/arm64/include/asm/percpu.h
AgeCommit message (Collapse)Author
2020-12-04KVM: arm64: Support per_cpu_ptr in nVHE hyp codeDavid Brazdil
When compiling with __KVM_NVHE_HYPERVISOR__, redefine per_cpu_offset() to __hyp_per_cpu_offset() which looks up the base of the nVHE per-CPU region of the given cpu and computes its offset from the .hyp.data..percpu section. This enables use of per_cpu_ptr() helpers in nVHE hyp code. Until now only this_cpu_ptr() was supported by setting TPIDR_EL2. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201202184122.26046-14-dbrazdil@google.com
2020-09-30kvm: arm64: Remove __hyp_this_cpu_readDavid Brazdil
this_cpu_ptr is meant for use in kernel proper because it selects between TPIDR_EL1/2 based on nVHE/VHE. __hyp_this_cpu_ptr was used in hyp to always select TPIDR_EL2. Unify all users behind this_cpu_ptr and friends by selecting _EL2 register under __KVM_NVHE_HYPERVISOR__. VHE continues selecting the register using alternatives. Under CONFIG_DEBUG_PREEMPT, the kernel helpers perform a preemption check which is omitted by the hyp helpers. Preserve the behavior for nVHE by overriding the corresponding macros under __KVM_NVHE_HYPERVISOR__. Extend the checks into VHE hyp code. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Andrew Scull <ascull@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-5-dbrazdil@google.com
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-12arm64: percpu: Fix LSE implementation of value-returning pcpu atomicsWill Deacon
Commit 959bf2fd03b5 ("arm64: percpu: Rewrite per-cpu ops to allow use of LSE atomics") introduced alternative code sequences for the arm64 percpu atomics, so that the LSE instructions can be patched in at runtime if they are supported by the CPU. Unfortunately, when patching in the LSE sequence for a value-returning pcpu atomic, the argument registers are the wrong way round. The implementation of this_cpu_add_return() therefore ends up adding uninitialised stack to the percpu variable and returning garbage. As it turns out, there aren't very many users of the value-returning percpu atomics in mainline and we only spotted this due to a failure in the kprobes selftests. In this case, when attempting to single-step over the out-of-line instruction slot, the debug monitors would not be enabled because calling this_cpu_inc_return() on the kernel debug monitor refcount would fail to detect the transition from 0. We would consequently execute past the slot and take an undefined instruction exception from the kernel, resulting in a BUG: | kernel BUG at arch/arm64/kernel/traps.c:421! | PREEMPT SMP | pc : do_undefinstr+0x268/0x278 | lr : do_undefinstr+0x124/0x278 | Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____)) | Call trace: | do_undefinstr+0x268/0x278 | el1_undef+0x10/0x78 | 0xffff00000803c004 | init_kprobes+0x150/0x180 | do_one_initcall+0x74/0x178 | kernel_init_freeable+0x188/0x224 | kernel_init+0x10/0x100 | ret_from_fork+0x10/0x1c Fix the argument order to get the value-returning pcpu atomics working correctly when implemented using the LSE instructions. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-07arm64: percpu: Rewrite per-cpu ops to allow use of LSE atomicsWill Deacon
Our percpu code is a bit of an inconsistent mess: * It rolls its own xchg(), but reuses cmpxchg_local() * It uses various different flavours of preempt_{enable,disable}() * It returns values even for the non-returning RmW operations * It makes no use of LSE atomics outside of the cmpxchg() ops * There are individual macros for different sizes of access, but these are all funneled through a switch statement rather than dispatched directly to the relevant case This patch rewrites the per-cpu operations to address these shortcomings. Whilst the new code is a lot cleaner, the big advantage is that we can use the non-returning ST- atomic instructions when we have LSE. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-09-25arm64: percpu: Initialize ret in the default caseNathan Chancellor
Clang warns that if the default case is taken, ret will be uninitialized. ./arch/arm64/include/asm/percpu.h:196:2: warning: variable 'ret' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] default: ^~~~~~~ ./arch/arm64/include/asm/percpu.h:200:9: note: uninitialized use occurs here return ret; ^~~ ./arch/arm64/include/asm/percpu.h:157:19: note: initialize the variable 'ret' to silence this warning unsigned long ret, loop; ^ = 0 This warning appears several times while building the erofs filesystem. While it's not strictly wrong, the BUILD_BUG will prevent this from becoming a true problem. Initialize ret to 0 in the default case right before the BUILD_BUG to silence all of these warnings. Reported-by: Prasad Sodagudi <psodagud@codeaurora.org> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Dennis Zhou <dennis@kernel.org>
2018-03-27arm64: move percpu cmpxchg implementation from cmpxchg.h to percpu.hWill Deacon
We want to avoid pulling linux/preempt.h into cmpxchg.h, since that can introduce a circular dependency on linux/bitops.h. linux/preempt.h is only needed by the per-cpu cmpxchg implementation, which is better off alongside the per-cpu xchg implementation in percpu.h, so move it there and add the missing #include. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-01-13arm64: alternatives: use tpidr_el2 on VHE hostsJames Morse
Now that KVM uses tpidr_el2 in the same way as Linux's cpu_offset in tpidr_el1, merge the two. This saves KVM from save/restoring tpidr_el1 on VHE hosts, and allows future code to blindly access per-cpu variables without triggering world-switch. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-11arm64: factor out current_stack_pointerMark Rutland
We define current_stack_pointer in <asm/thread_info.h>, though other files and header relying upon it do not have this necessary include, and are thus fragile to changes in the header soup. Subsequent patches will affect the header soup such that directly including <asm/thread_info.h> may result in a circular header include in some of these cases, so we can't simply include <asm/thread_info.h>. Instead, factor current_thread_info into its own header, and have all existing users include this explicitly. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-09arm64: percpu: kill off final ACCESS_ONCE() usesMark Rutland
For several reasons it is preferable to use {READ,WRITE}_ONCE() rather than ACCESS_ONCE(). For example, these handle aggregate types, result in shorter source code, and better document the intended access (which may be useful for instrumentation features such as the upcoming KTSAN). Over a number of patches, most uses of ACCESS_ONCE() in arch/arm64 have been migrated to {READ,WRITE}_ONCE(). For consistency, and the above reasons, this patch migrates the final remaining uses. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-10-19arm64: percpu: rewrite ll/sc loops in assemblyWill Deacon
Writing the outer loop of an LL/SC sequence using do {...} while constructs potentially allows the compiler to hoist memory accesses between the STXR and the branch back to the LDXR. On CPUs that do not guarantee forward progress of LL/SC loops when faced with memory accesses to the same ERG (up to 2k) between the failed STXR and the branch back, we may end up livelocking. This patch avoids this issue in our percpu atomics by rewriting the outer loop as part of the LL/SC inline assembly block. Cc: <stable@vger.kernel.org> Fixes: f97fc810798c ("arm64: percpu: Implement this_cpu operations") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09arm64: use preempt_disable_notrace in _percpu_read/writeChunyan Zhang
When debug preempt or preempt tracer is enabled, preempt_count_add/sub() can be traced by function and function graph tracing, and preempt_disable/enable() would call preempt_count_add/sub(), so in Ftrace subsystem we should use preempt_disable/enable_notrace instead. In the commit 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like events do") the function this_cpu_read() was added to trace_graph_entry(), and if this_cpu_read() calls preempt_disable(), graph tracer will go into a recursive loop, even if the tracing_on is disabled. So this patch change to use preempt_enable/disable_notrace instead in this_cpu_read(). Since Yonghui Yang helped a lot to find the root cause of this problem, so also add his SOB. Signed-off-by: Yonghui Yang <mark.yang@spreadtrum.com> Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-27arm64: force CONFIG_SMP=y and remove redundant #ifdefsWill Deacon
Nobody seems to be producing !SMP systems anymore, so this is just becoming a source of kernel bugs, particularly if people want to use coherent DMA with non-shared pages. This patch forces CONFIG_SMP=y for arm64, removing a modest amount of code in the process. Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-24arm64: percpu: Make this_cpu accessors pre-empt safeSteve Capper
this_cpu operations were implemented for arm64 in: 5284e1b arm64: xchg: Implement cmpxchg_double f97fc81 arm64: percpu: Implement this_cpu operations Unfortunately, it is possible for pre-emption to take place between address generation and data access. This can lead to cases where data is being manipulated by this_cpu for a different CPU than it was called on. Which effectively breaks the spec. This patch disables pre-emption for the this_cpu operations guaranteeing that address generation and data manipulation take place without a pre-emption in-between. Fixes: 5284e1b4bc8a ("arm64: xchg: Implement cmpxchg_double") Fixes: f97fc810798c ("arm64: percpu: Implement this_cpu operations") Reported-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Steve Capper <steve.capper@linaro.org> [catalin.marinas@arm.com: remove space after type cast] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-11-20arm64: percpu: Implement this_cpu operationsSteve Capper
The generic this_cpu operations disable interrupts to ensure that the requested operation is protected from pre-emption. For arm64, this is overkill and can hurt throughput and latency. This patch provides arm64 specific implementations for the this_cpu operations. Rather than disable interrupts, we use the exclusive monitor or atomic operations as appropriate. The following operations are implemented: add, add_return, and, or, read, write, xchg. We also wire up a cmpxchg implementation from cmpxchg.h. Testing was performed using the percpu_test module and hackbench on a Juno board running 3.18-rc4. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-09-08arm64: LLVMLinux: Use global stack register variable for aarch64Mark Charlebois
To support both Clang and GCC, use the global stack register variable vs a local register variable. Author: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Behan Webster <behanw@converseincode.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-28arm64: Fix !CONFIG_SMP kernel buildCatalin Marinas
Commit fb4a96029c8a (arm64: kernel: fix per-cpu offset restore on resume) uses per_cpu_offset() unconditionally during CPU wakeup, however, this is only defined for the SMP case. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Dave P Martin <Dave.Martin@arm.com>
2013-12-19arm64: percpu: implement optimised pcpu access using tpidr_el1Will Deacon
This patch implements optimised percpu variable accesses using the el1 r/w thread register (tpidr_el1) along the same lines as arch/arm/. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>