summaryrefslogtreecommitdiff
path: root/arch/powerpc/include/asm/vdso/processor.h
diff options
context:
space:
mode:
authorNicholas Piggin <npiggin@gmail.com>2022-09-20 22:22:59 +1000
committerMichael Ellerman <mpe@ellerman.id.au>2022-09-28 19:22:10 +1000
commit9c7bfc2dc21e737e8e4a753630bce675e1e7c0ad (patch)
treeb0feda9271c2502caa3e8f2b6fac5f51460a2dc3 /arch/powerpc/include/asm/vdso/processor.h
parentdabeb572adf24bbd7cb21d1cc4d118bdf2c2ab74 (diff)
powerpc/64s: Make POWER10 and later use pause_short in cpu_relax loops
We want to move away from using SMT priority updates for cpu_relax, and use a 'wait' instruction which is similar to x86. As well as being a much better fit for what everybody else uses and tests with, priority nops are stateful which is nasty (interrupts have to consider they might be taken at a different priority), and they're expensive to execute, similar to a mtSPR which can effect other threads in the pipe. This has shown to give results that are less affected by code alignment on benchmarks that cause a lot of spin waiting (e.g., rwsem contention on unixbench filesystem benchmarks) on POWER10. QEMU TCG only supports this instruction correctly since v7.1, versions without the fix may cause hangs whne running POWER10 CPUs. Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix checkpatch warnings RE the macros] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220920122259.363092-2-npiggin@gmail.com
Diffstat (limited to 'arch/powerpc/include/asm/vdso/processor.h')
-rw-r--r--arch/powerpc/include/asm/vdso/processor.h8
1 files changed, 7 insertions, 1 deletions
diff --git a/arch/powerpc/include/asm/vdso/processor.h b/arch/powerpc/include/asm/vdso/processor.h
index 8d79f994b4aa..80d13207c568 100644
--- a/arch/powerpc/include/asm/vdso/processor.h
+++ b/arch/powerpc/include/asm/vdso/processor.h
@@ -22,7 +22,13 @@
#endif
#ifdef CONFIG_PPC64
-#define cpu_relax() do { HMT_low(); HMT_medium(); barrier(); } while (0)
+#define cpu_relax() \
+ asm volatile(ASM_FTR_IFCLR( \
+ /* Pre-POWER10 uses low ; medium priority nops */ \
+ "or 1,1,1 ; or 2,2,2", \
+ /* POWER10 onward uses pause_short (wait 2,0) */ \
+ PPC_WAIT(2, 0), \
+ %0) :: "i" (CPU_FTR_ARCH_31) : "memory")
#else
#define cpu_relax() barrier()
#endif