diff options
author | Michael Ellerman <mpe@ellerman.id.au> | 2022-12-02 18:04:56 +1100 |
---|---|---|
committer | Michael Ellerman <mpe@ellerman.id.au> | 2022-12-02 18:04:56 +1100 |
commit | 22db71bcba826c607324a8ee1b21f5cf7ec71e8b (patch) | |
tree | 69e8c363296c8b68184e9fc9d9eb702289d5fabc /arch/powerpc/Kconfig | |
parent | 94ba4f2c33f42dae7813dc169a177e922a39560c (diff) | |
parent | 0b2199841a7952d01a717b465df028b40b2cf3e9 (diff) |
Merge branch 'topic/qspinlock' into next
Merge Nick's powerpc qspinlock implementation. From his cover letter:
This replaces the generic queued spinlock code (like s390 does) with our
own implementation.
Generic PV qspinlock code is causing latency / starvation regressions on
large systems that are resulting in hard lockups reported (mostly in
pathoogical cases). The generic qspinlock code has a number of issues
important for powerpc hardware and hypervisors that aren't easily solved
without changing code that would impact other architectures. Follow
s390's lead and implement our own for now.
Issues for powerpc using generic qspinlocks:
- The previous lock value should not be loaded with simple loads, and
need not be passed around from previous loads or cmpxchg results,
because powerpc uses ll/sc-style atomics which can perform more
complex operations that do not require this. powerpc implementations
tend to prefer loads use larx for improved coherency performance.
- The queueing process should absolutely minimise the number of stores
to the lock word to reduce exclusive coherency probes, important for
large system scalability. The pending logic is counter productive
here.
- Non-atomic unlock for paravirt locks is important (atomic
instructions tend to still be more expensive than x86 CPUs).
- Yielding to the lock owner is important in the oversubscribed
paravirt case, which requires storing the owner CPU in the lock
word.
- More control of lock stealing for the paravirt case is important to
keep latency down on large systems.
- The lock acquisition operation should always be made with a special
variant of atomic instructions with the lock hint bit set,
including (especially) in the queueing paths. This is more a matter
of adding more arch lock helpers so not an insurmountable problem
for generic code.
Diffstat (limited to 'arch/powerpc/Kconfig')
-rw-r--r-- | arch/powerpc/Kconfig | 3 |
1 files changed, 1 insertions, 2 deletions
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 1a134c9769f8..fe2aa445b654 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -99,7 +99,7 @@ config LOCKDEP_SUPPORT config GENERIC_LOCKBREAK bool default y - depends on SMP && PREEMPTION + depends on SMP && PREEMPTION && !PPC_QUEUED_SPINLOCKS config GENERIC_HWEIGHT bool @@ -158,7 +158,6 @@ config PPC select ARCH_USE_CMPXCHG_LOCKREF if PPC64 select ARCH_USE_MEMTEST select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS - select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT select ARCH_WANT_IPC_PARSE_VERSION select ARCH_WANT_IRQS_OFF_ACTIVATE_MM |