summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-11-03s390/qdio: sanitize put_indicatorSebastian Ott
qdio maintains an array of struct indicator_t. put_indicator takes a pointer to a member of a struct indicator_t within that array, calculates the index, and uses the array and the index to get the struct indicator_t. Simply use the pointer directly. Although the pointer happens to point to the first member of that struct use the container_of macro. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-03s390/qdio: use atomic_cmpxchgSebastian Ott
qdio uses atomic_read to find an unused indicator and atomic_set to flag it as used. This could lead to multiple users getting the same indicator. Use atomic_cmpxchg instead. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-02s390/nmi: avoid using long-displacement facilityVasily Gorbik
__LC_MCESAD is currently 4528 /* offsetof(struct lowcore, mcesad) */ that would require long-displacement facility for lg, which we don't have on z900. Fixes: 3037a52f9846 ("s390/nmi: do register validation as early as possible") Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-02s390: pass endianness info to sparseLuc Van Oostenryck
s390 is big-endian only but sparse assumes the same endianness as the building machine. This is problematic for code which expect __BYTE_ORDER__ being correctly predefined by the compiler which sparse can then pre-process differently from what gcc would, depending on the building machine endianness. Fix this by letting sparse know about the architecture endianness. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-26s390/decompressor: remove informational messagesMartin Schwidefsky
The decompressor for bzImage prints two informational messages which are not really helpful. The decompression step is fast and if something bad happens an error message will be printed. Remove the noise. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-26vmur: convert urdev.ref_count from atomic_t to refcount_tElena Reshetova
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable urdev.ref_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-26s390/cpum_cf: add hardware counter support for IBM z14Hendrik Brueckner
Add the hardware counters that are available with z14. With z14, the number of problem-state counters is reduced. The initialization is updated respectively. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-23s390/zcrypt: Introduce QACT support for AP bus devices.Harald Freudenberger
This patch introduces a new ap_qact() function which exploits the PQAP(QACT) subfunction. QACT is a new interface to Query the Ap Compatilibity Type based on a given AP qid, type, mode and version. Based on this new function the AP bus scan code is slightly reworked to use this new interface for querying the compatible type for each new AP queue device detected. So new and unknown devices can get automatically mapped to a compatible type and handled without the need for toleration patches for every new hardware. The currently highest known hardware is CEX6S. With this patch a possible successor can get queried for a combatible type known by the device driver without the need for an toleration patch. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-23s390/zcrypt: Enable special header file flag for AU CPRPHarald Freudenberger
With the CEX6 there is a new CPRB (subfunction AU) used to generate protected keys from secure keys. This new CPRB needs to have the special flag set in the queue message header struct which is introduced with this fix. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-23s390/zcrypt: CEX6S exploitationHarald Freudenberger
This patch adds the full CEX6S card support to the zcrypt device driver. A CEX6A/C/P is detected and displayed as such, the card and queue device driver code is updated to recognize it and the relative weight values for CEX4, CEX5 and CEX6 have been updated. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-19s390/nmi: do register validation as early as possibleMartin Schwidefsky
The validation of the CPU registers in the machine check handler is currently split into two parts. The first part is done at the start of the low level mcck_int_handler function, this includes the CPU timer register and the general purpose registers. The second part is done a bit later in s390_do_machine_check for all the other registers, including the control registers, floating pointer control, vector or floating pointer registers, the access registers, the guarded storage registers, the TOD programmable registers and the clock comparator. This is working fine to far but in theory a future extensions could cause the C code to use registers that are not validated yet. A better approach is to validate all CPU registers in "safe" assembler code before any C function is called. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-19s390/nmi: allocation of the extended save areaMartin Schwidefsky
The machine check extended save area is needed to store the vector registers and the guarded storage control block when a CPU is interrupted by a machine check. Move the slab cache allocation of the full save area to nmi.c, for early boot use a static __initdata block. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-19s390/ctl_reg: move control register definitions to ctl_reg.hMartin Schwidefsky
The nmi.h header has some constant defines for control register bits. These definitions should really be located in ctl_reg.h. Move and rename the defines. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-19s390/ctl_reg: use decoding unions in update_cr_regsMartin Schwidefsky
Add a decoding union for the bits in control registers 2 and use 'union ctlreg0' and 'union ctlreg2' in update_cr_regs to improve readability. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-19s390/nmi: use smp_emergency_stop instead of smp_send_stopMartin Schwidefsky
The smp_send_stop() function can be called from s390_handle_damage while DAT is off. This happens if a machine check indicates that kernel gprs or control registers can not be restored. The function smp_send_stop reenables DAT via __load_psw_mask. That should work for the case of lost kernel gprs and the system will do the expected stop of all CPUs. But if control registers are lost, in particular CR13 with the home space ASCE, interesting secondary crashes may occur. Make smp_emergency_stop callable from nmi.c and remove the cpumask argument. Replace the smp_send_stop call with smp_emergency_stop in the s390_handle_damage function. In addition add notrace and NOKPROBE_SYMBOL annotations for all functions required for the emergency shutdown. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-18s390/vdso: move boot_vdso_data to vdso.cMartin Schwidefsky
The boot_vdso_data variable is related to the vdso code, the magic of the initial vdso area for the early boot and the replacement of it in vdso_init should all be put into vdso.c. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-18s390/spinlock: use cpu alternatives to enable niai instructionVasily Gorbik
Enable niai instruction in the spinlock code at run-time for machines on which facility 49 is available (zEC12 and newer). Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-18s390: introduce CPU alternativesVasily Gorbik
Implement CPU alternatives, which allows to optionally patch newer instructions at runtime, based on CPU facilities availability. A new kernel boot parameter "noaltinstr" disables patching. Current implementation is derived from x86 alternatives. Although ideal instructions padding (when altinstr is longer then oldinstr) is added at compile time, and no oldinstr nops optimization has to be done at runtime. Also couple of compile time sanity checks are done: 1. oldinstr and altinstr must be <= 254 bytes long, 2. oldinstr and altinstr must not have an odd length. alternative(oldinstr, altinstr, facility); alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2); Both compile time and runtime padding consists of either 6/4/2 bytes nop or a jump (brcl) + 2 bytes nop filler if padding is longer then 6 bytes. .altinstructions and .altinstr_replacement sections are part of __init_begin : __init_end region and are freed after initialization. Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-18s390/dasd: remove unused debug macrosSebastian Ott
Get rid of unused wrapper macros around debug_sprintf_exception. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-18s390/debug: only write data onceSebastian Ott
debug_event_common memsets the active debug entry with zeros to prevent stale data leakage. This is overwritten with the actual debug data in the next step. Only write zeros to that part of the debug entry that's not used by new debug data. Micro benchmarks show a 2-10% reduction of cpu cycles with this approach. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-18s390/debug: improve debug_eventSebastian Ott
debug_event currently truncates the data if used with a size larger than the buf_size of the debug feature. For lots of callers of this function, wrappers have been implemented that loop until all data is handled. Move that functionality into debug_event_common and get rid of the wrappers. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-18s390/kexec: Fix checksum validation return code for kdumpPhilipp Rudo
Before kexec boots to a crash kernel it checks whether the image in memory changed after load. This is done by the function kdump_csum_valid, which returns true, i.e. an int != 0, on success and 0 otherwise. In other words when kdump_csum_valid returns an error code it means that the validation succeeded. This is not only counterintuitive but also produces the wrong result if the kernel was build without CONFIG_CRASH_DUMP. Fix this by making kdump_csum_valid return a bool. Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-16Merge tag 'vfio-ccw-20171016' of ↵Martin Schwidefsky
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features Pull vfio-ccw update from Cornelia Huck: "Some improvements in data handling for vfio-ccw."
2017-10-16vfio: ccw: validate the count field of a ccw before pinningDong Jia Shi
If the count field of a ccw is zero, there is no need to try to pin page(s) for it. Let's check the count value before starting pinning operations. Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-Id: <20171011023822.42948-3-bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-10-16vfio: ccw: bypass bad idaw address when fetching IDAL ccwsDong Jia Shi
We currently return the same error code (-EFAULT) to indicate two different error cases: 1. a bug in vfio-ccw implementation has been found. 2. a buggy channel program has been detected. This brings difficulty for userland program (specifically Qemu) to handle. Let's use -EFAULT to only indicate the first case. For the second case, we simply hand over the ccws to lower level for further handling. Notice: Once a bad idaw address is detected, the current behavior is to suppress the ssch. With this fix, the channel program will be accepted, and part of the channel program (the part ahead of the bad idaw) could possibly be executed by the device before I/O conclusion. Suggested-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-Id: <20171011023822.42948-2-bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2017-10-16s390: update defconfigMartin Schwidefsky
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-16s390/debug: adjust coding styleHeiko Carstens
The debug feature code hasn't been touched in ages and the code also looks like this. Therefore clean up the code so it looks a bit more like current coding style. There is no functional change - actually I made also sure that the generated code with performance_defconfig is identical. A diff of old vs new with "objdump -d" is empty. The code is still not checkpatch clean, but that was not the goal. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-16s390/pkey: fix kzalloc-simple.cocci warningsVasyl Gomonovych
drivers/s390/crypto/pkey_api.c:128:11-18: WARNING: kzalloc should be used for cprbmem, instead of kmalloc/memset Use kzalloc rather than kmalloc followed by memset with 0 Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com> Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-12s390/kprobes: remove KPROBE_SWAP_INST stateHeiko Carstens
For an unknown reason the s390 kprobes instruction replacement function modifies the kprobe_status of the current CPU to KPROBE_SWAP_INST. This was supposed to catch traps that happened during instruction patching. Such a fault is not supposed to happen, and silently discarding such a fault is certainly also not what we want. In fact s390 is the only architecture which has this odd piece of code. Just remove this and behave like all other architectures. This was pointed out by Jens Remus. Reported-by: Jens Remus <jremus@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09s390: cleanup string ops prototypesHeiko Carstens
Just some trivial changes like removing the extern keyword from the header file, renaming arguments to match the man pages, and whitespace removal. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09s390: optimize memset implementationHeiko Carstens
Like for the memset16/32/64 variants avoid that subsequent mvc instructions depend on each other since that might have negative performance impacts. This patch is currently hardly relevant since at least gcc 7.1 generates only inline memset code and not a single memset call. However there is no reason to not provide an optimized version just in case gcc generates memset calls again, like it did in the past. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09s390/mm: use memset64 instead of clear_tableHeiko Carstens
Use memset64 instead of the (now) open-coded variant clear_table. Performance wise there is no difference. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09s390: implement memset16, memset32 & memset64Heiko Carstens
Provide fast versions of the new memset variants. E.g. the generic memset64 is ten times slower than the optimized version if used on a whole page. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09Merge branch 'sthyi' into featuresMartin Schwidefsky
Add the store-hypervisor-information code into features using a tip branch for parallel merging into the KVM tree.
2017-10-09s390/sthyi: add s390_sthyi system callQingFeng Hao
Add a syscall of s390_sthyi to implement STHYI instruction in LPAR which reuses the implementation for KVM by Janosch Frank - commit 95ca2cb57985 ("KVM: s390: Add sthyi emulation"). STHYI(Store Hypervisor Information) is an emulated z/VM instruction that provides a guest with basic information about the layers it is running on. This includes information about the cpu configuration of both the machine and the lpar, as well as their names, machine model and machine type. This information enables an application to determine the maximum capacity of CPs and IFLs available to software. For the arguments of s390_sthyi, code shall be 0 and flags is reserved for future use, info is the output argument to store the required hypervisor info. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09s390/sthyi: add cache to store hypervisor infoQingFeng Hao
STHYI requires extensive locking in the higher hypervisors and is very computational/memory expensive. Therefore we cache the retrieved hypervisor info whose valid period is 1s with mutex to allow concurrent access. rw semaphore can't benefit here due to cache line bounce. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09s390/sthyi: reorganize sthyi implementationQingFeng Hao
As we need to support sthyi instruction on LPAR too, move the common code to kernel part and kvm related code to intercept.c for better reuse. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-04s390: use generic rwsem implementationHeiko Carstens
We never optimized our rwsem inline assemblies to make use of the new atomic instructions. The generic rwsem implementation implicitly makes use of the new instructions, since it implements the required rwsem primitives with atomic operations, which we did optimize. However even when compiling for old architectures the generic variant still generates better code. So it's time to simply remove our old code and switch to the generic implementation. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/disassembler: add new z14 instructionsHeiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/disassembler: add missing z13 instructionsHeiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/disassembler: add sthyi instructionHeiko Carstens
This instruction came with a z/VM extension and not with a specific machine generation. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/disassembler: remove double instructionsHeiko Carstens
Remove a couple of instructions that are listed twice. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/disassembler: fix LRDFU formatHeiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/disassembler: add missing end marker for e7 tableHeiko Carstens
The e7 opcode table does not have an end marker. Hence when trying to find an unknown e7 instruction the code will access memory behind the table until it finds something that matches the opcode, or the kernel crashes, whatever comes first. This affects not only the in-kernel disassembler but also uprobes and kprobes which refuse to set a probe on unknown instructions, and therefore search the opcode tables to figure out if instructions are known or not. Cc: <stable@vger.kernel.org> # v3.18+ Fixes: 3585cb0280654 ("s390/disassembler: add vector instructions") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/virtio: simplify MakefileHeiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/virtio: remove the old KVM virtio transportThomas Huth
There is no recent user space application available anymore which still supports this old virtio transport. Additionally, commit 3b2fbb3f06ef ("virtio/s390: deprecate old transport") introduced a deprecation message in the driver, and apparently nobody complained so far that it is still required. So let's simply remove it. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/ccwgroup: tie a ccwgroup driver to its ccw driverJulian Wiedmann
When grouping devices, the ccwgroup core only checks whether all of the devices are bound to the same ccw_driver. It has no means of checking if the requesting ccwgroup driver actually supports this device type. qeth implements its own device matching in qeth_core_probe_device(), while ctcm and lcs currently have no sanity-checking at all. Enable ccwgroup drivers to optionally defer the device type checking to the ccwgroup core, by specifying their supported ccw_driver. This allows us drop the device type matching from qeth, and improves the robustness of ctcm and lcs. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/crypto: add s390 platform specific aes gcm support.Harald Freudenberger
This patch introduces gcm(aes) support into the aes_s390 kernel module. Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29s390/crypto: add inline assembly for KMA instruction to cpacf.hPatrick Steuer
Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-28s390/rwlock: introduce rwlock wait queueingMartin Schwidefsky
Like the common queued rwlock code the s390 implementation uses the queued spinlock code on a spinlock_t embedded in the rwlock_t to achieve the queueing. The encoding of the rwlock_t differs though, the counter field in the rwlock_t is split into two parts. The upper two bytes hold the write bit and the write wait counter, the lower two bytes hold the read counter. The arch_read_lock operation works exactly like the common qrwlock but the enqueue operation for a writer follows a diffent logic. After the failed inline try to get the rwlock in write, the writer first increases the write wait counter, acquires the wait spin_lock for the queueing, and then loops until there are no readers and the write bit is zero. Without the write wait counter a CPU that just released the rwlock could immediately reacquire the lock in the inline code, bypassing all outstanding read and write waiters. For s390 this would cause massive imbalances in favour of writers in case of a contended rwlock. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>