summaryrefslogtreecommitdiff
path: root/arch/s390/include/asm/fpu-insn-asm.h
AgeCommit message (Collapse)Author
2025-06-16s390: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headersThomas Huth
While the GCC and Clang compilers already define __ASSEMBLER__ automatically when compiling assembler code, __ASSEMBLY__ is a macro that only gets defined by the Makefiles in the kernel. This is bad since macros starting with two underscores are names that are reserved by the C language. It can also be very confusing for the developers when switching between userspace and kernelspace coding, or when dealing with uapi headers that rather should use __ASSEMBLER__ instead. So let's now standardize on the __ASSEMBLER__ macro that is provided by the compilers. This is a completely mechanical patch (done with a simple "sed -i" statement), with some manual fixups done later while rebasing the patch. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20250611140046.137739-3-thuth@redhat.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-09-13s390/vdso: Wire up getrandom() vdso implementationHeiko Carstens
Provide the s390 specific vdso getrandom() architecture backend. _vdso_rng_data required data is placed within the _vdso_data vvar page, by using a hardcoded offset larger than vdso_data. As required the chacha20 implementation does not write to the stack. The implementation follows more or less the arm64 implementations and makes use of vector instructions. It has a fallback to the getrandom() system call for machines where the vector facility is not installed. The check if the vector facility is installed, as well as an optimization for machines with the vector-enhancements facility 2, is implemented with alternatives, avoiding runtime checks. Note that __kernel_getrandom() is implemented without the vdso user wrapper which would setup a stack frame for odd cases (aka very old glibc variants) where the caller has not done that. All callers of __kernel_getrandom() are required to setup a stack frame, like the C ABI requires it. The vdso testcases vdso_test_getrandom and vdso_test_chacha pass. Benchmark on a z16: $ ./vdso_test_getrandom bench-single vdso: 25000000 times in 0.493703559 seconds syscall: 25000000 times in 6.584025337 seconds Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-02-16s390/checksum: provide csum_partial_copy_nocheck()Heiko Carstens
With csum_partial(), which reads all bytes into registers it is easy to also implement csum_partial_copy_nocheck() which copies the buffer while calculating its checksum. For a 512 byte buffer this reduces the runtime by 19%. Compared to the old generic variant (memcpy() + cksm instruction) runtime is reduced by 42%). Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16s390/checksum: provide vector register variant of csum_partial()Heiko Carstens
Provide a faster variant of csum_partial() which uses vector registers instead of the cksm instruction. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16s390/fpu: move, rename, and merge header filesHeiko Carstens
Move, rename, and merge the fpu and vx header files. This way fpu header files have a consistent naming scheme (fpu*.h). Also get rid of the fpu subdirectory and move header files to asm directory, so that all fpu and vx header files can be found at the same location. Merge internal.h header file into other header files, since the internal helpers are used at many locations. so those helper functions are really not internal. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>