summaryrefslogtreecommitdiff
path: root/arch/s390/include/asm/word-at-a-time.h
AgeCommit message (Collapse)Author
2025-03-18s390: Use inline qualifier for all EX_TABLE and ALTERNATIVE inline assembliesHeiko Carstens
Use asm_inline for all inline assemblies which make use of the EX_TABLE or ALTERNATIVE macros. These macros expand to many lines and the compiler assumes the number of lines within an inline assembly is the same as the number of instructions within an inline assembly. This has an effect on inlining and loop unrolling decisions. In order to avoid incorrect assumptions use asm_inline, which tells the compiler that an inline assembly has the smallest possible size. In order to avoid confusion when asm_inline should be used or not, since a couple of inline assemblies are quite large: the rule is to always use asm_inline whenever the EX_TABLE or ALTERNATIVE macro is used. In specific cases there may be reasons to not follow this guideline, but that should be documented with the corresponding code. Using the inline qualifier everywhere has only a small effect on the kernel image size: add/remove: 0/10 grow/shrink: 19/8 up/down: 1492/-1858 (-366) The only location where this seems to matter is load_unaligned_zeropad() from word-at-a-time.h where the compiler inlines more functions within the dcache code, which is indeed code where performance matters. Suggested-by: Juergen Christ <jchrist@linux.ibm.com> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-02-01kernel.h: removed REPEAT_BYTE from kernel.hTanzir Hasan
This patch creates wordpart.h and includes it in asm/word-at-a-time.h for all architectures. WORD_AT_A_TIME_CONSTANTS depends on kernel.h because of REPEAT_BYTE. Moving this to another header and including it where necessary allows us to not include the bloated kernel.h. Making this implicit dependency on REPEAT_BYTE explicit allows for later improvements in the lib/string.c inclusion list. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Tanzir Hasan <tanzirh@google.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20231226-libstringheader-v6-1-80aa08c7652c@google.com Signed-off-by: Kees Cook <keescook@chromium.org>
2023-10-16s390: add support for DCACHE_WORD_ACCESSHeiko Carstens
Implement load_unaligned_zeropad() and enable DCACHE_WORD_ACCESS to speed up string operations in fs/dcache.c and fs/namei.c. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-10-16s390: provide word-at-a-time implementationHeiko Carstens
Provide an s390 specific word-at-a-time implementation. Compared to the generic variant the generated code for has_zero() is slightly better. However find_zero() is much simpler since it reuses the result of __fls() aka flogr() and now comes without any conditional branches, while the generic variant has three of them. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>