summaryrefslogtreecommitdiff
path: root/arch/riscv/lib
AgeCommit message (Collapse)Author
2024-02-09work around gcc bugs with 'asm goto' with outputsLinus Torvalds
We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' macro for that in the past: see commits 3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation bug") and a9f180345f53 ("compiler/gcc4: Make quirk for asm_volatile_goto() unconditional"). Then, much later, we ended up removing the workaround in commit 43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR 58670") because we no longer supported building the kernel with the affected gcc versions, but we left the macro uses around. Now, Sean Christopherson reports a new version of a very similar problem, which is fixed by re-applying that ancient workaround. But the problem in question is limited to only the 'asm goto with outputs' cases, so instead of re-introducing the old workaround as-is, let's rename and limit the workaround to just that much less common case. It looks like there are at least two separate issues that all hit in this area: (a) some versions of gcc don't mark the asm goto as 'volatile' when it has outputs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420 which is easy to work around by just adding the 'volatile' by hand. (b) Internal compiler errors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422 which are worked around by adding the extra empty 'asm' as a barrier, as in the original workaround. but the problem Sean sees may be a third thing since it involves bad code generation (not an ICE) even with the manually added 'volatile'. but the same old workaround works for this case, even if this feels a bit like voodoo programming and may only be hiding the issue. Reported-and-tested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-01-18riscv: lib: Check if output in asm goto supportedCharlie Jenkins
The output field of an asm goto statement is not supported by all compilers. If it is not supported, fallback to the non-optimized code. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Fixes: a04c192eabfb ("riscv: Add checksum library") Link: https://lore.kernel.org/r/20240118-csum_remove_output_operands_asm_goto-v2-1-5d1b73cf93d4@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17Merge patch series "riscv: Add fine-tuned checksum functions"Palmer Dabbelt
Charlie Jenkins <charlie@rivosinc.com> says: Each architecture generally implements fine-tuned checksum functions to leverage the instruction set. This patch adds the main checksum functions that are used in networking. Tested on QEMU, this series allows the CHECKSUM_KUNIT tests to complete an average of 50.9% faster. This patch takes heavy use of the Zbb extension using alternatives patching. To test this patch, enable the configs for KUNIT, then CHECKSUM_KUNIT. I have attempted to make these functions as optimal as possible, but I have not ran anything on actual riscv hardware. My performance testing has been limited to inspecting the assembly, running the algorithms on x86 hardware, and running in QEMU. ip_fast_csum is a relatively small function so even though it is possible to read 64 bits at a time on compatible hardware, the bottleneck becomes the clean up and setup code so loading 32 bits at a time is actually faster. * b4-shazam-merge: kunit: Add tests for csum_ipv6_magic and ip_fast_csum riscv: Add checksum library riscv: Add checksum header riscv: Add static key for misaligned accesses asm-generic: Improve csum_fold Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-0-1c50de5f2167@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17riscv: Add checksum libraryCharlie Jenkins
Provide a 32 and 64 bit version of do_csum. When compiled for 32-bit will load from the buffer in groups of 32 bits, and when compiled for 64-bit will load in groups of 64 bits. Additionally provide riscv optimized implementation of csum_ipv6_magic. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Xiao Wang <xiao.w.wang@intel.com> Link: https://lore.kernel.org/r/20240108-optimize_checksum-v15-4-1c50de5f2167@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16Merge patch series "riscv: support kernel-mode Vector"Palmer Dabbelt
Andy Chiu <andy.chiu@sifive.com> says: This series provides support running Vector in kernel mode. Additionally, kernel-mode Vector can be configured to run without turnning off preemption on a CONFIG_PREEMPT kernel. Along with the suport, we add Vector optimized copy_{to,from}_user. And provide a simple threshold to decide when to run the vectorized functions. We decided to drop vectorized memcpy/memset/memmove for the moment due to the concern of memory side-effect in kernel_vector_begin(). The detailed description can be found at v9[0] This series is composed by 4 parts: patch 1-4: adds basic support for kernel-mode Vector patch 5: includes vectorized copy_{to,from}_user into the kernel patch 6: refactor context switch code in fpu [1] patch 7-10: provides some code refactors and support for preemptible kernel-mode Vector. This series can be merged if we feel any part of {1~4, 5, 6, 7~10} is mature enough. This patch is tested on a QEMU with V and verified that booting, normal userspace operations all work as usual with thresholds set to 0. Also, we test by launching multiple kernel threads which continuously executes and verifies Vector operations in the background. The module that tests these operation is expected to be upstream later. * b4-shazam-merge: riscv: vector: allow kernel-mode Vector with preemption riscv: vector: use kmem_cache to manage vector context riscv: vector: use a mask to write vstate_ctrl riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}() riscv: fpu: drop SR_SD bit checking riscv: lib: vectorize copy_to_user/copy_from_user riscv: sched: defer restoring Vector context for user riscv: Add vector extension XOR implementation riscv: vector: make Vector always available for softirq context riscv: Add support for kernel mode vector Link: https://lore.kernel.org/r/20240115055929.4736-1-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16riscv: lib: vectorize copy_to_user/copy_from_userAndy Chiu
This patch utilizes Vector to perform copy_to_user/copy_from_user. If Vector is available and the size of copy is large enough for Vector to perform better than scalar, then direct the kernel to do Vector copies for userspace. Though the best programming practice for users is to reduce the copy, this provides a faster variant when copies are inevitable. The optimal size for using Vector, copy_to_user_thres, is only a heuristic for now. We can add DT parsing if people feel the need of customizing it. The exception fixup code of the __asm_vector_usercopy must fallback to the scalar one because accessing user pages might fault, and must be sleepable. Current kernel-mode Vector does not allow tasks to be preemptible, so we must disactivate Vector and perform a scalar fallback in such case. The original implementation of Vector operations comes from https://github.com/sifive/sifive-libc, which we agree to contribute to Linux kernel. Co-developed-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Co-developed-by: Nick Knight <nick.knight@sifive.com> Signed-off-by: Nick Knight <nick.knight@sifive.com> Suggested-by: Guo Ren <guoren@kernel.org> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-6-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-16riscv: Add vector extension XOR implementationGreentime Hu
This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen <hankuan.chen@sifive.com> Signed-off-by: Han-Kuan Chen <hankuan.chen@sifive.com> Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240115055929.4736-4-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09use linux/export.h rather than asm-generic/export.hAl Viro
asm-generic/export.h is a wrapper for linux/export.h, with explicit request to use linux/export.h directly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20231214191922.GQ1674809@ZenIV Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06riscv: Use SYM_*() assembly macros instead of deprecated onesClément Léger
ENTRY()/END()/WEAK() macros are deprecated and we should make use of the new SYM_*() macros [1] for better annotation of symbols. Replace the deprecated ones with the new ones and fix wrong usage of END()/ENDPROC() to correctly describe the symbols. [1] https://docs.kernel.org/core-api/asm-annotations.html Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20231024132655.730417-3-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06riscv: use ".L" local labels in assembly when applicableClément Léger
For the sake of coherency, use local labels in assembly when applicable. This also avoid kprobes being confused when applying a kprobe since the size of function is computed by checking where the next visible symbol is located. This might end up in computing some function size to be way shorter than expected and thus failing to apply kprobes to the specified offset. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20231024132655.730417-2-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-05RISC-V: capitalise CMO op macrosConor Dooley
The CMO op macros initially used lower case, as the original iteration of the ALT_CMO_OP alternative stringified the first parameter to finalise the assembly for the standard variant. As a knock-on, the T-Head versions of these CMOs had to use mixed case defines. Commit dd23e9535889 ("RISC-V: replace cbom instructions with an insn-def") removed the asm construction with stringify, replacing it an insn-def macro, rending the lower-case surplus to requirements. As far as I can tell from a brief check, CBO_zero does not see similar use and didn't require the mixed case define in the first place. Replace the lower case characters now for consistency with other insn-def macros in the standard and T-Head forms, and adjust the callsites. Suggested-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230915-aloe-dollar-994937477776@spud Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-16riscv: uaccess: Return the number of bytes effectively not copiedAlexandre Ghiti
It was reported that the riscv kernel hangs while executing the test in [1]. Indeed, the test hangs when trying to write a buffer to a file. The problem is that the riscv implementation of raw_copy_from_user() does not return the correct number of bytes not written when an exception happens and is fixed up, instead it always returns the initial size to copy, even if some bytes were actually copied. generic_perform_write() pre-faults the user pages and bails out if nothing can be written, otherwise it will access the userspace buffer: here the riscv implementation keeps returning it was not able to copy any byte though the pre-faulting indicates otherwise. So generic_perform_write() keeps retrying to access the user memory and ends up in an infinite loop. Note that before the commit mentioned in [1] that introduced this regression, it worked because generic_perform_write() would bail out if only one byte could not be written. So fix this by returning the number of bytes effectively not written in __asm_copy_[to|from]_user() and __clear_user(), as it is expected. Link: https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/ [1] Fixes: ebcbd75e3962 ("riscv: Fix the bug in memory access fixup code") Reported-by: Bo YU <tsu.yubo@gmail.com> Closes: https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/#t Reported-by: Aurelien Jarno <aurelien@aurel32.net> Closes: https://lore.kernel.org/linux-riscv/ZNOnCakhwIeue3yr@aurel32.net/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Link: https://lore.kernel.org/r/20230811150604.1621784-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-26riscv: Allow to downgrade paging mode from the command lineAlexandre Ghiti
Add 2 early command line parameters that allow to downgrade satp mode (using the same naming as x86): - "no5lvl": use a 4-level page table (down from sv57 to sv48) - "no4lvl": use a 3-level page table (down from sv57/sv48 to sv39) Note that going through the device tree to get the kernel command line works with ACPI too since the efi stub creates a device tree anyway with the command line. In KASAN kernels, we can't use the libfdt that early in the boot process since we are not ready to execute instrumented functions. So instead of using the "generic" libfdt, we compile our own versions of those functions that are not instrumented and that are prefixed so that they do not conflict with the generic ones. We also need the non-instrumented versions of the string functions and the prefixed versions of memcpy/memmove. This is largely inspired by commit aacd149b6238 ("arm64: head: avoid relocating the kernel twice for KASLR") from which I removed compilation flags that were not relevant to RISC-V at the moment (LTO, SCS). Also note that we have to link with -z norelro to avoid ld.lld to throw a warning with the new .got sections, like in commit 311bea3cb9ee ("arm64: link with -z norelro for LLD or aarch64-elf"). Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20230424092313.178699-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-14RISC-V: Use Zicboz in clear_page when availableAndrew Jones
Using memset() to zero a 4K page takes 563 total instructions, where 20 are branches. clear_page(), with Zicboz and a 64 byte block size, takes 169 total instructions, where 4 are branches and 33 are nops. Even though the block size is a variable, thanks to alternatives, we can still implement a Duff device without having to do any preliminary calculations. This is achieved by using the alternatives' cpufeature value (the upper 16 bits of patch_id). The value used is the maximum zicboz block size order accepted at the patch site. This enables us to stop patching / unrolling when 4K bytes have been zeroed (we would loop and continue after 4K if the page size would be larger) For 4K pages, unrolling 16 times allows block sizes of 64 and 128 to only loop a few times and larger block sizes to not loop at all. Since cbo.zero doesn't take an offset, we also need an 'add' after each instruction, making the loop body 112 to 160 bytes. Hopefully this is small enough to not cause icache misses. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230224162631.405473-7-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-14riscv: lib: Include hwcap.h directlyAndrew Jones
When using alternatives for cpufeatures we should include hwcap.h directly, rather than through errata_list.h. Opportunistically drop an unused include too. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20230224154601.88163-6-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-28riscv, lib: Fix Zbb strncmpBjörn Töpel
The Zbb optimized strncmp has two parts; a fast path that does XLEN/8B per iteration, and a slow that does one byte per iteration. The idea is to compare aligned XLEN chunks for most of strings, and do the remainder tail in the slow path. The Zbb strncmp has two issues in the fast path: Incorrect remainder handling (wrong compare): Assume that the string length is 9. On 64b systems, the fast path should do one iteration, and one iteration in the slow path. Instead, both were done in the fast path, which lead to incorrect results. An example: strncmp("/dev/vda", "/dev/", 5); Correct by changing "bgt" to "bge". Missing NULL checks in the second string: This could lead to incorrect results for: strncmp("/dev/vda", "/dev/vda\0", 8); Correct by adding an additional check. Fixes: b6fcdb191e36 ("RISC-V: add zbb support to string functions") Suggested-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230228184211.1585641-1-bjorn@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-28RISC-V: improve string-function assemblyHeiko Stuebner
Adapt the suggestions for the assembly string functions that Andrew suggested but that I didn't manage to include into the series that got applied. This includes improvements to two comments, removal of unneeded labels and moving one instruction slightly higher to contradict an explanatory comment. Suggested-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230208225328.1636017-3-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-14riscv: Fix Zbb alternative IDsSamuel Holland
Commit 4bf8860760d9 ("riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions") switched ISA extension alternatives to use the RISCV_ISA_EXT_* macros instead of CPUFEATURE_*. This was mismerged when applied on top of the Zbb series, so the Zbb alternatives referenced the wrong errata ID values. Fixes: 9daca9a5b9ac ("Merge patch series "riscv: improve boot time isa extensions handling"") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Guo Ren <guoren@kernel.org> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230212021534.59121-3-samuel@sholland.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31RISC-V: add zbb support to string functionsHeiko Stuebner
Add handling for ZBB extension and add support for using it as a variant for optimized string functions. Support for the Zbb-str-variants is limited to the GNU-assembler for now, as LLVM has not yet acquired the functionality to selectively change the arch option in assembler code. This is still under review at https://reviews.llvm.org/D123515 Co-developed-by: Christoph Muellner <christoph.muellner@vrull.eu> Signed-off-by: Christoph Muellner <christoph.muellner@vrull.eu> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230113212301.3534711-3-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31RISC-V: add infrastructure to allow different str* implementationsHeiko Stuebner
Depending on supported extensions on specific RISC-V cores, optimized str* functions might make sense. This adds basic infrastructure to allow patching the function calls via alternatives later on. The Linux kernel provides standard implementations for string functions but when architectures want to extend them, they need to provide their own. The added generic string functions are done in assembler (taken from disassembling the main-kernel functions for now) to allow us to control the used registers and extend them with optimized variants. This doesn't override the compiler's use of builtin replacements. So still first of all the compiler will select if a builtin will be better suitable i.e. for known strings. For all regular cases we will want to later select possible optimized variants and in the worst case fall back to the generic implemention added with this change. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230113212301.3534711-2-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-10riscv: lib: uaccess: fix CSR_STATUS SR_SUM bitChen Lifu
Since commit 5d8544e2d007 ("RISC-V: Generic library routines and assembly") and commit ebcbd75e3962 ("riscv: Fix the bug in memory access fixup code"), if __clear_user and __copy_user return from an fixup branch, CSR_STATUS SR_SUM bit will be set, it is a vulnerability, so that S-mode memory accesses to pages that are accessible by U-mode will success. Disable S-mode access to U-mode memory should clear SR_SUM bit. Fixes: 5d8544e2d007 ("RISC-V: Generic library routines and assembly") Fixes: ebcbd75e3962 ("riscv: Fix the bug in memory access fixup code") Signed-off-by: Chen Lifu <chenlifu@huawei.com> Reviewed-by: Ben Dooks <ben.dooks@codethink.co.uk> Link: https://lore.kernel.org/r/20220615014714.1650349-1-chenlifu@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-10riscv: Fixed misaligned memory access. Fixed pointer comparison.Michael T. Kloos
Rewrote the RISC-V memmove() assembly implementation. The previous implementation did not check memory alignment and it compared 2 pointers with a signed comparison. The misaligned memory access would cause the kernel to crash on systems that did not emulate it in firmware and did not support it in hardware. Firmware emulation is slow and may not exist. The RISC-V spec does not guarantee that support for misaligned memory accesses will exist. It should not be depended on. This patch now checks for XLEN granularity of co-alignment between the pointers. Failing that, copying is done by loading from the 2 contiguous and naturally aligned XLEN memory locations containing the overlapping XLEN sized data to be copied. The data is shifted into the correct place and binary or'ed together on each iteration. The result is then stored into the corresponding naturally aligned XLEN sized location in the destination. For unaligned data at the terminations of the regions to be copied or for copies less than (2 * XLEN) in size, byte copy is used. This patch also now uses unsigned comparison for the pointers and migrates to the newer assembler annotations from the now deprecated ones. Signed-off-by: Michael T. Kloos <michael@michaelkloos.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05riscv: extable: consolidate definitionsJisheng Zhang
This is a riscv port of commit 819771cc2892 ("arm64: extable: consolidate definitions"). Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05riscv: lib: uaccess: fold fixups into bodyJisheng Zhang
uaccess functions such __asm_copy_to_user(), __arch_copy_from_user() and __clear_user() place their exception fixups in the `.fixup` section without any clear association with themselves. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol. Similar as arm64 does, we must move fixups into the body of the functions themselves, after the usual fast-path returns. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05riscv: switch to relative exception tablesJisheng Zhang
Similar as other architectures such as arm64, x86 and so on, use offsets relative to the exception table entry values rather than absolute addresses for both the exception locationand the fixup. However, RISCV label difference will actually produce two relocations, a pair of R_RISCV_ADD32 and R_RISCV_SUB32. Take below simple code for example: $ cat test.S .section .text 1: nop .section __ex_table,"a" .balign 4 .long (1b - .) .previous $ riscv64-linux-gnu-gcc -c test.S $ riscv64-linux-gnu-readelf -r test.o Relocation section '.rela__ex_table' at offset 0x100 contains 2 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000000 000600000023 R_RISCV_ADD32 0000000000000000 .L1^B1 + 0 000000000000 000500000027 R_RISCV_SUB32 0000000000000000 .L0 + 0 The modpost will complain the R_RISCV_SUB32 relocation, so we need to patch modpost.c to skip this relocation for .rela__ex_table section. After this patch, the __ex_table section size of defconfig vmlinux is reduced from 7072 Bytes to 3536 Bytes. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2021-11-09include/linux/delay.h: replace kernel.h with the necessary inclusionsAndy Shevchenko
When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. [akpm@linux-foundation.org: cxd2880_common.h needs bits.h for GENMASK()] [andriy.shevchenko@linux.intel.com: delay.h: fix for removed kernel.h] Link: https://lkml.kernel.org/r/20211028170143.56523-1-andriy.shevchenko@linux.intel.com [akpm@linux-foundation.org: include/linux/fwnode.h needs bits.h for BIT()] Link: https://lkml.kernel.org/r/20211027150324.79827-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23riscv: __asm_copy_to-from_user: Fix: Typos in commentsAkira Tsukamoto
Fixing typos and grammar mistakes and using more intuitive label name. Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-23riscv: __asm_copy_to-from_user: Remove unnecessary size checkAkira Tsukamoto
Clean up: The size of 0 will be evaluated in the next step. Not required here. Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-23riscv: __asm_copy_to-from_user: Fix: fail on RV32Akira Tsukamoto
Had a bug when converting bytes to bits when the cpu was rv32. The a3 contains the number of bytes and multiple of 8 would be the bits. The LGREG is holding 2 for RV32 and 3 for RV32, so to achieve multiple of 8 it must always be constant 3. The 2 was mistakenly used for rv32. Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-23riscv: __asm_copy_to-from_user: Fix: overrun copyAkira Tsukamoto
There were two causes for the overrun memory access. The threshold size was too small. The aligning dst require one SZREG and unrolling word copy requires 8*SZREG, total have to be at least 9*SZREG. Inside the unrolling copy, the subtracting -(8*SZREG-1) would make iteration happening one extra loop. Proper value is -(8*SZREG). Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-06riscv: __asm_copy_to-from_user: Optimize unaligned memory access and ↵Akira Tsukamoto
pipeline stall This patch will reduce cpu usage dramatically in kernel space especially for application which use sys-call with large buffer size, such as network applications. The main reason behind this is that every unaligned memory access will raise exceptions and switch between s-mode and m-mode causing large overhead. First copy in bytes until reaches the first word aligned boundary in destination memory address. This is the preparation before the bulk aligned word copy. The destination address is aligned now, but oftentimes the source address is not in an aligned boundary. To reduce the unaligned memory access, it reads the data from source in aligned boundaries, which will cause the data to have an offset, and then combines the data in the next iteration by fixing offset with shifting before writing to destination. The majority of the improving copy speed comes from this shift copy. In the lucky situation that the both source and destination address are on the aligned boundary, perform load and store with register size to copy the data. Without the unrolling, it will reduce the speed since the next store instruction for the same register using from the load will stall the pipeline. At last, copying the remainder in one byte at a time. Signed-off-by: Akira Tsukamoto <akira.tsukamoto@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-01-14riscv: Add support for function error injectionGuo Ren
Inspired by the commit 42d038c4fb00 ("arm64: Add support for function error injection"), this patch supports function error injection for riscv. This patch mainly support two functions: one is regs_set_return_value() which is used to overwrite the return value; the another function is override_function_with_return() which is to override the probed function returning and jump to its caller. Test log: cd /sys/kernel/debug/fail_function echo sys_clone > inject echo 100 > probability echo 1 > interval ls / [ 313.176875] FAULT_INJECTION: forcing a failure. [ 313.176875] name fail_function, interval 1, probability 100, space 0, times 1 [ 313.184357] CPU: 0 PID: 87 Comm: sh Not tainted 5.8.0-rc5-00007-g6a758cc #117 [ 313.187616] Call Trace: [ 313.189100] [<ffffffe0002036b6>] walk_stackframe+0x0/0xc2 [ 313.191626] [<ffffffe00020395c>] show_stack+0x40/0x4c [ 313.193927] [<ffffffe000556c60>] dump_stack+0x7c/0x96 [ 313.194795] [<ffffffe0005522e8>] should_fail+0x140/0x142 [ 313.195923] [<ffffffe000299ffc>] fei_kprobe_handler+0x2c/0x5a [ 313.197687] [<ffffffe0009e2ec4>] kprobe_breakpoint_handler+0xb4/0x18a [ 313.200054] [<ffffffe00020357e>] do_trap_break+0x36/0xca [ 313.202147] [<ffffffe000201bca>] ret_from_exception+0x0/0xc [ 313.204556] [<ffffffe000201bbc>] ret_from_syscall+0x0/0x2 -sh: can't fork: Invalid argument Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-12-10riscv: provide memmove implementationNylon Chen
The memmove used by the kernel feature like KASAN. Signed-off-by: Nick Hu <nickhu@andestech.com> Signed-off-by: Nick Hu <nick650823@gmail.com> Signed-off-by: Nylon Chen <nylon7@andestech.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-10-04riscv: use memcpy based uaccess for nommu againChristoph Hellwig
This reverts commit adccfb1a805ea84d2db38eb53032533279bdaa97. Now that the generic uaccess by mempcy code handles unaligned addresses the generic code can be used for all RISC-V CPUs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-04-09Merge tag 'riscv-for-linus-5.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "This contains a handful of new features: - Partial support for the Kendryte K210. There are still a few outstanding issues that I have patches for, but I don't actually have a board to test them so they're not included yet. - SBI v0.2 support. - Fixes to support for building with LLVM-based toolchains. The resulting images are known not to boot yet. I don't anticipate a part two, but I'll probably have something early in the RCs to finish up the K210 support" * tag 'riscv-for-linus-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits) riscv: create a loader.bin boot image for Kendryte SoC riscv: Kendryte K210 default config riscv: Add Kendryte K210 device tree riscv: Select required drivers for Kendryte SOC riscv: Add Kendryte K210 SoC support riscv: Add SOC early init support riscv: Unaligned load/store handling for M_MODE RISC-V: Support cpu hotplug RISC-V: Add supported for ordered booting method using HSM RISC-V: Add SBI HSM extension definitions RISC-V: Export SBI error to linux error mapping function RISC-V: Add cpu_ops and modify default booting method RISC-V: Move relocate and few other functions out of __init RISC-V: Implement new SBI v0.2 extensions RISC-V: Introduce a new config for SBI v0.1 RISC-V: Add SBI v0.2 extension definitions RISC-V: Add basic support for SBI v0.2 RISC-V: Mark existing SBI as 0.1 SBI. riscv: Use macro definition instead of magic number riscv: Add support to dump the kernel page tables ...
2020-03-18riscv: uaccess should be used in nommu modeGreentime Hu
It might have the unaligned access exception when trying to exchange data with user space program. In this case, it failed in tty_ioctl(). Therefore we should enable uaccess.S for NOMMU mode since the generic code doesn't handle the unaligned access cases. 0x8013a212 <tty_ioctl+462>: ld a5,460(s1) [ 0.115279] Oops - load address misaligned [#1] [ 0.115284] CPU: 0 PID: 29 Comm: sh Not tainted 5.4.0-rc5-00020-gb4c27160d562-dirty #36 [ 0.115294] epc: 000000008013a212 ra : 000000008013a212 sp : 000000008f48dd50 [ 0.115303] gp : 00000000801cac28 tp : 000000008fb80000 t0 : 00000000000000e8 [ 0.115312] t1 : 000000008f58f108 t2 : 0000000000000009 s0 : 000000008f48ddf0 [ 0.115321] s1 : 000000008f8c6220 a0 : 0000000000000001 a1 : 000000008f48dd28 [ 0.115330] a2 : 000000008fb80000 a3 : 00000000801a7398 a4 : 0000000000000000 [ 0.115339] a5 : 0000000000000000 a6 : 000000008f58f0c6 a7 : 000000000000001d [ 0.115348] s2 : 000000008f8c6308 s3 : 000000008f78b7c8 s4 : 000000008fb834c0 [ 0.115357] s5 : 0000000000005413 s6 : 0000000000000000 s7 : 000000008f58f2b0 [ 0.115366] s8 : 000000008f858008 s9 : 000000008f776818 s10: 000000008f776830 [ 0.115375] s11: 000000008fb840a8 t3 : 1999999999999999 t4 : 000000008f78704c [ 0.115384] t5 : 0000000000000005 t6 : 0000000000000002 [ 0.115391] status: 0000000200001880 badaddr: 000000008f8c63ec cause: 0000000000000004 [ 0.115401] ---[ end trace 00d490c6a8b6c9ac ]--- This failure could be fixed after this patch applied. [ 0.002282] Run /init as init process Initializing random number generator... [ 0.005573] random: dd: uninitialized urandom read (512 bytes read) done. Welcome to Buildroot buildroot login: root Password: Jan 1 00:00:00 login[62]: root login on 'ttySIF0' ~ # Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-03-03RISC-V: Stop using LOCAL for the uaccess fixupsPalmer Dabbelt
LLVM's integrated assembler doesn't support the LOCAL directive, which we're using when generating our uaccess fixup tables. Luckily the table fragment is small enough that there's only one internal symbol, so using a relative symbol reference doesn't really complicate anything. Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-01-22riscv: Add KASAN supportNick Hu
This patch ports the feature Kernel Address SANitizer (KASAN). Note: The start address of shadow memory is at the beginning of kernel space, which is 2^64 - (2^39 / 2) in SV39. The size of the kernel space is 2^38 bytes so the size of shadow memory should be 2^38 / 8. Thus, the shadow memory would not overlap with the fixmap area. There are currently two limitations in this port, 1. RV64 only: KASAN need large address space for extra shadow memory region. 2. KASAN can't debug the modules since the modules are allocated in VMALLOC area. We mapped the shadow memory, which corresponding to VMALLOC area, to the kasan_early_shadow_page because we don't have enough physical space for all the shadow memory corresponding to VMALLOC area. Signed-off-by: Nick Hu <nickhu@andestech.com> Reported-by: Greentime Hu <green.hu@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-01-18riscv: Less inefficient gcc tishift helpers (and export their symbols)Olof Johansson
The existing __lshrti3 was really inefficient, and the other two helpers are also needed to compile some modules. Add the missing versions, and export all of the symbols like arm64 already does. This code is based on the assembly generated by libgcc builds. This fixes a build break triggered by ubsan: riscv64-unknown-linux-gnu-ld: lib/ubsan.o: in function `.L2': ubsan.c:(.text.unlikely+0x38): undefined reference to `__ashlti3' riscv64-unknown-linux-gnu-ld: ubsan.c:(.text.unlikely+0x42): undefined reference to `__ashrti3' Signed-off-by: Olof Johansson <olof@lixom.net> [paul.walmsley@sifive.com: use SYM_FUNC_{START,END} instead of ENTRY/ENDPROC; note libgcc origin] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-27riscv: fix compile failure with EXPORT_SYMBOL() & !MMULuc Van Oostenryck
When support for !MMU was added, the declaration of __asm_copy_to_user() & __asm_copy_from_user() were #ifdefed out hence their EXPORT_SYMBOL() give an error message like: .../riscv_ksyms.c:13:15: error: '__asm_copy_to_user' undeclared here .../riscv_ksyms.c:14:15: error: '__asm_copy_from_user' undeclared here Since these symbols are not defined with !MMU it's wrong to export them. Same for __clear_user() (even though this one is also declared in include/asm-generic/uaccess.h and thus doesn't give an error message). Fix this by doing the EXPORT_SYMBOL() directly where these symbols are defined: inside lib/uaccess.S itself. Fixes: 6bd33e1ece52 ("riscv: fix compile failure with EXPORT_SYMBOL() & !MMU") Reported-by: kbuild test robot <lkp@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-11-17riscv: add nommu supportChristoph Hellwig
The kernel runs in M-mode without using page tables, and thus can't run bare metal without help from additional firmware. Most of the patch is just stubbing out code not needed without page tables, but there is an interesting detail in the signals implementation: - The normal RISC-V syscall ABI only implements rt_sigreturn as VDSO entry point, but the ELF VDSO is not supported for nommu Linux. We instead copy the code to call the syscall onto the stack. In addition to enabling the nommu code a new defconfig for a small kernel image that can run in nommu mode on qemu is also provided, to run a kernel in qemu you can use the following command line: qemu-system-riscv64 -smp 2 -m 64 -machine virt -nographic \ -kernel arch/riscv/boot/loader \ -drive file=rootfs.ext2,format=raw,id=hd0 \ -device virtio-blk-device,drive=hd0 Contains contributions from Damien Le Moal <Damien.LeMoal@wdc.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anup Patel <anup@brainfault.org> [paul.walmsley@sifive.com: updated to apply; add CONFIG_MMU guards around PCI_IOBASE definition to fix build issues; fixed checkpatch issues; move the PCI_IO_* and VMEMMAP address space macros along with the others; resolve sparse warning] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-11-05riscv: abstract out CSR names for supervisor vs machine modeChristoph Hellwig
Many of the privileged CSRs exist in a supervisor and machine version that are used very similarly. Provide versions of the CSR names and fields that map to either the S-mode or M-mode variant depending on a new CONFIG_RISCV_M_MODE kconfig symbol. Contains contributions from Damien Le Moal <Damien.LeMoal@wdc.com> and Paul Walmsley <paul.walmsley@sifive.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> # for drivers/clocksource, drivers/irqchip [paul.walmsley@sifive.com: updated to apply] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-08-30riscv: Using CSR numbers to access CSRsBin Meng
Since commit a3182c91ef4e ("RISC-V: Access CSRs using CSR numbers"), we should prefer accessing CSRs using their CSR numbers, but there are several leftovers like sstatus / sptbr we missed. Signed-off-by: Bin Meng <bmeng.cn@gmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-08-08RISC-V: Remove udivdi3Palmer Dabbelt
This should never have landed in the first place: it was added as part of 64-bit divide support for 32-bit systems, but the kernel doesn't allow this sort of division. I must have forgotten to remove it. This patch removes the support. Since this routine only worked on 64-bit platforms but was only built on 32-bit platforms, it's essentially just nonsense anyway. Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Acked-by: Nicolas Pitre <nico@fluxnic.net> Link: https://lore.kernel.org/linux-riscv/nycvar.YSQ.7.76.1908061413360.19480@knanqh.ubzr/T/#t Reported-by: Eric Lin <tesheng@andestech.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-08-08riscv: delay: use do_div() instead of __udivdi3()Paul Walmsley
In preparation for removing __udivdi3() from the RISC-V architecture-specific files, convert its one user to use do_div(). This avoids breaking the RV32 build after __udivdi3() is removed. This second version removes the assignment of the remainder to an unused temporary variable. Thanks to Nicolas Pitre <nico@fluxnic.net> for the suggestion. Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Cc: Nicolas Pitre <nico@fluxnic.net>
2019-06-17Merge tag 'riscv-for-v5.2/fixes-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "This contains fixes, defconfig, and DT data changes for the v5.2-rc series. The fixes are relatively straightforward: - Addition of a TLB fence in the vmalloc_fault path, so the CPU doesn't enter an infinite page fault loop - Readdition of the pm_power_off export, so device drivers that reassign it can now be built as modules - A udelay() fix for RV32, fixing a miscomputation of the delay time - Removal of deprecated smp_mb__*() barriers This also adds initial DT data infrastructure for arch/riscv, along with initial data for the SiFive FU540-C000 SoC and the corresponding HiFive Unleashed board. We also update the RV64 defconfig to include some core drivers for the FU540 in the build" * tag 'riscv-for-v5.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: remove unused barrier defines riscv: mm: synchronize MMU after pte change riscv: dts: add initial board data for the SiFive HiFive Unleashed riscv: dts: add initial support for the SiFive FU540-C000 SoC dt-bindings: riscv: convert cpu binding to json-schema dt-bindings: riscv: sifive: add YAML documentation for the SiFive FU540 arch: riscv: add support for building DTB files from DT source data riscv: Fix udelay in RV32. riscv: export pm_power_off again RISC-V: defconfig: enable clocks, serial console
2019-06-11riscv: Fix udelay in RV32.Nick Hu
In RV32, udelay would delay the wrong cycle. When it shifts right "UDELAY_SHIFT" bits, it either delays 0 cycle or 1 cycle. It only works correctly in RV64. Because the 'ucycles' always needs to be 64 bits variable. Signed-off-by: Nick Hu <nickhu@andestech.com> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> [paul.walmsley@sifive.com: fixed minor spelling error] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 286Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 97 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.025053186@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-21RISC-V: lib: minor asm cleanupOlof Johansson
Fix tab/space conversion and use ENTRY/ENDPROC macros. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>