summaryrefslogtreecommitdiff
path: root/arch/riscv
AgeCommit message (Collapse)Author
2021-04-26RISC-V: Add crash kernel supportNick Kossifidis
This patch allows Linux to act as a crash kernel for use with kdump. Userspace will let the crash kernel know about the memory region it can use through linux,usable-memory property on the /memory node (overriding its reg property), and about the memory region where the elf core header of the previous kernel is saved, through a reserved-memory node with a compatible string of "linux,elfcorehdr". This approach is the least invasive and re-uses functionality already present. I tested this on riscv64 qemu and it works as expected, you may test it by retrieving the dmesg of the previous kernel through /proc/vmcore, using the vmcore-dmesg utility from kexec-tools. Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26RISC-V: Add kdump supportNick Kossifidis
This patch adds support for kdump, the kernel will reserve a region for the crash kernel and jump there on panic. In order for userspace tools (kexec-tools) to prepare the crash kernel kexec image, we also need to expose some information on /proc/iomem for the memory regions used by the kernel and for the region reserved for crash kernel. Note that on userspace the device tree is used to determine the system's memory layout so the "System RAM" on /proc/iomem is ignored. I tested this on riscv64 qemu and works as expected, you may test it by triggering a crash through /proc/sysrq_trigger: echo c > /proc/sysrq_trigger Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26RISC-V: Improve init_resources()Nick Kossifidis
The kernel region is always present and we know where it is, no need to look for it inside the loop, just ignore it like the rest of the reserved regions within system's memory. Additionally, we don't need to call memblock_free inside the loop, as if called it'll split the region of pre-allocated resources in two parts, messing things up, just re-use the previous pre-allocated resource and free any unused resources after both loops finish. Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> [Palmer: commit text] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26RISC-V: Add kexec supportNick Kossifidis
This patch adds support for kexec on RISC-V. On SMP systems it depends on HOTPLUG_CPU in order to be able to bring up all harts after kexec. It also needs a recent OpenSBI version that supports the HSM extension. I tested it on riscv64 QEMU on both an smp and a non-smp system. Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: vdso: fix and clean-up MakefileJisheng Zhang
Running "make" on an already compiled kernel tree will rebuild the kernel even without any modifications: CALL linux/scripts/checksyscalls.sh CALL linux/scripts/atomic/check-atomics.sh CHK include/generated/compile.h SO2S arch/riscv/kernel/vdso/vdso-syms.S AS arch/riscv/kernel/vdso/vdso-syms.o AR arch/riscv/kernel/vdso/built-in.a AR arch/riscv/kernel/built-in.a AR arch/riscv/built-in.a GEN .version CHK include/generated/compile.h UPD include/generated/compile.h CC init/version.o AR init/built-in.a LD vmlinux.o The reason is "Any target that utilizes if_changed must be listed in $(targets), otherwise the command line check will fail, and the target will always be built" as explained by Documentation/kbuild/makefiles.rst Fix this build bug by adding vdso-syms.S to $(targets) At the same time, there are two trivial clean up modifications: - the vdso-dummy.o is not needed any more after so remove it. - vdso.lds is a generated file, so it should be prefixed with $(obj)/ instead of $(src)/ Fixes: c2c81bb2f691 ("RISC-V: Fix the VDSO symbol generaton for binutils-2.35+") Cc: stable@vger.kernel.org Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv/mm: Use BUG_ON instead of if condition followed by BUG.zhouchuangao
BUG_ON() uses unlikely in if(), which can be optimized at compile time. Signed-off-by: zhouchuangao <zhouchuangao@vivo.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv/kprobe: fix kernel panic when invoking sys_read traced by kprobeLiao Chang
The execution of sys_read end up hitting a BUG_ON() in __find_get_block after installing kprobe at sys_read, the BUG message like the following: [ 65.708663] ------------[ cut here ]------------ [ 65.709987] kernel BUG at fs/buffer.c:1251! [ 65.711283] Kernel BUG [#1] [ 65.712032] Modules linked in: [ 65.712925] CPU: 0 PID: 51 Comm: sh Not tainted 5.12.0-rc4 #1 [ 65.714407] Hardware name: riscv-virtio,qemu (DT) [ 65.715696] epc : __find_get_block+0x218/0x2c8 [ 65.716835] ra : __getblk_gfp+0x1c/0x4a [ 65.717831] epc : ffffffe00019f11e ra : ffffffe00019f56a sp : ffffffe002437930 [ 65.719553] gp : ffffffe000f06030 tp : ffffffe0015abc00 t0 : ffffffe00191e038 [ 65.721290] t1 : ffffffe00191e038 t2 : 000000000000000a s0 : ffffffe002437960 [ 65.723051] s1 : ffffffe00160ad00 a0 : ffffffe00160ad00 a1 : 000000000000012a [ 65.724772] a2 : 0000000000000400 a3 : 0000000000000008 a4 : 0000000000000040 [ 65.726545] a5 : 0000000000000000 a6 : ffffffe00191e000 a7 : 0000000000000000 [ 65.728308] s2 : 000000000000012a s3 : 0000000000000400 s4 : 0000000000000008 [ 65.730049] s5 : 000000000000006c s6 : ffffffe00240f800 s7 : ffffffe000f080a8 [ 65.731802] s8 : 0000000000000001 s9 : 000000000000012a s10: 0000000000000008 [ 65.733516] s11: 0000000000000008 t3 : 00000000000003ff t4 : 000000000000000f [ 65.734434] t5 : 00000000000003ff t6 : 0000000000040000 [ 65.734613] status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003 [ 65.734901] Call Trace: [ 65.735076] [<ffffffe00019f11e>] __find_get_block+0x218/0x2c8 [ 65.735417] [<ffffffe00020017a>] __ext4_get_inode_loc+0xb2/0x2f6 [ 65.735618] [<ffffffe000201b6c>] ext4_get_inode_loc+0x3a/0x8a [ 65.735802] [<ffffffe000203380>] ext4_reserve_inode_write+0x2e/0x8c [ 65.735999] [<ffffffe00020357a>] __ext4_mark_inode_dirty+0x4c/0x18e [ 65.736208] [<ffffffe000206bb0>] ext4_dirty_inode+0x46/0x66 [ 65.736387] [<ffffffe000192914>] __mark_inode_dirty+0x12c/0x3da [ 65.736576] [<ffffffe000180dd2>] touch_atime+0x146/0x150 [ 65.736748] [<ffffffe00010d762>] filemap_read+0x234/0x246 [ 65.736920] [<ffffffe00010d834>] generic_file_read_iter+0xc0/0x114 [ 65.737114] [<ffffffe0001f5d7a>] ext4_file_read_iter+0x42/0xea [ 65.737310] [<ffffffe000163f2c>] new_sync_read+0xe2/0x15a [ 65.737483] [<ffffffe000165814>] vfs_read+0xca/0xf2 [ 65.737641] [<ffffffe000165bae>] ksys_read+0x5e/0xc8 [ 65.737816] [<ffffffe000165c26>] sys_read+0xe/0x16 [ 65.737973] [<ffffffe000003972>] ret_from_syscall+0x0/0x2 [ 65.738858] ---[ end trace fe93f985456c935d ]--- A simple reproducer looks like: echo 'p:myprobe sys_read fd=%a0 buf=%a1 count=%a2' > /sys/kernel/debug/tracing/kprobe_events echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable cat /sys/kernel/debug/tracing/trace Here's what happens to hit that BUG_ON(): 1) After installing kprobe at entry of sys_read, the first instruction is replaced by 'ebreak' instruction on riscv64 platform. 2) Once kernel reach the 'ebreak' instruction at the entry of sys_read, it trap into the riscv breakpoint handler, where it do something to setup for coming single-step of origin instruction, including backup the 'sstatus' in pt_regs, followed by disable interrupt during single stepping via clear 'SIE' bit of 'sstatus' in pt_regs. 3) Then kernel restore to the instruction slot contains two instructions, one is original instruction at entry of sys_read, the other is 'ebreak'. Here it trigger a 'Instruction page fault' exception (value at 'scause' is '0xc'), if PF is not filled into PageTabe for that slot yet. 4) Again kernel trap into page fault exception handler, where it choose different policy according to the state of running kprobe. Because afte 2) the state is KPROBE_HIT_SS, so kernel reset the current kprobe and 'pc' points back to the probe address. 5) Because 'epc' point back to 'ebreak' instrution at sys_read probe, kernel trap into breakpoint handler again, and repeat the operations at 2), however 'sstatus' without 'SIE' is keep at 4), it cause the real 'sstatus' saved at 2) is overwritten by the one withou 'SIE'. 6) When kernel cross the probe the 'sstatus' CSR restore with value without 'SIE', and reach __find_get_block where it requires the interrupt must be enabled. Fix this is very trivial, just restore the value of 'sstatus' in pt_regs with backup one at 2) when the instruction being single stepped cause a page fault. Fixes: c22b0bcb1dd02 ("riscv: Add kprobes supported") Signed-off-by: Liao Chang <liaochang1@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Set ARCH_HAS_STRICT_MODULE_RWX if MMUJisheng Zhang
Now we can set ARCH_HAS_STRICT_MODULE_RWX for MMU riscv platforms, this is good from security perspective. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: module: Create module allocations without exec permissionsJisheng Zhang
The core code manages the executable permissions of code regions of modules explicitly, it is not necessary to create the module vmalloc regions with RWX permissions. Create them with RW- permissions instead. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: bpf: Avoid breaking W^XJisheng Zhang
We allocate Non-executable pages, then call bpf_jit_binary_lock_ro() to enable executable permission after mapping them read-only. This is to prepare for STRICT_MODULE_RWX in following patch. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: bpf: Move bpf_jit_alloc_exec() and bpf_jit_free_exec() to coreJisheng Zhang
We will drop the executable permissions of the code pages from the mapping at allocation time soon. Move bpf_jit_alloc_exec() and bpf_jit_free_exec() to bpf_jit_core.c so that they can be shared by both RV64I and RV32I. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Acked-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: kprobes: Implement alloc_insn_page()Jisheng Zhang
Allocate PAGE_KERNEL_READ_EXEC(read only, executable) page for kprobes insn page. This is to prepare for STRICT_MODULE_RWX. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Constify sbi_ipi_opsJisheng Zhang
Constify the sbi_ipi_ops so that it will be placed in the .rodata section. This will cause attempts to modify it to fail when strict page permissions are in place. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Constify sys_call_tableJisheng Zhang
Constify the sys_call_table so that it will be placed in the .rodata section. This will cause attempts to modify the table to fail when strict page permissions are in place. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Mark some global variables __ro_after_initJisheng Zhang
All of these are never modified after init, so they can be __ro_after_init. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: add __init section marker to some functionsJisheng Zhang
They are not needed after booting, so mark them as __init to move them to the __init section. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Prepare ptdump for vm layout dynamic addressesAlexandre Ghiti
This is a preparatory patch for sv48 support that will introduce dynamic PAGE_OFFSET. Dynamic PAGE_OFFSET implies that all zones (vmalloc, vmemmap, fixaddr...) whose addresses depend on PAGE_OFFSET become dynamic and can't be used to statically initialize the array used by ptdump to identify the different zones of the vm layout. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Move kernel mapping outside of linear mappingAlexandre Ghiti
This is a preparatory patch for relocatable kernel and sv48 support. The kernel used to be linked at PAGE_OFFSET address therefore we could use the linear mapping for the kernel mapping. But the relocated kernel base address will be different from PAGE_OFFSET and since in the linear mapping, two different virtual addresses cannot point to the same physical address, the kernel mapping needs to lie outside the linear mapping so that we don't have to copy it at the same physical offset. The kernel mapping is moved to the last 2GB of the address space, BPF is now always after the kernel and modules use the 2GB memory range right before the kernel, so BPF and modules regions do not overlap. KASLR implementation will simply have to move the kernel in the last 2GB range and just take care of leaving enough space for BPF. In addition, by moving the kernel to the end of the address space, both sv39 and sv48 kernels will be exactly the same without needing to be relocated at runtime. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> [Palmer: Squash the STRICT_RWX fix, and a !MMU fix] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Select HAVE_DYNAMIC_FTRACE when -fpatchable-function-entry is availableNathan Chancellor
clang prior to 13.0.0 does not support -fpatchable-function-entry for RISC-V. clang: error: unsupported option '-fpatchable-function-entry=8' for target 'riscv64-unknown-linux-gnu' To avoid this error, only select HAVE_DYNAMIC_FTRACE when this option is not available. Fixes: afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT") Link: https://github.com/ClangBuiltLinux/linux/issues/1268 Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Fangrui Song <maskray@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Workaround mcount name prior to clang-13Nathan Chancellor
Prior to clang 13.0.0, the RISC-V name for the mcount symbol was "mcount", which differs from the GCC version of "_mcount", which results in the following errors: riscv64-linux-gnu-ld: init/main.o: in function `__traceiter_initcall_level': main.c:(.text+0xe): undefined reference to `mcount' riscv64-linux-gnu-ld: init/main.o: in function `__traceiter_initcall_start': main.c:(.text+0x4e): undefined reference to `mcount' riscv64-linux-gnu-ld: init/main.o: in function `__traceiter_initcall_finish': main.c:(.text+0x92): undefined reference to `mcount' riscv64-linux-gnu-ld: init/main.o: in function `.LBB32_28': main.c:(.text+0x30c): undefined reference to `mcount' riscv64-linux-gnu-ld: init/main.o: in function `free_initmem': main.c:(.text+0x54c): undefined reference to `mcount' This has been corrected in https://reviews.llvm.org/D98881 but the minimum supported clang version is 10.0.1. To avoid build errors and to gain a working function tracer, adjust the name of the mcount symbol for older versions of clang in mount.S and recordmcount.pl. Link: https://github.com/ClangBuiltLinux/linux/issues/1331 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Use $(LD) instead of $(CC) to link vDSONathan Chancellor
Currently, the VDSO is being linked through $(CC). This does not match how the rest of the kernel links objects, which is through the $(LD) variable. When linking with clang, there are a couple of warnings about flags that will not be used during the link: clang-12: warning: argument unused during compilation: '-no-pie' [-Wunused-command-line-argument] clang-12: warning: argument unused during compilation: '-pg' [-Wunused-command-line-argument] '-no-pie' was added in commit 85602bea297f ("RISC-V: build vdso-dummy.o with -no-pie") to override '-pie' getting added to the ld command from distribution versions of GCC that enable PIE by default. It is technically no longer needed after commit c2c81bb2f691 ("RISC-V: Fix the VDSO symbol generaton for binutils-2.35+"), which removed vdso-dummy.o in favor of generating vdso-syms.S from vdso.so with $(NM) but this also resolves the issue in case it ever comes back due to having full control over the $(LD) command. '-pg' is for function tracing, it is not used during linking as clang states. These flags could be removed/filtered to fix the warnings but it is easier to just match the rest of the kernel and use $(LD) directly for linking. See commits fe00e50b2db8 ("ARM: 8858/1: vdso: use $(LD) instead of $(CC) to link VDSO") 691efbedc60d ("arm64: vdso: use $(LD) instead of $(CC) to link VDSO") 2ff906994b6c ("MIPS: VDSO: Use $(LD) instead of $(CC) to link VDSO") 2b2a25845d53 ("s390/vdso: Use $(LD) instead of $(CC) to link vDSO") for more information. The flags are converted to linker flags and '--eh-frame-hdr' is added to match what is added by GCC implicitly, which can be seen by adding '-v' to GCC's invocation. Additionally, since this area is being modified, use the $(OBJCOPY) variable instead of an open coded $(CROSS_COMPILE)objcopy so that the user's choice of objcopy binary is respected. Link: https://github.com/ClangBuiltLinux/linux/issues/803 Link: https://github.com/ClangBuiltLinux/linux/issues/970 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Fangrui Song <maskray@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: sifive: Apply errata "cip-1200" patchVincent Chen
For certain SiFive CPUs, "sfence.vma addr" cannot exactly flush addr from TLB in the particular cases. The details could be found here: https://sifive.cdn.prismic.io/sifive/167a1a56-03f4-4615-a79e-b2a86153148f_FU740_errata_20210205.pdf In order to ensure the functionality, this patch uses the Alternative scheme to replace all "sfence.vma addr" with "sfence.vma" at runtime. Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: sifive: Apply errata "cip-453" patchVincent Chen
Add sign extension to the $badaddr before addressing the instruction page fault and instruction access fault to workaround the issue "cip-453". To avoid affecting the existing code sequence, this patch will creates two trampolines to add sign extension to the $badaddr. By the "alternative" mechanism, these two trampolines will replace the original exception handler of instruction page fault and instruction access fault in the excp_vect_table. In this case, only the specific SiFive CPU core jumps to the do_page_fault and do_trap_insn_fault through these two trampolines. Other CPUs are not affected. Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: sifive: Add SiFive alternative portsVincent Chen
Add required ports of the Alternative scheme for SiFive. Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Introduce alternative mechanism to apply errata solutionVincent Chen
Introduce the "alternative" mechanism from ARM64 and x86 to apply the CPU vendors' errata solution at runtime. The main purpose of this patch is to provide a framework. Therefore, the implementation is quite basic for now so that some scenarios could not use this schemei, such as patching code to a module, relocating the patching code and heterogeneous CPU topology. Users could use the macro ALTERNATIVE to apply an errata to the existing code flow. In the macro ALTERNATIVE, users need to specify the manufacturer information(vendorid, archid, and impid) for this errata. Therefore, kernel will know this errata is suitable for which CPU core. During the booting procedure, kernel will select the errata required by the CPU core and then patch it. It means that the kernel only applies the errata to the specified CPU core. In this case, the vendor's errata does not affect each other at runtime. The above patching procedure only occurs during the booting phase, so we only take the overhead of the "alternative" mechanism once. This "alternative" mechanism is enabled by default to ensure that all required errata will be applied. However, users can disable this feature by the Kconfig "CONFIG_RISCV_ERRATA_ALTERNATIVE". Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Add 3 SBI wrapper functions to get cpu manufacturer informationVincent Chen
Add 3 wrapper functions to get vendor id, architecture id and implement id from M-mode Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-03-29riscv: Cleanup KASAN_VMALLOC supportAlexandre Ghiti
When KASAN vmalloc region is populated, there is no userspace process and the page table in use is swapper_pg_dir, so there is no need to read SATP. Then we can use the same scheme used by kasan_populate_p*d functions to go through the page table, which harmonizes the code. In addition, make use of set_pgd that goes through all unused page table levels, contrary to p*d_populate functions, which makes this function work whatever the number of page table levels. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-03-29RISC-V: Don't print SBI version for all detected extensionsAnup Patel
The sbi_init() already prints SBI version before detecting various SBI extensions so we don't need to print SBI version for all detected SBI extensions. Signed-off-by: Anup Patel <anup.patel@wdc.com> Reviewed-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-03-16riscv: Enable generic clockevent broadcastGuo Ren
When percpu-timers are stopped by deep power saving mode, we need system timer help to broadcast IPI_TIMER. This is first introduced by broken x86 hardware, where the local apic timer stops in C3 state. But many other architectures(powerpc, mips, arm, hexagon, openrisc, sh) have supported the infrastructure to deal with Power Management issues. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-03-09riscv: Add ARCH_HAS_FORTIFY_SOURCEKefeng Wang
FORTIFY_SOURCE could detect various overflows at compile and run time. ARCH_HAS_FORTIFY_SOURCE means that the architecture can be built and run with CONFIG_FORTIFY_SOURCE. Select it in RISCV. See more about this feature from commit 6974f0c4555e ("include/linux/string.h: add the option of fortified string.h functions"). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-03-09riscv: Add support for memtestKefeng Wang
The riscv [rv32_]defconfig enabled CONFIG_MEMTEST, but memtest feature is not supported in RISCV. Add early_memtest() to support for memtest. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-28Merge tag 'riscv-for-linus-5.12-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: "A pair of patches that slipped through the cracks: - enable CPU hotplug in the defconfigs - some cleanups to setup_bootmem There's also a single fix for some randconfig build failures: - make NUMA depend on SMP" * tag 'riscv-for-linus-5.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Cleanup setup_bootmem() RISC-V: Enable CPU Hotplug in defconfigs RISC-V: Make NUMA depend on SMP
2021-02-27Merge tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring thread rewrite from Jens Axboe: "This converts the io-wq workers to be forked off the tasks in question instead of being kernel threads that assume various bits of the original task identity. This kills > 400 lines of code from io_uring/io-wq, and it's the worst part of the code. We've had several bugs in this area, and the worry is always that we could be missing some pieces for file types doing unusual things (recent /dev/tty example comes to mind, userfaultfd reads installing file descriptors is another fun one... - both of which need special handling, and I bet it's not the last weird oddity we'll find). With these identical workers, we can have full confidence that we're never missing anything. That, in itself, is a huge win. Outside of that, it's also more efficient since we're not wasting space and code on tracking state, or switching between different states. I'm sure we're going to find little things to patch up after this series, but testing has been pretty thorough, from the usual regression suite to production. Any issue that may crop up should be manageable. There's also a nice series of further reductions we can do on top of this, but I wanted to get the meat of it out sooner rather than later. The general worry here isn't that it's fundamentally broken. Most of the little issues we've found over the last week have been related to just changes in how thread startup/exit is done, since that's the main difference between using kthreads and these kinds of threads. In fact, if all goes according to plan, I want to get this into the 5.10 and 5.11 stable branches as well. That said, the changes outside of io_uring/io-wq are: - arch setup, simple one-liner to each arch copy_thread() implementation. - Removal of net and proc restrictions for io_uring, they are no longer needed or useful" * tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block: (30 commits) io-wq: remove now unused IO_WQ_BIT_ERROR io_uring: fix SQPOLL thread handling over exec io-wq: improve manager/worker handling over exec io_uring: ensure SQPOLL startup is triggered before error shutdown io-wq: make buffered file write hashed work map per-ctx io-wq: fix race around io_worker grabbing io-wq: fix races around manager/worker creation and task exit io_uring: ensure io-wq context is always destroyed for tasks arch: ensure parisc/powerpc handle PF_IO_WORKER in copy_thread() io_uring: cleanup ->user usage io-wq: remove nr_process accounting io_uring: flag new native workers with IORING_FEAT_NATIVE_WORKERS net: remove cmsg restriction from io_uring based send/recvmsg calls Revert "proc: don't allow async path resolution of /proc/self components" Revert "proc: don't allow async path resolution of /proc/thread-self components" io_uring: move SQPOLL thread io-wq forked worker io-wq: make io_wq_fork_thread() available to other users io-wq: only remove worker from free_list, if it was there io_uring: remove io_identity io_uring: remove any grabbing of context ...
2021-02-26riscv: Cleanup setup_bootmem()Kefeng Wang
After the following patches, commit de043da0b9e7 ("RISC-V: Fix usage of memblock_enforce_memory_limit") commit 1bd14a66ee52 ("RISC-V: Remove any memblock representing unusable memory area") commit b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()") some logic is useless, kill the mem_start/start/end and unneeded code. Reviewed-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-26RISC-V: Enable CPU Hotplug in defconfigsAnup Patel
The CPU hotplug support has been tested on QEMU, Spike, and SiFive Unleashed so let's enable it by default in RV32 and RV64 defconfigs. Signed-off-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-26RISC-V: Make NUMA depend on SMPPalmer Dabbelt
In theory these are orthogonal, but in practice all NUMA systems are SMP. NUMA && !SMP doesn't build, everyone else is coupling them, and I don't really see any value in supporting that configuration. Fixes: 4f0e8eef772e ("riscv: Add numa support for riscv64 platform") Suggested-by: Andrew Morton <akpm@linux-foundation.org> Suggested-by: Atish Patra <atishp@atishpatra.org> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-26Merge tag 'riscv-for-linus-5.12-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "A handful of new RISC-V related patches for this merge window: - A check to ensure drivers are properly using uaccess. This isn't manifesting with any of the drivers I'm currently using, but may catch errors in new drivers. - Some preliminary support for the FU740, along with the HiFive Unleashed it will appear on. - NUMA support for RISC-V, which involves making the arm64 code generic. - Support for kasan on the vmalloc region. - A handful of new drivers for the Kendryte K210, along with the DT plumbing required to boot on a handful of K210-based boards. - Support for allocating ASIDs. - Preliminary support for kernels larger than 128MiB. - Various other improvements to our KASAN support, including the utilization of huge pages when allocating the KASAN regions. We may have already found a bug with the KASAN_VMALLOC code, but it's passing my tests. There's a fix in the works, but that will probably miss the merge window. * tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (75 commits) riscv: Improve kasan population by using hugepages when possible riscv: Improve kasan population function riscv: Use KASAN_SHADOW_INIT define for kasan memory initialization riscv: Improve kasan definitions riscv: Get rid of MAX_EARLY_MAPPING_SIZE soc: canaan: Sort the Makefile alphabetically riscv: Disable KSAN_SANITIZE for vDSO riscv: Remove unnecessary declaration riscv: Add Canaan Kendryte K210 SD card defconfig riscv: Update Canaan Kendryte K210 defconfig riscv: Add Kendryte KD233 board device tree riscv: Add SiPeed MAIXDUINO board device tree riscv: Add SiPeed MAIX GO board device tree riscv: Add SiPeed MAIX DOCK board device tree riscv: Add SiPeed MAIX BiT board device tree riscv: Update Canaan Kendryte K210 device tree dt-bindings: add resets property to dw-apb-timer dt-bindings: fix sifive gpio properties dt-bindings: update sifive uart compatible string dt-bindings: update sifive clint compatible string ...
2021-02-22riscv: Improve kasan population by using hugepages when possibleAlexandre Ghiti
The kasan functions that populates the shadow regions used to allocate them page by page and did not take advantage of hugepages, so fix this by trying to allocate hugepages of 1GB and fallback to 2MB hugepages or 4K pages in case it fails. This reduces the page table memory consumption and improves TLB usage, as shown below: Before this patch: ---[ Kasan shadow start ]--- 0xffffffc000000000-0xffffffc400000000 0x00000000818ef000 16G PTE . A . . . . R V 0xffffffc400000000-0xffffffc447fc0000 0x00000002b7f4f000 1179392K PTE D A . . . W R V 0xffffffc480000000-0xffffffc800000000 0x00000000818ef000 14G PTE . A . . . . R V ---[ Kasan shadow end ]--- After this patch: ---[ Kasan shadow start ]--- 0xffffffc000000000-0xffffffc400000000 0x00000000818ef000 16G PTE . A . . . . R V 0xffffffc400000000-0xffffffc440000000 0x0000000240000000 1G PGD D A . . . W R V 0xffffffc440000000-0xffffffc447e00000 0x00000002b7e00000 126M PMD D A . . . W R V 0xffffffc447e00000-0xffffffc447fc0000 0x00000002b818f000 1792K PTE D A . . . W R V 0xffffffc480000000-0xffffffc800000000 0x00000000818ef000 14G PTE . A . . . . R V ---[ Kasan shadow end ]--- Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Improve kasan population functionAlexandre Ghiti
Current population code populates a whole page table without taking care of what could have been already allocated and without taking into account possible index in page table, assuming the virtual address to map is always aligned on the page table size, which, for example, won't be the case when the kernel will get pushed to the end of the address space. Address those problems by rewriting the kasan population function, splitting it into subfunctions for each different page table level. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Use KASAN_SHADOW_INIT define for kasan memory initializationAlexandre Ghiti
Instead of hardcoding memory initialization to 0, use KASAN_SHADOW_INIT. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Improve kasan definitionsAlexandre Ghiti
There is no functional change here, only improvement in code readability by adding comments to explain where the kasan constants come from and by replacing hardcoded numerical constant by the corresponding define. Note that the comments come from arm64. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Get rid of MAX_EARLY_MAPPING_SIZEAlexandre Ghiti
At early boot stage, we have a whole PGDIR to map the kernel, so there is no need to restrict the early mapping size to 128MB. Removing this define also allows us to simplify some compile time logic. This fixes large kernel mappings with a size greater than 128MB, as it is the case for syzbot kernels whose size was just ~130MB. Note that on rv64, for now, we are then limited to PGDIR size for early mapping as we can't use PGD mappings (see [1]). That should be enough given the relative small size of syzbot kernels compared to PGDIR_SIZE which is 1GB. [1] https://lore.kernel.org/lkml/20200603153608.30056-1-alex@ghiti.fr/ Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Disable KSAN_SANITIZE for vDSOTobias Klauser
We use the generic C VDSO implementations of a handful of clock-related functions. When kasan is enabled this results in asan stub calls that are unlikely to be resolved by userspace, this just disables KASAN when building the VDSO. Verified the fix on a kernel with KASAN enabled using vDSO selftests. Link: https://lore.kernel.org/lkml/CACT4Y+ZNJBnkKHXUf=tm_yuowvZvHwN=0rmJ=7J+xFd+9r_6pQ@mail.gmail.com/ Tested-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Tested-by: Dmitry Vyukov <dvyukov@google.com> [Palmer: commit text] Fixes: ad5d1122b82f ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Remove unnecessary declarationKefeng Wang
max_low_pfn and min_low_pfn are declared in linux/memblock.h, and it also is included in arch/riscv/mm/init.c, drop unnecessary declaration. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Add Canaan Kendryte K210 SD card defconfigDamien Le Moal
The nommu_k210_defconfig default configuration allows booting a Canaan Kendryte K210 SoC based boards using an embedded intramfs cpio file. Modifying this configuration to enable support for the board SD card is not trivial for all users. To help beginners getting started with these boards, add the nommu_k210_sdcard_defconfig default configuration file to set all configuration options necessary to use the board mmc-spi sd card for the root file system. This new configuration adds support for the block layer, the mmc-spi driver and modifies the boot options to specify the rootfs device as mmcblk0p1 (first partition of the sd card block device). The ext2 file system is selected by default to encourage its use as that results in only about 4KB added to the kernel image size. As ext2 does not have journaling, the boot options specify a read-only mount of the file system. Similarly to the smaller nommu_k210_defconfig, this new default configuration disables virtual terminal support to reduce the kernel image size. The default device tree selected is unchanged, specifying the simple "k210_generic" device tree file. The user must change this setting to specify the device tree suitable for the board being used (sipeed_maix_bit, sipeed_maix_dock, sipeed_maix_go, sipeed_maixduino or canaan_kd233). Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Update Canaan Kendryte K210 defconfigDamien Le Moal
Update the Kendryte k210 nommu default configuration file (nommu_k210_defconfig) to include device drivers for reset, reboot, I2C, SPI, gpio and LEDs support. Virtual Terminal support is also disabled as no terminal devices are supported and enabled. Disabling CONFIG_VT (removing the no longer needed override for CONFIG_VGA_CONSOLE) reduces the kernel image size by about 65 KB. This default configuration remains suitable for a system using an initramfs cpio file linked into the kernel image. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Add Kendryte KD233 board device treeDamien Le Moal
Add the device tree canaan_kd233.dts for the Canaan Kendryte KD233 development board. This device tree enables LEDs, some gpios and spi/mmc SD card device. The WS2812B RGB LED and the 10 positions rotary dip switch present on the board are left undefined. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> [Palmer: Remove undocumented microphone entry, along with the use.] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Add SiPeed MAIXDUINO board device treeDamien Le Moal
Add the device tree sipeed_maixduino.dts for the SiPeed MAIXDUINO board. This device tree enables LEDs and spi/mmc SD card device. Additionally, gpios and i2c are also enabled and mapped to the board header pins as indicated on the board itself. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> [Palmer: Remove undocumented microphone entry, along with the use.] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Add SiPeed MAIX GO board device treeDamien Le Moal
Add the device tree sipeed_maix_go.dts for the SiPeed MAIX GO board. This device tree enables buttons, LEDs, gpio, i2c and spi/mmc SD card devices. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> [Palmer: Remove undocumented microphone entry, along with the use.] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-22riscv: Add SiPeed MAIX DOCK board device treeDamien Le Moal
Add the device tree sipeed_maix_dock.dts for the SiPeed MAIX DOCK m1 and m1w boards. This device tree enables LEDs, gpio, i2c and spi/mmc SD card devices. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> [Palmer: Remove undocumented microphone entry, along with the use.] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>