linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2024-03-12	perf: RISC-V: Introduce Andes PMU to support perf event sampling	Yu Chien Peter Lin
	Assign riscv_pmu_irq_num the value of (256 + 18) for the custome PMU and add SSCOUNTOVF and SIP alternatives to ALT_SBI_PMU_OVERFLOW() and ALT_SBI_PMU_OVF_CLEAR_PENDING() macros, respectively. To make use of Andes PMU extension, "xandespmu" needs to be appended to the riscv,isa-extensions for each cpu node in device-tree, and make sure CONFIG_ANDES_CUSTOM_PMU is enabled. Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com> Reviewed-by: Charles Ci-Jyun Wu <dminus@andestech.com> Reviewed-by: Leo Yu-Chi Liang <ycliang@andestech.com> Co-developed-by: Locus Wei-Han Chen <locus84@andestech.com> Signed-off-by: Locus Wei-Han Chen <locus84@andestech.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240222083946.3977135-8-peterlin@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-12	riscv: errata: Rename defines for Andes	Yu Chien Peter Lin
	Use "ANDES" rather than "ANDESTECH" to unify the naming convention with directory, file names, Kconfig options and other definitions. Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com> Reviewed-by: Charles Ci-Jyun Wu <dminus@andestech.com> Reviewed-by: Leo Yu-Chi Liang <ycliang@andestech.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240222083946.3977135-2-peterlin@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-11	Merge tag 'for-netdev' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2024-03-11 We've added 59 non-merge commits during the last 9 day(s) which contain a total of 88 files changed, 4181 insertions(+), 590 deletions(-). The main changes are: 1) Enforce VM_IOREMAP flag and range in ioremap_page_range and introduce VM_SPARSE kind and vm_area_[un]map_pages to be used in bpf_arena, from Alexei. 2) Introduce bpf_arena which is sparse shared memory region between bpf program and user space where structures inside the arena can have pointers to other areas of the arena, and pointers work seamlessly for both user-space programs and bpf programs, from Alexei and Andrii. 3) Introduce may_goto instruction that is a contract between the verifier and the program. The verifier allows the program to loop assuming it's behaving well, but reserves the right to terminate it, from Alexei. 4) Use IETF format for field definitions in the BPF standard document, from Dave. 5) Extend struct_ops libbpf APIs to allow specify version suffixes for stuct_ops map types, share the same BPF program between several map definitions, and other improvements, from Eduard. 6) Enable struct_ops support for more than one page in trampolines, from Kui-Feng. 7) Support kCFI + BPF on riscv64, from Puranjay. 8) Use bpf_prog_pack for arm64 bpf trampoline, from Puranjay. 9) Fix roundup_pow_of_two undefined behavior on 32-bit archs, from Toke. ==================== Link: https://lore.kernel.org/r/20240312003646.8692-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11	Merge tag 'kvm-riscv-6.9-1' of https://github.com/kvm-riscv/linux into HEAD	Paolo Bonzini
	KVM/riscv changes for 6.9 - Exception and interrupt handling for selftests - Sstc (aka arch_timer) selftest - Forward seed CSR access to KVM userspace - Ztso extension support for Guest/VM - Zacas extension support for Guest/VM
2024-03-11	Merge tag 'loongarch-kvm-6.9' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.9 * Set reserved bits as zero in CPUCFG. * Start SW timer only when vcpu is blocking. * Do not restart SW timer when it is expired. * Remove unnecessary CSR register saving during enter guest.
2024-03-09	Merge tag 'kvm-x86-guest_memfd_fixes-6.8' of ↵	Paolo Bonzini
	https://github.com/kvm-x86/linux into HEAD KVM GUEST_MEMFD fixes for 6.8: - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to avoid creating ABI that KVM can't sanely support. - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly clear that such VMs are purely a development and testing vehicle, and come with zero guarantees. - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan is to support confidential VMs with deterministic private memory (SNP and TDX) only in the TDP MMU. - Fix a bug in a GUEST_MEMFD negative test that resulted in false passes when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
2024-03-06	bpf, riscv64/cfi: Support kCFI + BPF on riscv64	Puranjay Mohan
	The riscv BPF JIT doesn't emit proper kCFI prologues for BPF programs and struct_ops trampolines when CONFIG_CFI_CLANG is enabled. This causes CFI failures when calling BPF programs and can even crash the kernel due to invalid memory accesses. Example crash: root@rv-selftester:~/bpf# ./test_progs -a dummy_st_ops Unable to handle kernel paging request at virtual address ffffffff78204ffc Oops [#1] Modules linked in: bpf_testmod(OE) [....] CPU: 3 PID: 356 Comm: test_progs Tainted: P OE 6.8.0-rc1 #1 Hardware name: riscv-virtio,qemu (DT) epc : bpf_struct_ops_test_run+0x28c/0x5fc ra : bpf_struct_ops_test_run+0x26c/0x5fc epc : ffffffff82958010 ra : ffffffff82957ff0 sp : ff200000007abc80 gp : ffffffff868d6218 tp : ff6000008d87b840 t0 : 000000000000000f t1 : 0000000000000000 t2 : 000000002005793e s0 : ff200000007abcf0 s1 : ff6000008a90fee0 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : ffffffff868dba26 a6 : 0000000000000001 a7 : 0000000052464e43 s2 : 00007ffffc0a95f0 s3 : ff6000008a90fe80 s4 : ff60000084c24c00 s5 : ffffffff78205000 s6 : ff60000088750648 s7 : ff20000000035008 s8 : fffffffffffffff4 s9 : ffffffff86200610 s10: 0000000000000000 s11: 0000000000000000 t3 : ffffffff8483dc30 t4 : ffffffff8483dc10 t5 : ffffffff8483dbf0 t6 : ffffffff8483dbd0 status: 0000000200000120 badaddr: ffffffff78204ffc cause: 000000000000000d [<ffffffff82958010>] bpf_struct_ops_test_run+0x28c/0x5fc [<ffffffff805083ee>] bpf_prog_test_run+0x170/0x548 [<ffffffff805029c8>] __sys_bpf+0x2d2/0x378 [<ffffffff804ff570>] __riscv_sys_bpf+0x5c/0x120 [<ffffffff8000e8fe>] syscall_handler+0x62/0xe4 [<ffffffff83362df6>] do_trap_ecall_u+0xc6/0x27c [<ffffffff833822c4>] ret_from_exception+0x0/0x64 Code: b603 0109 b683 0189 b703 0209 8493 0609 157d 8d65 (a303) ffca ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception SMP: stopping secondary CPUs Implement proper kCFI prologues for the BPF programs and callbacks and drop __nocfi for riscv64. Fix the trampoline generation code to emit kCFI prologue when a struct_ops trampoline is being prepared. Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20240303170207.82201-2-puranjay12@gmail.com
2024-03-06	mm/treewide: align up pXd_leaf() retval across archs	Peter Xu
	Even if pXd_leaf() API is defined globally, it's not clear on the retval, and there are three types used (bool, int, unsigned log). Always return a boolean for pXd_leaf() APIs. Link: https://lkml.kernel.org/r/20240305043750.93762-11-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-03-06	arch: define CONFIG_PAGE_SIZE_*KB on all architectures	Arnd Bergmann
	Most architectures only support a single hardcoded page size. In order to ensure that each one of these sets the corresponding Kconfig symbols, change over the PAGE_SHIFT definition to the common one and allow only the hardware page size to be selected. Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-06	RISC-V: KVM: Allow Zacas extension for Guest/VM	Anup Patel
	Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zacas extension for Guest/VM. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>
2024-03-06	RISC-V: KVM: Allow Ztso extension for Guest/VM	Anup Patel
	Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Ztso extension for Guest/VM. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>
2024-03-01	Merge tag 'riscv-for-linus-6.8-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - detect ".option arch" support on not-yet-released LLVM builds - fix missing TLB flush when modifying non-leaf PTEs - fixes for T-Head custom extensions - fix for systems with the legacy PMU, that manifests as a crash on kernels built without SBI PMU support - fix for systems that clear envcfg on suspend, which manifests as cbo.zero trapping after resume - fixes for Svnapot systems, including removing Svnapot support for huge vmalloc/vmap regions tag 'riscv-for-linus-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Sparse-Memory/vmemmap out-of-bounds fix riscv: Fix pte_leaf_size() for NAPOT Revert "riscv: mm: support Svnapot in huge vmap" riscv: Save/restore envcfg CSR during CPU suspend riscv: Add a custom ISA extension for the [ms]envcfg CSR riscv: Fix enabling cbo.zero when running in M-mode perf: RISCV: Fix panic on pmu overflow handler MAINTAINERS: Update SiFive driver maintainers drivers: perf: ctr_get_width function for legacy is not defined drivers: perf: added capabilities for legacy PMU RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly riscv: add CALLER_ADDRx support RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH kbuild: Add -Wa,--fatal-warnings to as-instr invocation riscv: tlb: fix __p*d_free_tlb()
2024-02-29	riscv: Sparse-Memory/vmemmap out-of-bounds fix	Dimitris Vlachos
	Offset vmemmap so that the first page of vmemmap will be mapped to the first page of physical memory in order to ensure that vmemmap’s bounds will be respected during pfn_to_page()/page_to_pfn() operations. The conversion macros will produce correct SV39/48/57 addresses for every possible/valid DRAM_BASE inside the physical memory limits. v2:Address Alex's comments Suggested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Dimitris Vlachos <dvlachos@ics.forth.gr> Reported-by: Dimitris Vlachos <dvlachos@ics.forth.gr> Closes: https://lore.kernel.org/linux-riscv/20240202135030.42265-1-csd4492@csd.uoc.gr Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240229191723.32779-1-dvlachos@ics.forth.gr Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29	Merge patch series "NAPOT Fixes"	Palmer Dabbelt
	Alexandre Ghiti <alexghiti@rivosinc.com> says: This contains 2 fixes for NAPOT: patch 1 disables the use of NAPOT mapping for vmalloc/vmap and patch 2 implements pte_leaf_size() to report NAPOT size. * b4-shazam-merge: riscv: Fix pte_leaf_size() for NAPOT Revert "riscv: mm: support Svnapot in huge vmap" Link: https://lore.kernel.org/r/20240227205016.121901-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29	riscv: Fix pte_leaf_size() for NAPOT	Alexandre Ghiti
	pte_leaf_size() must be reimplemented to add support for NAPOT mappings. Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240227205016.121901-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29	Revert "riscv: mm: support Svnapot in huge vmap"	Alexandre Ghiti
	This reverts commit ce173474cf19fe7fbe8f0fc74e3c81ec9c3d9807. We cannot correctly deal with NAPOT mappings in vmalloc/vmap because if some part of a NAPOT mapping is unmapped, the remaining mapping is not updated accordingly. For example: ptr = vmalloc_huge(64 * 1024, GFP_KERNEL); vunmap_range((unsigned long)(ptr + PAGE_SIZE), (unsigned long)(ptr + 64 * 1024)); leads to the following kernel page table dump: 0xffff8f8000ef0000-0xffff8f8000ef1000 0x00000001033c0000 4K PTE N .. .. D A G . . W R V Meaning the first entry which was not unmapped still has the N bit set, which, if accessed first and cached in the TLB, could allow access to the unmapped range. That's because the logic to break the NAPOT mapping does not exist and likely won't. Indeed, to break a NAPOT mapping, we first have to clear the whole mapping, flush the TLB and then set the new mapping ("break- before-make" equivalent). That works fine in userspace since we can handle any pagefault occurring on the remaining mapping but we can't handle a kernel pagefault on such mapping. So fix this by reverting the commit that introduced the vmap/vmalloc support. Fixes: ce173474cf19 ("riscv: mm: support Svnapot in huge vmap") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240227205016.121901-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29	Merge patch series "riscv: cbo.zero fixes"	Palmer Dabbelt
	Samuel Holland <samuel.holland@sifive.com> says: This series fixes a couple of issues related to using the cbo.zero instruction in userspace. The first patch fixes a bug where the wrong enable bit gets set if the kernel is running in M-mode. The remaining patches fix a bug where the enable bit gets reset to its default value after a nonretentive idle state. I have hardware which reproduces this: Before this series: $ tools/testing/selftests/riscv/hwprobe/cbo TAP version 13 1..3 ok 1 Zicboz block size # Zicboz block size: 64 Illegal instruction After applying this series: $ tools/testing/selftests/riscv/hwprobe/cbo TAP version 13 1..3 ok 1 Zicboz block size # Zicboz block size: 64 ok 2 cbo.zero ok 3 cbo.zero check # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 * b4-shazam-merge: riscv: Save/restore envcfg CSR during CPU suspend riscv: Add a custom ISA extension for the [ms]envcfg CSR riscv: Fix enabling cbo.zero when running in M-mode Link: https://lore.kernel.org/r/20240228065559.3434837-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29	riscv: Save/restore envcfg CSR during CPU suspend	Samuel Holland
	The value of the [ms]envcfg CSR is lost when entering a nonretentive idle state, so the CSR must be rewritten when resuming the CPU. Cc: <stable@vger.kernel.org> # v6.7+ Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240228065559.3434837-4-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29	riscv: Add a custom ISA extension for the [ms]envcfg CSR	Samuel Holland
	The [ms]envcfg CSR was added in version 1.12 of the RISC-V privileged ISA (aka S[ms]1p12). However, bits in this CSR are defined by several other extensions which may be implemented separately from any particular version of the privileged ISA (for example, some unrelated errata may prevent an implementation from claiming conformance with Ss1p12). As a result, Linux cannot simply use the privileged ISA version to determine if the CSR is present. It must also check if any of these other extensions are implemented. It also cannot probe the existence of the CSR at runtime, because Linux does not require Sstrict, so (in the absence of additional information) it cannot know if a CSR at that address is [ms]envcfg or part of some non-conforming vendor extension. Since there are several standard extensions that imply the existence of the [ms]envcfg CSR, it becomes unwieldy to check for all of them wherever the CSR is accessed. Instead, define a custom Xlinuxenvcfg ISA extension bit that is implied by the other extensions and denotes that the CSR exists as defined in the privileged ISA, containing at least one of the fields common between menvcfg and senvcfg. This extension does not need to be parsed from the devicetree or ISA string because it can only be implemented as a subset of some other standard extension. Cc: <stable@vger.kernel.org> # v6.7+ Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240228065559.3434837-3-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29	riscv: Fix enabling cbo.zero when running in M-mode	Samuel Holland
	When the kernel is running in M-mode, the CBZE bit must be set in the menvcfg CSR, not in senvcfg. Cc: <stable@vger.kernel.org> Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode") Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240228065559.3434837-2-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-23	kexec: split crashkernel reservation code out from crash_core.c	Baoquan He
	Patch series "Split crash out from kexec and clean up related config items", v3. Motivation: ============= Previously, LKP reported a building error. When investigating, it can't be resolved reasonablly with the present messy kdump config items. https://lore.kernel.org/oe-kbuild-all/202312182200.Ka7MzifQ-lkp@intel.com/ The kdump (crash dumping) related config items could causes confusions: Firstly, CRASH_CORE enables codes including - crashkernel reservation; - elfcorehdr updating; - vmcoreinfo exporting; - crash hotplug handling; Now fadump of powerpc, kcore dynamic debugging and kdump all selects CRASH_CORE, while fadump - fadump needs crashkernel parsing, vmcoreinfo exporting, and accessing global variable 'elfcorehdr_addr'; - kcore only needs vmcoreinfo exporting; - kdump needs all of the current kernel/crash_core.c. So only enabling PROC_CORE or FA_DUMP will enable CRASH_CORE, this mislead people that we enable crash dumping, actual it's not. Secondly, It's not reasonable to allow KEXEC_CORE select CRASH_CORE. Because KEXEC_CORE enables codes which allocate control pages, copy kexec/kdump segments, and prepare for switching. These codes are shared by both kexec reboot and kdump. We could want kexec reboot, but disable kdump. In that case, CRASH_CORE should not be selected. -------------------- CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_KEXEC=y CONFIG_KEXEC_FILE=y --------------------- Thirdly, It's not reasonable to allow CRASH_DUMP select KEXEC_CORE. That could make KEXEC_CORE, CRASH_DUMP are enabled independently from KEXEC or KEXEC_FILE. However, w/o KEXEC or KEXEC_FILE, the KEXEC_CORE code built in doesn't make any sense because no kernel loading or switching will happen to utilize the KEXEC_CORE code. --------------------- CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y --------------------- In this case, what is worse, on arch sh and arm, KEXEC relies on MMU, while CRASH_DUMP can still be enabled when !MMU, then compiling error is seen as the lkp test robot reported in above link. ------arch/sh/Kconfig------ config ARCH_SUPPORTS_KEXEC def_bool MMU config ARCH_SUPPORTS_CRASH_DUMP def_bool BROKEN_ON_SMP --------------------------- Changes: =========== 1, split out crash_reserve.c from crash_core.c; 2, split out vmcore_infoc. from crash_core.c; 3, move crash related codes in kexec_core.c into crash_core.c; 4, remove dependency of FA_DUMP on CRASH_DUMP; 5, clean up kdump related config items; 6, wrap up crash codes in crash related ifdefs on all 8 arch-es which support crash dumping, except of ppc; Achievement: =========== With above changes, I can rearrange the config item logic as below (the right item depends on or is selected by the left item): PROC_KCORE -----------> VMCORE_INFO \|----------> VMCORE_INFO FA_DUMP----\| \|----------> CRASH_RESERVE ---->VMCORE_INFO / \|---->CRASH_RESERVE KEXEC --\| /\| \|--> KEXEC_CORE--> CRASH_DUMP-->/-\|---->PROC_VMCORE KEXEC_FILE --\| \ \| \---->CRASH_HOTPLUG KEXEC --\| \|--> KEXEC_CORE (for kexec reboot only) KEXEC_FILE --\| Test ======== On all 8 architectures, including x86_64, arm64, s390x, sh, arm, mips, riscv, loongarch, I did below three cases of config item setting and building all passed. Take configs on x86_64 as exampmle here: (1) Both CONFIG_KEXEC and KEXEC_FILE is unset, then all kexec/kdump items are unset automatically: # Kexec and crash features # CONFIG_KEXEC is not set # CONFIG_KEXEC_FILE is not set # end of Kexec and crash features (2) set CONFIG_KEXEC_FILE and 'make olddefconfig': --------------- # Kexec and crash features CONFIG_CRASH_RESERVE=y CONFIG_VMCORE_INFO=y CONFIG_KEXEC_CORE=y CONFIG_KEXEC_FILE=y CONFIG_CRASH_DUMP=y CONFIG_CRASH_HOTPLUG=y CONFIG_CRASH_MAX_MEMORY_RANGES=8192 # end of Kexec and crash features --------------- (3) unset CONFIG_CRASH_DUMP in case 2 and execute 'make olddefconfig': ------------------------ # Kexec and crash features CONFIG_KEXEC_CORE=y CONFIG_KEXEC_FILE=y # end of Kexec and crash features ------------------------ Note: For ppc, it needs investigation to make clear how to split out crash code in arch folder. Hope Hari and Pingfan can help have a look, see if it's doable. Now, I make it either have both kexec and crash enabled, or disable both of them altogether. This patch (of 14): Both kdump and fa_dump of ppc rely on crashkernel reservation. Move the relevant codes into separate files: crash_reserve.c, include/linux/crash_reserve.h. And also add config item CRASH_RESERVE to control its enabling of the codes. And update config items which has relationship with crashkernel reservation. And also change ifdeffery from CONFIG_CRASH_CORE to CONFIG_CRASH_RESERVE when those scopes are only crashkernel reservation related. And also rename arch/XXX/include/asm/{crash_core.h => crash_reserve.h} on arm64, x86 and risc-v because those architectures' crash_core.h is only related to crashkernel reservation. [akpm@linux-foundation.org: s/CRASH_RESEERVE/CRASH_RESERVE/, per Klara Modin] Link: https://lkml.kernel.org/r/20240124051254.67105-1-bhe@redhat.com Link: https://lkml.kernel.org/r/20240124051254.67105-2-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Pingfan Liu <piliu@redhat.com> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Michael Kelley <mhklinux@outlook.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22	riscv: remove MCOUNT_NAME workaround	Nathan Chancellor
	Now that the minimum supported version of LLVM for building the kernel has been bumped to 13.0.1, the condition for using _mcount as MCOUNT_NAME is always true, as the build will fail during the configuration stage for older LLVM versions. Replace MCOUNT_NAME with _mcount directly. This effectively reverts commit 7ce047715030 ("riscv: Workaround mcount name prior to clang-13"). Link: https://lkml.kernel.org/r/20240125-bump-min-llvm-ver-to-13-0-1-v1-7-f5ff9bda41c5@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Conor Dooley <conor@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22	arch and include: update LLVM Phabricator links	Nathan Chancellor
	reviews.llvm.org was LLVM's Phabricator instances for code review. It has been abandoned in favor of GitHub pull requests. While the majority of links in the kernel sources still work because of the work Fangrui has done turning the dynamic Phabricator instance into a static archive, there are some issues with that work, so preemptively convert all the links in the kernel sources to point to the commit on GitHub. Most of the commits have the corresponding differential review link in the commit message itself so there should not be any loss of fidelity in the relevant information. Link: https://discourse.llvm.org/t/update-on-github-pull-requests/71540/172 Link: https://lkml.kernel.org/r/20240109-update-llvm-links-v1-2-eb09b59db071@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Fangrui Song <maskray@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22	riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION	Alexandre Ghiti
	The new riscv specific arch_hugetlb_migration_supported() must be guarded with a #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION to avoid the following build error: In file included from include/linux/hugetlb.h:851, from kernel/fork.c:52: >> arch/riscv/include/asm/hugetlb.h:15:42: error: static declaration of 'arch_hugetlb_migration_supported' follows non-static declaration 15 \| #define arch_hugetlb_migration_supported arch_hugetlb_migration_supported \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/hugetlb.h:916:20: note: in expansion of macro 'arch_hugetlb_migration_supported' 916 \| static inline bool arch_hugetlb_migration_supported(struct hstate h) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/hugetlb.h:14:6: note: previous declaration of 'arch_hugetlb_migration_supported' with type 'bool(struct hstate )' {aka '_Bool(struct hstate )'} 14 \| bool arch_hugetlb_migration_supported(struct hstate h); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402110258.CV51JlEI-lkp@intel.com/ Fixes: ce68c035457b ("riscv: Fix arch_hugetlb_migration_supported() for NAPOT") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240211083640.756583-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-22	riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly	Yangyu Chen
	Previous commit dbfbda3bd6bf ("riscv: mm: update T-Head memory type definitions") from patch [1] missed a `<` for bit shifting, result in bit(61) does not set in _PAGE_NOCACHE_THEAD and leaves bit(0) set instead. This patch get this fixed. Link: https://lore.kernel.org/linux-riscv/20230912072510.2510-1-jszhang@kernel.org/ [1] Fixes: dbfbda3bd6bf ("riscv: mm: update T-Head memory type definitions") Signed-off-by: Yangyu Chen <cyy@cyyself.name> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/tencent_E19FA1A095768063102E654C6FC858A32F06@qq.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-22	riscv: add CALLER_ADDRx support	Zong Li
	CALLER_ADDRx returns caller's address at specified level, they are used for several tracers. These macros eventually use __builtin_return_address(n) to get the caller's address if arch doesn't define their own implementation. In RISC-V, __builtin_return_address(n) only works when n == 0, we need to walk the stack frame to get the caller's address at specified level. data.level started from 'level + 3' due to the call flow of getting caller's address in RISC-V implementation. If we don't have additional three iteration, the level is corresponding to follows: callsite -> return_address -> arch_stack_walk -> walk_stackframe \| \| \| \| level 3 level 2 level 1 level 0 Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Zong Li <zong.li@sifive.com> Link: https://lore.kernel.org/r/20240202015102.26251-1-zong.li@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-22	Merge commit '8246601a7d391ce8207408149d65732f28af81a1' into fixes	Palmer Dabbelt
	This single fix is also part of a larger cleanup, so I'm merging it into my fixes branch so it can be shared with for-next. * commit '8246601a7d391ce8207408149d65732f28af81a1': riscv: tlb: fix __p*d_free_tlb()
2024-02-22	riscv/pgtable: define PFN_PTE_SHIFT	David Hildenbrand
	We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Link: https://lkml.kernel.org/r/20240129124649.189745-6-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Russell King (Oracle) <linux@armlinux.org.uk> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22	arm64, powerpc, riscv, s390, x86: ptdump: refactor CONFIG_DEBUG_WX	Christophe Leroy
	All architectures using the core ptdump functionality also implement CONFIG_DEBUG_WX, and they all do it more or less the same way, with a function called debug_checkwx() that is called by mark_rodata_ro(), which is a substitute to ptdump_check_wx() when CONFIG_DEBUG_WX is set and a no-op otherwise. Refactor by centrally defining debug_checkwx() in linux/ptdump.h and call debug_checkwx() immediately after calling mark_rodata_ro() instead of calling it at the end of every mark_rodata_ro(). On x86_32, mark_rodata_ro() first checks __supported_pte_mask has _PAGE_NX before calling debug_checkwx(). Now the check is inside the callee ptdump_walk_pgd_level_checkwx(). On powerpc_64, mark_rodata_ro() bails out early before calling ptdump_check_wx() when the MMU doesn't have KERNEL_RO feature. The check is now also done in ptdump_check_wx() as it is called outside mark_rodata_ro(). Link: https://lkml.kernel.org/r/a59b102d7964261d31ead0316a9f18628e4e7a8e.1706610398.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Greg KH <greg@kroah.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Phong Tran <tranmanphong@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Steven Price <steven.price@arm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-15	Merge patch series "membarrier: riscv: Core serializing command"	Palmer Dabbelt
	RISC-V was lacking a membarrier implementation for the store/fetch ordering, which is a bit tricky because of the deferred icache flushing we use in RISC-V. * b4-shazam-merge: membarrier: riscv: Provide core serializing command locking: Introduce prepare_sync_core_cmd() membarrier: Create Documentation/scheduler/membarrier.rst membarrier: riscv: Add full memory barrier in switch_mm() Link: https://lore.kernel.org/r/20240131144936.29190-1-parri.andrea@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-15	membarrier: riscv: Provide core serializing command	Andrea Parri
	RISC-V uses xRET instructions on return from interrupt and to go back to user-space; the xRET instruction is not core serializing. Use FENCE.I for providing core serialization as follows: - by calling sync_core_before_usermode() on return from interrupt (cf. ipi_sync_core()), - via switch_mm() and sync_core_before_usermode() (respectively, for uthread->uthread and kthread->uthread transitions) before returning to user-space. On RISC-V, the serialization in switch_mm() is activated by resetting the icache_stale_mask of the mm at prepare_sync_core_cmd(). Suggested-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/r/20240131144936.29190-5-parri.andrea@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-15	membarrier: riscv: Add full memory barrier in switch_mm()	Andrea Parri
	The membarrier system call requires a full memory barrier after storing to rq->curr, before going back to user-space. The barrier is only needed when switching between processes: the barrier is implied by mmdrop() when switching from kernel to userspace, and it's not needed when switching from userspace to kernel. Rely on the feature/mechanism ARCH_HAS_MEMBARRIER_CALLBACKS and on the primitive membarrier_arch_switch_mm(), already adopted by the PowerPC architecture, to insert the required barrier. Fixes: fab957c11efe2f ("RISC-V: Atomic and Locking Code") Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/r/20240131144936.29190-2-parri.andrea@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-09	work around gcc bugs with 'asm goto' with outputs	Linus Torvalds
	We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' macro for that in the past: see commits 3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation bug") and a9f180345f53 ("compiler/gcc4: Make quirk for asm_volatile_goto() unconditional"). Then, much later, we ended up removing the workaround in commit 43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR 58670") because we no longer supported building the kernel with the affected gcc versions, but we left the macro uses around. Now, Sean Christopherson reports a new version of a very similar problem, which is fixed by re-applying that ancient workaround. But the problem in question is limited to only the 'asm goto with outputs' cases, so instead of re-introducing the old workaround as-is, let's rename and limit the workaround to just that much less common case. It looks like there are at least two separate issues that all hit in this area: (a) some versions of gcc don't mark the asm goto as 'volatile' when it has outputs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420 which is easy to work around by just adding the 'volatile' by hand. (b) Internal compiler errors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422 which are worked around by adding the extra empty 'asm' as a barrier, as in the original workaround. but the problem Sean sees may be a third thing since it involves bad code generation (not an ICE) even with the manually added 'volatile'. but the same old workaround works for this case, even if this feels a bit like voodoo programming and may only be hiding the issue. Reported-and-tested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-09	Merge tag 'riscv-for-linus-6.8-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - fix missing TLB flush during early boot on SPARSEMEM_VMEMMAP configurations - fixes to correctly implement the break-before-make behavior requried by the ISA for NAPOT mappings - fix a missing TLB flush on intermediate mapping changes - fix build warning about a missing declaration of overflow_stack - fix performace regression related to incorrect tracking of completed batch TLB flushes * tag 'riscv-for-linus-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fix arch_tlbbatch_flush() by clearing the batch cpumask riscv: declare overflow_stack as exported from traps.c riscv: Fix arch_hugetlb_migration_supported() for NAPOT riscv: Flush the tlb when a page directory is freed riscv: Fix hugetlb_mask_last_page() when NAPOT is enabled riscv: Fix set_huge_pte_at() for NAPOT mapping riscv: mm: execute local TLB flush after populating vmemmap
2024-02-08	kvm: replace __KVM_HAVE_READONLY_MEM with Kconfig symbol	Paolo Bonzini
	KVM uses __KVM_HAVE_* symbols in the architecture-dependent uapi/asm/kvm.h to mask unused definitions in include/uapi/linux/kvm.h. __KVM_HAVE_READONLY_MEM however was nothing but a misguided attempt to define KVM_CAP_READONLY_MEM only on architectures where KVM_CHECK_EXTENSION(KVM_CAP_READONLY_MEM) could possibly return nonzero. This however does not make sense, and it prevented userspace from supporting this architecture-independent feature without recompilation. Therefore, these days __KVM_HAVE_READONLY_MEM does not mask anything and is only used in virt/kvm/kvm_main.c. Userspace does not need to test it and there should be no need for it to exist. Remove it and replace it with a Kconfig symbol within Linux source code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-02-07	riscv: declare overflow_stack as exported from traps.c	Ben Dooks
	The percpu area overflow_stacks is exported from arch/riscv/kernel/traps.c for use in the entry code, but is not declared anywhere. Add the relevant declaration to arch/riscv/include/asm/stacktrace.h to silence the following sparse warning: arch/riscv/kernel/traps.c:395:1: warning: symbol '__pcpu_scope_overflow_stack' was not declared. Should it be static? We don't add the stackinfo_get_overflow() call as for some of the other architectures as this doesn't seem to be used yet, so just silence the warning. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Fixes: be97d0db5f44 ("riscv: VMAP_STACK overflow detection thread-safe") Link: https://lore.kernel.org/r/20231123134214.81481-1-ben.dooks@codethink.co.uk Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-07	riscv: Fix arch_hugetlb_migration_supported() for NAPOT	Alexandre Ghiti
	arch_hugetlb_migration_supported() must be reimplemented to add support for NAPOT hugepages, which is done here. Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240130120114.106003-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-06	riscv: Flush the tlb when a page directory is freed	Alexandre Ghiti
	The riscv privileged specification mandates to flush the TLB whenever a page directory is modified, so add that to tlb_flush(). Fixes: c5e9b2c2ae82 ("riscv: Improve tlb_flush()") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240128120405.25876-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-01	kernel.h: removed REPEAT_BYTE from kernel.h	Tanzir Hasan
	This patch creates wordpart.h and includes it in asm/word-at-a-time.h for all architectures. WORD_AT_A_TIME_CONSTANTS depends on kernel.h because of REPEAT_BYTE. Moving this to another header and including it where necessary allows us to not include the bloated kernel.h. Making this implicit dependency on REPEAT_BYTE explicit allows for later improvements in the lib/string.c inclusion list. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Tanzir Hasan <tanzirh@google.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20231226-libstringheader-v6-1-80aa08c7652c@google.com Signed-off-by: Kees Cook <keescook@chromium.org>
2024-01-31	riscv: mm: execute local TLB flush after populating vmemmap	Vincent Chen
	The spare_init() calls memmap_populate() many times to create VA to PA mapping for the VMEMMAP area, where all "struct page" are located once CONFIG_SPARSEMEM_VMEMMAP is defined. These "struct page" are later initialized in the zone_sizes_init() function. However, during this process, no sfence.vma instruction is executed for this VMEMMAP area. This omission may cause the hart to fail to perform page table walk because some data related to the address translation is invisible to the hart. To solve this issue, the local_flush_tlb_kernel_range() is called right after the sparse_init() to execute a sfence.vma instruction for this VMEMMAP area, ensuring that all data related to the address translation is visible to the hart. Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem") Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240117140333.2479667-1-vincent.chen@sifive.com Fixes: 7a92fc8b4d20 ("mm: Introduce flush_cache_vmap_early()") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-26	Merge tag 'kvm-riscv-6.8-2' of https://github.com/kvm-riscv/linux into HEAD	Paolo Bonzini
	KVM/riscv changes for 6.8 part #2 - Zbc extension support for Guest/VM - Scalar crypto extensions support for Guest/VM - Vector crypto extensions support for Guest/VM - Zfh[min] extensions support for Guest/VM - Zihintntl extension support for Guest/VM - Zvfh[min] extensions support for Guest/VM - Zfa extension support for Guest/VM
2024-01-24	riscv: Avoid code duplication with generic bitops implementation	Xiao Wang
	There's code duplication between the fallback implementation for bitops __ffs/__fls/ffs/fls API and the generic C implementation in include/asm-generic/bitops/. To avoid this duplication, this patch renames the generic C implementation by adding a "generic_" prefix to them, then we can use these generic APIs as fallback. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20231112094421.4014931-1-xiao.w.wang@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-24	riscv: blacklist assembly symbols for kprobe	Clément Léger
	Adding kprobes on some assembly functions (mainly exception handling) will result in crashes (either recursive trap or panic). To avoid such errors, add ASM_NOKPROBE() macro which allow adding specific symbols into the __kprobe_blacklist section and use to blacklist the following symbols that showed to be problematic: - handle_exception() - ret_from_exception() - handle_kernel_stack_overflow() Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20231004131009.409193-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-24	Merge patch series "riscv: support fast gup"	Palmer Dabbelt
	Jisheng Zhang <jszhang@kernel.org> says: This series adds fast gup support to riscv. The First patch fixes a bug in __pd_free_tlb(). Per the riscv privileged spec, if non-leaf PTEs I.E pmd, pud or p4d is modified, a sfence.vma is a must. The 2nd patch is a preparation patch. The last two patches do the real work: In order to implement fast gup we need to ensure that the page table walker is protected from page table pages being freed from under it. riscv situation is more complicated than other architectures: some riscv platforms may use IPI to perform TLB shootdown, for example, those platforms which support AIA, usually the riscv_ipi_for_rfence is true on these platforms; some riscv platforms may rely on the SBI to perform TLB shootdown, usually the riscv_ipi_for_rfence is false on these platforms. To keep software pagetable walkers safe in this case we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h for more details. This patch enables MMU_GATHER_RCU_TABLE_FREE, then use tlb_remove_page_ptdesc() for those platforms which use IPI to perform TLB shootdown; tlb_remove_ptdesc() for those platforms which use SBI to perform TLB shootdown; Both case mean that disabling interrupts will block the free and protect the fast gup page walker. So after the 3rd patch, everything is well prepared, let's select HAVE_FAST_GUP if MMU. b4-shazam-merge: riscv: enable HAVE_FAST_GUP if MMU riscv: enable MMU_GATHER_RCU_TABLE_FREE for SMP && MMU riscv: tlb: convert __pd_free_tlb() to inline functions riscv: tlb: fix __pd_free_tlb() Link: https://lore.kernel.org/r/20231219175046.2496-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-24	riscv: enable HAVE_FAST_GUP if MMU	Jisheng Zhang
	Activate the fast gup for riscv mmu platforms. Here are some GUP_FAST_BENCHMARK performance numbers: Before the patch: GUP_FAST_BENCHMARK: Time: get:53203 put:5085 us After the patch: GUP_FAST_BENCHMARK: Time: get:17711 put:5060 us The get time is reduced by 66.7%! IOW, 3x get speed! Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231219175046.2496-5-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-24	riscv: enable MMU_GATHER_RCU_TABLE_FREE for SMP && MMU	Jisheng Zhang
	In order to implement fast gup we need to ensure that the page table walker is protected from page table pages being freed from under it. riscv situation is more complicated than other architectures: some riscv platforms may use IPI to perform TLB shootdown, for example, those platforms which support AIA, usually the riscv_ipi_for_rfence is true on these platforms; some riscv platforms may rely on the SBI to perform TLB shootdown, usually the riscv_ipi_for_rfence is false on these platforms. To keep software pagetable walkers safe in this case we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h for more details. This patch enables MMU_GATHER_RCU_TABLE_FREE, then use tlb_remove_page_ptdesc() for those platforms which use IPI to perform TLB shootdown; tlb_remove_ptdesc() for those platforms which use SBI to perform TLB shootdown; Both case mean that disabling interrupts will block the free and protect the fast gup page walker. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231219175046.2496-4-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-24	riscv: tlb: convert __p*d_free_tlb() to inline functions	Jisheng Zhang
	This is to prepare for enabling MMU_GATHER_RCU_TABLE_FREE. No functionality changes. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231219175046.2496-3-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-24	riscv: tlb: fix __p*d_free_tlb()	Jisheng Zhang
	If non-leaf PTEs I.E pmd, pud or p4d is modified, a sfence.vma is a must for safe, imagine if an implementation caches the non-leaf translation in TLB, although I didn't meet this HW so far, but it's possible in theory. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Fixes: c5e9b2c2ae82 ("riscv: Improve tlb_flush()") Link: https://lore.kernel.org/r/20231219175046.2496-2-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-22	RISC-V: add helper function to read the vector VLEN	Heiko Stuebner
	VLEN describes the length of each vector register and some instructions need specific minimal VLENs to work correctly. The vector code already includes a variable riscv_v_vsize that contains the value of "32 vector registers with vlenb length" that gets filled during boot. vlenb is the value contained in the CSR_VLENB register and the value represents "VLEN / 8". So add riscv_vector_vlen() to return the actual VLEN value for in-kernel users when they need to check the available VLEN. Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jerry Shih <jerry.shih@sifive.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240122002024.27477-2-ebiggers@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-20	Merge tag 'riscv-for-linus-6.8-mw4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: - Support for tuning for systems with fast misaligned accesses. - Support for SBI-based suspend. - Support for the new SBI debug console extension. - The T-Head CMOs now use PA-based flushes. - Support for enabling the V extension in kernel code. - Optimized IP checksum routines. - Various ftrace improvements. - Support for archrandom, which depends on the Zkr extension. - The build is no longer broken under NET=n, KUNIT=y for ports that don't define their own ipv6 checksum. * tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (56 commits) lib: checksum: Fix build with CONFIG_NET=n riscv: lib: Check if output in asm goto supported riscv: Fix build error on rv32 + XIP riscv: optimize ELF relocation function in riscv RISC-V: Implement archrandom when Zkr is available riscv: Optimize hweight API with Zbb extension riscv: add dependency among Image(.gz), loader(.bin), and vmlinuz.efi samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI] riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support riscv: ftrace: Make function graph use ftrace directly riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name riscv: Restrict DWARF5 when building with LLVM to known working versions riscv: Hoist linker relaxation disabling logic into Kconfig kunit: Add tests for csum_ipv6_magic and ip_fast_csum riscv: Add checksum library riscv: Add checksum header riscv: Add static key for misaligned accesses asm-generic: Improve csum_fold RISC-V: selftests: cbo: Ensure asm operands match constraints ...