summaryrefslogtreecommitdiff
path: root/arch/riscv/kernel/setup.c
AgeCommit message (Collapse)Author
2025-02-14riscv: signal: fix signal_minsigstkszYong-Xuan Wang
The init_rt_signal_env() funciton is called before the alternative patch is applied, so using the alternative-related API to check the availability of an extension within this function doesn't have the intended effect. This patch reorders the init_rt_signal_env() and apply_boot_alternatives() to get the correct signal_minsigstksz. Fixes: e92f469b0771 ("riscv: signal: Report signal frame size to userspace via auxv") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Zong Li <zong.li@sifive.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241220083926.19453-3-yongxuan.wang@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-25mm/memblock: add memblock_alloc_or_panic interfaceGuo Weikang
Before SLUB initialization, various subsystems used memblock_alloc to allocate memory. In most cases, when memory allocation fails, an immediate panic is required. To simplify this behavior and reduce repetitive checks, introduce `memblock_alloc_or_panic`. This function ensures that memory allocation failures result in a panic automatically, improving code readability and consistency across subsystems that require this behavior. [guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic] Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/ Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> [s390] Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-11riscv: Fix wrong usage of __pa() on a fixmap addressAlexandre Ghiti
riscv uses fixmap addresses to map the dtb so we can't use __pa() which is reserved for linear mapping addresses. Fixes: b2473a359763 ("of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241209074508.53037-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-11-27Merge tag 'riscv-for-linus-6.13-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-v updates from Palmer Dabbelt: - Support for pointer masking in userspace - Support for probing vector misaligned access performance - Support for qspinlock on systems with Zacas and Zabha * tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits) RISC-V: Remove unnecessary include from compat.h riscv: Fix default misaligned access trap riscv: Add qspinlock support dt-bindings: riscv: Add Ziccrse ISA extension description riscv: Add ISA extension parsing for Ziccrse asm-generic: ticket-lock: Add separate ticket-lock.h asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock riscv: Implement xchg8/16() using Zabha riscv: Implement arch_cmpxchg128() using Zacas riscv: Improve zacas fully-ordered cmpxchg() riscv: Implement cmpxchg8/16() using Zabha dt-bindings: riscv: Add Zabha ISA extension description riscv: Implement cmpxchg32/64() using Zacas riscv: Do not fail to build on byte/halfword operations with Zawrs riscv: Move cpufeature.h macros into their own header KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests riscv: hwprobe: Export the Supm ISA extension riscv: selftests: Add a pointer masking test riscv: Allow ptrace control of the tagged address ABI ...
2024-11-11riscv: Add qspinlock supportAlexandre Ghiti
In order to produce a generic kernel, a user can select CONFIG_COMBO_SPINLOCKS which will fallback at runtime to the ticket spinlock implementation if Zabha or Ziccrse are not present. Note that we can't use alternatives here because the discovery of extensions is done too late and we need to start with the qspinlock implementation because the ticket spinlock implementation would pollute the spinlock value, so let's use static keys. This is largely based on Guo's work and Leonardo reviews at [1]. Link: https://lore.kernel.org/linux-riscv/20231225125847.2778638-1-guoren@kernel.org/ [1] Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20241103145153.105097-14-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-29of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verifyUsama Arif
__pa() is only intended to be used for linear map addresses and using it for initial_boot_params which is in fixmap for arm64 will give an incorrect value. Hence save the physical address when it is known at boot time when calling early_init_dt_scan for arm64 and use it at kexec time instead of converting the virtual address using __pa(). Note that arm64 doesn't need the FDT region reserved in the DT as the kernel explicitly reserves the passed in FDT. Therefore, only a debug warning is fixed with this change. Reported-by: Breno Leitao <leitao@debian.org> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Usama Arif <usamaarif642@gmail.com> Fixes: ac10be5cdbfa ("arm64: Use common of_kexec_alloc_and_setup_fdt()") Link: https://lore.kernel.org/r/20241023171426.452688-1-usamaarif642@gmail.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-07-22ACPI: RISCV: Add NUMA support based on SRAT and SLITHaibo Xu
Add acpi_numa.c file to enable parse NUMA information from ACPI SRAT and SLIT tables. SRAT table provide CPUs(Hart) and memory nodes to proximity domain mapping, while SLIT table provide the distance metrics between proximity domains. Signed-off-by: Haibo Xu <haibo1.xu@intel.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/65dbad1fda08a32922c44886e4581e49b4a2fecc.1718268003.git.haibo1.xu@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-18Merge tag 'driver-core-6.8-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here are the set of driver core and kernfs changes for 6.8-rc1. Nothing major in here this release cycle, just lots of small cleanups and some tweaks on kernfs that in the very end, got reverted and will come back in a safer way next release cycle. Included in here are: - more driver core 'const' cleanups and fixes - fw_devlink=rpm is now the default behavior - kernfs tiny changes to remove some string functions - cpu handling in the driver core is updated to work better on many systems that add topologies and cpus after booting - other minor changes and cleanups All of the cpu handling patches have been acked by the respective maintainers and are coming in here in one series. Everything has been in linux-next for a while with no reported issues" * tag 'driver-core-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (51 commits) Revert "kernfs: convert kernfs_idr_lock to an irq safe raw spinlock" kernfs: convert kernfs_idr_lock to an irq safe raw spinlock class: fix use-after-free in class_register() PM: clk: make pm_clk_add_notifier() take a const pointer EDAC: constantify the struct bus_type usage kernfs: fix reference to renamed function driver core: device.h: fix Excess kernel-doc description warning driver core: class: fix Excess kernel-doc description warning driver core: mark remaining local bus_type variables as const driver core: container: make container_subsys const driver core: bus: constantify subsys_register() calls driver core: bus: make bus_sort_breadthfirst() take a const pointer kernfs: d_obtain_alias(NULL) will do the right thing... driver core: Better advertise dev_err_probe() kernfs: Convert kernfs_path_from_node_locked() from strlcpy() to strscpy() kernfs: Convert kernfs_name_locked() from strlcpy() to strscpy() kernfs: Convert kernfs_walk_ns() from strlcpy() to strscpy() initramfs: Expose retained initrd as sysfs file fs/kernfs/dir: obey S_ISGID kernel/cgroup: use kernfs_create_dir_ns() ...
2024-01-04riscv: Remove unused members from struct cpu_operationsSamuel Holland
name is not used anywhere at all. cpu_prepare and cpu_disable do nothing and always return 0 if implemented. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231121234736.3489608-3-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-12-06riscv: convert to use arch_cpu_is_hotpluggable()Russell King (Oracle)
Convert riscv to use the arch_cpu_is_hotpluggable() helper rather than arch_register_cpu(). Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> # On HiFive Unmatched Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/E1r5R4L-00Ct0d-To@rmk-PC.armlinux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-06riscv: Switch over to GENERIC_CPU_DEVICESJames Morse
Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be overridden by the arch code, switch over to this to allow common code to choose when the register_cpu() call is made. This allows topology_init() to be removed. This is an intermediate step to the logic being moved to drivers/acpi, where GENERIC_CPU_DEVICES will do the work when booting with acpi=off. This patch also has the effect of moving the registration of CPUs from subsys to driver core initialisation, prior to any initcalls running. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/E1r5R4G-00Ct0M-PS@rmk-PC.armlinux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-08Merge tag 'riscv-for-linus-6.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for cbo.zero in userspace - Support for CBOs on ACPI-based systems - A handful of improvements for the T-Head cache flushing ops - Support for software shadow call stacks - Various cleanups and fixes * tag 'riscv-for-linus-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (31 commits) RISC-V: hwprobe: Fix vDSO SIGSEGV riscv: configs: defconfig: Enable configs required for RZ/Five SoC riscv: errata: prefix T-Head mnemonics with th. riscv: put interrupt entries into .irqentry.text riscv: mm: Update the comment of CONFIG_PAGE_OFFSET riscv: Using TOOLCHAIN_HAS_ZIHINTPAUSE marco replace zihintpause riscv/mm: Fix the comment for swap pte format RISC-V: clarify the QEMU workaround in ISA parser riscv: correct pt_level name via pgtable_l5/4_enabled RISC-V: Provide pgtable_l5_enabled on rv32 clocksource: timer-riscv: Increase rating of clock_event_device for Sstc clocksource: timer-riscv: Don't enable/disable timer interrupt lkdtm: Fix CFI_BACKWARD on RISC-V riscv: Use separate IRQ shadow call stacks riscv: Implement Shadow Call Stack riscv: Move global pointer loading to a macro riscv: Deduplicate IRQ stack switching riscv: VMAP_STACK overflow detection thread-safe RISC-V: cacheflush: Initialize CBO variables on ACPI systems RISC-V: ACPI: RHCT: Add function to get CBO block sizes ...
2023-10-17efi: move screen_info into efi init codeArnd Bergmann
After the vga console no longer relies on global screen_info, there are only two remaining use cases: - on the x86 architecture, it is used for multiple boot methods (bzImage, EFI, Xen, kexec) to commucate the initial VGA or framebuffer settings to a number of device drivers. - on other architectures, it is only used as part of the EFI stub, and only for the three sysfb framebuffers (simpledrm, simplefb, efifb). Remove the duplicate data structure definitions by moving it into the efi-init.c file that sets it up initially for the EFI case, leaving x86 as an exception that retains its own definition for non-EFI boots. The added #ifdefs here are optional, I added them to further limit the reach of screen_info to configurations that have at least one of the users enabled. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20231017093947.3627976-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-17vgacon: rework screen_info #ifdef checksArnd Bergmann
On non-x86 architectures, the screen_info variable is generally only used for the VGA console where supported, and in some cases the EFI framebuffer or vga16fb. Now that we have a definite list of which architectures actually use it for what, use consistent #ifdef checks so the global variable is only defined when it is actually used on those architectures. Loongarch and riscv have no support for vgacon or vga16fb, but they support EFI firmware, so only that needs to be checked, and the initialization can be removed because that is handled by EFI. IA64 has both vgacon and EFI, though EFI apparently never uses a framebuffer here. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Khalid Aziz <khalid@gonehiking.org> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20231009211845.3136536-3-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-12riscv: kdump: fix crashkernel reserving problem on RISC-VChen Jiahao
When testing on risc-v QEMU environment with "crashkernel=" parameter enabled, a problem occurred with the following message: [ 0.000000] crashkernel low memory reserved: 0xf8000000 - 0x100000000 (128 MB) [ 0.000000] crashkernel reserved: 0x0000000177e00000 - 0x0000000277e00000 (4096 MB) [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/resource.c:779 __insert_resource+0x8e/0xd0 [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc2-next-20230920 #1 [ 0.000000] Hardware name: riscv-virtio,qemu (DT) [ 0.000000] epc : __insert_resource+0x8e/0xd0 [ 0.000000] ra : insert_resource+0x28/0x4e [ 0.000000] epc : ffffffff80017344 ra : ffffffff8001742e sp : ffffffff81203db0 [ 0.000000] gp : ffffffff812ece98 tp : ffffffff8120dac0 t0 : ff600001f7ff2b00 [ 0.000000] t1 : 0000000000000000 t2 : 3428203030303030 s0 : ffffffff81203dc0 [ 0.000000] s1 : ffffffff81211e18 a0 : ffffffff81211e18 a1 : ffffffff81289380 [ 0.000000] a2 : 0000000277dfffff a3 : 0000000177e00000 a4 : 0000000177e00000 [ 0.000000] a5 : ffffffff81289380 a6 : 0000000277dfffff a7 : 0000000000000078 [ 0.000000] s2 : ffffffff81289380 s3 : ffffffff80a0bac8 s4 : ff600001f7ff2880 [ 0.000000] s5 : 0000000000000280 s6 : 8000000a00006800 s7 : 000000000000007f [ 0.000000] s8 : 0000000080017038 s9 : 0000000080038ea0 s10: 0000000000000000 [ 0.000000] s11: 0000000000000000 t3 : ffffffff80a0bc00 t4 : ffffffff80a0bc00 [ 0.000000] t5 : ffffffff80a0bbd0 t6 : ffffffff80a0bc00 [ 0.000000] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [ 0.000000] [<ffffffff80017344>] __insert_resource+0x8e/0xd0 [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] Failed to add a Crash kernel resource at 177e00000 The crashkernel memory has been allocated successfully, whereas it failed to insert into iomem_resource. This is due to the unique reserving logic in risc-v arch specific code, i.e. crashk_res/crashk_low_res will be added into iomem_resource later in init_resources(), which is not aligned with current unified reserving logic in reserve_crashkernel_{generic,low}() and therefore leads to the failure of crashkernel reservation. Removing the arch specific code within #ifdef CONFIG_KEXEC_CORE in init_resources() to fix above problem. Fixes: 31549153088e ("riscv: kdump: use generic interface to simplify crashkernel reservation") Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> Link: https://lore.kernel.org/r/20230925024333.730964-1-chenjiahao16@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-09-21RISC-V: Enable cbo.zero in usermodeAndrew Jones
When Zicboz is present, enable its instruction (cbo.zero) in usermode by setting its respective senvcfg bit. We don't bother trying to set this bit per-task, which would also require an interface for tasks to request enabling and/or disabling. Instead, permanently set the bit for each hart which has the extension when bringing it online. This patch also introduces riscv_cpu_has_extension_[un]likely() functions to check a specific hart's ISA bitmap for extensions. Prior to checking the specific hart's bitmap in these functions we try the bitmap which represents the LCD of extensions, but only when we know it will use its optimized, alternatives path by gating its call on CONFIG_RISCV_ALTERNATIVE. When alternatives are used, the compiler ensures that the invocation of the LCD search becomes a constant true or false. When it's true, even the new functions will completely vanish from their callsites. OTOH, when the LCD check is false, we need to do a search of the hart's ISA bitmap. Had we also checked the LCD bitmap without the use of alternatives, then we would have ended up with two bitmap searches instead of one. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230918131518.56803-10-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-09-08Merge patch series "riscv: Introduce KASLR"Palmer Dabbelt
Alexandre Ghiti <alexghiti@rivosinc.com> says: The following KASLR implementation allows to randomize the kernel mapping: - virtually: we expect the bootloader to provide a seed in the device-tree - physically: only implemented in the EFI stub, it relies on the firmware to provide a seed using EFI_RNG_PROTOCOL. arm64 has a similar implementation hence the patch 3 factorizes KASLR related functions for riscv to take advantage. The new virtual kernel location is limited by the early page table that only has one PUD and with the PMD alignment constraint, the kernel can only take < 512 positions. * b4-shazam-merge: riscv: libstub: Implement KASLR by using generic functions libstub: Fix compilation warning for rv32 arm64: libstub: Move KASLR handling functions to kaslr.c riscv: Dump out kernel offset information on panic riscv: Introduce virtual kernel mapping KASLR Link: https://lore.kernel.org/r/20230722123850.634544-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-09-05riscv: Dump out kernel offset information on panicAlexandre Ghiti
Dump out the KASLR virtual kernel offset when panic to help debug kernel. Signed-off-by: Zong Li <zong.li@sifive.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Song Shuai <songshuaishuai@tinylab.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20230722123850.634544-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-31Merge patch series "riscv: Reduce ARCH_KMALLOC_MINALIGN to 8"Palmer Dabbelt
Jisheng Zhang <jszhang@kernel.org> says: Currently, riscv defines ARCH_DMA_MINALIGN as L1_CACHE_BYTES, I.E 64Bytes, if CONFIG_RISCV_DMA_NONCOHERENT=y. To support unified kernel Image, usually we have to enable CONFIG_RISCV_DMA_NONCOHERENT, thus it brings some bad effects to coherent platforms: Firstly, it wastes memory, kmalloc-96, kmalloc-32, kmalloc-16 and kmalloc-8 slab caches don't exist any more, they are replaced with either kmalloc-128 or kmalloc-64. Secondly, larger than necessary kmalloc aligned allocations results in unnecessary cache/TLB pressure. This issue also exists on arm64 platforms. From last year, Catalin tried to solve this issue by decoupling ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN, limiting kmalloc() minimum alignment to dma_get_cache_alignment() and replacing ARCH_KMALLOC_MINALIGN usage in various drivers with ARCH_DMA_MINALIGN etc.[1] One fact we can make use of for riscv: if the CPU doesn't support ZICBOM or T-HEAD CMO, we know the platform is coherent. Based on Catalin's work and above fact, we can easily solve the kmalloc align issue for riscv: we can override dma_get_cache_alignment(), then let it return ARCH_DMA_MINALIGN at the beginning and return 1 once we know the underlying HW neither supports ZICBOM nor supports T-HEAD CMO. So what about if the CPU supports ZICBOM or T-HEAD CMO, but all the devices are dma coherent? Well, we use ARCH_DMA_MINALIGN as the kmalloc minimum alignment, nothing changed in this case. This case can be improved in the future once we see such platforms in mainline. After this patch, a simple test of booting to a small buildroot rootfs on qemu shows: kmalloc-96 5041 5041 96 ... kmalloc-64 9606 9606 64 ... kmalloc-32 5128 5128 32 ... kmalloc-16 7682 7682 16 ... kmalloc-8 10246 10246 8 ... So we save about 1268KB memory. The saving will be much larger in normal OS env on real HW platforms. patch1 allows kmalloc() caches aligned to the smallest value. patch2 enables DMA_BOUNCE_UNALIGNED_KMALLOC. After this series: As for coherent platforms, kmalloc-{8,16,32,96} caches come back on coherent both RV32 and RV64 platforms, I.E !ZICBOM and !THEAD_CMO. As for noncoherent RV32 platforms, nothing changed. As for noncoherent RV64 platforms, I.E either ZICBOM or THEAD_CMO, the above kmalloc caches also come back if > 4GB memory or users pass "swiotlb=mmnn,force" to force swiotlb creation if <= 4GB memory. How much mmnn should be depends on the specific platform, it needs to be tried and tested all possible usage case on the specific hardware. For example, I can use the minimal I/O TLB slabs on Sipeed M1S Dock. * b4-shazam-merge: riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC for !dma_coherent riscv: allow kmalloc() caches aligned to the smallest value Link: https://lore.kernel.org/linux-arm-kernel/20230524171904.3967031-1-catalin.marinas@arm.com/ [1] Link: https://lore.kernel.org/r/20230718152214.2907-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-23riscv: allow kmalloc() caches aligned to the smallest valueJisheng Zhang
Currently, riscv defines ARCH_DMA_MINALIGN as L1_CACHE_BYTES, I.E 64Bytes, if CONFIG_RISCV_DMA_NONCOHERENT=y. To support unified kernel Image, usually we have to enable CONFIG_RISCV_DMA_NONCOHERENT, thus it brings some bad effects to coherent platforms: Firstly, it wastes memory, kmalloc-96, kmalloc-32, kmalloc-16 and kmalloc-8 slab caches don't exist any more, they are replaced with either kmalloc-128 or kmalloc-64. Secondly, larger than necessary kmalloc aligned allocations results in unnecessary cache/TLB pressure. This issue also exists on arm64 platforms. From last year, Catalin tried to solve this issue by decoupling ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN, limiting kmalloc() minimum alignment to dma_get_cache_alignment() and replacing ARCH_KMALLOC_MINALIGN usage in various drivers with ARCH_DMA_MINALIGN etc.[1] One fact we can make use of for riscv: if the CPU doesn't support ZICBOM or T-HEAD CMO, we know the platform is coherent. Based on Catalin's work and above fact, we can easily solve the kmalloc align issue for riscv: we can override dma_get_cache_alignment(), then let it return ARCH_DMA_MINALIGN at the beginning and return 1 once we know the underlying HW neither supports ZICBOM nor supports T-HEAD CMO. So what about if the CPU supports ZICBOM or T-HEAD CMO, but all the devices are dma coherent? Well, we use ARCH_DMA_MINALIGN as the kmalloc minimum alignment, nothing changed in this case. This case can be improved in the future. After this patch, a simple test of booting to a small buildroot rootfs on qemu shows: kmalloc-96 5041 5041 96 ... kmalloc-64 9606 9606 64 ... kmalloc-32 5128 5128 32 ... kmalloc-16 7682 7682 16 ... kmalloc-8 10246 10246 8 ... So we save about 1268KB memory. The saving will be much larger in normal OS env on real HW platforms. Link: https://lore.kernel.org/linux-arm-kernel/20230524171904.3967031-1-catalin.marinas@arm.com/ [1] Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230718152214.2907-2-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-16riscv: kdump: Implement crashkernel=X,[high,low]Chen Jiahao
On riscv, the current crash kernel allocation logic is trying to allocate within 32bit addressible memory region by default, if failed, try to allocate without 4G restriction. In need of saving DMA zone memory while allocating a relatively large crash kernel region, allocating the reserved memory top down in high memory, without overlapping the DMA zone, is a mature solution. Here introduce the parameter option crashkernel=X,[high,low]. One can reserve the crash kernel from high memory above DMA zone range by explicitly passing "crashkernel=X,high"; or reserve a memory range below 4G with "crashkernel=X,low". Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Baoquan He <bhe@redhat.com> Link: https://lore.kernel.org/r/20230726175000.2536220-2-chenjiahao16@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-08Merge patch series "riscv: Add vector ISA support"Palmer Dabbelt
Andy Chiu <andy.chiu@sifive.com> says: This is the v21 patch series for adding Vector extension support in Linux. Please refer to [1] for the introduction of the patchset. The v21 patch series was aimed to solve build issues from v19, provide usage guideline for the prctl interface, and address review comments on v20. Thank every one who has been reviewing, suggesting on the topic. Hope this get a step closer to the final merge. * b4-shazam-merge: (27 commits) selftests: add .gitignore file for RISC-V hwprobe selftests: Test RISC-V Vector prctl interface riscv: Add documentation for Vector riscv: Enable Vector code to be built riscv: detect assembler support for .option arch riscv: Add sysctl to set the default vector rule for new processes riscv: Add prctl controls for userspace vector management riscv: hwcap: change ELF_HWCAP to a function riscv: KVM: Add vector lazy save/restore support riscv: kvm: Add V extension to KVM ISA riscv: prevent stack corruption by reserving task_pt_regs(p) early riscv: signal: validate altstack to reflect Vector riscv: signal: Report signal frame size to userspace via auxv riscv: signal: Add sigcontext save/restore for vector riscv: signal: check fp-reserved words unconditionally riscv: Add ptrace vector support riscv: Allocate user's vector context in the first-use trap riscv: Add task switch support for vector riscv: Introduce struct/helpers to save/restore per-task Vector state riscv: Introduce riscv_v_vsize to record size of Vector context ... Link: https://lore.kernel.org/r/20230605110724.21391-1-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-08riscv: signal: Add sigcontext save/restore for vectorGreentime Hu
This patch facilitates the existing fp-reserved words for placement of the first extension's context header on the user's sigframe. A context header consists of a distinct magic word and the size, including the header itself, of an extension on the stack. Then, the frame is followed by the context of that extension, and then a header + context body for another extension if exists. If there is no more extension to come, then the frame must be ended with a null context header. A special case is rv64gc, where the kernel support no extensions requiring to expose additional regfile to the user. In such case the kernel would place the null context header right after the first reserved word of __riscv_q_ext_state when saving sigframe. And the kernel would check if all reserved words are zeros when a signal handler returns. __riscv_q_ext_state---->| |<-__riscv_extra_ext_header ~ ~ .reserved[0]--->|0 |<- .reserved <-------|magic |<- .hdr | |size |_______ end of sc_fpregs | |ext-bdy| | ~ ~ +)size ------->|magic |<- another context header |size | |ext-bdy| ~ ~ |magic:0|<- null context header |size:0 | The vector registers will be saved in datap pointer. The datap pointer will be allocated dynamically when the task needs in kernel space. On the other hand, datap pointer on the sigframe will be set right after the __riscv_v_ext_state data structure. Co-developed-by: Vincent Chen <vincent.chen@sifive.com> Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Suggested-by: Vineet Gupta <vineetg@rivosinc.com> Suggested-by: Richard Henderson <richard.henderson@linaro.org> Co-developed-by: Andy Chiu <andy.chiu@sifive.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20230605110724.21391-15-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-01RISC-V: ACPI: Cache and retrieve the RINTC structureSunil V L
RINTC structures in the MADT provide mapping between the hartid and the CPU. This is required many times even at run time like cpuinfo. So, instead of parsing the ACPI table every time, cache the RINTC structures and provide a function to get the correct RINTC structure for a given cpu. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230515054928.2079268-10-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-01RISC-V: Add ACPI initialization in setup_arch()Sunil V L
Initialize the ACPI core for RISC-V during boot. ACPI tables and interpreter are initialized based on the information passed from the firmware and the value of the kernel parameter 'acpi'. With ACPI support added for RISC-V, the kernel parameter 'acpi' is also supported on RISC-V. Hence, update the documentation. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230515054928.2079268-9-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-01riscv: move sbi_init() earlier before jump_label_init()Jisheng Zhang
We call jump_label_init() in setup_arch() is to use static key mechanism earlier, but riscv jump label relies on the sbi functions, If we enable static key before sbi_init(), the code path looks like: static_branch_enable() .. arch_jump_label_transform() patch_text_nosync() flush_icache_range() flush_icache_all() sbi_remote_fence_i() for CONFIG_RISCV_SBI case __sbi_rfence() Since sbi isn't initialized, so NULL deference! Here is a typical panic log: [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 0.000000] Oops [#1] [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0-rc7+ #79 [ 0.000000] Hardware name: riscv-virtio,qemu (DT) [ 0.000000] epc : 0x0 [ 0.000000] ra : sbi_remote_fence_i+0x1e/0x26 [ 0.000000] epc : 0000000000000000 ra : ffffffff80005826 sp : ffffffff80c03d50 [ 0.000000] gp : ffffffff80ca6178 tp : ffffffff80c0ad80 t0 : 6200000000000000 [ 0.000000] t1 : 0000000000000000 t2 : 62203a6b746e6972 s0 : ffffffff80c03d60 [ 0.000000] s1 : ffffffff80001af6 a0 : 0000000000000000 a1 : 0000000000000000 [ 0.000000] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.000000] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000080200 [ 0.000000] s2 : ffffffff808b3e48 s3 : ffffffff808bf698 s4 : ffffffff80cb2818 [ 0.000000] s5 : 0000000000000001 s6 : ffffffff80c9c345 s7 : ffffffff80895aa0 [ 0.000000] s8 : 0000000000000001 s9 : 000000000000007f s10: 0000000000000000 [ 0.000000] s11: 0000000000000000 t3 : ffffffff80824d08 t4 : 0000000000000022 [ 0.000000] t5 : 000000000000003d t6 : 0000000000000000 [ 0.000000] status: 0000000000000100 badaddr: 0000000000000000 cause: 000000000000000c [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- Fix this issue by moving sbi_init() earlier before jump_label_init() Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20230515054928.2079268-2-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-28Merge tag 'riscv-for-linus-6.4-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for runtime detection of the Svnapot extension - Support for Zicboz when clearing pages - We've moved to GENERIC_ENTRY - Support for !MMU on rv32 systems - The linear region is now mapped via huge pages - Support for building relocatable kernels - Support for the hwprobe interface - Various fixes and cleanups throughout the tree * tag 'riscv-for-linus-6.4-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (57 commits) RISC-V: hwprobe: Explicity check for -1 in vdso init RISC-V: hwprobe: There can only be one first riscv: Allow to downgrade paging mode from the command line dt-bindings: riscv: add sv57 mmu-type RISC-V: hwprobe: Remove __init on probe_vendor_features() riscv: Use --emit-relocs in order to move .rela.dyn in init riscv: Check relocations at compile time powerpc: Move script to check relocations at compile time in scripts/ riscv: Introduce CONFIG_RELOCATABLE riscv: Move .rela.dyn outside of init to avoid empty relocations riscv: Prepare EFI header for relocatable kernels riscv: Unconditionnally select KASAN_VMALLOC if KASAN riscv: Fix ptdump when KASAN is enabled riscv: Fix EFI stub usage of KASAN instrumented strcmp function riscv: Move DTB_EARLY_BASE_VA to the kernel address space riscv: Rework kasan population functions riscv: Split early and final KASAN population functions riscv: Use PUD/P4D/PGD pages for the linear mapping riscv: Move the linear mapping creation in its own function riscv: Get rid of riscv_pfn_base variable ...
2023-04-27Merge tag 'devicetree-for-6.4-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull more devicetree updates from Rob Herring: - First part of DT header detangling dropping cpu.h from of_device.h and replacing some includes with forward declarations. A handful of drivers needed some adjustment to their includes as a result. - Refactor of_device.h to be used by bus drivers rather than various device drivers. This moves non-bus related functions out of of_device.h. The end goal is for of_platform.h and of_device.h to stop including each other. - Refactor open coded parsing of "ranges" in some bus drivers to use DT address parsing functions - Add some new address parsing functions of_property_read_reg(), of_range_count(), and of_range_to_resource() in preparation to convert more open coded parsing of DT addresses to use them. - Treewide clean-ups to use of_property_read_bool() and of_property_present() as appropriate. The ones here are the ones that didn't get picked up elsewhere. * tag 'devicetree-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (34 commits) bus: tegra-gmi: Replace of_platform.h with explicit includes hte: Use of_property_present() for testing DT property presence w1: w1-gpio: Use of_property_read_bool() for boolean properties virt: fsl: Use of_property_present() for testing DT property presence soc: fsl: Use of_property_present() for testing DT property presence sbus: display7seg: Use of_property_read_bool() for boolean properties sparc: Use of_property_read_bool() for boolean properties sparc: Use of_property_present() for testing DT property presence bus: mvebu-mbus: Remove open coded "ranges" parsing of/address: Add of_property_read_reg() helper of/address: Add of_range_count() helper of/address: Add support for 3 address cell bus of/address: Add of_range_to_resource() helper of: unittest: Add bus address range parsing tests of: Drop cpu.h include from of_device.h OPP: Adjust includes to remove of_device.h irqchip: loongson-eiointc: Add explicit include for cpuhotplug.h cpuidle: Adjust includes to remove of_device.h cpufreq: sun50i: Add explicit include for cpu.h cpufreq: Adjust includes to remove of_device.h ...
2023-04-13riscv: Do not set initial_boot_params to the linear address of the dtbAlexandre Ghiti
early_init_dt_verify() is already called in parse_dtb() and since the dtb address does not change anymore (it is now in the fixmap region), no need to reset initial_boot_params by calling early_init_dt_verify() again. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230329081932.79831-3-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-13riscv: Move early dtb mapping into the fixmap regionAlexandre Ghiti
riscv establishes 2 virtual mappings: - early_pg_dir maps the kernel which allows to discover the system memory - swapper_pg_dir installs the final mapping (linear mapping included) We used to map the dtb in early_pg_dir using DTB_EARLY_BASE_VA, and this mapping was not carried over in swapper_pg_dir. It happens that early_init_fdt_scan_reserved_mem() must be called before swapper_pg_dir is setup otherwise we could allocate reserved memory defined in the dtb. And this function initializes reserved_mem variable with addresses that lie in the early_pg_dir dtb mapping: when those addresses are reused with swapper_pg_dir, this mapping does not exist and then we trap. The previous "fix" was incorrect as early_init_fdt_scan_reserved_mem() must be called before swapper_pg_dir is set up otherwise we could allocate in reserved memory defined in the dtb. So move the dtb mapping in the fixmap region which is established in early_pg_dir and handed over to swapper_pg_dir. Fixes: 922b0375fc93 ("riscv: Fix memblock reservation for device tree blob") Fixes: 8f3a2b4a96dc ("RISC-V: Move DT mapping outof fixmap") Fixes: 50e63dd8ed92 ("riscv: fix reserved memory setup") Reported-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/all/f8e67f82-103d-156c-deb0-d6d6e2756f5e@microchip.com/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230329081932.79831-2-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-13riscv: Add explicit include for cpu.hRob Herring
Removing the include of cpu.h from of_device.h (included by of_platform.h) causes an error in setup.c: arch/riscv/kernel/setup.c:313:22: error: arithmetic on a pointer to an incomplete type 'typeof(struct cpu)' (aka 'struct cpu') The of_platform.h header is not necessary either, so it can be dropped. Link: https://lore.kernel.org/r/20230329-dt-cpu-header-cleanups-v1-8-581e2605fe47@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-03-14RISC-V: Add Zicboz detection and block size parsingAndrew Jones
Parse "riscv,cboz-block-size" from the DT by piggybacking on Zicbom's riscv_init_cbom_blocksize(). Additionally check the DT for the presence of the "zicboz" extension and, when it's present, validate the parsed cboz block size as we do Zicbom's cbom block size with riscv_isa_extension_check(). Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230224162631.405473-5-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-31riscv: move riscv_noncoherent_supported() out of ZICBOM probeJisheng Zhang
It's a bit weird to call riscv_noncoherent_supported() each time when insmoding a module. Move the calling out of feature patch func. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230128172856.3814-2-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: mm: Proper page permissions after initmem freeBjörn Töpel
64-bit RISC-V kernels have the kernel image mapped separately to alias the linear map. The linear map and the kernel image map are documented as "direct mapping" and "kernel" respectively in [1]. At image load time, the linear map corresponding to the kernel image is set to PAGE_READ permission, and the kernel image map is set to PAGE_READ|PAGE_EXEC. When the initmem is freed, the pages in the linear map should be restored to PAGE_READ|PAGE_WRITE, whereas the corresponding pages in the kernel image map should be restored to PAGE_READ, by removing the PAGE_EXEC permission. This is not the case. For 64-bit kernels, only the linear map is restored to its proper page permissions at initmem free, and not the kernel image map. In practise this results in that the kernel can potentially jump to dead __init code, and start executing invalid instructions, without getting an exception. Restore the freed initmem properly, by setting both the kernel image map to the correct permissions. [1] Documentation/riscv/vm-layout.rst Fixes: e5c35fa04019 ("riscv: Map the kernel with correct permissions the first time") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> Tested-by: Alexandre Ghiti <alex@ghiti.fr> Link: https://lore.kernel.org/r/20221115090641.258476-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-10riscv: fix reserved memory setupConor Dooley
Currently, RISC-V sets up reserved memory using the "early" copy of the device tree. As a result, when trying to get a reserved memory region using of_reserved_mem_lookup(), the pointer to reserved memory regions is using the early, pre-virtual-memory address which causes a kernel panic when trying to use the buffer's name: Unable to handle kernel paging request at virtual address 00000000401c31ac Oops [#1] Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc1-00001-g0d9d6953d834 #1 Hardware name: Microchip PolarFire-SoC Icicle Kit (DT) epc : string+0x4a/0xea ra : vsnprintf+0x1e4/0x336 epc : ffffffff80335ea0 ra : ffffffff80338936 sp : ffffffff81203be0 gp : ffffffff812e0a98 tp : ffffffff8120de40 t0 : 0000000000000000 t1 : ffffffff81203e28 t2 : 7265736572203a46 s0 : ffffffff81203c20 s1 : ffffffff81203e28 a0 : ffffffff81203d22 a1 : 0000000000000000 a2 : ffffffff81203d08 a3 : 0000000081203d21 a4 : ffffffffffffffff a5 : 00000000401c31ac a6 : ffff0a00ffffff04 a7 : ffffffffffffffff s2 : ffffffff81203d08 s3 : ffffffff81203d00 s4 : 0000000000000008 s5 : ffffffff000000ff s6 : 0000000000ffffff s7 : 00000000ffffff00 s8 : ffffffff80d9821a s9 : ffffffff81203d22 s10: 0000000000000002 s11: ffffffff80d9821c t3 : ffffffff812f3617 t4 : ffffffff812f3617 t5 : ffffffff812f3618 t6 : ffffffff81203d08 status: 0000000200000100 badaddr: 00000000401c31ac cause: 000000000000000d [<ffffffff80338936>] vsnprintf+0x1e4/0x336 [<ffffffff80055ae2>] vprintk_store+0xf6/0x344 [<ffffffff80055d86>] vprintk_emit+0x56/0x192 [<ffffffff80055ed8>] vprintk_default+0x16/0x1e [<ffffffff800563d2>] vprintk+0x72/0x80 [<ffffffff806813b2>] _printk+0x36/0x50 [<ffffffff8068af48>] print_reserved_mem+0x1c/0x24 [<ffffffff808057ec>] paging_init+0x528/0x5bc [<ffffffff808031ae>] setup_arch+0xd0/0x592 [<ffffffff8080070e>] start_kernel+0x82/0x73c early_init_fdt_scan_reserved_mem() takes no arguments as it operates on initial_boot_params, which is populated by early_init_dt_verify(). On RISC-V, early_init_dt_verify() is called twice. Once, directly, in setup_arch() if CONFIG_BUILTIN_DTB is not enabled and once indirectly, very early in the boot process, by parse_dtb() when it calls early_init_dt_scan_nodes(). This first call uses dtb_early_va to set initial_boot_params, which is not usable later in the boot process when early_init_fdt_scan_reserved_mem() is called. On arm64 for example, the corresponding call to early_init_dt_scan_nodes() uses fixmap addresses and doesn't suffer the same fate. Move early_init_fdt_scan_reserved_mem() further along the boot sequence, after the direct call to early_init_dt_verify() in setup_arch() so that the names use the correct virtual memory addresses. The above supposed that CONFIG_BUILTIN_DTB was not set, but should work equally in the case where it is - unflatted_and_copy_device_tree() also updates initial_boot_params. Reported-by: Valentina Fernandez <valentina.fernandezalanis@microchip.com> Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/linux-riscv/f8e67f82-103d-156c-deb0-d6d6e2756f5e@microchip.com/ Fixes: 922b0375fc93 ("riscv: Fix memblock reservation for device tree blob") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20221107151524.3941467-1-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-11riscv: always honor the CONFIG_CMDLINE_FORCE when parsing dtbWenting Zhang
When CONFIG_CMDLINE_FORCE is enabled, cmdline provided by CONFIG_CMDLINE are always used. This allows CONFIG_CMDLINE to be used regardless of the result of device tree scanning. This especially fixes the case where a device tree without the chosen node is supplied to the kernel. In such cases, early_init_dt_scan would return true. But inside early_init_dt_scan_chosen, the cmdline won't be updated as there is no chosen node in the device tree. As a result, CONFIG_CMDLINE is not copied into boot_command_line even if CONFIG_CMDLINE_FORCE is enabled. This commit allows properly update boot_command_line in this situation. Fixes: 8fd6e05c7463 ("arch: riscv: support kernel command line forcing when no DTB passed") Signed-off-by: Wenting Zhang <zephray@outlook.com> Reviewed-by: Björn Töpel <bjorn@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/PSBPR04MB399135DFC54928AB958D0638B1829@PSBPR04MB3991.apcprd04.prod.outlook.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-13RISC-V: Clean up the Zicbom block size probingPalmer Dabbelt
This fixes two issues: I truncated the warning's hart ID when porting to the 64-bit hart ID code, and the original code's warning handling could fire on an uninitialized hart ID. The biggest change here is that riscv_cbom_block_size is no longer initialized, as IMO the default isn't sane: there's nothing in the ISA that mandates any specific cache block size, so falling back to one will just silently produce the wrong answer on some systems. This also changes the probing order so the cache block size is known before enabling Zicbom support. CC: stable@vger.kernel.org CC: Andrew Jones <ajones@ventanamicro.com> CC: Heiko Stuebner <heiko@sntech.de> CC: Atish Patra <atishp@rivosinc.com> Fixes: 3aefb2ee5bdd ("riscv: implement Zicbom-based CMO instructions + the t-head variant") Fixes: 1631ba1259d6 ("riscv: Add support for non-coherent devices using zicbom extension") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> [Conor: fixed the redefinition errors] Tested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220912224800.998121-1-mail@conchuod.ie Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-07-28riscv: Add support for non-coherent devices using zicbom extensionHeiko Stuebner
The Zicbom ISA-extension was ratified in november 2021 and introduces instructions for dcache invalidate, clean and flush operations. Implement cache management operations for non-coherent devices based on them. Of course not all cores will support this, so implement an alternative-based mechanism that replaces empty instructions with ones done around Zicbom instructions. As discussed in previous versions, assume the platform being coherent by default so that non-coherent devices need to get marked accordingly by firmware. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20220706231536.2041855-4-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-06-01RISC-V: Mark IORESOURCE_EXCLUSIVE for reserved mem instead of IORESOURCE_BUSYXianting Tian
Commit 00ab027a3b82 ("RISC-V: Add kernel image sections to the resource tree") marked IORESOURCE_BUSY for reserved memory, which caused resource map failed in subsequent operations of related driver, so remove the IORESOURCE_BUSY flag. In order to prohibit userland mapping reserved memory, mark IORESOURCE_EXCLUSIVE for it. The code to reproduce the issue, dts: mem0: memory@a0000000 { reg = <0x0 0xa0000000 0 0x1000000>; no-map; }; &test { status = "okay"; memory-region = <&mem0>; }; code: np = of_parse_phandle(pdev->dev.of_node, "memory-region", 0); ret = of_address_to_resource(np, 0, &r); base = devm_ioremap_resource(&pdev->dev, &r); // base = -EBUSY Fixes: 00ab027a3b82 ("RISC-V: Add kernel image sections to the resource tree") Reported-by: Huaming Jiang <jianghuaming.jhm@alibaba-inc.com> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Co-developed-by: Nick Kossifidis <mick@ics.forth.gr> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Link: https://lore.kernel.org/r/20220518013428.1338983-1-xianting.tian@linux.alibaba.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-11riscv: move boot alternatives to after fill_hwcapHeiko Stuebner
Move the application of boot alternatives to after the hw-capabilities are populated. This allows to check for available extensions when determining which alternatives to apply and also makes it actually work if CONFIG_SMP is disabled for whatever reason. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20220511192921.2223629-8-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-22drivers/base/node: consolidate node device subsystem initialization in ↵David Hildenbrand
node_dev_init() ... and call node_dev_init() after memory_dev_init() from driver_init(), so before any of the existing arch/subsys calls. All online nodes should be known at that point: early during boot, arch code determines node and zone ranges and sets the relevant nodes online; usually this happens in setup_arch(). This is in line with memory_dev_init(), which initializes the memory device subsystem and creates all memory block devices. Similar to memory_dev_init(), panic() if anything goes wrong, we don't want to continue with such basic initialization errors. The important part is that node_dev_init() gets called after memory_dev_init() and after cpu_dev_init(), but before any of the relevant archs call register_cpu() to register the new cpu device under the node device. The latter should be the case for the current users of topology_init(). Link: https://lkml.kernel.org/r/20220203105212.30385-1-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Tested-by: Anatoly Pugachev <matorola@gmail.com> (sparc64) Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mike Rapoport <rppt@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20RISC-V: Do not use cpumask data structure for hartid bitmapAtish Patra
Currently, SBI APIs accept a hartmask that is generated from struct cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it is not the correct data structure for hartids as it can be higher than NR_CPUs for platforms with sparse or discontguous hartids. Remove all association between hartid mask and struct cpumask. Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes) Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes) Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09RISC-V: Use common riscv_cpuid_to_hartid_mask() for both SMP=y and SMP=nSean Christopherson
Use what is currently the SMP=y version of riscv_cpuid_to_hartid_mask() for both SMP=y and SMP=n to fix a build failure with KVM=m and SMP=n due to boot_cpu_hartid not being exported. This also fixes a second bug where the SMP=n version assumes the sole CPU in the system is in the incoming mask, which may not hold true in kvm_riscv_vcpu_sbi_ecall() if the KVM guest VM has multiple vCPUs (on a SMP=n system). Fixes: 1ef46c231df4 ("RISC-V: Implement new SBI v0.2 extensions") Reported-by: Adam Borowski <kilobyte@angband.pl> Reviewed-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2021-11-06memblock: use memblock_free for freeing virtual pointersMike Rapoport
Rename memblock_free_ptr() to memblock_free() and use memblock_free() when freeing a virtual pointer so that memblock_free() will be a counterpart of memblock_alloc() The callers are updated with the below semantic patch and manual addition of (void *) casting to pointers that are represented by unsigned long variables. @@ identifier vaddr; expression size; @@ ( - memblock_phys_free(__pa(vaddr), size); + memblock_free(vaddr, size); | - memblock_free_ptr(vaddr, size); + memblock_free(vaddr, size); ) [sfr@canb.auug.org.au: fixup] Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06memblock: rename memblock_free to memblock_phys_freeMike Rapoport
Since memblock_free() operates on a physical range, make its name reflect it and rename it to memblock_phys_free(), so it will be a logical counterpart to memblock_phys_alloc(). The callers are updated with the below semantic patch: @@ expression addr; expression size; @@ - memblock_free(addr, size); + memblock_phys_free(addr, size); Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-05Merge tag 'riscv-for-linus-5.15-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - support PC-relative instructions (auipc and branches) in kprobes - support for forced IRQ threading - support for the hlt/nohlt kernel command line options, via the generic idle loop - show the edge/level triggered behavior of interrupts in /proc/interrupts - a handful of cleanups to our address mapping mechanisms - support for allocating gigantic hugepages via CMA - support for the undefined behavior sanitizer (UBSAN) - a handful of cleanups to the VDSO that allow the kernel to build with LLD. - support for hugepage migration * tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits) riscv: add support for hugepage migration RISC-V: Fix VDSO build for !MMU riscv: use strscpy to replace strlcpy riscv: explicitly use symbol offsets for VDSO riscv: Enable Undefined Behavior Sanitizer UBSAN riscv: Keep the riscv Kconfig selects sorted riscv: Support allocating gigantic hugepages using CMA riscv: fix the global name pfn_base confliction error riscv: Move early fdt mapping creation in its own function riscv: Simplify BUILTIN_DTB device tree mapping handling riscv: Use __maybe_unused instead of #ifdefs around variable declarations riscv: Get rid of map_size parameter to create_kernel_page_table riscv: Introduce va_kernel_pa_offset for 32-bit kernel riscv: Optimize kernel virtual address conversion macro dt-bindings: riscv: add starfive jh7100 bindings riscv: Enable GENERIC_IRQ_SHOW_LEVEL riscv: Enable idle generic idle loop riscv: Allow forced irq threading riscv: Implement thread_struct whitelist for hardened usercopy riscv: kprobes: implement the branch instructions ...
2021-08-25riscv: use strscpy to replace strlcpyJason Wang
The strlcpy should not be used because it doesn't limit the source length. As linus says, it's a completely useless function if you can't implicitly trust the source string - but that is almost always why people think they should use it! All in all the BSD function will lead some potential bugs. But the strscpy doesn't require reading memory from the src string beyond the specified "count" bytes, and since the return value is easier to error-check than strlcpy()'s. In addition, the implementation is robust to the string changing out from underneath it, unlike the current strlcpy() implementation. Thus, We prefer using strscpy instead of strlcpy. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-20riscv: Fix a number of free'd resources in init_resources()Petr Pavlu
Function init_resources() allocates a boot memory block to hold an array of resources which it adds to iomem_resource. The array is filled in from its end and the function then attempts to free any unused memory at the beginning. The problem is that size of the unused memory is incorrectly calculated and this can result in releasing memory which is in use by active resources. Their data then gets corrupted later when the memory is reused by a different part of the system. Fix the size of the released memory to correctly match the number of unused resource entries. Fixes: ffe0e5261268 ("RISC-V: Improve init_resources()") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Nick Kossifidis <mick@ics.forth.gr> Tested-by: Sunil V L <sunilvl@ventanamicro.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-09Merge tag 'riscv-for-linus-5.14-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "We have a handful of new features for 5.14: - Support for transparent huge pages. - Support for generic PCI resources mapping. - Support for the mem= kernel parameter. - Support for KFENCE. - A handful of fixes to avoid W+X mappings in the kernel. - Support for VMAP_STACK based overflow detection. - An optimized copy_{to,from}_user" * tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (37 commits) riscv: xip: Fix duplicate included asm/pgtable.h riscv: Fix PTDUMP output now BPF region moved back to module region riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall riscv: add VMAP_STACK overflow detection riscv: ptrace: add argn syntax riscv: mm: fix build errors caused by mk_pmd() riscv: Introduce structure that group all variables regarding kernel mapping riscv: Map the kernel with correct permissions the first time riscv: Introduce set_kernel_memory helper riscv: Enable KFENCE for riscv64 RISC-V: Use asm-generic for {in,out}{bwlq} riscv: add ASID-based tlbflushing methods riscv: pass the mm_struct to __sbi_tlb_flush_range riscv: Add mem kernel parameter support riscv: Simplify xip and !xip kernel address conversion macros riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED riscv: Only initialize swiotlb when necessary riscv: fix typo in init.c riscv: Cleanup unused functions riscv: mm: Use better bitmap_zalloc() ...
2021-07-08riscv: convert to setup_initial_init_mm()Kefeng Wang
Use setup_initial_init_mm() helper to simplify code. Link: https://lkml.kernel.org/r/20210608083418.137226-13-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>