summaryrefslogtreecommitdiff
path: root/arch/riscv/kernel/sys_riscv.c
AgeCommit message (Collapse)Author
2024-06-25syscalls: mmap(): use unsigned offset type consistentlyArnd Bergmann
Most architectures that implement the old-style mmap() with byte offset use 'unsigned long' as the type for that offset, but microblaze and riscv have the off_t type that is shared with userspace, matching the prototype in include/asm-generic/syscalls.h. Make this consistent by using an unsigned argument everywhere. This changes the behavior slightly, as the argument is shifted to a page number, and an user input with the top bit set would result in a negative page offset rather than a large one as we use elsewhere. For riscv, the 32-bit sys_mmap2() definition actually used a custom type that is different from the global declaration, but this was missed due to an incorrect type check. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-27riscv: remove unused headertanzirh@google.com
arch/riscv/kernel/sys_riscv.c builds without using anything from asm-generic/mman-common.h. There seems to be no reason to include this file. Signed-off-by: Tanzir Hasan <tanzirh@google.com> Fixes: 9e2e6042a7ec ("riscv: Allow PROT_WRITE-only mmap()") Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20231212-removeasmgeneric-v2-1-a0e6d7df34a7@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-03RISC-V: Move the hwprobe syscall to its own fileAndrew Jones
As Palmer says, hwprobe is "sort of its own thing now, and it's only going to get bigger..." Suggested-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231122164700.127954-8-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-03RISC-V: hwprobe: Clarify cpus size parameterAndrew Jones
The "count" parameter associated with the 'cpus' parameter of the hwprobe syscall is the size in bytes of 'cpus'. Naming it 'cpu_count' may mislead users (it did me) to think it's the number of CPUs that are or can be represented by 'cpus' instead. This is particularly easy (IMO) to get wrong since 'cpus' is documented to be defined by CPU_SET(3) and CPU_SET(3) also documents a CPU_COUNT() (the number of CPUs in set) macro. CPU_SET(3) refers to the size of cpu sets with 'setsize'. Adopt 'cpusetsize' for the hwprobe parameter and specifically state it is in bytes in Documentation/riscv/hwprobe.rst to clarify. Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20231122164700.127954-7-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-08Merge tag 'riscv-for-linus-6.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for cbo.zero in userspace - Support for CBOs on ACPI-based systems - A handful of improvements for the T-Head cache flushing ops - Support for software shadow call stacks - Various cleanups and fixes * tag 'riscv-for-linus-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (31 commits) RISC-V: hwprobe: Fix vDSO SIGSEGV riscv: configs: defconfig: Enable configs required for RZ/Five SoC riscv: errata: prefix T-Head mnemonics with th. riscv: put interrupt entries into .irqentry.text riscv: mm: Update the comment of CONFIG_PAGE_OFFSET riscv: Using TOOLCHAIN_HAS_ZIHINTPAUSE marco replace zihintpause riscv/mm: Fix the comment for swap pte format RISC-V: clarify the QEMU workaround in ISA parser riscv: correct pt_level name via pgtable_l5/4_enabled RISC-V: Provide pgtable_l5_enabled on rv32 clocksource: timer-riscv: Increase rating of clock_event_device for Sstc clocksource: timer-riscv: Don't enable/disable timer interrupt lkdtm: Fix CFI_BACKWARD on RISC-V riscv: Use separate IRQ shadow call stacks riscv: Implement Shadow Call Stack riscv: Move global pointer loading to a macro riscv: Deduplicate IRQ stack switching riscv: VMAP_STACK overflow detection thread-safe RISC-V: cacheflush: Initialize CBO variables on ACPI systems RISC-V: ACPI: RHCT: Add function to get CBO block sizes ...
2023-10-10docs: move riscv under archCosta Shulyupin
and fix all in-tree references. Architecture-specific documentation is being moved into Documentation/arch/ as a way of cleaning up the top-level documentation directory and making the docs hierarchy more closely match the source hierarchy. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20230930185354.3034118-1-costa.shul@redhat.com
2023-09-21RISC-V: hwprobe: Expose Zicboz extension and its block sizeAndrew Jones
Expose Zicboz through hwprobe and also provide a key to extract its respective block size. Opportunistically add a macro and apply it to current extensions in order to avoid duplicating code. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Evan Green <evan@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230918131518.56803-11-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-23riscv: Implement syscall wrappersSami Tolvanen
Commit f0bddf50586d ("riscv: entry: Convert to generic entry") moved syscall handling to C code, which exposed function pointer type mismatches that trip fine-grained forward-edge Control-Flow Integrity (CFI) checks as syscall handlers are all called through the same syscall_t pointer type. To fix the type mismatches, implement pt_regs based syscall wrappers similarly to x86 and arm64. This patch is based on arm64 syscall wrappers added in commit 4378a7d4be30 ("arm64: implement syscall wrappers"), where the main goal was to minimize the risk of userspace-controlled values being used under speculation. This may be a concern for riscv in future as well. Following other architectures, the syscall wrappers generate three functions for each syscall; __riscv_<compat_>sys_<name> takes a pt_regs pointer and extracts arguments from registers, __se_<compat_>sys_<name> is a sign-extension wrapper that casts the long arguments to the correct types for the real syscall implementation, which is named __do_<compat_>sys_<name>. Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20230710183544.999540-9-samitolvanen@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-19Merge patch series "RISC-V: Export Zba, Zbb to usermode via hwprobe"Palmer Dabbelt
Evan Green <evan@rivosinc.com> says: This change detects the presence of Zba, Zbb, and Zbs extensions and exports them per-hart to userspace via the hwprobe mechanism. Glibc can then use these in setting up hwcaps-based library search paths. There's a little bit of extra housekeeping here: the first change adds Zba and Zbs to the set of extensions the kernel recognizes, and the second change starts tracking ISA features per-hart (in addition to the ANDed mask of features across all harts which the kernel uses to make decisions). Now that we track the ISA information per-hart, we could even fix up /proc/cpuinfo to accurately report extension per-hart, though I've left that out of this series for now. * b4-shazam-merge: RISC-V: hwprobe: Expose Zba, Zbb, and Zbs RISC-V: Track ISA extensions per hart RISC-V: Add Zba, Zbs extension probing Link: https://lore.kernel.org/r/20230509182504.2997252-1-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-19RISC-V: hwprobe: Expose Zba, Zbb, and ZbsEvan Green
Add two new bits to the IMA_EXT_0 key for ZBA, ZBB, and ZBS extensions. These are accurately reported per CPU. Signed-off-by: Evan Green <evan@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20230509182504.2997252-4-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-08riscv: hwprobe: Add support for probing V in RISCV_HWPROBE_KEY_IMA_EXT_0Andy Chiu
Probing kernel support for Vector extension is available now. This only add detection for V only. Extenions like Zvfh, Zk are not in this scope. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Evan Green <evan@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230605110724.21391-4-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-26RISC-V: hwprobe: Explicity check for -1 in vdso initAndrew Jones
id_bitsmash is unsigned. We need to explicitly check for -1, rather than use > 0. Fixes: aa5af0aa90ba ("RISC-V: Add hwprobe vDSO function and data") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Evan Green <evan@rivosinc.com> Link: https://lore.kernel.org/r/20230426141333.10063-3-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-26RISC-V: hwprobe: There can only be one firstAndrew Jones
Only capture the first cpu_id in order for the comparison below to be of any use. Fixes: ea3de9ce8aa2 ("RISC-V: Add a syscall for HW probing") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Evan Green <evan@rivosinc.com> Link: https://lore.kernel.org/r/20230426141333.10063-2-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-18RISC-V: Add hwprobe vDSO function and dataEvan Green
Add a vDSO function __vdso_riscv_hwprobe, which can sit in front of the riscv_hwprobe syscall and answer common queries. We stash a copy of static answers for the "all CPUs" case in the vDSO data page. This data is private to the vDSO, so we can decide later to change what's stored there or under what conditions we defer to the syscall. Currently all data can be discovered at boot, so the vDSO function answers all queries when the cpumask is set to the "all CPUs" hint. There's also a boolean in the data that lets the vDSO function know that all CPUs are the same. In that case, the vDSO will also answer queries for arbitrary CPU masks in addition to the "all CPUs" hint. Signed-off-by: Evan Green <evan@rivosinc.com> Link: https://lore.kernel.org/r/20230407231103.2622178-7-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-18RISC-V: hwprobe: Support probing of misaligned access performanceEvan Green
This allows userspace to select various routines to use based on the performance of misaligned access on the target hardware. Rather than adding DT bindings, this change taps into the alternatives mechanism used to probe CPU errata. Add a new function pointer alongside the vendor-specific errata_patch_func() that probes for desirable errata (otherwise known as "features"). Unlike the errata_patch_func(), this function is called on each CPU as it comes up, so it can save feature information per-CPU. The T-head C906 has fast unaligned access, both as defined by GCC [1], and in performing a basic benchmark, which determined that byte copies are >50% slower than a misaligned word copy of the same data size (source for this test at [2]): bytecopy size f000 count 50000 offset 0 took 31664899 us wordcopy size f000 count 50000 offset 0 took 5180919 us wordcopy size f000 count 50000 offset 1 took 13416949 us [1] https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv.cc#L353 [2] https://pastebin.com/EPXvDHSW Co-developed-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Evan Green <evan@rivosinc.com> Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com> Link: https://lore.kernel.org/r/20230407231103.2622178-5-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-18RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMAEvan Green
We have an implicit set of base behaviors that userspace depends on, which are mostly defined in various ISA specifications. Co-developed-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Evan Green <evan@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com> Link: https://lore.kernel.org/r/20230407231103.2622178-4-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-04-18RISC-V: Add a syscall for HW probingEvan Green
We don't have enough space for these all in ELF_HWCAP{,2} and there's no system call that quite does this, so let's just provide an arch-specific one to probe for hardware capabilities. This currently just provides m{arch,imp,vendor}id, but with the key-value pairs we can pass more in the future. Co-developed-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Evan Green <evan@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com> Link: https://lore.kernel.org/r/20230407231103.2622178-3-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-22riscv: Allow PROT_WRITE-only mmap()Andrew Bresticker
Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout PROT_READ with the justification that a write-only PTE is considered a reserved PTE permission bit pattern in the privileged spec. This check is unnecessary since we let VM_WRITE imply VM_READ on RISC-V, and it is inconsistent with other architectures that don't support write-only PTEs, creating a potential software portability issue. Just remove the check altogether and let PROT_WRITE imply PROT_READ as is the case on other architectures. Note that this also allows PROT_WRITE|PROT_EXEC mappings which were disallowed prior to the aforementioned commit; PROT_READ is implied in such mappings as well. Fixes: 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid") Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220915193702.2201018-3-abrestic@rivosinc.com/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-07-21riscv: mmap with PROT_WRITE but no PROT_READ is invalidCeleste Liu
As mentioned in Table 4.5 in RISC-V spec Volume 2 Section 4.3, write but not read is "Reserved for future use.". For now, they are not valid. In the current code, -wx is marked as invalid, but -w- is not marked as invalid. This patch refines that judgment. Reported-by: xctan <xc-tan@outlook.com> Co-developed-by: dram <dramforever@live.com> Signed-off-by: dram <dramforever@live.com> Co-developed-by: Ruizhe Pan <c141028@gmail.com> Signed-off-by: Ruizhe Pan <c141028@gmail.com> Signed-off-by: Celeste Liu <coelacanthus@outlook.com> Link: https://lore.kernel.org/r/PH7PR14MB559464DBDD310E755F5B21E8CEDC9@PH7PR14MB5594.namprd14.prod.outlook.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26riscv: compat: syscall: Add compat_sys_call_table implementationGuo Ren
Implement compat sys_call_table and some system call functions: truncate64, ftruncate64, fallocate, pread64, pwrite64, sync_file_range, readahead, fadvise64_64 which need argument translation. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20220405071314.3225832-12-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2020-06-18RISC-V: Don't allow write+exec only page mapping request in mmapYash Shah
As per the table 4.4 of version "20190608-Priv-MSU-Ratified" of the RISC-V instruction set manual[0], the PTE permission bit combination of "write+exec only" is reserved for future use. Hence, don't allow such mapping request in mmap call. An issue is been reported by David Abdurachmanov, that while running stress-ng with "sysbadaddr" argument, RCU stalls are observed on RISC-V specific kernel. This issue arises when the stress-sysbadaddr request for pages with "write+exec only" permission bits and then passes the address obtain from this mmap call to various system call. For the riscv kernel, the mmap call should fail for this particular combination of permission bits since it's not valid. [0]: http://dabbelt.com/~palmer/keep/riscv-isa-manual/riscv-privileged-20190608-1.pdf Signed-off-by: Yash Shah <yash.shah@sifive.com> Reported-by: David Abdurachmanov <david.abdurachmanov@gmail.com> [Palmer: Refer to the latest ISA specification at the only link I could find, and update the terminology.] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 286Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 97 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.025053186@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-28RISC-V: Use a less ugly workaround for unused variable warningsPalmer Dabbelt
Thanks to Christoph Hellwig for pointing out a cleaner way to do this, as my approach was quite ugly. CC: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-08-20RISC-V: Define sys_riscv_flush_icache when SMP=nPalmer Dabbelt
This would be necessary to make non-SMP builds work, but there is another error in the implementation of our syscall linkage that actually just causes sys_riscv_flush_icache to never build. I've build tested this on allnoconfig and allnoconfig+SMP=y, as well as defconfig like normal. CC: Christoph Hellwig <hch@infradead.org> CC: Guenter Roeck <linux@roeck-us.net> In-Reply-To: <20180809055830.GA17533@infradead.org> In-Reply-To: <20180809132612.GA31058@roeck-us.net> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02mm: add ksys_mmap_pgoff() helper; remove in-kernel calls to sys_mmap_pgoff()Dominik Brodowski
Using this helper allows us to avoid the in-kernel calls to the sys_mmap_pgoff() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_mmap_pgoff(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2017-12-11RISC-V: Logical vs Bitwise typoDan Carpenter
In the current code, there is a ! logical NOT where a bitwise ~ NOT was intended. It means that we never return -EINVAL. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-30RISC-V: Clean up an unused includePalmer Dabbelt
We used to have some cmpxchg syscalls. They're no longer there, so we no longer need the include. CC: Christoph Hellwig <hch@infradead.org> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-30RISC-V: Allow userspace to flush the instruction cacheAndrew Waterman
Despite RISC-V having a direct 'fence.i' instruction available to userspace (which we can't trap!), that's not actually viable when running on Linux because the kernel might schedule a process on another hart. There is no way for userspace to handle this without invoking the kernel (as it doesn't know the thread->hart mappings), so we've defined a RISC-V specific system call to flush the instruction cache. This patch adds both a system call and a VDSO entry. If possible, we'd like to avoid having the system call be considered part of the user-facing ABI and instead restrict that to the VDSO entry -- both just in general to avoid having additional user-visible ABI to maintain, and because we'd prefer that users just call the VDSO entry because there might be a better way to do this in the future (ie, one that doesn't require entering the kernel). Signed-off-by: Andrew Waterman <andrew@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-09-26RISC-V: User-facing APIPalmer Dabbelt
This patch contains code that is in some way visible to the user: including via system calls, the VDSO, module loading and signal handling. It also contains some generic code that is ABI visible. Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>