Age | Commit message (Collapse) | Author |
|
This patch adds purgatory, the name and concept have been taken
from kexec-tools. Purgatory runs between two kernels, and do
verify sha256 hash to ensure the kernel to jump to is fine and
has not been corrupted after loading. Makefile is modified based
on x86 platform.
Signed-off-by: Li Zhengyu <lizhengyu3@huawei.com>
Link: https://lore.kernel.org/r/20220408100914.150110-6-lizhengyu3@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
This patch adds support for loading a kexec on panic (kdump) kernel.
It has been tested with vmcore-dmesg on riscv64 QEMU on both an smp
and a non-smp system.
Signed-off-by: Li Zhengyu <lizhengyu3@huawei.com>
Link: https://lore.kernel.org/r/20220408100914.150110-5-lizhengyu3@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
This patch adds support for kexec_file on RISC-V. I tested it on riscv64
QEMU with busybear-linux and single core along with the OpenSBI firmware
fw_jump.bin for generic platform.
On SMP system, it depends on CONFIG_{HOTPLUG_CPU, RISCV_SBI} to
resume/stop hart through OpenSBI firmware, it also needs a OpenSBI that
support the HSM extension.
Signed-off-by: Liao Chang <liaochang1@huawei.com>
Signed-off-by: Li Zhengyu <lizhengyu3@huawei.com>
Link: https://lore.kernel.org/r/20220408100914.150110-4-lizhengyu3@huawei.com
[Palmer: Make 64-bit only]
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The pointer to buffer loading kernel binaries is in kernel space for
kexec_fil mode, When copy_from_user copies data from pointer to a block
of memory, it checkes that the pointer is in the user space range, on
RISCV-V that is:
static inline bool __access_ok(unsigned long addr, unsigned long size)
{
return size <= TASK_SIZE && addr <= TASK_SIZE - size;
}
and TASK_SIZE is 0x4000000000 for 64-bits, which now causes
copy_from_user to reject the access of the field 'buf' of struct
kexec_segment that is in range [CONFIG_PAGE_OFFSET - VMALLOC_SIZE,
CONFIG_PAGE_OFFSET), is invalid user space pointer.
This patch fixes this issue by skipping access_ok(), use mempcy() instead.
Signed-off-by: Liao Chang <liaochang1@huawei.com>
Link: https://lore.kernel.org/r/20220408100914.150110-3-lizhengyu3@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Fixes dtbs_check warnings like:
dma@3000000: $nodename:0: 'dma@3000000' does not match '^dma-controller(@.*)?$'
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20220407193856.18223-1-krzysztof.kozlowski@linaro.org
Fixes: c5ab54e9945b ("riscv: dts: add support for PDMA device of HiFive Unleashed Rev A00")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Kernel now supports chained power-off handlers. Use do_kernel_power_off()
that invokes chained power-off handlers. It also invokes legacy
pm_power_off() for now, which will be removed once all drivers will
be converted to the new sys-off API.
Acked-by: Palmer Dabbelt <palmer@dabbelt.com>
Reviewed-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The RISC-V port supports the rv32i and rv64i base ISAs, but provides no
mechanism to run 32-bit userspace on 64-bit systems. This adds that
support, via the COMPAT framework. As the RISC-V ISAs (and uABIs) were
developed concurrently, the resulting compat support is mostly generic.
This includes a handful of cleanups to the generic compat infrastructure
to more cleanly support RISC-V, followed by the RISC-V implementation.
* palmer/riscv-compat:
riscv: compat: Add COMPAT Kbuild skeletal support
riscv: compat: ptrace: Add compat_arch_ptrace implement
riscv: compat: signal: Add rt_frame implementation
riscv: compat: vdso: Add setup additional pages implementation
riscv: compat: vdso: Add COMPAT_VDSO base code implementation
riscv: compat: Add hw capability check for elf
riscv: compat: Add elf.h implementation
riscv: compat: process: Add UXL_32 support in start_thread
riscv: compat: syscall: Add entry.S implementation
riscv: compat: syscall: Add compat_sys_call_table implementation
riscv: compat: Support TASK_SIZE for compat mode
riscv: compat: Add basic compat data type implementation
riscv: Fixup difference with defconfig
syscalls: compat: Fix the missing part for __SYSCALL_COMPAT
asm-generic: compat: Cleanup duplicate definitions
fs: stat: compat: Add __ARCH_WANT_COMPAT_STAT
arch: Add SYSVIPC_COMPAT for all architectures
compat: consolidate the compat_flock{,64} definition
uapi: always define F_GETLK64/F_SETLK64/F_SETLKW64 in fcntl.h
uapi: simplify __ARCH_FLOCK{,64}_PAD a little
|
|
Adds initial skeletal COMPAT Kbuild (Running 32bit U-mode on
64bit S-mode) support.
- Setup kconfig & dummy functions for compiling.
- Implement compat_start_thread by the way.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-21-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Now, you can use native gdb on riscv64 for rv32 app debugging.
$ uname -a
Linux buildroot 5.16.0-rc4-00036-gbef6b82fdf23-dirty #53 SMP Mon Dec 20 23:06:53 CST 2021 riscv64 GNU/Linux
$ cat /proc/cpuinfo
processor : 0
hart : 0
isa : rv64imafdcsuh
mmu : sv48
$ file /bin/busybox
/bin/busybox: setuid ELF 32-bit LSB shared object, UCB RISC-V, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-riscv32-ilp32d.so.1, for GNU/Linux 5.15.0, stripped
$ file /usr/bin/gdb
/usr/bin/gdb: ELF 32-bit LSB shared object, UCB RISC-V, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-riscv32-ilp32d.so.1, for GNU/Linux 5.15.0, stripped
$ /usr/bin/gdb /bin/busybox
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
...
Reading symbols from /bin/busybox...
(No debugging symbols found in /bin/busybox)
(gdb) b main
Breakpoint 1 at 0x8ddc
(gdb) r
Starting program: /bin/busybox
Failed to read a valid object file image from memory.
Breakpoint 1, 0x555a8ddc in main ()
(gdb) i r
ra 0x77df0b74 0x77df0b74
sp 0x7fdd3d10 0x7fdd3d10
gp 0x5567e800 0x5567e800 <bb_common_bufsiz1+160>
tp 0x77f64280 0x77f64280
t0 0x0 0
t1 0x555a6fac 1431990188
t2 0x77dd8db4 2011008436
fp 0x7fdd3e34 0x7fdd3e34
s1 0x7fdd3e34 2145205812
a0 0xffffffff -1
a1 0x2000 8192
a2 0x7fdd3e3c 2145205820
a3 0x0 0
a4 0x7fdd3d30 2145205552
a5 0x555a8dc0 1431997888
a6 0x77f2c170 2012397936
a7 0x6a7c7a2f 1786542639
s2 0x0 0
s3 0x0 0
s4 0x555a8dc0 1431997888
s5 0x77f8a3a8 2012783528
s6 0x7fdd3e3c 2145205820
s7 0x5567cecc 1432866508
--Type <RET> for more, q to quit, c to continue without paging--
s8 0x1 1
s9 0x0 0
s10 0x55634448 1432568904
s11 0x0 0
t3 0x77df0bb8 2011106232
t4 0x42fc 17148
t5 0x0 0
t6 0x40 64
pc 0x555a8ddc 0x555a8ddc <main+28>
(gdb) si
0x555a78f0 in mallopt@plt ()
(gdb) c
Continuing.
BusyBox v1.34.1 (2021-12-19 22:39:48 CST) multi-call binary.
BusyBox is copyrighted by many authors between 1998-2015.
Licensed under GPLv2. See source distribution for detailed
copyright notices.
Usage: busybox [function [arguments]...]
or: busybox --list[-full]
...
[Inferior 1 (process 107) exited normally]
(gdb) q
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-20-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Implement compat_setup_rt_frame for sigcontext save & restore. The
main process is the same with signal, but the rv32 pt_regs' size
is different from rv64's, so we needs convert them.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-19-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
In the event that random_get_entropy() can't access a cycle counter or
similar, falling back to returning 0 is really not the best we can do.
Instead, at least calling random_get_entropy_fallback() would be
preferable, because that always needs to return _something_, even
falling back to jiffies eventually. It's not as though
random_get_entropy_fallback() is super high precision or guaranteed to
be entropic, but basically anything that's not zero all the time is
better than returning zero all the time.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
As commit d283d422c6c4 ("x86: mm: add x86_64 support for page table
check"), enable ARCH_SUPPORTS_PAGE_TABLE_CHECK on riscv.
Add additional page table check stubs for page table helpers, these stubs
can be used to check the existing page table entries.
Link: https://lkml.kernel.org/r/20220507110114.4128854-7-tongtiangen@huawei.com
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
No conflicts.
Build issue in drivers/net/ethernet/sfc/ptp.c
54fccfdd7c66 ("sfc: efx_default_channel_type APIs can be static")
49e6123c65da ("net: sfc: fix memory leak due to ptp channel")
https://lore.kernel.org/all/20220510130556.52598fe2@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Adds support for Svpbmt, the "Supervisor-mode: page-based memory types"
extension, which allows pages to be marked as non-cacheable and/or I/O.
This also includes support for the Allwinner D1's page table attributes
via the alternatives framework, which differ from Svpbmt in various ways
but are necessary to make the D1 function.
* palmer/riscv-d1:
riscv: add memory-type errata for T-Head
riscv: don't use global static vars to store alternative data
riscv: remove FIXMAP_PAGE_IO and fall back to its default value
riscv: add RISC-V Svpbmt extension support
riscv: Fix accessing pfn bits in PTEs for non-32bit variants
riscv: move boot alternatives to after fill_hwcap
riscv: prevent compressed instructions in alternatives
riscv: extend concatenated alternatives-lines to the same length
riscv: implement ALTERNATIVE_2 macro
riscv: implement module alternatives
riscv: allow different stages with alternatives
riscv: integrate alternatives better into the main architecture
|
|
Some current cpus based on T-Head cores implement memory-types
way different than described in the svpbmt spec even going
so far as using PTE bits marked as reserved.
Add the T-Head vendor-id and necessary errata code to
replace the affected instructions.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20220511192921.2223629-13-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Right now the code uses a global struct to store vendor-ids
and another global variable to store the vendor-patch-function.
There exist specific cases where we'll need to patch the kernel
at an even earlier stage, where trying to write to a static
variable might actually result in hangs.
Also collecting the vendor-information consists of 3 sbi-ecalls
(or csr-reads) which is pretty negligible in the context of
booting a kernel.
So rework the code to not rely on static variables and instead
collect the vendor-information when a round of alternatives is
to be applied.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Link: https://lore.kernel.org/r/20220511192921.2223629-12-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
If not defined in the arch, FIXMAP_PAGE_IO defaults to PAGE_KERNEL_IO,
which we defined when adding the svpbmt implementation.
So drop the FIXMAP_PAGE_IO riscv define.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20220511192921.2223629-11-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Svpbmt (the S should be capitalized) is the
"Supervisor-mode: page-based memory types" extension
that specifies attributes for cacheability, idempotency
and ordering.
The relevant settings are done in special bits in PTEs:
Here is the svpbmt PTE format:
| 63 | 62-61 | 60-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
N MT RSW D A G U X W R V
^
Of the Reserved bits [63:54] in a leaf PTE, the high bit is already
allocated (as the N bit), so bits [62:61] are used as the MT (aka
MemType) field. This field specifies one of three memory types that
are close equivalents (or equivalent in effect) to the three main x86
and ARMv8 memory types - as shown in the following table.
RISC-V
Encoding &
MemType RISC-V Description
---------- ------------------------------------------------
00 - PMA Normal Cacheable, No change to implied PMA memory type
01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory
11 - Rsvd Reserved for future standard use
As the extension will not be present on all implementations,
implement a method to handle cpufeatures via alternatives
to not incur runtime penalties on cpu variants not supporting
specific extensions and patch relevant code parts at runtime.
Co-developed-by: Wei Fu <wefu@redhat.com>
Signed-off-by: Wei Fu <wefu@redhat.com>
Co-developed-by: Liu Shaohua <liush@allwinnertech.com>
Signed-off-by: Liu Shaohua <liush@allwinnertech.com>
Co-developed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@kernel.org>
[moved to use the alternatives mechanism]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Link: https://lore.kernel.org/r/20220511192921.2223629-10-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
On rv32 the PFN part of PTEs is defined to use bits [xlen-1:10]
while on rv64 it is defined to use bits [53:10], leaving [63:54]
as reserved.
With upcoming optional extensions like svpbmt these previously
reserved bits will get used so simply right-shifting the PTE
to get the PFN won't be enough.
So introduce a _PAGE_PFN_MASK constant to mask the correct bits
for both rv32 and rv64 before shifting.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Link: https://lore.kernel.org/r/20220511192921.2223629-9-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Move the application of boot alternatives to after the hw-capabilities
are populated. This allows to check for available extensions when
determining which alternatives to apply and also makes it actually
work if CONFIG_SMP is disabled for whatever reason.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20220511192921.2223629-8-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Instructions are opportunistically compressed by the RISC-V assembler
when possible, but in alternatives-blocks both the old and new content
need to be the same size, so having the toolchain do somewhat random
optimizations will cause strange side-effects like
"attempt to move .org backwards" compile-time errors.
Already a simple "and" used in alternatives assembly will cause these
mismatched code sizes.
So prevent compressed instructions to be generated in alternatives-
code and use option-push and -pop to only limit this to the relevant
code blocks
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Link: https://lore.kernel.org/r/20220511192921.2223629-7-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
ALT_NEW_CONTENT already uses same-length assembler lines, so
extend this to the other elements as well.
This makes it more readable when these elements need to be extended
in the future.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Link: https://lore.kernel.org/r/20220511192921.2223629-6-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
When the alternatives were added the commit already provided a template
on how to implement 2 different alternatives for one piece of code.
Make this usable.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Link: https://lore.kernel.org/r/20220511192921.2223629-5-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
This allows alternatives to also be applied when loading modules
and follows the implementation of other architectures (e.g. arm64).
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Link: https://lore.kernel.org/r/20220511192921.2223629-4-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Future features may need to be applied at a different
time during boot, so allow defining stages for alternatives
and handling them differently depending on the stage.
Also make the alternatives-location more flexible so that
future stages may provide their own location.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Link: https://lore.kernel.org/r/20220511192921.2223629-3-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Right now the alternatives need to be explicitly enabled and
erratas are limited to SiFive ones.
We want to use alternatives not only for patching soc erratas,
but in the future also for handling different behaviour depending
on the existence of future extensions.
So move the core alternatives over to the kernel subdirectory
and move the CONFIG_RISCV_ALTERNATIVE to be a hidden symbol
which we expect relevant erratas and extensions to just select
if needed.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Link: https://lore.kernel.org/r/20220511192921.2223629-2-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Now that we have fair spinlocks we can use the generic queued rwlocks,
so we might as well do so.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Our existing spinlocks aren't fair and replacing them has been on the
TODO list for a long time. This moves to the recently-introduced ticket
spinlocks, which are simple enough that they are likely to be correct
and fast on the vast majority of extant implementations.
This introduces a horrible hack that allows us to split out the spinlock
conversion from the rwlock conversion. We have to do the spinlocks
first because qrwlock needs fair spinlocks, but we don't want to pollute
the asm-generic code to support the generic spinlocks without qrwlocks.
Thus we pollute the RISC-V code, but just until the next commit as it's
all going away.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Guo Ren <guoren@kernel.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Many architectures have similar install.sh scripts.
The first half is really generic; it verifies that the kernel image
and System.map exist, then executes ~/bin/${INSTALLKERNEL} or
/sbin/${INSTALLKERNEL} if available.
The second half is kind of arch-specific; it copies the kernel image
and System.map to the destination, but the code is slightly different.
Factor out the generic part into scripts/install.sh.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
|
|
In preparation for Clang supporting randstruct, reorganize the Kconfigs,
move the attribute macros, and generalize the feature to be named
CONFIG_RANDSTRUCT for on/off, CONFIG_RANDSTRUCT_FULL for the full
randomization mode, and CONFIG_RANDSTRUCT_PERFORMANCE for the cache-line
sized mode.
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220503205503.3054173-4-keescook@chromium.org
|
|
Add fn and fn_arg members into struct kernel_clone_args and test for
them in copy_thread (instead of testing for PF_KTHREAD | PF_IO_WORKER).
This allows any task that wants to be a user space task that only runs
in kernel mode to use this functionality.
The code on x86 is an exception and still retains a PF_KTHREAD test
because x86 unlikely everything else handles kthreads slightly
differently than user space tasks that start with a function.
The functions that created tasks that start with a function
have been updated to set ".fn" and ".fn_arg" instead of
".stack" and ".stack_size". These functions are fork_idle(),
create_io_thread(), kernel_thread(), and user_mode_thread().
Link: https://lkml.kernel.org/r/20220506141512.516114-4-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
With io_uring we have started supporting tasks that are for most
purposes user space tasks that exclusively run code in kernel mode.
The kernel task that exec's init and tasks that exec user mode
helpers are also user mode tasks that just run kernel code
until they call kernel execve.
Pass kernel_clone_args into copy_thread so these oddball
tasks can be supported more cleanly and easily.
v2: Fix spelling of kenrel_clone_args on h8300
Link: https://lkml.kernel.org/r/20220506141512.516114-2-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fix from Palmer Dabbelt:
- A fix to relocate the DTB early in boot, in cases where the
bootloader doesn't put the DTB in a region that will end up
mapped by the kernel.
This manifests as a crash early in boot on a handful of
configurations.
* tag 'riscv-for-linus-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: relocate DTB if it's outside memory region
|
|
tools/testing/selftests/net/forwarding/Makefile
f62c5acc800e ("selftests/net/forwarding: add missing tests to Makefile")
50fe062c806e ("selftests: forwarding: new test, verify host mdb entries")
https://lore.kernel.org/all/20220502111539.0b7e4621@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Take care of faults occuring between the PARange and IPA range by
injecting an exception
- Fix S2 faults taken from a host EL0 in protected mode
- Work around Oops caused by a PMU access from a 32bit guest when PMU
has been created. This is a temporary bodge until we fix it for
good.
x86:
- Fix potential races when walking host page table
- Fix shadow page table leak when KVM runs nested
- Work around bug in userspace when KVM synthesizes leaf 0x80000021
on older (pre-EPYC) or Intel processors
Generic (but affects only RISC-V):
- Fix bad user ABI for KVM_EXIT_SYSTEM_EVENT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: work around QEMU issue with synthetic CPUID leaves
Revert "x86/mm: Introduce lookup_address_in_mm()"
KVM: x86/mmu: fix potential races when walking host page table
KVM: fix bad user ABI for KVM_EXIT_SYSTEM_EVENT
KVM: x86/mmu: Do not create SPTEs for GFNs that exceed host.MAXPHYADDR
KVM: arm64: Inject exception on out-of-IPA-range translation fault
KVM/arm64: Don't emulate a PMU for 32-bit guests if feature not set
KVM: arm64: Handle host stage-2 faults from 32-bit EL0
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A semi-large pile of clk driver fixes this time around.
Nothing is touching the core so these fixes are fairly well contained
to specific devices that use these clk drivers.
- Some Allwinner SoC fixes to gracefully handle errors and mark an
RTC clk as critical so that the RTC keeps ticking.
- Fix AXI bus clks and RTC clk design for Microchip PolarFire SoC
driver introduced this cycle. This has some devicetree bits acked
by riscv maintainers. We're fixing it now so that the prior
bindings aren't released in a major kernel version.
- Remove a reset on Microchip PolarFire SoCs that broke when enabling
CONFIG_PM.
- Set a min/max for the Qualcomm graphics clk. This got broken by the
clk rate range patches introduced this cycle"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: sunxi: sun9i-mmc: check return value after calling platform_get_resource()
clk: sunxi-ng: sun6i-rtc: Mark rtc-32k as critical
riscv: dts: microchip: reparent mpfs clocks
clk: microchip: mpfs: add RTCREF clock control
clk: microchip: mpfs: re-parent the configurable clocks
dt-bindings: rtc: add refclk to mpfs-rtc
dt-bindings: clk: mpfs: add defines for two new clocks
dt-bindings: clk: mpfs document msspll dri registers
riscv: dts: microchip: fix usage of fic clocks on mpfs
clk: microchip: mpfs: mark CLK_ATHENA as critical
clk: microchip: mpfs: fix parents for FIC clocks
clk: qcom: clk-rcg2: fix gfx3d frequency calculation
clk: microchip: mpfs: don't reset disabled peripherals
clk: sunxi-ng: fix not NULL terminated coccicheck error
|
|
Patch series "Convert vmcore to use an iov_iter", v5.
For some reason several people have been sending bad patches to fix
compiler warnings in vmcore recently. Here's how it should be done.
Compile-tested only on x86. As noted in the first patch, s390 should take
this conversion a bit further, but I'm not inclined to do that work
myself.
This patch (of 3):
Instead of passing in a 'buf' and 'userbuf' argument, pass in an iov_iter.
s390 needs more work to pass the iov_iter down further, or refactor, but
I'd be more comfortable if someone who can test on s390 did that work.
It's more convenient to convert the whole of read_from_oldmem() to take an
iov_iter at the same time, so rename it to read_from_oldmem_iter() and add
a temporary read_from_oldmem() wrapper that creates an iov_iter.
Link: https://lkml.kernel.org/r/20220408090636.560886-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20220408090636.560886-2-bhe@redhat.com
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix to properly ensure a single CPU is running during patch_text().
- A defconfig update to include RPMSG_CTRL when RPMSG_CHAR was set,
necessary after a recent refactoring.
* tag 'riscv-for-linus-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: configs: Configs that had RPMSG_CHAR now get RPMSG_CTRL
riscv: patch_text: Fixup last cpu should be master
|
|
Fixes for (relatively) old bugs, to be merged in both the -rc and next
development trees:
* Fix potential races when walking host page table
* Fix bad user ABI for KVM_EXIT_SYSTEM_EVENT
* Fix shadow page table leak when KVM runs nested
|
|
When KVM_EXIT_SYSTEM_EVENT was introduced, it included a flags
member that at the time was unused. Unfortunately this extensibility
mechanism has several issues:
- x86 is not writing the member, so it would not be possible to use it
on x86 except for new events
- the member is not aligned to 64 bits, so the definition of the
uAPI struct is incorrect for 32- on 64-bit userspace. This is a
problem for RISC-V, which supports CONFIG_KVM_COMPAT, but fortunately
usage of flags was only introduced in 5.18.
Since padding has to be introduced, place a new field in there
that tells if the flags field is valid. To allow further extensibility,
in fact, change flags to an array of 16 values, and store how many
of the values are valid. The availability of the new ndata field
is tied to a system capability; all architectures are changed to
fill in the field.
To avoid breaking compilation of userspace that was using the flags
field, provide a userspace-only union to overlap flags with data[0].
The new field is placed at the same offset for both 32- and 64-bit
userspace.
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Message-Id: <20220422103013.34832-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In case the DTB provided by the bootloader/BootROM is before the kernel
image or outside /memory, we won't be able to access it through the
linear mapping, and get a segfault on setup_arch(). Currently OpenSBI
relocates DTB but that's not always the case (e.g. if FW_JUMP_FDT_ADDR
is not specified), and it's also not the most portable approach since
the default FW_JUMP_FDT_ADDR of the generic platform relocates the DTB
at a specific offset that may not be available. To avoid this situation
copy DTB so that it's visible through the linear mapping.
Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Link: https://lore.kernel.org/r/20220322132839.3653682-1-mick@ics.forth.gr
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Fixes: f105aa940e78 ("riscv: add BUILTIN_DTB support for MMU-enabled targets")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
include/linux/netdevice.h
net/core/dev.c
6510ea973d8d ("net: Use this_cpu_inc() to increment net->core_stats")
794c24e9921f ("net-core: rx_otherhost_dropped to core_stats")
https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/
drivers/net/wan/cosa.c
d48fea8401cf ("net: cosa: fix error check return value of register_chrdev()")
89fbca3307d4 ("net: wan: remove support for COSA and SRP synchronous serial boards")
https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-04-27
We've added 85 non-merge commits during the last 18 day(s) which contain
a total of 163 files changed, 4499 insertions(+), 1521 deletions(-).
The main changes are:
1) Teach libbpf to enhance BPF verifier log with human-readable and relevant
information about failed CO-RE relocations, from Andrii Nakryiko.
2) Add typed pointer support in BPF maps and enable it for unreferenced pointers
(via probe read) and referenced ones that can be passed to in-kernel helpers,
from Kumar Kartikeya Dwivedi.
3) Improve xsk to break NAPI loop when rx queue gets full to allow for forward
progress to consume descriptors, from Maciej Fijalkowski & Björn Töpel.
4) Fix a small RCU read-side race in BPF_PROG_RUN routines which dereferenced
the effective prog array before the rcu_read_lock, from Stanislav Fomichev.
5) Implement BPF atomic operations for RV64 JIT, and add libbpf parsing logic
for USDT arguments under riscv{32,64}, from Pu Lehui.
6) Implement libbpf parsing of USDT arguments under aarch64, from Alan Maguire.
7) Enable bpftool build for musl and remove nftw with FTW_ACTIONRETVAL usage
so it can be shipped under Alpine which is musl-based, from Dominique Martinet.
8) Clean up {sk,task,inode} local storage trace RCU handling as they do not
need to use call_rcu_tasks_trace() barrier, from KP Singh.
9) Improve libbpf API documentation and fix error return handling of various
API functions, from Grant Seltzer.
10) Enlarge offset check for bpf_skb_{load,store}_bytes() helpers given data
length of frags + frag_list may surpass old offset limit, from Liu Jian.
11) Various improvements to prog_tests in area of logging, test execution
and by-name subtest selection, from Mykola Lysenko.
12) Simplify map_btf_id generation for all map types by moving this process
to build time with help of resolve_btfids infra, from Menglong Dong.
13) Fix a libbpf bug in probing when falling back to legacy bpf_probe_read*()
helpers; the probing caused always to use old helpers, from Runqing Yang.
14) Add support for ARCompact and ARCv2 platforms for libbpf's PT_REGS
tracing macros, from Vladimir Isaev.
15) Cleanup BPF selftests to remove old & unneeded rlimit code given kernel
switched to memcg-based memory accouting a while ago, from Yafang Shao.
16) Refactor of BPF sysctl handlers to move them to BPF core, from Yan Zhu.
17) Fix BPF selftests in two occasions to work around regressions caused by latest
LLVM to unblock CI until their fixes are worked out, from Yonghong Song.
18) Misc cleanups all over the place, from various others.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
selftests/bpf: Add libbpf's log fixup logic selftests
libbpf: Fix up verifier log for unguarded failed CO-RE relos
libbpf: Simplify bpf_core_parse_spec() signature
libbpf: Refactor CO-RE relo human description formatting routine
libbpf: Record subprog-resolved CO-RE relocations unconditionally
selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests
libbpf: Avoid joining .BTF.ext data with BPF programs by section name
libbpf: Fix logic for finding matching program for CO-RE relocation
libbpf: Drop unhelpful "program too large" guess
libbpf: Fix anonymous type check in CO-RE logic
bpf: Compute map_btf_id during build time
selftests/bpf: Add test for strict BTF type check
selftests/bpf: Add verifier tests for kptr
selftests/bpf: Add C tests for kptr
libbpf: Add kptr type tag macros to bpf_helpers.h
bpf: Make BTF type match stricter for release arguments
bpf: Teach verifier about kptr_get kfunc helpers
bpf: Wire up freeing of referenced kptr
bpf: Populate pairs of btf_id and destructor kfunc in btf
bpf: Adapt copy_map_value for multiple offset case
...
====================
Link: https://lore.kernel.org/r/20220427224758.20976-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Reconstruct __setup_additional_pages() by appending vdso info
pointer argument to meet compat_vdso_info requirement. And change
vm_special_mapping *dm, *cm initialization into static.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-18-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
There is no vgettimeofday supported in rv32 that makes simple to
generate rv32 vdso code which only needs riscv64 compiler. Other
architectures need change compiler or -m (machine parameter) to
support vdso32 compiling. If rv32 support vgettimeofday (which
cause C compile) in future, we would add CROSS_COMPILE to support
that makes more requirement on compiler enviornment.
linux-rv64/arch/riscv/kernel/compat_vdso/compat_vdso.so.dbg:
file format elf64-littleriscv
Disassembly of section .text:
0000000000000800 <__vdso_rt_sigreturn>:
800: 08b00893 li a7,139
804: 00000073 ecall
808: 0000 unimp
...
000000000000080c <__vdso_getcpu>:
80c: 0a800893 li a7,168
810: 00000073 ecall
814: 8082 ret
...
0000000000000818 <__vdso_flush_icache>:
818: 10300893 li a7,259
81c: 00000073 ecall
820: 8082 ret
linux-rv32/arch/riscv/kernel/vdso/vdso.so.dbg:
file format elf32-littleriscv
Disassembly of section .text:
00000800 <__vdso_rt_sigreturn>:
800: 08b00893 li a7,139
804: 00000073 ecall
808: 0000 unimp
...
0000080c <__vdso_getcpu>:
80c: 0a800893 li a7,168
810: 00000073 ecall
814: 8082 ret
...
00000818 <__vdso_flush_icache>:
818: 10300893 li a7,259
81c: 00000073 ecall
820: 8082 ret
Finally, reuse all *.S from vdso in compat_vdso that makes
implementation clear and readable.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-17-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Detect hardware COMPAT (32bit U-mode) capability in rv64. If not
support COMPAT mode in hw, compat_elf_check_arch would return
false by compat_binfmt_elf.c
Add CLASS to enhance (compat_)elf_check_arch to distinguish
32BIT/64BIT elf.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-16-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Implement necessary type and macro for compat elf. See the code
comment for detail.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-15-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
If the current task is in COMPAT mode, set SR_UXL_32 in status for
returning userspace. We need CONFIG _COMPAT to prevent compiling
errors with rv32 defconfig.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-14-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Implement the entry of compat_sys_call_table[] in asm. Ref to
riscv-privileged spec 4.1.1 Supervisor Status Register (sstatus):
BIT[32:33] = UXL[1:0]:
- 1:32
- 2:64
- 3:128
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-13-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Implement compat sys_call_table and some system call functions:
truncate64, ftruncate64, fallocate, pread64, pwrite64,
sync_file_range, readahead, fadvise64_64 which need argument
translation.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-12-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|