summaryrefslogtreecommitdiff
path: root/arch/riscv/kernel
AgeCommit message (Collapse)Author
2023-01-31Merge tag 'v6.2-rc6' into sched/core, to pick up fixesIngo Molnar
Pick up fixes before merging another batch of cpuidle updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-25riscv: Move call to init_cpu_topology() to later initialization stageLey Foon Tan
If "capacity-dmips-mhz" is present in a CPU DT node, topology_parse_cpu_capacity() will fail to allocate memory. arm64, with which this code path is shared, does not call topology_parse_cpu_capacity() until later in boot where memory allocation is available. While "capacity-dmips-mhz" is not yet a valid property on RISC-V, invalid properties should be ignored rather than cause issues. Move init_cpu_topology(), which calls topology_parse_cpu_capacity(), to a later initialization stage, to match arm64. As a side effect of this change, RISC-V is "protected" from changes to core topology code that would work on arm64 where memory allocation is safe but on RISC-V isn't. Fixes: 03f11f03dbfe ("RISC-V: Parse cpu topology during boot.") Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Link: https://lore.kernel.org/r/20230105033705.3946130-1-leyfoon.tan@starfivetech.com [Palmer: use Conor's commit text] Link: https://lore.kernel.org/linux-riscv/20230104183033.755668-1-pierre.gondois@arm.com/T/#me592d4c8b9508642954839f0077288a353b0b9b2 Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-24riscv/kprobe: Fix instruction simulation of JALRLiao Chang
Set kprobe at 'jalr 1140(ra)' of vfs_write results in the following crash: [ 32.092235] Unable to handle kernel access to user memory without uaccess routines at virtual address 00aaaaaad77b1170 [ 32.093115] Oops [#1] [ 32.093251] Modules linked in: [ 32.093626] CPU: 0 PID: 135 Comm: ftracetest Not tainted 6.2.0-rc2-00013-gb0aa5e5df0cb-dirty #16 [ 32.093985] Hardware name: riscv-virtio,qemu (DT) [ 32.094280] epc : ksys_read+0x88/0xd6 [ 32.094855] ra : ksys_read+0xc0/0xd6 [ 32.095016] epc : ffffffff801cda80 ra : ffffffff801cdab8 sp : ff20000000d7bdc0 [ 32.095227] gp : ffffffff80f14000 tp : ff60000080f9cb40 t0 : ffffffff80f13e80 [ 32.095500] t1 : ffffffff8000c29c t2 : ffffffff800dbc54 s0 : ff20000000d7be60 [ 32.095716] s1 : 0000000000000000 a0 : ffffffff805a64ae a1 : ffffffff80a83708 [ 32.095921] a2 : ffffffff80f160a0 a3 : 0000000000000000 a4 : f229b0afdb165300 [ 32.096171] a5 : f229b0afdb165300 a6 : ffffffff80eeebd0 a7 : 00000000000003ff [ 32.096411] s2 : ff6000007ff76800 s3 : fffffffffffffff7 s4 : 00aaaaaad77b1170 [ 32.096638] s5 : ffffffff80f160a0 s6 : ff6000007ff76800 s7 : 0000000000000030 [ 32.096865] s8 : 00ffffffc3d97be0 s9 : 0000000000000007 s10: 00aaaaaad77c9410 [ 32.097092] s11: 0000000000000000 t3 : ffffffff80f13e48 t4 : ffffffff8000c29c [ 32.097317] t5 : ffffffff8000c29c t6 : ffffffff800dbc54 [ 32.097505] status: 0000000200000120 badaddr: 00aaaaaad77b1170 cause: 000000000000000d [ 32.098011] [<ffffffff801cdb72>] ksys_write+0x6c/0xd6 [ 32.098222] [<ffffffff801cdc06>] sys_write+0x2a/0x38 [ 32.098405] [<ffffffff80003c76>] ret_from_syscall+0x0/0x2 Since the rs1 and rd might be the same one, such as 'jalr 1140(ra)', hence it requires obtaining the target address from rs1 followed by updating rd. Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Liao Chang <liaochang1@huawei.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230116064342.2092136-1-liaochang1@huawei.com [Palmer: Pick Guo's cleanup] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-24riscv: fix jal offsets in patched alternativesJisheng Zhang
Alternatives live in a different section, so offsets used by jal instruction will point to wrong locations after the patch got applied. Similar to arm64, adjust the location to consider that offset. Co-developed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20230113212205.3534622-1-heiko@sntech.de Fixes: 27c653c06505 ("RISC-V: fix auipc-jalr addresses in patched alternatives") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-20Merge tag 'archtopo-cacheinfo-updates-6.3' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into driver-core-next Sudeep writes: "cacheinfo and arch_topology updates for v6.3 The main change is to build the cache topology information for all the CPUs from the primary CPU. Currently the cacheinfo for secondary CPUs is created during the early boot on the respective CPU itself. Preemption and interrupts are disabled at this stage. On PREEMPT_RT kernels, allocating memory and even parsing the PPTT table for ACPI based systems triggers a: 'BUG: sleeping function called from invalid context' To prevent this bug, the cacheinfo is now allocated from the primary CPU when preemption and interrupts are enabled and before booting secondary CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information only, without relying on any architecture specific mechanism if done so early. The other minor change included here is to handle shared caches at different levels when not all the CPUs on the system have the same cache hierarchy." * tag 'archtopo-cacheinfo-updates-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: cacheinfo: Fix shared_cpu_map to handle shared caches at different levels arch_topology: Build cacheinfo from primary CPU ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info() ACPI: PPTT: Remove acpi_find_cache_levels() cacheinfo: Check 'cache-unified' property to count cache leaves cacheinfo: Return error code in init_of_cache_level() cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation
2023-01-19Merge patch series "Putting some basic order on isa extension lists"Palmer Dabbelt
This cleans up the ISA string handling to more closely match a version of the ISA spec. This is visible in /proc/cpuinfo and the ordering changes may break something in userspace, but these orderings have changed before without issues so with any luck that's still the case. This also adds documentation so userspace has a better idea of what is intended when it comes to compatibility for /proc/cpuinfo, which should help everyone as this will likely keep changing. * b4-shazam-merge: Documentation: riscv: add a section about ISA string ordering in /proc/cpuinfo RISC-V: resort all extensions in consistent orders RISC-V: clarify ISA string ordering rules in cpu.c Link: https://lore.kernel.org/r/20221205144525.2148448-1-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-19riscv: fix -Wundef warning for CONFIG_RISCV_BOOT_SPINWAITMasahiro Yamada
Since commit 80b6093b55e3 ("kbuild: add -Wundef to KBUILD_CPPFLAGS for W=1 builds"), building with W=1 detects misuse of #if. $ make W=1 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- arch/riscv/kernel/ [snip] AS arch/riscv/kernel/head.o arch/riscv/kernel/head.S:329:5: warning: "CONFIG_RISCV_BOOT_SPINWAIT" is not defined, evaluates to 0 [-Wundef] 329 | #if CONFIG_RISCV_BOOT_SPINWAIT | ^~~~~~~~~~~~~~~~~~~~~~~~~~ CONFIG_RISCV_BOOT_SPINWAIT is a bool option. #ifdef should be used. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Fixes: 2ffc48fc7071 ("RISC-V: Move spinwait booting method to its own config") Link: https://lore.kernel.org/r/20230106161213.2374093-1-masahiroy@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-18mm: remove zap_page_range and create zap_vma_pagesMike Kravetz
zap_page_range was originally designed to unmap pages within an address range that could span multiple vmas. While working on [1], it was discovered that all callers of zap_page_range pass a range entirely within a single vma. In addition, the mmu notification call within zap_page range does not correctly handle ranges that span multiple vmas. When crossing a vma boundary, a new mmu_notifier_range_init/end call pair with the new vma should be made. Instead of fixing zap_page_range, do the following: - Create a new routine zap_vma_pages() that will remove all pages within the passed vma. Most users of zap_page_range pass the entire vma and can use this new routine. - For callers of zap_page_range not passing the entire vma, instead call zap_page_range_single(). - Remove zap_page_range. [1] https://lore.kernel.org/linux-mm/20221114235507.294320-2-mike.kravetz@oracle.com/ Link: https://lkml.kernel.org/r/20230104002732.232573-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Suggested-by: Peter Xu <peterx@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> [s390] Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18arch_topology: Build cacheinfo from primary CPUPierre Gondois
commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection in the CPU hotplug path") adds a call to detect_cache_attributes() to populate the cacheinfo before updating the siblings mask. detect_cache_attributes() allocates memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT kernels, on secondary CPUs, this triggers a: 'BUG: sleeping function called from invalid context' [1] as the code is executed with preemption and interrupts disabled. The primary CPU was previously storing the cache information using the now removed (struct cpu_topology).llc_id: commit 5b8dc787ce4a ("arch_topology: Drop LLC identifier stash from the CPU topology") allocate_cache_info() tries to build the cacheinfo from the primary CPU prior secondary CPUs boot, if the DT/ACPI description contains cache information. If allocate_cache_info() fails, then fallback to the current state for the cacheinfo allocation. [1] will be triggered in such case. When unplugging a CPU, the cacheinfo memory cannot be freed. If it was, then the memory would be allocated early by the re-plugged CPU and would trigger [1]. Note that populate_cache_leaves() might be called multiple times due to populate_leaves being moved up. This is required since detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu) being allocated but not populated. [1]: | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111 | preempt_count: 1, expected: 0 | RCU nest depth: 1, expected: 1 | 3 locks held by swapper/111/0: | #0: (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8 | #1: (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0 | #2: (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80 | irq event stamp: 0 | hardirqs last enabled at (0): 0x0 | hardirqs last disabled at (0): copy_process+0x5dc/0x1ab8 | softirqs last enabled at (0): copy_process+0x5dc/0x1ab8 | softirqs last disabled at (0): 0x0 | Preemption disabled at: | migrate_enable+0x30/0x130 | CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-rt6-[...] | Call trace: | __kmalloc+0xbc/0x1e8 | detect_cache_attributes+0x2d4/0x5f0 | update_siblings_masks+0x30/0x368 | store_cpu_topology+0x78/0xb8 | secondary_start_kernel+0xd0/0x198 | __secondary_switched+0xb0/0xb4 Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230104183033.755668-7-pierre.gondois@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-17RISC-V: resort all extensions in consistent ordersConor Dooley
Ordering between each and every list of extensions is wildly inconsistent. Per discussion on the lists pick the following policy: - The array defining order in /proc/cpuinfo follows a narrow interpretation of the ISA specifications, described in a comment immediately presiding it. - All other lists of extensions are sorted alphabetically. This will hopefully allow for easier review & future additions, and reduce conflicts between patchsets as the number of extensions grows. Link: https://lore.kernel.org/all/20221129144742.2935581-2-conor.dooley@microchip.com/ Suggested-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221205144525.2148448-3-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-17RISC-V: clarify ISA string ordering rules in cpu.cConor Dooley
While the current list of rules may have been accurate when created it now lacks some clarity in the face of isa-manual updates. Instead of trying to continuously align this rule-set with the one in the specifications, change the role of this comment. This particular comment is important, as the array it "decorates" defines the order in which the ISA string appears to userspace in /proc/cpuinfo. Re-jig and strengthen the wording to provide contributors with a set order in which to add entries & note why this particular struct needs more attention than others. While in the area, add some whitespace and tweak some wording for readability's sake. Suggested-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221205144525.2148448-2-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-17cacheinfo: Use RISC-V's init_cache_level() as generic OF implementationPierre Gondois
RISC-V's implementation of init_of_cache_level() is following the Devicetree Specification v0.3 regarding caches, cf.: - s3.7.3 'Internal (L1) Cache Properties' - s3.8 'Multi-level and Shared Cache Nodes' Allow reusing the implementation by moving it. Also make 'levels', 'leaves' and 'level' unsigned int. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230104183033.755668-2-pierre.gondois@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-13arch/idle: Change arch_cpu_idle() behavior: always exit with IRQs disabledPeter Zijlstra
Current arch_cpu_idle() is called with IRQs disabled, but will return with IRQs enabled. However, the very first thing the generic code does after calling arch_cpu_idle() is raw_local_irq_disable(). This means that architectures that can idle with IRQs disabled end up doing a pointless 'enable-disable' dance. Therefore, push this IRQ disabling into the idle function, meaning that those architectures can avoid the pointless IRQ state flipping. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64] Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.618076436@infradead.org
2023-01-13objtool/idle: Validate __cpuidle code as noinstrPeter Zijlstra
Idle code is very like entry code in that RCU isn't available. As such, add a little validation. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.373461409@infradead.org
2023-01-05riscv, kprobes: Stricter c.jr/c.jalr decodingBjörn Töpel
In the compressed instruction extension, c.jr, c.jalr, c.mv, and c.add is encoded the following way (each instruction is 16b): ---+-+-----------+-----------+-- 100 0 rs1[4:0]!=0 00000 10 : c.jr 100 1 rs1[4:0]!=0 00000 10 : c.jalr 100 0 rd[4:0]!=0 rs2[4:0]!=0 10 : c.mv 100 1 rd[4:0]!=0 rs2[4:0]!=0 10 : c.add The following logic is used to decode c.jr and c.jalr: insn & 0xf007 == 0x8002 => instruction is an c.jr insn & 0xf007 == 0x9002 => instruction is an c.jalr When 0xf007 is used to mask the instruction, c.mv can be incorrectly decoded as c.jr, and c.add as c.jalr. Correct the decoding by changing the mask from 0xf007 to 0xf07f. Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230102160748.1307289-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-29RISC-V: fix auipc-jalr addresses in patched alternativesHeiko Stuebner
Alternatives live in a different section, so addresses used by call functions will point to wrong locations after the patch got applied. Similar to arm64, adjust the location to consider that offset. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20221223221332.4127602-13-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-29RISC-V: kprobes: use central defined funct3 constantsHeiko Stuebner
Don't redefine values that are already available in the central header asm/insn.h . Use the values from there instead. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20221223221332.4127602-9-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-29RISC-V: rename parse_asm.h to insn.hHeiko Stuebner
The current parse_asm header should become a more centralized place for everything concerning parsing and constructing instructions. We already have a header insn-def.h similar to aarch64, so rename parse_asm.h to insn.h (again similar to aarch64) to show that it's meant for more than simple instruction parsing. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20221223221332.4127602-8-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-29RISC-V: Move riscv_insn_is_* macros into a common headerHeiko Stuebner
Right now the riscv kernel has (at least) two independent sets of functions to check if an encoded instruction is of a specific type. One in kgdb and one kprobes simulate-insn code. More parts of the kernel will probably need this in the future, so instead of allowing this duplication to go on further, move macros that do the function declaration in a common header, similar to at least aarch64. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20221223221332.4127602-7-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-29RISC-V: add prefix to all constants/macros in parse_asm.hHeiko Stuebner
Some of the constants and macros already have suitable RV_, RVG_ or RVC_ prefixes. Extend this to the rest of the file as well, as we want to use these things in a broader scope soon. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Link: https://lore.kernel.org/r/20221223221332.4127602-3-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull RISC-V kvm updates from Paolo Bonzini: - Allow unloading KVM module - Allow KVM user-space to set mvendorid, marchid, and mimpid - Several fixes and cleanups * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: RISC-V: KVM: Add ONE_REG interface for mvendorid, marchid, and mimpid RISC-V: KVM: Save mvendorid, marchid, and mimpid when creating VCPU RISC-V: Export sbi_get_mvendorid() and friends RISC-V: KVM: Move sbi related struct and functions to kvm_vcpu_sbi.h RISC-V: KVM: Use switch-case in kvm_riscv_vcpu_set/get_reg() RISC-V: KVM: Remove redundant includes of asm/csr.h RISC-V: KVM: Remove redundant includes of asm/kvm_vcpu_timer.h RISC-V: KVM: Fix reg_val check in kvm_riscv_vcpu_set_reg_config() RISC-V: KVM: Simplify kvm_arch_prepare_memory_region() RISC-V: KVM: Exit run-loop immediately if xfer_to_guest fails RISC-V: KVM: use vma_lookup() instead of find_vma_intersection() RISC-V: KVM: Add exit logic to main.c
2022-12-14Merge tag 'riscv-for-linus-6.2-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the T-Head PMU via the perf subsystem - ftrace support for rv32 - Support for non-volatile memory devices - Various fixes and cleanups * tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits) Documentation: RISC-V: patch-acceptance: s/implementor/implementer Documentation: RISC-V: Mention the UEFI Standards Documentation: RISC-V: Allow patches for non-standard behavior Documentation: RISC-V: Fix a typo in patch-acceptance riscv: Fixup compile error with !MMU riscv: Fix P4D_SHIFT definition for 3-level page table mode riscv: Apply a static assert to riscv_isa_ext_id RISC-V: Add some comments about the shadow and overflow stacks RISC-V: Align the shadow stack RISC-V: Ensure Zicbom has a valid block size RISC-V: Introduce riscv_isa_extension_check RISC-V: Improve use of isa2hwcap[] riscv: Don't duplicate _ALTERNATIVE_CFG* macros riscv: alternatives: Drop the underscores from the assembly macro names riscv: alternatives: Don't name unused macro parameters riscv: Don't duplicate __ALTERNATIVE_CFG in __ALTERNATIVE_CFG_2 riscv: mm: call best_map_size many times during linear-mapping riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a riscv: Fix crash during early errata patching riscv: boot: add zstd support ...
2022-12-13Merge tag 'efi-next-for-v6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: "Another fairly sizable pull request, by EFI subsystem standards. Most of the work was done by me, some of it in collaboration with the distro and bootloader folks (GRUB, systemd-boot), where the main focus has been on removing pointless per-arch differences in the way EFI boots a Linux kernel. - Refactor the zboot code so that it incorporates all the EFI stub logic, rather than calling the decompressed kernel as a EFI app. - Add support for initrd= command line option to x86 mixed mode. - Allow initrd= to be used with arbitrary EFI accessible file systems instead of just the one the kernel itself was loaded from. - Move some x86-only handling and manipulation of the EFI memory map into arch/x86, as it is not used anywhere else. - More flexible handling of any random seeds provided by the boot environment (i.e., systemd-boot) so that it becomes available much earlier during the boot. - Allow improved arch-agnostic EFI support in loaders, by setting a uniform baseline of supported features, and adding a generic magic number to the DOS/PE header. This should allow loaders such as GRUB or systemd-boot to reduce the amount of arch-specific handling substantially. - (arm64) Run EFI runtime services from a dedicated stack, and use it to recover from synchronous exceptions that might occur in the firmware code. - (arm64) Ensure that we don't allocate memory outside of the 48-bit addressable physical range. - Make EFI pstore record size configurable - Add support for decoding CXL specific CPER records" * tag 'efi-next-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (43 commits) arm64: efi: Recover from synchronous exceptions occurring in firmware arm64: efi: Execute runtime services from a dedicated stack arm64: efi: Limit allocations to 48-bit addressable physical region efi: Put Linux specific magic number in the DOS header efi: libstub: Always enable initrd command line loader and bump version efi: stub: use random seed from EFI variable efi: vars: prohibit reading random seed variables efi: random: combine bootloader provided RNG seed with RNG protocol output efi/cper, cxl: Decode CXL Error Log efi/cper, cxl: Decode CXL Protocol Error Section efi: libstub: fix efi_load_initrd_dev_path() kernel-doc comment efi: x86: Move EFI runtime map sysfs code to arch/x86 efi: runtime-maps: Clarify purpose and enable by default for kexec efi: pstore: Add module parameter for setting the record size efi: xen: Set EFI_PARAVIRT for Xen dom0 boot on all architectures efi: memmap: Move manipulation routines into x86 arch tree efi: memmap: Move EFI fake memmap support into x86 arch tree efi: libstub: Undeprecate the command line initrd loader efi: libstub: Add mixed mode support to command line initrd loader efi: libstub: Permit mixed mode return types other than efi_status_t ...
2022-12-12Merge tag 'timers-core-2022-12-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Updates for timers, timekeeping and drivers: Core: - The timer_shutdown[_sync]() infrastructure: Tearing down timers can be tedious when there are circular dependencies to other things which need to be torn down. A prime example is timer and workqueue where the timer schedules work and the work arms the timer. What needs to prevented is that pending work which is drained via destroy_workqueue() does not rearm the previously shutdown timer. Nothing in that shutdown sequence relies on the timer being functional. The conclusion was that the semantics of timer_shutdown_sync() should be: - timer is not enqueued - timer callback is not running - timer cannot be rearmed Preventing the rearming of shutdown timers is done by discarding rearm attempts silently. A warning for the case that a rearm attempt of a shutdown timer is detected would not be really helpful because it's entirely unclear how it should be acted upon. The only way to address such a case is to add 'if (in_shutdown)' conditionals all over the place. This is error prone and in most cases of teardown not required all. - The real fix for the bluetooth HCI teardown based on timer_shutdown_sync(). A larger scale conversion to timer_shutdown_sync() is work in progress. - Consolidation of VDSO time namespace helper functions - Small fixes for timer and timerqueue Drivers: - Prevent integer overflow on the XGene-1 TVAL register which causes an never ending interrupt storm. - The usual set of new device tree bindings - Small fixes and improvements all over the place" * tag 'timers-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) dt-bindings: timer: renesas,cmt: Add r8a779g0 CMT support dt-bindings: timer: renesas,tmu: Add r8a779g0 support clocksource/drivers/arm_arch_timer: Use kstrtobool() instead of strtobool() clocksource/drivers/timer-ti-dm: Fix missing clk_disable_unprepare in dmtimer_systimer_init_clock() clocksource/drivers/timer-ti-dm: Clear settings on probe and free clocksource/drivers/timer-ti-dm: Make timer_get_irq static clocksource/drivers/timer-ti-dm: Fix warning for omap_timer_match clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error clocksource/drivers/timer-npcm7xx: Enable timer 1 clock before use dt-bindings: timer: nuvoton,npcm7xx-timer: Allow specifying all clocks dt-bindings: timer: rockchip: Add rockchip,rk3128-timer clockevents: Repair kernel-doc for clockevent_delta2ns() clocksource/drivers/ingenic-ost: Define pm functions properly in platform_driver struct clocksource/drivers/sh_cmt: Access registers according to spec vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy Bluetooth: hci_qca: Fix the teardown problem for real timers: Update the documentation to reflect on the new timer_shutdown() API timers: Provide timer_shutdown[_sync]() timers: Add shutdown mechanism to the internal functions timers: Split [try_to_]del_timer[_sync]() to prepare for shutdown mode ...
2022-12-12Merge patch series "RISC-V: Align the shadow stack"Palmer Dabbelt
Palmer Dabbelt <palmer@rivosinc.com> says: This contains a pair of cleanups that depend on a fix that has already landed upstream. * b4-shazam-merge: RISC-V: Add some comments about the shadow and overflow stacks RISC-V: Align the shadow stack riscv: fix race when vmap stack overflow Link: https://lore.kernel.org/r/20221130023515.20217-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-12RISC-V: Add some comments about the shadow and overflow stacksPalmer Dabbelt
It took me a while to page all this back in when trying to review the recent spin_shadow_stack, so I figured I'd just write up some comments. Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20221130023515.20217-2-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-12RISC-V: Align the shadow stackPalmer Dabbelt
The standard RISC-V ABIs all require 16-byte stack alignment. We're only calling that one function on the shadow stack so I doubt it'd result in a real issue, but might as well keep this lined up. Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20221130023515.20217-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-09Merge patch series "RISC-V: Ensure Zicbom has a valid block size"Palmer Dabbelt
Andrew Jones <ajones@ventanamicro.com> says: When a DT puts zicbom in the isa string, but does not provide a block size, ALT_CMO_OP() will attempt to do cache operations on address zero since the start address will be ANDed with zero. We can't simply BUG() in riscv_init_cbom_blocksize() when we fail to find a block size because the failure will happen before logging works, leaving users to scratch their heads as to why the boot hung. Instead, ensure Zicbom is disabled and output an error which will hopefully alert people that the DT needs to be fixed. While at it, add a check that the block size is a power-of-2 too. * b4-shazam-merge: RISC-V: Ensure Zicbom has a valid block size RISC-V: Introduce riscv_isa_extension_check RISC-V: Improve use of isa2hwcap[] Link: https://lore.kernel.org/r/20221129143447.49714-1-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-09RISC-V: Ensure Zicbom has a valid block sizeAndrew Jones
When a DT puts zicbom in the isa string, but does not provide a block size, ALT_CMO_OP() will attempt to do cache operations on address zero since the start address will be ANDed with zero. We can't simply BUG() in riscv_init_cbom_blocksize() when we fail to find a block size because the failure will happen before logging works, leaving users to scratch their heads as to why the boot hung. Instead, ensure Zicbom is disabled and output an error which will hopefully alert people that the DT needs to be fixed. While at it, add a check that the block size is a power-of-2 too. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20221129143447.49714-4-ajones@ventanamicro.com [Palmer: base on 5c20a3a9df19 ("RISC-V: Fix compilation without RISCV_ISA_ZICBOM"] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-09RISC-V: Introduce riscv_isa_extension_checkAndrew Jones
Currently any isa extension found in the isa string is set in the isa bitmap. An isa extension set in the bitmap indicates that the extension is present and may be used (a.k.a is enabled). However, when an extension cannot be used due to missing dependencies or errata it should not be added to the bitmap. Introduce a function where additional checks may be placed in order to determine if an extension should be enabled or not. Note, the checks may simply indicate an issue with the DT, but, since extensions may be used in early boot, it's not always possible to simply produce an error at the point the issue is determined. It's best to keep the extension disabled and produce an error. No functional change intended, as the function is only introduced and always returns true. A later patch will provide checks for an isa extension. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20221129143447.49714-3-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-09RISC-V: Improve use of isa2hwcap[]Andrew Jones
Improve isa2hwcap[] by removing it from static storage, as riscv_fill_hwcap() is only called once, and by reducing its size from 256 bytes to 26. The latter improvement is possible because isa2hwcap[] will never be indexed with capital letters and we can precompute the offsets from 'a'. No functional change intended. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20221129143447.49714-2-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-08Merge patch "RISC-V: Fix unannoted hardirqs-on in return to userspace slow-path"Palmer Dabbelt
I'm merging this in as a single patch to make it easier to handle the backports. * b4-shazam-merge: RISC-V: Fix unannoted hardirqs-on in return to userspace slow-path Link: https://lore.kernel.org/r/20221111223108.1976562-1-abrestic@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-08RISC-V: Fix unannoted hardirqs-on in return to userspace slow-pathAndrew Bresticker
The return to userspace path in entry.S may enable interrupts without the corresponding lockdep annotation, producing a splat[0] when DEBUG_LOCKDEP is enabled. Simply calling __trace_hardirqs_on() here gets a bit messy due to the use of RA to point back to ret_from_exception, so just move the whole slow-path loop into C. It's more readable and it lets us use local_irq_{enable,disable}(), avoiding the need for manual annotations altogether. [0]: ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(!lockdep_hardirqs_enabled()) WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5512 check_flags+0x10a/0x1e0 Modules linked in: CPU: 2 PID: 1 Comm: init Not tainted 6.1.0-rc4-00160-gb56b6e2b4f31 #53 Hardware name: riscv-virtio,qemu (DT) epc : check_flags+0x10a/0x1e0 ra : check_flags+0x10a/0x1e0 <snip> status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff808edb90>] lock_is_held_type+0x78/0x14e [<ffffffff8003dae2>] __might_resched+0x26/0x22c [<ffffffff8003dd24>] __might_sleep+0x3c/0x66 [<ffffffff80022c60>] get_signal+0x9e/0xa70 [<ffffffff800054a2>] do_notify_resume+0x6e/0x422 [<ffffffff80003c68>] ret_from_exception+0x0/0x10 irq event stamp: 44512 hardirqs last enabled at (44511): [<ffffffff808f901c>] _raw_spin_unlock_irqrestore+0x54/0x62 hardirqs last disabled at (44512): [<ffffffff80008200>] __trace_hardirqs_off+0xc/0x14 softirqs last enabled at (44472): [<ffffffff808f9fbe>] __do_softirq+0x3de/0x51e softirqs last disabled at (44467): [<ffffffff80017760>] irq_exit+0xd6/0x104 ---[ end trace 0000000000000000 ]--- possible reason: unannotated irqs-on. Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com> Fixes: 3c4697982982 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT") Link: https://lore.kernel.org/r/20221111223108.1976562-1-abrestic@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-07Merge tag 'v6.1-rc8' into efi/nextArd Biesheuvel
Linux 6.1-rc8
2022-12-07RISC-V: Export sbi_get_mvendorid() and friendsAnup Patel
The sbi_get_mvendorid(), sbi_get_marchid(), and sbi_get_mimpid() can be used by KVM module so let us export these functions. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org>
2022-12-05riscv: stacktrace: Make walk_stackframe cross pt_regs frameGuo Ren
The current walk_stackframe with FRAME_POINTER would stop unwinding at ret_from_exception: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1518 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: init CPU: 0 PID: 1 Comm: init Not tainted 5.10.113-00021-g15c15974895c-dirty #192 Call Trace: [<ffffffe0002038c8>] walk_stackframe+0x0/0xee [<ffffffe000aecf48>] show_stack+0x32/0x4a [<ffffffe000af1618>] dump_stack_lvl+0x72/0x8e [<ffffffe000af1648>] dump_stack+0x14/0x1c [<ffffffe000239ad2>] ___might_sleep+0x12e/0x138 [<ffffffe000239aec>] __might_sleep+0x10/0x18 [<ffffffe000afe3fe>] down_read+0x22/0xa4 [<ffffffe000207588>] do_page_fault+0xb0/0x2fe [<ffffffe000201b80>] ret_from_exception+0x0/0xc The optimization would help walk_stackframe cross the pt_regs frame and get more backtrace of debug info: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1518 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: init CPU: 0 PID: 1 Comm: init Not tainted 5.10.113-00021-g15c15974895c-dirty #192 Call Trace: [<ffffffe0002038c8>] walk_stackframe+0x0/0xee [<ffffffe000aecf48>] show_stack+0x32/0x4a [<ffffffe000af1618>] dump_stack_lvl+0x72/0x8e [<ffffffe000af1648>] dump_stack+0x14/0x1c [<ffffffe000239ad2>] ___might_sleep+0x12e/0x138 [<ffffffe000239aec>] __might_sleep+0x10/0x18 [<ffffffe000afe3fe>] down_read+0x22/0xa4 [<ffffffe000207588>] do_page_fault+0xb0/0x2fe [<ffffffe000201b80>] ret_from_exception+0x0/0xc [<ffffffe000613c06>] riscv_intc_irq+0x1a/0x72 [<ffffffe000201b80>] ret_from_exception+0x0/0xc [<ffffffe00033f44a>] vma_link+0x54/0x160 [<ffffffe000341d7a>] mmap_region+0x2cc/0x4d0 [<ffffffe000342256>] do_mmap+0x2d8/0x3ac [<ffffffe000326318>] vm_mmap_pgoff+0x70/0xb8 [<ffffffe00032638a>] vm_mmap+0x2a/0x36 [<ffffffe0003cfdde>] elf_map+0x72/0x84 [<ffffffe0003d05f8>] load_elf_binary+0x69a/0xec8 [<ffffffe000376240>] bprm_execve+0x246/0x53a [<ffffffe00037786c>] kernel_execve+0xe8/0x124 [<ffffffe000aecdf2>] run_init_process+0xfa/0x10c [<ffffffe000aece16>] try_to_run_init_process+0x12/0x3c [<ffffffe000afa920>] kernel_init+0xb4/0xf8 [<ffffffe000201b80>] ret_from_exception+0x0/0xc Here is the error injection test code for the above output: drivers/irqchip/irq-riscv-intc.c: static asmlinkage void riscv_intc_irq(struct pt_regs *regs) { unsigned long cause = regs->cause & ~CAUSE_IRQ_FLAG; + u32 tmp; __get_user(tmp, (u32 *)0); Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20221109064937.3643993-3-guoren@kernel.org [Palmer: use SYM_CODE_*] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-05riscv: stacktrace: Fixup ftrace_graph_ret_addr retp argumentGuo Ren
The 'retp' is a pointer to the return address on the stack, so we must pass the current return address pointer as the 'retp' argument to ftrace_push_return_trace(). Not parent function's return address on the stack. Fixes: b785ec129bd9 ("riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20221109064937.3643993-2-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-05RISC-V: kexec: Fix memory leak of elf header bufferLi Huafei
This is reported by kmemleak detector: unreferenced object 0xff2000000403d000 (size 4096): comm "kexec", pid 146, jiffies 4294900633 (age 64.792s) hex dump (first 32 bytes): 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 .ELF............ 04 00 f3 00 01 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000566ca97c>] kmemleak_vmalloc+0x3c/0xbe [<00000000979283d8>] __vmalloc_node_range+0x3ac/0x560 [<00000000b4b3712a>] __vmalloc_node+0x56/0x62 [<00000000854f75e2>] vzalloc+0x2c/0x34 [<00000000e9a00db9>] crash_prepare_elf64_headers+0x80/0x30c [<0000000067e8bf48>] elf_kexec_load+0x3e8/0x4ec [<0000000036548e09>] kexec_image_load_default+0x40/0x4c [<0000000079fbe1b4>] sys_kexec_file_load+0x1c4/0x322 [<0000000040c62c03>] ret_from_syscall+0x0/0x2 In elf_kexec_load(), a buffer is allocated via vzalloc() to store elf headers. While it's not freed back to system when kdump kernel is reloaded or unloaded, or when image->elf_header is successfully set and then fails to load kdump kernel for some reason. Fix it by freeing the buffer in arch_kimage_file_post_load_cleanup(). Fixes: 8acea455fafa ("RISC-V: Support for kexec_file on panic") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221104095658.141222-2-lihuafei1@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-05RISC-V: kexec: Fix memory leak of fdt bufferLi Huafei
This is reported by kmemleak detector: unreferenced object 0xff60000082864000 (size 9588): comm "kexec", pid 146, jiffies 4294900634 (age 64.788s) hex dump (first 32 bytes): d0 0d fe ed 00 00 12 ed 00 00 00 48 00 00 11 40 ...........H...@ 00 00 00 28 00 00 00 11 00 00 00 02 00 00 00 00 ...(............ backtrace: [<00000000f95b17c4>] kmemleak_alloc+0x34/0x3e [<00000000b9ec8e3e>] kmalloc_order+0x9c/0xc4 [<00000000a95cf02e>] kmalloc_order_trace+0x34/0xb6 [<00000000f01e68b4>] __kmalloc+0x5c2/0x62a [<000000002bd497b2>] kvmalloc_node+0x66/0xd6 [<00000000906542fa>] of_kexec_alloc_and_setup_fdt+0xa6/0x6ea [<00000000e1166bde>] elf_kexec_load+0x206/0x4ec [<0000000036548e09>] kexec_image_load_default+0x40/0x4c [<0000000079fbe1b4>] sys_kexec_file_load+0x1c4/0x322 [<0000000040c62c03>] ret_from_syscall+0x0/0x2 In elf_kexec_load(), a buffer is allocated via kvmalloc() to store fdt. While it's not freed back to system when kexec kernel is reloaded or unloaded. Then memory leak is caused. Fix it by introducing riscv specific function arch_kimage_file_post_load_cleanup(), and freeing the buffer there. Fixes: 6261586e0c91 ("RISC-V: Add kexec_file support") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Liao Chang <liaochang1@huawei.com> Link: https://lore.kernel.org/r/20221104095658.141222-1-lihuafei1@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-02Merge patch series "Support VMCOREINFO export for RISCV64"Palmer Dabbelt
Add arch_crash_save_vmcoreinfo(), which exports VM layout(MODULES, VMALLOC, VMEMMAP ranges and KERNEL_LINK_ADDR), va bits and ram base for vmcore. * b4-shazam-merge: Documentation: kdump: describe VMCOREINFO export for RISCV64 RISC-V: Add arch_crash_save_vmcoreinfo support Link: https://lore.kernel.org/r/20221026144208.373504-1-xianting.tian@linux.alibaba.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-02RISC-V: Add arch_crash_save_vmcoreinfo supportXianting Tian
Add arch_crash_save_vmcoreinfo(), which exports VM layout(MODULES, VMALLOC, VMEMMAP ranges and KERNEL_LINK_ADDR), va bits and ram base for vmcore. Default pagetable levels and PAGE_OFFSET aren't same for different kernel version as below. For pagetable levels, it sets sv57 by default and falls back to setting sv48 at boot time if sv57 is not supported by the hardware. For ram base, the default value is 0x80200000 for qemu riscv64 env and, for example, is 0x200000 on the XuanTie 910 CPU. * Linux Kernel 5.18 ~ * PGTABLE_LEVELS = 5 * PAGE_OFFSET = 0xff60000000000000 * Linux Kernel 5.17 ~ * PGTABLE_LEVELS = 4 * PAGE_OFFSET = 0xffffaf8000000000 * Linux Kernel 4.19 ~ * PGTABLE_LEVELS = 3 * PAGE_OFFSET = 0xffffffe000000000 Since these configurations change from time to time and version to version, it is preferable to export them via vmcoreinfo than to change the crash's code frequently, it can simplify the development of crash tool. Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Tested-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Guo Ren <guoren@kernel.org> Acked-by: Baoquan He <bhe@redhat.com> Link: https://lore.kernel.org/r/20221026144208.373504-2-xianting.tian@linux.alibaba.com [Palmer: wrap commit text] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-02riscv: add riscv rethook implementationBinglei Wang
Implement the kretprobes on riscv arch by using rethook machenism which abstracts general kretprobe info into a struct rethook_node to be embedded in the struct kretprobe_instance. Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Binglei Wang <l3b2w1@gmail.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221025151831.1097417-1-conor@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-02Merge patch series "RISC-V: Dynamic ftrace support for RV32I"Palmer Dabbelt
Jamie Iles <jamie@jamieiles.com> says: This series enables dynamic ftrace support for RV32I bringing it to parity with RV64I. Most of the work is already there, this is largely just assembly fixes to handle register sizes, correct handling of the psABI calling convention and Kconfig change. Validated with all ftrace boot time self test with qemu for RV32I and RV64I in addition to real tracing on an RV32I FPGA design. * b4-shazam-merge: RISC-V: enable dynamic ftrace for RV32I RISC-V: preserve a1 in mcount RISC-V: reduce mcount save space on RV32 RISC-V: use REG_S/REG_L for mcount Link: https://lore.kernel.org/r/20221115200832.706370-1-jamie@jamieiles.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-02RISC-V: preserve a1 in mcountJamie Iles
The RISC-V ELF psABI states that both a0 and a1 are used for return values so we should preserve them both in return_to_handler. This is especially important for RV32 for functions returning a 64-bit quantity otherwise the return value can be corrupted and undefined behaviour results. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Jamie Iles <jamie@jamieiles.com> Link: https://lore.kernel.org/r/20221115200832.706370-4-jamie@jamieiles.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-02RISC-V: reduce mcount save space on RV32Jamie Iles
For RV32 we can reduce the size of the ABI save+restore state by using SZREG so that register stores are packed rather than on an 8 byte boundary. Signed-off-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20221115200832.706370-3-jamie@jamieiles.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-02RISC-V: use REG_S/REG_L for mcountJamie Iles
In preparation for rv32i ftrace support, convert mcount routines to use native sized loads/stores. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Jamie Iles <jamie@jamieiles.com> Link: https://lore.kernel.org/r/20221115200832.706370-2-jamie@jamieiles.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-01RISC-V: Fix a race condition during kernel stack overflowPalmer Dabbelt
This fixes a concrete bug but is also the basis for some cleanup work, so I'm merging it based on the offending commit in order to minimize future conflicts. * commit '7e1864332fbc1b993659eab7974da9fe8bf8c128': riscv: fix race when vmap stack overflow
2022-12-01vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copyJann Horn
find_timens_vvar_page() is not architecture-specific, as can be seen from how all five per-architecture versions of it are the same. (arm64, powerpc and riscv are exactly the same; x86 and s390 have two characters difference inside a comment, less blank lines, and mark the !CONFIG_TIME_NS version as inline.) Refactor the five copies into a central copy in kernel/time/namespace.c. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221130115320.2918447-1-jannh@google.com
2022-11-29Merge patch series "riscv: kexec: Fxiup crash_save percpu and ↵Palmer Dabbelt
machine_kexec_mask_interrupts" guoren@kernel.org <guoren@kernel.org> says: From: Guo Ren <guoren@linux.alibaba.com> Current riscv kexec can't crash_save percpu states and disable interrupts properly. The patch series fix them, make kexec work correct. * b4-shazam-merge: riscv: kexec: Fixup crash_smp_send_stop without multi cores riscv: kexec: Fixup irq controller broken in kexec crash path Link: https://lore.kernel.org/r/20221020141603.2856206-1-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: kexec: Fixup crash_smp_send_stop without multi coresGuo Ren
Current crash_smp_send_stop is the same as the generic one in kernel/panic and misses crash_save_cpu in percpu. This patch is inspired by 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()") and adds the same mechanism for riscv. Before this patch, test result: crash> help -r CPU 0: [OFFLINE] CPU 1: epc : ffffffff80009ff0 ra : ffffffff800b789a sp : ff2000001098bb40 gp : ffffffff815fca60 tp : ff60000004680000 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff2000001098bc90 s1 : ffffffff81600798 a0 : ff2000001098bb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000004690800 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff2000001098bb48 s3 : ffffffff81093ec8 s4 : ffffffff816004ac s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f720 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff2000001098b9a8 CPU 2: [OFFLINE] CPU 3: [OFFLINE] After this patch, test result: crash> help -r CPU 0: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ffffffff81403eb0 gp : ffffffff815fcb48 tp : ffffffff81413400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffff81403ec0 s1 : 0000000000000000 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eac s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 1: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff2000000068bf30 gp : ffffffff815fcb48 tp : ff6000000240d400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff2000000068bf40 s1 : 0000000000000001 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039ea8 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 2: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff20000000693f30 gp : ffffffff815fcb48 tp : ff6000000240e900 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff20000000693f40 s1 : 0000000000000002 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eb0 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 3: epc : ffffffff8000a1e4 ra : ffffffff800b7bba sp : ff200000109bbb40 gp : ffffffff815fcb48 tp : ff6000000373aa00 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff200000109bbc90 s1 : ffffffff816007a0 a0 : ff200000109bbb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000002c61c00 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff200000109bbb48 s3 : ffffffff810941a8 s4 : ffffffff816004b4 s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f7a0 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff200000109bb9a8 Fixes: ad943893d5f1 ("RISC-V: Fixup schedule out issue in machine_crash_shutdown()") Reviewed-by: Xianting Tian <xianting.tian@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Nick Kossifidis <mick@ics.forth.gr> Link: https://lore.kernel.org/r/20221020141603.2856206-3-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>