summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2024-03-11Merge tag 'loongarch-kvm-6.9' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.9 * Set reserved bits as zero in CPUCFG. * Start SW timer only when vcpu is blocking. * Do not restart SW timer when it is expired. * Remove unnecessary CSR register saving during enter guest.
2024-03-09Merge tag 'kvm-x86-guest_memfd_fixes-6.8' of ↵Paolo Bonzini
https://github.com/kvm-x86/linux into HEAD KVM GUEST_MEMFD fixes for 6.8: - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to avoid creating ABI that KVM can't sanely support. - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly clear that such VMs are purely a development and testing vehicle, and come with zero guarantees. - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan is to support confidential VMs with deterministic private memory (SNP and TDX) only in the TDP MMU. - Fix a bug in a GUEST_MEMFD negative test that resulted in false passes when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
2024-03-06LoongArch: KVM: Remove unnecessary CSR register saving during enter guestBibo Mao
Some CSR registers like CRMD/PRMD are saved during enter VM mode now. However they are not restored for actual use, so saving for these CSR registers can be removed. Reviewed-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-06LoongArch: KVM: Do not restart SW timer when it is expiredBibo Mao
LoongArch VCPUs have their own separate HW timers. SW timer is to wake up blocked vcpu thread, rather than HW timer emulation. When blocking vcpu scheduled out, SW timer is used to wakeup blocked vcpu thread and injects timer interrupt. It does not care about whether guest timer is in period mode or oneshot mode, and SW timer needs not to be restarted since vcpu has been woken. This patch does not restart SW timer when it is expired. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-06LoongArch: KVM: Start SW timer only when vcpu is blockingBibo Mao
SW timer is enabled when vcpu thread is scheduled out, and it is to wake up vcpu from blocked queue. If vcpu thread is scheduled out but is not blocked, such as it is preempted by other threads, it is not necessary to enable SW timer. Since vcpu thread is still on running queue if it is preempted and SW timer is only to wake up vcpu on blocking queue, so SW timer is not useful in this situation. This patch enables SW timer only when vcpu is scheduled out and is blocking. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-06LoongArch: KVM: Set reserved bits as zero in CPUCFGBibo Mao
Supported CPUCFG information comes from function _kvm_get_cpucfg_mask(). A bit should be zero if it is reserved by HW or if it is not supported by KVM. Also LoongArch software page table walk feature defined in CPUCFG2_LSPW is supported by KVM, it should be enabled by default. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-03Merge tag 'powerpc-6.8-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix IOMMU table initialisation when doing kdump over SR-IOV - Fix incorrect RTAS function name for resetting TCE tables - Fix fpu_signal selftest failures since a recent change Thanks to Gaurav Batra and Nathan Lynch. * tag 'powerpc-6.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix fpu_signal failures powerpc/rtas: use correct function name for resetting TCE tables powerpc/pseries/iommu: IOMMU table is not initialized for kdump over SR-IOV
2024-03-03Merge tag 'x86_urgent_for_v6.8_rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Do not reserve SETUP_RNG_SEED setup data in the e820 map as it should be used by kexec only - Make sure MKTME feature detection happens at an earlier time in the boot process so that the physical address size supported by the CPU is properly corrected and MTRR masks are programmed properly, leading to TDX systems booting without disable_mtrr_cleanup on the cmdline - Make sure the different address sizes supported by the CPU are read out as early as possible * tag 'x86_urgent_for_v6.8_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/e820: Don't reserve SETUP_RNG_SEED in e820 x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()
2024-03-01Merge tag 'riscv-for-linus-6.8-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - detect ".option arch" support on not-yet-released LLVM builds - fix missing TLB flush when modifying non-leaf PTEs - fixes for T-Head custom extensions - fix for systems with the legacy PMU, that manifests as a crash on kernels built without SBI PMU support - fix for systems that clear *envcfg on suspend, which manifests as cbo.zero trapping after resume - fixes for Svnapot systems, including removing Svnapot support for huge vmalloc/vmap regions * tag 'riscv-for-linus-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Sparse-Memory/vmemmap out-of-bounds fix riscv: Fix pte_leaf_size() for NAPOT Revert "riscv: mm: support Svnapot in huge vmap" riscv: Save/restore envcfg CSR during CPU suspend riscv: Add a custom ISA extension for the [ms]envcfg CSR riscv: Fix enabling cbo.zero when running in M-mode perf: RISCV: Fix panic on pmu overflow handler MAINTAINERS: Update SiFive driver maintainers drivers: perf: ctr_get_width function for legacy is not defined drivers: perf: added capabilities for legacy PMU RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly riscv: add CALLER_ADDRx support RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH kbuild: Add -Wa,--fatal-warnings to as-instr invocation riscv: tlb: fix __p*d_free_tlb()
2024-03-01x86/e820: Don't reserve SETUP_RNG_SEED in e820Jiri Bohac
SETUP_RNG_SEED in setup_data is supplied by kexec and should not be reserved in the e820 map. Doing so reserves 16 bytes of RAM when booting with kexec. (16 bytes because data->len is zeroed by parse_setup_data so only sizeof(setup_data) is reserved.) When kexec is used repeatedly, each boot adds two entries in the kexec-provided e820 map as the 16-byte range splits a larger range of usable memory. Eventually all of the 128 available entries get used up. The next split will result in losing usable memory as the new entries cannot be added to the e820 map. Fixes: 68b8e9713c8e ("x86/setup: Use rng seeds from setup_data") Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/ZbmOjKnARGiaYBd5@dwarf.suse.cz
2024-02-29riscv: Sparse-Memory/vmemmap out-of-bounds fixDimitris Vlachos
Offset vmemmap so that the first page of vmemmap will be mapped to the first page of physical memory in order to ensure that vmemmap’s bounds will be respected during pfn_to_page()/page_to_pfn() operations. The conversion macros will produce correct SV39/48/57 addresses for every possible/valid DRAM_BASE inside the physical memory limits. v2:Address Alex's comments Suggested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Dimitris Vlachos <dvlachos@ics.forth.gr> Reported-by: Dimitris Vlachos <dvlachos@ics.forth.gr> Closes: https://lore.kernel.org/linux-riscv/20240202135030.42265-1-csd4492@csd.uoc.gr Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240229191723.32779-1-dvlachos@ics.forth.gr Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29Merge patch series "NAPOT Fixes"Palmer Dabbelt
Alexandre Ghiti <alexghiti@rivosinc.com> says: This contains 2 fixes for NAPOT: patch 1 disables the use of NAPOT mapping for vmalloc/vmap and patch 2 implements pte_leaf_size() to report NAPOT size. * b4-shazam-merge: riscv: Fix pte_leaf_size() for NAPOT Revert "riscv: mm: support Svnapot in huge vmap" Link: https://lore.kernel.org/r/20240227205016.121901-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29riscv: Fix pte_leaf_size() for NAPOTAlexandre Ghiti
pte_leaf_size() must be reimplemented to add support for NAPOT mappings. Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240227205016.121901-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29Revert "riscv: mm: support Svnapot in huge vmap"Alexandre Ghiti
This reverts commit ce173474cf19fe7fbe8f0fc74e3c81ec9c3d9807. We cannot correctly deal with NAPOT mappings in vmalloc/vmap because if some part of a NAPOT mapping is unmapped, the remaining mapping is not updated accordingly. For example: ptr = vmalloc_huge(64 * 1024, GFP_KERNEL); vunmap_range((unsigned long)(ptr + PAGE_SIZE), (unsigned long)(ptr + 64 * 1024)); leads to the following kernel page table dump: 0xffff8f8000ef0000-0xffff8f8000ef1000 0x00000001033c0000 4K PTE N .. .. D A G . . W R V Meaning the first entry which was not unmapped still has the N bit set, which, if accessed first and cached in the TLB, could allow access to the unmapped range. That's because the logic to break the NAPOT mapping does not exist and likely won't. Indeed, to break a NAPOT mapping, we first have to clear the whole mapping, flush the TLB and then set the new mapping ("break- before-make" equivalent). That works fine in userspace since we can handle any pagefault occurring on the remaining mapping but we can't handle a kernel pagefault on such mapping. So fix this by reverting the commit that introduced the vmap/vmalloc support. Fixes: ce173474cf19 ("riscv: mm: support Svnapot in huge vmap") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240227205016.121901-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29Merge patch series "riscv: cbo.zero fixes"Palmer Dabbelt
Samuel Holland <samuel.holland@sifive.com> says: This series fixes a couple of issues related to using the cbo.zero instruction in userspace. The first patch fixes a bug where the wrong enable bit gets set if the kernel is running in M-mode. The remaining patches fix a bug where the enable bit gets reset to its default value after a nonretentive idle state. I have hardware which reproduces this: Before this series: $ tools/testing/selftests/riscv/hwprobe/cbo TAP version 13 1..3 ok 1 Zicboz block size # Zicboz block size: 64 Illegal instruction After applying this series: $ tools/testing/selftests/riscv/hwprobe/cbo TAP version 13 1..3 ok 1 Zicboz block size # Zicboz block size: 64 ok 2 cbo.zero ok 3 cbo.zero check # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 * b4-shazam-merge: riscv: Save/restore envcfg CSR during CPU suspend riscv: Add a custom ISA extension for the [ms]envcfg CSR riscv: Fix enabling cbo.zero when running in M-mode Link: https://lore.kernel.org/r/20240228065559.3434837-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29riscv: Save/restore envcfg CSR during CPU suspendSamuel Holland
The value of the [ms]envcfg CSR is lost when entering a nonretentive idle state, so the CSR must be rewritten when resuming the CPU. Cc: <stable@vger.kernel.org> # v6.7+ Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240228065559.3434837-4-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29riscv: Add a custom ISA extension for the [ms]envcfg CSRSamuel Holland
The [ms]envcfg CSR was added in version 1.12 of the RISC-V privileged ISA (aka S[ms]1p12). However, bits in this CSR are defined by several other extensions which may be implemented separately from any particular version of the privileged ISA (for example, some unrelated errata may prevent an implementation from claiming conformance with Ss1p12). As a result, Linux cannot simply use the privileged ISA version to determine if the CSR is present. It must also check if any of these other extensions are implemented. It also cannot probe the existence of the CSR at runtime, because Linux does not require Sstrict, so (in the absence of additional information) it cannot know if a CSR at that address is [ms]envcfg or part of some non-conforming vendor extension. Since there are several standard extensions that imply the existence of the [ms]envcfg CSR, it becomes unwieldy to check for all of them wherever the CSR is accessed. Instead, define a custom Xlinuxenvcfg ISA extension bit that is implied by the other extensions and denotes that the CSR exists as defined in the privileged ISA, containing at least one of the fields common between menvcfg and senvcfg. This extension does not need to be parsed from the devicetree or ISA string because it can only be implemented as a subset of some other standard extension. Cc: <stable@vger.kernel.org> # v6.7+ Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240228065559.3434837-3-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29riscv: Fix enabling cbo.zero when running in M-modeSamuel Holland
When the kernel is running in M-mode, the CBZE bit must be set in the menvcfg CSR, not in senvcfg. Cc: <stable@vger.kernel.org> Fixes: 43c16d51a19b ("RISC-V: Enable cbo.zero in usermode") Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240228065559.3434837-2-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-28Merge tag 'v6.8-p5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a regression in lskcipher and an out-of-bound access in arm64/neonbs" * tag 'v6.8-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: arm64/neonbs - fix out-of-bounds access on short input crypto: lskcipher - Copy IV in lskcipher glue code always
2024-02-26x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registersPaolo Bonzini
MKTME repurposes the high bit of physical address to key id for encryption key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same, the valid bits in the MTRR mask register are based on the reduced number of physical address bits. detect_tme() in arch/x86/kernel/cpu/intel.c detects TME and subtracts it from the total usable physical bits, but it is called too late. Move the call to early_init_intel() so that it is called in setup_arch(), before MTRRs are setup. This fixes boot on TDX-enabled systems, which until now only worked with "disable_mtrr_cleanup". Without the patch, the values written to the MTRRs mask registers were 52-bit wide (e.g. 0x000fffff_80000800) and the writes failed; with the patch, the values are 46-bit wide, which matches the reduced MAXPHYADDR that is shown in /proc/cpuinfo. Reported-by: Zixi Chen <zixchen@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20240131230902.1867092-3-pbonzini%40redhat.com
2024-02-26x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()Paolo Bonzini
In commit fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach"), the initialization of c->x86_phys_bits was moved after this_cpu->c_early_init(c). This is incorrect because early_init_amd() expected to be able to reduce the value according to the contents of CPUID leaf 0x8000001f. Fortunately, the bug was negated by init_amd()'s call to early_init_amd(), which does reduce x86_phys_bits in the end. However, this is very late in the boot process and, most notably, the wrong value is used for x86_phys_bits when setting up MTRRs. To fix this, call get_cpu_address_sizes() as soon as X86_FEATURE_CPUID is set/cleared, and c->extended_cpuid_level is retrieved. Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20240131230902.1867092-2-pbonzini%40redhat.com
2024-02-25Merge tag 'x86_urgent_for_v6.8_rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Make sure clearing CPU buffers using VERW happens at the latest possible point in the return-to-userspace path, otherwise memory accesses after the VERW execution could cause data to land in CPU buffers again * tag 'x86_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: KVM/VMX: Move VERW closer to VMentry for MDS mitigation KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key x86/entry_32: Add VERW just before userspace transition x86/entry_64: Add VERW just before userspace transition x86/bugs: Add asm helpers for executing VERW
2024-02-24Merge tag 'powerpc-6.8-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix a crash when hot adding a PCI device to an LPAR since recent changes - Fix nested KVM level-2 guest reboot failure due to empty 'arch_compat' Thanks to Amit Machhiwal, Aneesh Kumar K.V (IBM), Brian King, Gaurav Batra, and Vaibhav Jain. * tag 'powerpc-6.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: KVM: PPC: Book3S HV: Fix L2 guest reboot failure due to empty 'arch_compat' powerpc/pseries/iommu: DLPAR add doesn't completely initialize pci_controller
2024-02-24Merge tag 'cxl-fixes-6.8-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dan Williams: "A collection of significant fixes for the CXL subsystem. The largest change in this set, that bordered on "new development", is the fix for the fact that the location of the new qos_class attribute did not match the Documentation. The fix ends up deleting more code than it added, and it has a new unit test to backstop basic errors in this interface going forward. So the "red-diff" and unit test saved the "rip it out and try again" response. In contrast, the new notification path for firmware reported CXL errors (CXL CPER notifications) has a locking context bug that can not be fixed with a red-diff. Given where the release cycle stands, it is not comfortable to squeeze in that fix in these waning days. So, that receives the "back it out and try again later" treatment. There is a regression fix in the code that establishes memory NUMA nodes for platform CXL regions. That has an ack from x86 folks. There are a couple more fixups for Linux to understand (reassemble) CXL regions instantiated by platform firmware. The policy around platforms that do not match host-physical-address with system-physical-address (i.e. systems that have an address translation mechanism between the address range reported in the ACPI CEDT.CFMWS and endpoint decoders) has been softened to abort driver load rather than teardown the memory range (can cause system hangs). Lastly, there is a robustness / regression fix for cases where the driver would previously continue in the face of error, and a fixup for PCI error notification handling. Summary: - Fix NUMA initialization from ACPI CEDT.CFMWS - Fix region assembly failures due to async init order - Fix / simplify export of qos_class information - Fix cxl_acpi initialization vs single-window-init failures - Fix handling of repeated 'pci_channel_io_frozen' notifications - Workaround platforms that violate host-physical-address == system-physical address assumptions - Defer CXL CPER notification handling to v6.9" * tag 'cxl-fixes-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/acpi: Fix load failures due to single window creation failure acpi/ghes: Remove CXL CPER notifications cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window cxl/test: Add support for qos_class checking cxl: Fix sysfs export of qos_class for memdev cxl: Remove unnecessary type cast in cxl_qos_class_verify() cxl: Change 'struct cxl_memdev_state' *_perf_list to single 'struct cxl_dpa_perf' cxl/region: Allow out of order assembly of autodiscovered regions cxl/region: Handle endpoint decoders in cxl_region_find_decoder() x86/numa: Fix the sort compare func used in numa_fill_memblks() x86/numa: Fix the address overlap check in numa_fill_memblks() cxl/pci: Skip to handle RAS errors if CXL.mem device is detached
2024-02-24Merge tag 'loongarch-fixes-6.8-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix two cpu-hotplug issues, fix the init sequence about FDT system, fix the coding style of dts, and fix the wrong CPUCFG ID handling of KVM" * tag 'loongarch-fixes-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Streamline kvm_check_cpucfg() and improve comments LoongArch: KVM: Rename _kvm_get_cpucfg() to _kvm_get_cpucfg_mask() LoongArch: KVM: Fix input validation of _kvm_get_cpucfg() & kvm_check_cpucfg() LoongArch: dts: Minor whitespace cleanup LoongArch: Call early_init_fdt_scan_reserved_mem() earlier LoongArch: Update cpu_sibling_map when disabling nonboot CPUs LoongArch: Disable IRQ before init_fn() for nonboot CPUs
2024-02-24crypto: arm64/neonbs - fix out-of-bounds access on short inputArd Biesheuvel
The bit-sliced implementation of AES-CTR operates on blocks of 128 bytes, and will fall back to the plain NEON version for tail blocks or inputs that are shorter than 128 bytes to begin with. It will call straight into the plain NEON asm helper, which performs all memory accesses in granules of 16 bytes (the size of a NEON register). For this reason, the associated plain NEON glue code will copy inputs shorter than 16 bytes into a temporary buffer, given that this is a rare occurrence and it is not worth the effort to work around this in the asm code. The fallback from the bit-sliced NEON version fails to take this into account, potentially resulting in out-of-bounds accesses. So clone the same workaround, and use a temp buffer for short in/outputs. Fixes: fc074e130051 ("crypto: arm64/aes-neonbs-ctr - fallback to plain NEON for final chunk") Cc: <stable@vger.kernel.org> Reported-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-02-23Merge tag 'parisc-for-6.8-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc architecture fixes from Helge Deller: "Fixes CPU hotplug, the parisc stack unwinder and two possible build errors in kprobes and ftrace area: - Fix CPU hotplug - Fix unaligned accesses and faults in stack unwinder - Fix potential build errors by always including asm-generic/kprobes.h - Fix build bug by add missing CONFIG_DYNAMIC_FTRACE check" * tag 'parisc-for-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix stack unwinder parisc/kprobes: always include asm-generic/kprobes.h parisc/ftrace: add missing CONFIG_DYNAMIC_FTRACE check Revert "parisc: Only list existing CPUs in cpu_possible_mask"
2024-02-23Merge tag 'arm-fixes-6.8-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull arm and RISC-V SoC fixes from Arnd Bergmann: "The Rockchip and IMX8 platforms get a number of fixes for dts files in order to address some misconfigurations, including a regression for USB-C support on some boards. The other dts fixes are part of a series by Rob Herring to clean up another class of dtc compiler warnings across all platforms, with a few others helping out as well. With this, we can enable the warning for the coming merge window without introducing regressions. Conor Dooley has collected fixes for RISC-V platforms, both for the dts files and for platofrm specific drivers. The ep93xx platform gets a regression for for its gpio descriptors" * tag 'arm-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (28 commits) ARM: dts: renesas: rcar-gen2: Add missing #interrupt-cells to DA9063 nodes cache: ax45mp_cache: Align end size to cache boundary in ax45mp_dma_cache_wback() arm64: dts: qcom: Fix interrupt-map cell sizes arm: dts: Fix dtc interrupt_map warnings arm64: dts: Fix dtc interrupt_provider warnings arm: dts: Fix dtc interrupt_provider warnings arm64: dts: freescale: Disable interrupt_map check ARM: ep93xx: Add terminator to gpiod_lookup_table riscv: dts: sifive: add missing #interrupt-cells to pmic arm64: dts: rockchip: Correct Indiedroid Nova GPIO Names arm64: dts: rockchip: Drop interrupts property from rk3328 pwm-rockchip node arm64: dts: rockchip: set num-cs property for spi on px30 arm64: dts: rockchip: minor rk3588 whitespace cleanup riscv: dts: starfive: replace underscores in node names bus: imx-weim: fix valid range check Revert "arm64: dts: imx8mn-var-som-symphony: Describe the USB-C connector" Revert "arm64: dts: imx8mp-dhcom-pdk3: Describe the USB-C connector" arm64: dts: tqma8mpql: fix audio codec iov-supply arm64: dts: rockchip: drop unneeded status from rk3588-jaguar gpio-leds ARM: dts: rockchip: Drop interrupts property from pwm-rockchip nodes ...
2024-02-23Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "A simple fix to a definition in the CXL PMU driver, a couple of patches to restore SME control registers on the resume path (since Arm's fast model now clears them) and a revert for our jump label asm constraints after Geert noticed they broke the build with GCC 5.5. There was then the ensuing discussion about raising the minimum GCC (and corresponding binutils) versions at [1], but for now we'll keep things working as they were until that goes ahead. - Revert fix to jump label asm constraints, as it regresses the build with some GCC 5.5 toolchains. - Restore SME control registers when resuming from suspend - Fix incorrect filter definition in CXL PMU driver" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/sme: Restore SMCR_EL1.EZT0 on exit from suspend arm64/sme: Restore SME registers on exit from suspend Revert "arm64: jump_label: use constraints "Si" instead of "i"" perf: CXL: fix CPMU filter value mask length
2024-02-23Merge tag 's390-6.8-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - Fix invalid -EBUSY on ccw_device_start() which can lead to failing device initialization - Add missing multiplication by 8 in __iowrite64_copy() to get the correct byte length before calling zpci_memcpy_toio() - Various config updates * tag 's390-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: fix invalid -EBUSY on ccw_device_start s390: use the correct count for __iowrite64_copy() s390/configs: update default configurations s390/configs: enable INIT_STACK_ALL_ZERO in all configurations s390/configs: provide compat topic configuration target
2024-02-23Merge tag 'drm-fixes-2024-02-23' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "This is the weekly drm fixes. Non-drivers there is a fbdev/sparc fix, syncobj, ttm and buddy fixes. On the driver side, ivpu, meson, i915 have a small fix each. Then amdgpu and xe have a bunch. Nouveau has some minor uapi additions to give userspace some useful info along with a Kconfig change to allow the new GSP firmware paths to be used by default on the GPUs it supports. Seems about the usual amount for this time of release cycle. fbdev: - fix sparc undefined reference syncobj: - fix sync obj fence waiting - handle NULL fence in syncobj eventfd code ttm: - fix invalid free buddy: - fix list handling - fix 32-bit build meson: - don't remove bridges from other drivers nouveau: - fix build warnings - add two minor info parameters - add a Kconfig to allow GSP by default on some GPUs ivpu: - allow fw to do initial tile config i915: - fix TV mode amdgpu: - Suspend/resume fixes - Backlight error fix - DCN 3.5 fixes - Misc fixes xe: - Remove support for persistent exec_queues - Drop a reduntant sysfs newline printout - A three-patch fix for a VM_BIND rebind optimization path - Fix a modpost warning on an xe KUNIT module" * tag 'drm-fixes-2024-02-23' of git://anongit.freedesktop.org/drm/drm: (27 commits) nouveau: add an ioctl to report vram usage nouveau: add an ioctl to return vram bar size. nouveau/gsp: add kconfig option to enable GSP paths by default drm/amdgpu: Fix the runtime resume failure issue drm/amd/display: fix null-pointer dereference on edid reading drm/amd/display: Fix memory leak in dm_sw_fini() drm/amd/display: fix input states translation error for dcn35 & dcn351 drm/amd/display: Fix potential null pointer dereference in dc_dmub_srv drm/amd/display: Only allow dig mapping to pwrseq in new asic drm/amd/display: adjust few initialization order in dm drm/syncobj: handle NULL fence in syncobj_eventfd_entry_func drm/syncobj: call drm_syncobj_fence_add_wait when WAIT_AVAILABLE flag is set drm/ttm: Fix an invalid freeing on already freed page in error path sparc: Fix undefined reference to fb_is_primary_device drm/xe: Fix modpost warning on xe_mocs kunit module drm/xe/xe_gt_idle: Drop redundant newline in name drm/xe: Return 2MB page size for compact 64k PTEs drm/xe: Add XE_VMA_PTE_64K VMA flag drm/xe: Fix xe_vma_set_pte_size drm/xe/uapi: Remove support for persistent exec_queues ...
2024-02-23RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUsConor Dooley
Before attempting to support the pre-ratification version of vector found on older T-Head CPUs, disallow "v" in riscv,isa on these platforms. The deprecated property has no clear way to communicate the specific version of vector that is supported and much of the vendor provided software puts "v" in the isa string. riscv,isa-extensions should be used instead. This should not be too much of a burden for these systems, as the vendor shipped devicetrees and firmware do not work with a mainline kernel and will require updating. We can limit this restriction to only ignore v in riscv,isa on CPUs that report T-Head's vendor ID and a zero marchid. Newer T-Head CPUs that support the ratified version of vector should report non-zero marchid, according to Guo Ren [1]. Link: https://lore.kernel.org/linux-riscv/CAJF2gTRy5eK73=d6s7CVy9m9pB8p4rAoMHM3cZFwzg=AuF7TDA@mail.gmail.com/ [1] Fixes: dc6667a4e7e3 ("riscv: Extending cpufeature.c to detect V-extension") Co-developed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20240223-tidings-shabby-607f086cb4d7@spud Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-23Merge tag 'renesas-fixes-for-v6.8-tag1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/fixes Renesas fixes for v6.8 - Add missing #interrupt-cells to DA9063 nodes. * tag 'renesas-fixes-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel: ARM: dts: renesas: rcar-gen2: Add missing #interrupt-cells to DA9063 nodes Link: https://lore.kernel.org/r/cover.1708597150.git.geert+renesas@glider.be Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-02-23Merge tag 'riscv-dt-fixes-for-v6.8-rc6' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes RISC-V Devicetree fixes for v6.8-rc6 Two fixes for W=2 issues in devicetrees, which should constitute fixes for all reasonable-to-fix W=2 problems on RISC-V. The others are caused by standard USB and MMC property names containing underscores that are not likely to ever change. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> * tag 'riscv-dt-fixes-for-v6.8-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux: riscv: dts: sifive: add missing #interrupt-cells to pmic riscv: dts: starfive: replace underscores in node names Link: https://lore.kernel.org/r/20240221-foil-glade-09dbf1aa3fe2@spud Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-02-23x86: irq: unconditionally define KVM interrupt vectorsPaolo Bonzini
Unlike arch/x86/kernel/idt.c, FRED support chose to remove the #ifdefs from the .c files and concentrate them in the headers, where unused handlers are #define'd to NULL. However, the constants for KVM's 3 posted interrupt vectors are still defined conditionally in irq_vectors.h. In the tree that FRED support was developed on, this is innocuous because CONFIG_HAVE_KVM was effectively always set. With the cleanups that recently went into the KVM tree to remove CONFIG_HAVE_KVM, the conditional became IS_ENABLED(CONFIG_KVM). This causes a linux-next compilation failure in FRED code, when CONFIG_KVM=n. In preparation for the merging of FRED in Linux 6.9, define the interrupt vector numbers unconditionally. Cc: x86@kernel.org Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Suggested-by: Xin Li (Intel) <xin@zytor.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-02-23LoongArch: KVM: Streamline kvm_check_cpucfg() and improve commentsWANG Xuerui
All the checks currently done in kvm_check_cpucfg can be realized with early returns, so just do that to avoid extra cognitive burden related to the return value handling. While at it, clean up comments of _kvm_get_cpucfg_mask() and kvm_check_cpucfg(), by removing comments that are merely restatement of the code nearby, and paraphrasing the rest so they read more natural for English speakers (that likely are not familiar with the actual Chinese- influenced grammar). No functional changes intended. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23LoongArch: KVM: Rename _kvm_get_cpucfg() to _kvm_get_cpucfg_mask()WANG Xuerui
The function is not actually a getter of guest CPUCFG, but rather validation of the input CPUCFG ID plus information about the supported bit flags of that CPUCFG leaf. So rename it to avoid confusion. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23LoongArch: KVM: Fix input validation of _kvm_get_cpucfg() & kvm_check_cpucfg()WANG Xuerui
The range check for the CPUCFG ID is wrong (should have been a || instead of &&) and useless in effect, so fix the obvious mistake. Furthermore, the juggling of the temp return value is unnecessary, because it is semantically equivalent and more readable to just return at every switch case's end. This is done too to avoid potential bugs in the future related to the unwanted complexity. Also, the return value of _kvm_get_cpucfg is meant to be checked, but this was not done, so bad CPUCFG IDs wrongly fall back to the default case and 0 is incorrectly returned; check the return value to fix the UAPI behavior. While at it, also remove the redundant range check in kvm_check_cpucfg, because out-of-range CPUCFG IDs are already rejected by the -EINVAL as returned by _kvm_get_cpucfg(). Fixes: db1ecca22edf ("LoongArch: KVM: Add LSX (128bit SIMD) support") Fixes: 118e10cd893d ("LoongArch: KVM: Add LASX (256bit SIMD) support") Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23LoongArch: dts: Minor whitespace cleanupKrzysztof Kozlowski
The DTS code coding style expects exactly one space before '{' character. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23LoongArch: Call early_init_fdt_scan_reserved_mem() earlierHuacai Chen
The unflatten_and_copy_device_tree() function contains a call to memblock_alloc(). This means that memblock is allocating memory before any of the reserved memory regions are set aside in the arch_mem_init() function which calls early_init_fdt_scan_reserved_mem(). Therefore, there is a possibility for memblock to allocate from any of the reserved memory regions. Hence, move the call to early_init_fdt_scan_reserved_mem() to be earlier in the init sequence, so that the reserved memory regions are set aside before any allocations are done using memblock. Cc: stable@vger.kernel.org Fixes: 88d4d957edc707e ("LoongArch: Add FDT booting support from efi system table") Signed-off-by: Oreoluwa Babatunde <quic_obabatun@quicinc.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23LoongArch: Update cpu_sibling_map when disabling nonboot CPUsHuacai Chen
Update cpu_sibling_map when disabling nonboot CPUs by defining & calling clear_cpu_sibling_map(), otherwise we get such errors on SMT systems: jump label: negative count! WARNING: CPU: 6 PID: 45 at kernel/jump_label.c:263 __static_key_slow_dec_cpuslocked+0xec/0x100 CPU: 6 PID: 45 Comm: cpuhp/6 Not tainted 6.8.0-rc5+ #1340 pc 90000000004c302c ra 90000000004c302c tp 90000001005bc000 sp 90000001005bfd20 a0 000000000000001b a1 900000000224c278 a2 90000001005bfb58 a3 900000000224c280 a4 900000000224c278 a5 90000001005bfb50 a6 0000000000000001 a7 0000000000000001 t0 ce87a4763eb5234a t1 ce87a4763eb5234a t2 0000000000000000 t3 0000000000000000 t4 0000000000000006 t5 0000000000000000 t6 0000000000000064 t7 0000000000001964 t8 000000000009ebf6 u0 9000000001f2a068 s9 0000000000000000 s0 900000000246a2d8 s1 ffffffffffffffff s2 ffffffffffffffff s3 90000000021518c0 s4 0000000000000040 s5 9000000002151058 s6 9000000009828e40 s7 00000000000000b4 s8 0000000000000006 ra: 90000000004c302c __static_key_slow_dec_cpuslocked+0xec/0x100 ERA: 90000000004c302c __static_key_slow_dec_cpuslocked+0xec/0x100 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 00000004 (PPLV0 +PIE -PWE) EUEN: 00000000 (-FPE -SXE -ASXE -BTE) ECFG: 00071c1c (LIE=2-4,10-12 VS=7) ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) CPU: 6 PID: 45 Comm: cpuhp/6 Not tainted 6.8.0-rc5+ #1340 Stack : 0000000000000000 900000000203f258 900000000179afc8 90000001005bc000 90000001005bf980 0000000000000000 90000001005bf988 9000000001fe0be0 900000000224c280 900000000224c278 90000001005bf8c0 0000000000000001 0000000000000001 ce87a4763eb5234a 0000000007f38000 90000001003f8cc0 0000000000000000 0000000000000006 0000000000000000 4c206e6f73676e6f 6f4c203a656d616e 000000000009ec99 0000000007f38000 0000000000000000 900000000214b000 9000000001fe0be0 0000000000000004 0000000000000000 0000000000000107 0000000000000009 ffffffffffafdabe 00000000000000b4 0000000000000006 90000000004c302c 9000000000224528 00005555939a0c7c 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1c ... Call Trace: [<9000000000224528>] show_stack+0x48/0x1a0 [<900000000179afc8>] dump_stack_lvl+0x78/0xa0 [<9000000000263ed0>] __warn+0x90/0x1a0 [<90000000017419b8>] report_bug+0x1b8/0x280 [<900000000179c564>] do_bp+0x264/0x420 [<90000000004c302c>] __static_key_slow_dec_cpuslocked+0xec/0x100 [<90000000002b4d7c>] sched_cpu_deactivate+0x2fc/0x300 [<9000000000266498>] cpuhp_invoke_callback+0x178/0x8a0 [<9000000000267f70>] cpuhp_thread_fun+0xf0/0x240 [<90000000002a117c>] smpboot_thread_fn+0x1dc/0x2e0 [<900000000029a720>] kthread+0x140/0x160 [<9000000000222288>] ret_from_kernel_thread+0xc/0xa4 Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23LoongArch: Disable IRQ before init_fn() for nonboot CPUsHuacai Chen
Disable IRQ before init_fn() for nonboot CPUs when hotplug, in order to silence such warnings (and also avoid potential errors due to unexpected interrupts): WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:4503 rcu_cpu_starting+0x214/0x280 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.6.17+ #1198 pc 90000000048e3334 ra 90000000047bd56c tp 900000010039c000 sp 900000010039fdd0 a0 0000000000000001 a1 0000000000000006 a2 900000000802c040 a3 0000000000000000 a4 0000000000000001 a5 0000000000000004 a6 0000000000000000 a7 90000000048e3f4c t0 0000000000000001 t1 9000000005c70968 t2 0000000004000000 t3 000000000005e56e t4 00000000000002e4 t5 0000000000001000 t6 ffffffff80000000 t7 0000000000040000 t8 9000000007931638 u0 0000000000000006 s9 0000000000000004 s0 0000000000000001 s1 9000000006356ac0 s2 9000000007244000 s3 0000000000000001 s4 0000000000000001 s5 900000000636f000 s6 7fffffffffffffff s7 9000000002123940 s8 9000000001ca55f8 ra: 90000000047bd56c tlb_init+0x24c/0x528 ERA: 90000000048e3334 rcu_cpu_starting+0x214/0x280 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 00000000 (PPLV0 -PIE -PWE) EUEN: 00000000 (-FPE -SXE -ASXE -BTE) ECFG: 00071000 (LIE=12 VS=7) ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.6.17+ #1198 Stack : 0000000000000000 9000000006375000 9000000005b61878 900000010039c000 900000010039fa30 0000000000000000 900000010039fa38 900000000619a140 9000000006456888 9000000006456880 900000010039f950 0000000000000001 0000000000000001 cb0cb028ec7e52e1 0000000002b90000 9000000100348700 0000000000000000 0000000000000001 ffffffff916d12f1 0000000000000003 0000000000040000 9000000007930370 0000000002b90000 0000000000000004 9000000006366000 900000000619a140 0000000000000000 0000000000000004 0000000000000000 0000000000000009 ffffffffffc681f2 9000000002123940 9000000001ca55f8 9000000006366000 90000000047a4828 00007ffff057ded8 00000000000000b0 0000000000000000 0000000000000000 0000000000071000 ... Call Trace: [<90000000047a4828>] show_stack+0x48/0x1a0 [<9000000005b61874>] dump_stack_lvl+0x84/0xcc [<90000000047f60ac>] __warn+0x8c/0x1e0 [<9000000005b0ab34>] report_bug+0x1b4/0x280 [<9000000005b63110>] do_bp+0x2d0/0x480 [<90000000047a2e20>] handle_bp+0x120/0x1c0 [<90000000048e3334>] rcu_cpu_starting+0x214/0x280 [<90000000047bd568>] tlb_init+0x248/0x528 [<90000000047a4c44>] per_cpu_trap_init+0x124/0x160 [<90000000047a19f4>] cpu_probe+0x494/0xa00 [<90000000047b551c>] start_secondary+0x3c/0xc0 [<9000000005b66134>] smpboot_entry+0x50/0x58 Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-02-23powerpc/rtas: use correct function name for resetting TCE tablesNathan Lynch
The PAPR spec spells the function name as "ibm,reset-pe-dma-windows" but in practice firmware uses the singular form: "ibm,reset-pe-dma-window" in the device tree. Since we have the wrong spelling in the RTAS function table, reverse lookups (token -> name) fail and warn: unexpected failed lookup for token 86 WARNING: CPU: 1 PID: 545 at arch/powerpc/kernel/rtas.c:659 __do_enter_rtas_trace+0x2a4/0x2b4 CPU: 1 PID: 545 Comm: systemd-udevd Not tainted 6.8.0-rc4 #30 Hardware name: IBM,9105-22A POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NL1060_028) hv:phyp pSeries NIP [c0000000000417f0] __do_enter_rtas_trace+0x2a4/0x2b4 LR [c0000000000417ec] __do_enter_rtas_trace+0x2a0/0x2b4 Call Trace: __do_enter_rtas_trace+0x2a0/0x2b4 (unreliable) rtas_call+0x1f8/0x3e0 enable_ddw.constprop.0+0x4d0/0xc84 dma_iommu_dma_supported+0xe8/0x24c dma_set_mask+0x5c/0xd8 mlx5_pci_init.constprop.0+0xf0/0x46c [mlx5_core] probe_one+0xfc/0x32c [mlx5_core] local_pci_probe+0x68/0x12c pci_call_probe+0x68/0x1ec pci_device_probe+0xbc/0x1a8 really_probe+0x104/0x570 __driver_probe_device+0xb8/0x224 driver_probe_device+0x54/0x130 __driver_attach+0x158/0x2b0 bus_for_each_dev+0xa8/0x120 driver_attach+0x34/0x48 bus_add_driver+0x174/0x304 driver_register+0x8c/0x1c4 __pci_register_driver+0x68/0x7c mlx5_init+0xb8/0x118 [mlx5_core] do_one_initcall+0x60/0x388 do_init_module+0x7c/0x2a4 init_module_from_file+0xb4/0x108 idempotent_init_module+0x184/0x34c sys_finit_module+0x90/0x114 And oopses are possible when lockdep is enabled or the RTAS tracepoints are active, since those paths dereference the result of the lookup. Use the correct spelling to match firmware's behavior, adjusting the related constants to match. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 8252b88294d2 ("powerpc/rtas: improve function information lookups") Reported-by: Gaurav Batra <gbatra@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240222-rtas-fix-ibm-reset-pe-dma-window-v1-1-7aaf235ac63c@linux.ibm.com
2024-02-23powerpc/pseries/iommu: IOMMU table is not initialized for kdump over SR-IOVGaurav Batra
When kdump kernel tries to copy dump data over SR-IOV, LPAR panics due to NULL pointer exception: Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc000000020847ad4 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: mlx5_core(+) vmx_crypto pseries_wdt papr_scm libnvdimm mlxfw tls psample sunrpc fuse overlay squashfs loop CPU: 12 PID: 315 Comm: systemd-udevd Not tainted 6.4.0-Test102+ #12 Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_008) hv:phyp pSeries NIP: c000000020847ad4 LR: c00000002083b2dc CTR: 00000000006cd18c REGS: c000000029162ca0 TRAP: 0300 Not tainted (6.4.0-Test102+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 48288244 XER: 00000008 CFAR: c00000002083b2d8 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 1 ... NIP _find_next_zero_bit+0x24/0x110 LR bitmap_find_next_zero_area_off+0x5c/0xe0 Call Trace: dev_printk_emit+0x38/0x48 (unreliable) iommu_area_alloc+0xc4/0x180 iommu_range_alloc+0x1e8/0x580 iommu_alloc+0x60/0x130 iommu_alloc_coherent+0x158/0x2b0 dma_iommu_alloc_coherent+0x3c/0x50 dma_alloc_attrs+0x170/0x1f0 mlx5_cmd_init+0xc0/0x760 [mlx5_core] mlx5_function_setup+0xf0/0x510 [mlx5_core] mlx5_init_one+0x84/0x210 [mlx5_core] probe_one+0x118/0x2c0 [mlx5_core] local_pci_probe+0x68/0x110 pci_call_probe+0x68/0x200 pci_device_probe+0xbc/0x1a0 really_probe+0x104/0x540 __driver_probe_device+0xb4/0x230 driver_probe_device+0x54/0x130 __driver_attach+0x158/0x2b0 bus_for_each_dev+0xa8/0x130 driver_attach+0x34/0x50 bus_add_driver+0x16c/0x300 driver_register+0xa4/0x1b0 __pci_register_driver+0x68/0x80 mlx5_init+0xb8/0x100 [mlx5_core] do_one_initcall+0x60/0x300 do_init_module+0x7c/0x2b0 At the time of LPAR dump, before kexec hands over control to kdump kernel, DDWs (Dynamic DMA Windows) are scanned and added to the FDT. For the SR-IOV case, default DMA window "ibm,dma-window" is removed from the FDT and DDW added, for the device. Now, kexec hands over control to the kdump kernel. When the kdump kernel initializes, PCI busses are scanned and IOMMU group/tables created, in pci_dma_bus_setup_pSeriesLP(). For the SR-IOV case, there is no "ibm,dma-window". The original commit: b1fc44eaa9ba, fixes the path where memory is pre-mapped (direct mapped) to the DDW. When TCEs are direct mapped, there is no need to initialize IOMMU tables. iommu_table_setparms_lpar() only considers "ibm,dma-window" property when initiallizing IOMMU table. In the scenario where TCEs are dynamically allocated for SR-IOV, newly created IOMMU table is not initialized. Later, when the device driver tries to enter TCEs for the SR-IOV device, NULL pointer execption is thrown from iommu_area_alloc(). The fix is to initialize the IOMMU table with DDW property stored in the FDT. There are 2 points to remember: 1. For the dedicated adapter, kdump kernel would encounter both default and DDW in FDT. In this case, DDW property is used to initialize the IOMMU table. 2. A DDW could be direct or dynamic mapped. kdump kernel would initialize IOMMU table and mark the existing DDW as "dynamic". This works fine since, at the time of table initialization, iommu_table_clear() makes some space in the DDW, for some predefined number of TCEs which are needed for kdump to succeed. Fixes: b1fc44eaa9ba ("pseries/iommu/ddw: Fix kdump to work in absence of ibm,dma-window") Signed-off-by: Gaurav Batra <gbatra@linux.vnet.ibm.com> Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240125203017.61014-1-gbatra@linux.ibm.com
2024-02-22KVM: x86/mmu: Restrict KVM_SW_PROTECTED_VM to the TDP MMUSean Christopherson
Advertise and support software-protected VMs if and only if the TDP MMU is enabled, i.e. disallow KVM_SW_PROTECTED_VM if TDP is enabled for KVM's legacy/shadow MMU. TDP support for the shadow MMU is maintenance-only, e.g. support for TDX and SNP will also be restricted to the TDP MMU. Fixes: 89ea60c2c7b5 ("KVM: x86: Add support for "protected VMs" that can utilize private memory") Link: https://lore.kernel.org/r/20240222190612.2942589-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-22KVM: x86: Update KVM_SW_PROTECTED_VM docs to make it clear they're a WIPSean Christopherson
Rewrite the help message for KVM_SW_PROTECTED_VM to make it clear that software-protected VMs are a development and testing vehicle for guest_memfd(), and that attempting to use KVM_SW_PROTECTED_VM for anything remotely resembling a "real" VM will fail. E.g. any memory accesses from KVM will incorrectly access shared memory, nested TDP is wildly broken, and so on and so forth. Update KVM's API documentation with similar warnings to discourage anyone from attempting to run anything but selftests with KVM_X86_SW_PROTECTED_VM. Fixes: 89ea60c2c7b5 ("KVM: x86: Add support for "protected VMs" that can utilize private memory") Link: https://lore.kernel.org/r/20240222190612.2942589-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-02-23Merge tag 'drm-misc-fixes-2024-02-22' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A list handling fix and 64bit division on 32bit platform fix for the drm/buddy allocator, a cast warning and an initialization fix for nouveau, a bridge handling fix for meson, an initialisation fix for ivpu, a SPARC build fix for fbdev, a double-free fix for ttm, and two fence handling fixes for syncobj. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/gl2antuifidtzn3dfm426p7xwh5fxj23behagwh26owfnosh2w@gqoa7vj5prnh
2024-02-22riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATIONAlexandre Ghiti
The new riscv specific arch_hugetlb_migration_supported() must be guarded with a #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION to avoid the following build error: In file included from include/linux/hugetlb.h:851, from kernel/fork.c:52: >> arch/riscv/include/asm/hugetlb.h:15:42: error: static declaration of 'arch_hugetlb_migration_supported' follows non-static declaration 15 | #define arch_hugetlb_migration_supported arch_hugetlb_migration_supported | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/hugetlb.h:916:20: note: in expansion of macro 'arch_hugetlb_migration_supported' 916 | static inline bool arch_hugetlb_migration_supported(struct hstate *h) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/hugetlb.h:14:6: note: previous declaration of 'arch_hugetlb_migration_supported' with type 'bool(struct hstate *)' {aka '_Bool(struct hstate *)'} 14 | bool arch_hugetlb_migration_supported(struct hstate *h); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402110258.CV51JlEI-lkp@intel.com/ Fixes: ce68c035457b ("riscv: Fix arch_hugetlb_migration_supported() for NAPOT") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240211083640.756583-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-22riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctlyYangyu Chen
Previous commit dbfbda3bd6bf ("riscv: mm: update T-Head memory type definitions") from patch [1] missed a `<` for bit shifting, result in bit(61) does not set in _PAGE_NOCACHE_THEAD and leaves bit(0) set instead. This patch get this fixed. Link: https://lore.kernel.org/linux-riscv/20230912072510.2510-1-jszhang@kernel.org/ [1] Fixes: dbfbda3bd6bf ("riscv: mm: update T-Head memory type definitions") Signed-off-by: Yangyu Chen <cyy@cyyself.name> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/tencent_E19FA1A095768063102E654C6FC858A32F06@qq.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-22riscv: add CALLER_ADDRx supportZong Li
CALLER_ADDRx returns caller's address at specified level, they are used for several tracers. These macros eventually use __builtin_return_address(n) to get the caller's address if arch doesn't define their own implementation. In RISC-V, __builtin_return_address(n) only works when n == 0, we need to walk the stack frame to get the caller's address at specified level. data.level started from 'level + 3' due to the call flow of getting caller's address in RISC-V implementation. If we don't have additional three iteration, the level is corresponding to follows: callsite -> return_address -> arch_stack_walk -> walk_stackframe | | | | level 3 level 2 level 1 level 0 Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Zong Li <zong.li@sifive.com> Link: https://lore.kernel.org/r/20240202015102.26251-1-zong.li@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>