summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-10riscv: Implement flush_cache_vmap()Alexandre Ghiti
The RISC-V kernel needs a sfence.vma after a page table modification: we used to rely on the vmalloc fault handling to emit an sfence.vma, but commit 7d3332be011e ("riscv: mm: Pre-allocate PGD entries for vmalloc/modules area") got rid of this path for 64-bit kernels, so now we need to explicitly emit a sfence.vma in flush_cache_vmap(). Note that we don't need to implement flush_cache_vunmap() as the generic code should emit a flush tlb after unmapping a vmalloc region. Fixes: 7d3332be011e ("riscv: mm: Pre-allocate PGD entries for vmalloc/modules area") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230725132246.817726-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-10riscv: Do not allow vmap pud mappings for 3-level page tableAlexandre Ghiti
The vmalloc_fault() path was removed and to avoid syncing the vmalloc PGD mappings, they are now preallocated. But if the kernel can use a PUD mapping (which in sv39 is actually a PGD mapping) for large vmalloc allocation, it will free the current unused preallocated PGD mapping and install a new leaf one. Since there is no sync anymore, some page tables lack this new mapping and that triggers a panic. So only allow PUD mappings for sv48 and sv57. Fixes: 7d3332be011e ("riscv: mm: Pre-allocate PGD entries for vmalloc/modules area") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230808130709.1502614-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-08riscv: mm: fix 2 instances of -Wmissing-variable-declarationsNick Desaulniers
I'm looking to enable -Wmissing-variable-declarations behind W=1. 0day bot spotted the following instance in ARCH=riscv builds: arch/riscv/mm/init.c:276:7: warning: no previous extern declaration for non-static variable 'trampoline_pg_dir' [-Wmissing-variable-declarations] 276 | pgd_t trampoline_pg_dir[PTRS_PER_PGD] __page_aligned_bss; | ^ arch/riscv/mm/init.c:276:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit 276 | pgd_t trampoline_pg_dir[PTRS_PER_PGD] __page_aligned_bss; | ^ arch/riscv/mm/init.c:279:7: warning: no previous extern declaration for non-static variable 'early_pg_dir' [-Wmissing-variable-declarations] 279 | pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); | ^ arch/riscv/mm/init.c:279:1: note: declare 'static' if the variable is not intended to be used outside of this translation unit 279 | pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); | ^ These symbols are referenced by more than one translation unit, so make sure they're both declared and include the correct header for their declarations. Finally, sort the list of includes to help keep them tidy. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/llvm/202308081000.tTL1ElTr-lkp@intel.com/ Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20230808-riscv_static-v2-1-2a1e2d2c7a4f@google.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-08riscv,mmio: Fix readX()-to-delay() orderingAndrea Parri
Section 2.1 of the Platform Specification [1] states: Unless otherwise specified by a given I/O device, I/O devices are on ordering channel 0 (i.e., they are point-to-point strongly ordered). which is not sufficient to guarantee that a readX() by a hart completes before a subsequent delay() on the same hart (cf. memory-barriers.txt, "Kernel I/O barrier effects"). Set the I(nput) bit in __io_ar() to restore the ordering, align inline comments. [1] https://github.com/riscv/riscv-platform-specs Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20230803042738.5937-1-parri.andrea@gmail.com Fixes: fab957c11efe ("RISC-V: Atomic and Locking Code") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-08riscv: Fix CPU feature detection with SMP disabledSamuel Holland
commit 914d6f44fc50 ("RISC-V: only iterate over possible CPUs in ISA string parser") changed riscv_fill_hwcap() from iterating over CPU DT nodes to iterating over logical CPU IDs. Since this function runs long before cpu_dev_init() creates CPU devices, it hits the fallback path in of_cpu_device_node_get(), which itself iterates over the DT nodes, searching for a node with the requested CPU ID. (Incidentally, this makes riscv_fill_hwcap() now take quadratic time.) riscv_fill_hwcap() passes a logical CPU ID to of_cpu_device_node_get(), which uses the arch_match_cpu_phys_id() hook to translate the logical ID to a physical ID as found in the DT. arch_match_cpu_phys_id() has a generic weak definition, and RISC-V provides a strong definition using cpuid_to_hartid_map(). However, the RISC-V specific implementation is located in arch/riscv/kernel/smp.c, and that file is only compiled when SMP is enabled. As a result, when SMP is disabled, the generic definition is used, and riscv_isa gets initialized based on the ISA string of hart 0, not the boot hart. On FU740, this means has_fpu() returns false, and userspace crashes when trying to use floating-point instructions. Fix this by moving arch_match_cpu_phys_id() to a file which is always compiled. Fixes: 70114560b285 ("RISC-V: Add RISC-V specific arch_match_cpu_phys_id") Fixes: 914d6f44fc50 ("RISC-V: only iterate over possible CPUs in ISA string parser") Reported-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230803012608.3540081-1-samuel.holland@sifive.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-04riscv: Start of DRAM should at least be aligned on PMD size for the direct ↵Alexandre Ghiti
mapping So that we do not end up mapping the whole linear mapping using 4K pages, which is slow at boot time, and also very likely at runtime. So make sure we align the start of DRAM on a PMD boundary. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reported-by: Song Shuai <suagrfillet@gmail.com> Fixes: 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping") Tested-by: Song Shuai <suagrfillet@gmail.com> Link: https://lore.kernel.org/r/20230704121837.248976-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-04Merge patch series "RISC-V: Fix a few kexec_file_load(2) failures"Palmer Dabbelt
Petr Tesarik <petrtesarik@huaweicloud.com> says: From: Petr Tesarik <petr.tesarik.ext@huawei.com> The kexec_file_load(2) syscall does not work at least in some kernel builds. For details see the relevant section in this blog post: https://sigillatum.tesarici.cz/2023-07-21-state-of-riscv64-kdump.html This patch series handles an additional relocation types, removes the need to implement a Global Offset Table (GOT) for the purgatory and fixes the placement of initrd. * b4-shazam-merge: riscv/kexec: load initrd high in available memory riscv/kexec: handle R_RISCV_CALL_PLT relocation type Link: https://lore.kernel.org/r/cover.1690365011.git.petr.tesarik.ext@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-04riscv/kexec: load initrd high in available memoryTorsten Duwe
When initrd is loaded low, the secondary kernel fails like this: INITRD: 0xdc581000+0x00eef000 overlaps in-use memory region This initrd load address corresponds to the _end symbol, but the reservation is aligned on PMD_SIZE, as explained by a comment in setup_bootmem(). It is technically possible to align the initrd load address accordingly, leaving a hole between the end of kernel and the initrd, but it is much simpler to allocate the initrd top-down. Fixes: 838b3e28488f ("RISC-V: Load purgatory in kexec_file") Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com> Cc: stable@vger.kernel.org Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/all/67c8eb9eea25717c2c8208d9bfbfaa39e6e2a1c6.1690365011.git.petr.tesarik.ext@huawei.com/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-04riscv/kexec: handle R_RISCV_CALL_PLT relocation typeTorsten Duwe
R_RISCV_CALL has been deprecated and replaced by R_RISCV_CALL_PLT. See Enum 18-19 in Table 3. Relocation types here: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc It was deprecated in ("Deprecated R_RISCV_CALL, prefer R_RISCV_CALL_PLT"): https://github.com/riscv-non-isa/riscv-elf-psabi-doc/commit/a0dced85018d7a0ec17023c9389cbd70b1dbc1b0 Recent tools (at least GNU binutils-2.40) already use R_RISCV_CALL_PLT. Kernels built with such binutils fail kexec_load_file(2) with: kexec_image: Unknown rela relocation: 19 kexec_image: Error loading purgatory ret=-8 The binary code at the call site remains the same, so tell arch_kexec_apply_relocations_add() to handle _PLT alike. Fixes: 838b3e28488f ("RISC-V: Load purgatory in kexec_file") Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com> Cc: Li Zhengyu <lizhengyu3@huawei.com> Cc: stable@vger.kernel.org Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/all/b046b164af8efd33bbdb7d4003273bdf9196a5b0.1690365011.git.petr.tesarik.ext@huawei.com/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-02Documentation: kdump: Add va_kernel_pa_offset for RISCV64Song Shuai
RISC-V Linux exports "va_kernel_pa_offset" in vmcoreinfo to help Crash-utility translate the kernel virtual address correctly. Here adds the definition of "va_kernel_pa_offset". Fixes: 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping") Link: https://lore.kernel.org/linux-riscv/20230724040649.220279-1-suagrfillet@gmail.com/ Signed-off-by: Song Shuai <suagrfillet@gmail.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230724100917.309061-2-suagrfillet@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-02riscv: Export va_kernel_pa_offset in vmcoreinfoSong Shuai
Since RISC-V Linux v6.4, the commit 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping") changes phys_ram_base from the physical start of the kernel to the actual start of the DRAM. The Crash-utility's VTOP() still uses phys_ram_base and kernel_map.virt_addr to translate kernel virtual address, that failed the Crash with Linux v6.4 [1]. Export kernel_map.va_kernel_pa_offset in vmcoreinfo to help Crash translate the kernel virtual address correctly. Fixes: 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping") Link: https://lore.kernel.org/linux-riscv/20230724040649.220279-1-suagrfillet@gmail.com/ [1] Signed-off-by: Song Shuai <suagrfillet@gmail.com> Reviewed-by: Xianting Tian  <xianting.tian@linux.alibaba.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230724100917.309061-1-suagrfillet@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-02RISC-V: ACPI: Fix acpi_os_ioremap to return iomem addressSunil V L
acpi_os_ioremap() currently is a wrapper to memremap() on RISC-V. But the callers of acpi_os_ioremap() expect it to return __iomem address and hence sparse tool reports a new warning. Fix this issue by type casting to __iomem type. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307230357.egcTAefj-lkp@intel.com/ Fixes: a91a9ffbd3a5 ("RISC-V: Add support to build the ACPI core") Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230724100346.1302937-1-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-02selftests: riscv: Fix compilation error with vstate_exec_nolibc.cAlexandre Ghiti
The following error happens: In file included from vstate_exec_nolibc.c:2: /usr/include/riscv64-linux-gnu/sys/prctl.h:42:12: error: conflicting types for ‘prctl’; h ave ‘int(int, ...)’ 42 | extern int prctl (int __option, ...) __THROW; | ^~~~~ In file included from ./../../../../include/nolibc/nolibc.h:99, from <command-line>: ./../../../../include/nolibc/sys.h:892:5: note: previous definition of ‘prctl’ with type ‘int(int, long unsigned int, long unsigned int, long unsigned int, long unsigned int) ’ 892 | int prctl(int option, unsigned long arg2, unsigned long arg3, | ^~~~~ Fix this by not including <sys/prctl.h>, which is not needed here since prctl syscall is directly called using its number. Fixes: 7cf6198ce22d ("selftests: Test RISC-V Vector prctl interface") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230713115829.110421-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-02selftests/riscv: fix potential build failure during the "emit_tests" stepJohn Hubbard
The riscv selftests (which were modeled after the arm64 selftests) are improperly declaring the "emit_tests" target to depend upon the "all" target. This approach, when combined with commit 9fc96c7c19df ("selftests: error out if kernel header files are not yet built"), has caused build failures [1] on arm64, and is likely to cause similar failures for riscv. To fix this, simply remove the unnecessary "all" dependency from the emit_tests target. The dependency is still effectively honored, because again, invocation is via "install", which also depends upon "all". An alternative approach would be to harden the emit_tests target so that it can depend upon "all", but that's a lot more complicated and hard to get right, and doesn't seem worth it, especially given that emit_tests should probably not be overridden at all. [1] https://lore.kernel.org/20230710-kselftest-fix-arm64-v1-1-48e872844f25@kernel.org Fixes: 9fc96c7c19df ("selftests: error out if kernel header files are not yet built") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230712193514.740033-1-jhubbard@nvidia.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-07-12RISC-V: Don't include Zicsr or Zifencei in I from ACPIPalmer Dabbelt
ACPI ISA strings are based on a specification after Zicsr and Zifencei were split out of I, so we shouldn't be treating them as part of I. We haven't release an ACPI-based kernel yet, so we don't need to worry about compatibility with the old ISA strings. Fixes: 07edc32779e3 ("RISC-V: always report presence of extensions formerly part of the base ISA") Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Link: https://lore.kernel.org/r/20230711224600.10879-1-palmer@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-07-12riscv: mm: fix truncation warning on RV32Jisheng Zhang
lkp reports below sparse warning when building for RV32: arch/riscv/mm/init.c:1204:48: sparse: warning: cast truncates bits from constant value (100000000 becomes 0) IMO, the reason we didn't see this truncates bug in real world is "0" means MEMBLOCK_ALLOC_ACCESSIBLE in memblock and there's no RV32 HW with more than 4GB memory. Fix it anyway to make sparse happy. Fixes: decf89f86ecd ("riscv: try to allocate crashkern region from 32bit addressible memory") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306080034.SLiCiOMn-lkp@intel.com/ Link: https://lore.kernel.org/r/20230709171036.1906-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-07-12perf: RISC-V: Remove PERF_HES_STOPPED flag checking in riscv_pmu_start()Eric Lin
Since commit 096b52fd2bb4 ("perf: RISC-V: throttle perf events") the perf_sample_event_took() function was added to report time spent in overflow interrupts. If the interrupt takes too long, the perf framework will lower the sysctl_perf_event_sample_rate and max_samples_per_tick. When hwc->interrupts is larger than max_samples_per_tick, the hwc->interrupts will be set to MAX_INTERRUPTS, and events will be throttled within the __perf_event_account_interrupt() function. However, the RISC-V PMU driver doesn't call riscv_pmu_stop() to update the PERF_HES_STOPPED flag after perf_event_overflow() in pmu_sbi_ovf_handler() function to avoid throttling. When the perf framework unthrottled the event in the timer interrupt handler, it triggers riscv_pmu_start() function and causes a WARN_ON_ONCE() warning, as shown below: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 240 at drivers/perf/riscv_pmu.c:184 riscv_pmu_start+0x7c/0x8e Modules linked in: CPU: 0 PID: 240 Comm: ls Not tainted 6.4-rc4-g19d0788e9ef2 #1 Hardware name: SiFive (DT) epc : riscv_pmu_start+0x7c/0x8e ra : riscv_pmu_start+0x28/0x8e epc : ffffffff80aef864 ra : ffffffff80aef810 sp : ffff8f80004db6f0 gp : ffffffff81c83750 tp : ffffaf80069f9bc0 t0 : ffff8f80004db6c0 t1 : 0000000000000000 t2 : 000000000000001f s0 : ffff8f80004db720 s1 : ffffaf8008ca1068 a0 : 0000ffffffffffff a1 : 0000000000000000 a2 : 0000000000000001 a3 : 0000000000000870 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000840 a7 : 0000000000000030 s2 : 0000000000000000 s3 : ffffaf8005165800 s4 : ffffaf800424da00 s5 : ffffffffffffffff s6 : ffffffff81cc7590 s7 : 0000000000000000 s8 : 0000000000000006 s9 : 0000000000000001 s10: ffffaf807efbc340 s11: ffffaf807efbbf00 t3 : ffffaf8006a16028 t4 : 00000000dbfbb796 t5 : 0000000700000000 t6 : ffffaf8005269870 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff80aef864>] riscv_pmu_start+0x7c/0x8e [<ffffffff80185b56>] perf_adjust_freq_unthr_context+0x15e/0x174 [<ffffffff80188642>] perf_event_task_tick+0x88/0x9c [<ffffffff800626a8>] scheduler_tick+0xfe/0x27c [<ffffffff800b5640>] update_process_times+0x9a/0xba [<ffffffff800c5bd4>] tick_sched_handle+0x32/0x66 [<ffffffff800c5e0c>] tick_sched_timer+0x64/0xb0 [<ffffffff800b5e50>] __hrtimer_run_queues+0x156/0x2f4 [<ffffffff800b6bdc>] hrtimer_interrupt+0xe2/0x1fe [<ffffffff80acc9e8>] riscv_timer_interrupt+0x38/0x42 [<ffffffff80090a16>] handle_percpu_devid_irq+0x90/0x1d2 [<ffffffff8008a9f4>] generic_handle_domain_irq+0x28/0x36 After referring other PMU drivers like Arm, Loongarch, Csky, and Mips, they don't call *_pmu_stop() to update with PERF_HES_STOPPED flag after perf_event_overflow() function nor do they add PERF_HES_STOPPED flag checking in *_pmu_start() which don't cause this warning. Thus, it's recommended to remove this unnecessary check in riscv_pmu_start() function to prevent this warning. Signed-off-by: Eric Lin <eric.lin@sifive.com> Link: https://lore.kernel.org/r/20230710154328.19574-1-eric.lin@sifive.com Fixes: 096b52fd2bb4 ("perf: RISC-V: throttle perf events") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-07-11Documentation: RISC-V: hwprobe: Fix a formatting errorPalmer Dabbelt
I'm not sure what I was trying to do with the ':'s, but they're just rendered to HTML which looks odd. This makes "fence.i" look like "mvendorid" and such, which is seems reasonable to me. Reviewed-by: Evan Green <evan@rivosinc.com> Link: https://lore.kernel.org/r/20230710193329.2742-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-07-09Linux 6.5-rc1v6.5-rc1Linus Torvalds
2023-07-09MAINTAINERS 2: Electric BoogalooLinus Torvalds
We just sorted the entries and fields last release, so just out of a perverse sense of curiosity, I decided to see if we can keep things ordered for even just one release. The answer is "No. No we cannot". I suggest that all kernel developers will need weekly training sessions, involving a lot of Big Bird and Sesame Street. And at the yearly maintainer summit, we will all sing the alphabet song together. I doubt I will keep doing this. At some point "perverse sense of curiosity" turns into just a cold dark place filled with sadness and despair. Repeats: 80e62bc8487b ("MAINTAINERS: re-sort all entries and fields") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-09Merge tag 'dma-mapping-6.5-2023-07-09' of ↵Linus Torvalds
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: - swiotlb area sizing fixes (Petr Tesarik) * tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: reduce the number of areas to match actual memory pool size swiotlb: always set the number of areas before allocating the pool
2023-07-09Merge tag 'irq_urgent_for_v6.5_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq update from Borislav Petkov: - Optimize IRQ domain's name assignment * tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Use return value of strreplace()
2023-07-09Merge tag 'x86_urgent_for_v6.5_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu fix from Borislav Petkov: - Do FPU AP initialization on Xen PV too which got missed by the recent boot reordering work * tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/xen: Fix secondary processors' FPU initialization
2023-07-09Merge tag 'x86-core-2023-07-09' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix for the mechanism to park CPUs with an INIT IPI. On shutdown or kexec, the kernel tries to park the non-boot CPUs with an INIT IPI. But the same code path is also used by the crash utility. If the CPU which panics is not the boot CPU then it sends an INIT IPI to the boot CPU which resets the machine. Prevent this by validating that the CPU which runs the stop mechanism is the boot CPU. If not, leave the other CPUs in HLT" * tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Don't send INIT to boot CPU
2023-07-09Merge tag 'mips_6.5_1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fixes for KVM - fix for loongson build and cpu probing - DT fixes * tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled MIPS: dts: add missing space before { MIPS: Loongson: Fix build error when make modules_install MIPS: KVM: Fix NULL pointer dereference MIPS: Loongson: Fix cpu_probe_loongson() again
2023-07-09Merge tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fix from Darrick Wong: "Nothing exciting here, just getting rid of a gcc warning that I got tired of seeing when I turn on gcov" * tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix uninit warning in xfs_growfs_data
2023-07-09Merge tag '6.5-rc-smb3-client-fixes-part2' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull more smb client updates from Steve French: - fix potential use after free in unmount - minor cleanup - add worker to cleanup stale directory leases * tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add a laundromat thread for cached directories smb: client: remove redundant pointer 'server' cifs: fix session state transition to avoid use-after-free issue
2023-07-09Merge tag 'ntb-6.5' of https://github.com/jonmason/ntbLinus Torvalds
Pull NTB updates from Jon Mason: "Fixes for pci_clean_master, error handling in driver inits, and various other issues/bugs" * tag 'ntb-6.5' of https://github.com/jonmason/ntb: ntb: hw: amd: Fix debugfs_create_dir error checking ntb.rst: Fix copy and paste error ntb_netdev: Fix module_init problem ntb: intel: Remove redundant pci_clear_master ntb: epf: Remove redundant pci_clear_master ntb_hw_amd: Remove redundant pci_clear_master ntb: idt: drop redundant pci_enable_pcie_error_reporting() MAINTAINERS: git://github -> https://github.com for jonmason NTB: EPF: fix possible memory leak in pci_vntb_probe() NTB: ntb_tool: Add check for devm_kcalloc NTB: ntb_transport: fix possible memory leak while device_register() fails ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: idt: Fix error handling in idt_pci_driver_init()
2023-07-08mm: lock newly mapped VMA with corrected orderingHugh Dickins
Lockdep is certainly right to complain about (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f but task is already holding lock: (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db Invert those to the usual ordering. Fixes: 33313a747e81 ("mm: lock newly mapped VMA which can be modified after it becomes visible") Cc: stable@vger.kernel.org Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-08Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues" The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since it was all hopefully fixed in mainline. * tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: lib: dhry: fix sleeping allocations inside non-preemptable section kasan, slub: fix HW_TAGS zeroing with slub_debug kasan: fix type cast in memory_is_poisoned_n mailmap: add entries for Heiko Stuebner mailmap: update manpage link bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page MAINTAINERS: add linux-next info mailmap: add Markus Schneider-Pargmann writeback: account the number of pages written back mm: call arch_swap_restore() from do_swap_page() squashfs: fix cache race with migration mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison docs: update ocfs2-devel mailing list address MAINTAINERS: update ocfs2-devel mailing list address mm: disable CONFIG_PER_VMA_LOCK until its fixed fork: lock VMAs of the parent process when forking
2023-07-08fork: lock VMAs of the parent process when forkingSuren Baghdasaryan
When forking a child process, the parent write-protects anonymous pages and COW-shares them with the child being forked using copy_present_pte(). We must not take any concurrent page faults on the source vma's as they are being processed, as we expect both the vma and the pte's behind it to be stable. For example, the anon_vma_fork() expects the parents vma->anon_vma to not change during the vma copy. A concurrent page fault on a page newly marked read-only by the page copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the source vma, defeating the anon_vma_clone() that wasn't done because the parent vma originally didn't have an anon_vma, but we now might end up copying a pte entry for a page that has one. Before the per-vma lock based changes, the mmap_lock guaranteed exclusion with concurrent page faults. But now we need to do a vma_start_write() to make sure no concurrent faults happen on this vma while it is being processed. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Suggested-by: David Hildenbrand <david@redhat.com> Reported-by: Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/ Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com> Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/ Reported-by: Jacob Young <jacobly.alt@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-08mm: lock newly mapped VMA which can be modified after it becomes visibleSuren Baghdasaryan
mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking. Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-08mm: lock a vma before stack expansionSuren Baghdasaryan
With recent changes necessitating mmap_lock to be held for write while expanding a stack, per-VMA locks should follow the same rules and be write-locked to prevent page faults into the VMA being expanded. Add the necessary locking. Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-08Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull more SCSI updates from James Bottomley: "A few late arriving patches that missed the initial pull request. It's mostly bug fixes (the dt-bindings is a fix for the initial pull)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Remove unused function declaration scsi: target: docs: Remove tcm_mod_builder.py scsi: target: iblock: Quiet bool conversion warning with pr_preempt use scsi: dt-bindings: ufs: qcom: Fix ICE phandle scsi: core: Simplify scsi_cdl_check_cmd() scsi: isci: Fix comment typo scsi: smartpqi: Replace one-element arrays with flexible-array members scsi: target: tcmu: Replace strlcpy() with strscpy() scsi: ncr53c8xx: Replace strlcpy() with strscpy() scsi: lpfc: Fix lpfc_name struct packing
2023-07-08Merge tag 'i2c-for-6.5-rc1-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull more i2c updates from Wolfram Sang: - xiic patch should have been in the original pull but slipped through - mpc patch fixes a build regression - nomadik cleanup * tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mpc: Drop unused variable i2c: nomadik: Remove a useless call in the remove function i2c: xiic: Don't try to handle more interrupt events after error
2023-07-08Merge tag 'hardening-v6.5-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - Check for NULL bdev in LoadPin (Matthias Kaehlcke) - Revert unwanted KUnit FORTIFY build default - Fix 1-element array causing boot warnings with xhci-hub * tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: usb: ch9: Replace bmSublinkSpeedAttr 1-element array with flexible array Revert "fortify: Allow KUnit test to build without FORTIFY" dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter
2023-07-08ntb: hw: amd: Fix debugfs_create_dir error checkingAnup Sharma
The debugfs_create_dir function returns ERR_PTR in case of error, and the only correct way to check if an error occurred is 'IS_ERR' inline function. This patch will replace the null-comparison with IS_ERR. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Suggested-by: Ivan Orlov <ivan.orlov0322@gmail.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2023-07-08Merge tag 'perf-tools-for-v6.5-2-2023-07-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next Pull more perf tools updates from Namhyung Kim: "These are remaining changes and fixes for this cycle. Build: - Allow generating vmlinux.h from BTF using `make GEN_VMLINUX_H=1` and skip if the vmlinux has no BTF. - Replace deprecated clang -target xxx option by --target=xxx. perf record: - Print event attributes with well known type and config symbols in the debug output like below: # perf record -e cycles,cpu-clock -C0 -vv true <SNIP> ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0 (PERF_COUNT_SW_CPU_CLOCK) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 - Update AMD IBS event error message since it now support per-process profiling but no priviledge filters. $ sudo perf record -e ibs_op//k -C 0 Error: AMD IBS doesn't support privilege filtering. Try again without the privilege modifiers (like 'k') at the end. perf lock contention: - Support CSV style output using -x option $ sudo perf lock con -ab -x, sleep 1 # output: contended, total wait, max wait, avg wait, type, caller 19, 194232, 21415, 10222, spinlock, process_one_work+0x1f0 15, 162748, 23843, 10849, rwsem:R, do_user_addr_fault+0x40e 4, 86740, 23415, 21685, rwlock:R, ep_poll_callback+0x2d 1, 84281, 84281, 84281, mutex, iwl_mvm_async_handlers_wk+0x135 8, 67608, 27404, 8451, spinlock, __queue_work+0x174 3, 58616, 31125, 19538, rwsem:W, do_mprotect_pkey+0xff 3, 52953, 21172, 17651, rwlock:W, do_epoll_wait+0x248 2, 30324, 19704, 15162, rwsem:R, do_madvise+0x3ad 1, 24619, 24619, 24619, spinlock, rcu_core+0xd4 - Add --output option to save the data to a file not to be interfered by other debug messages. Test: - Fix event parsing test on ARM where there's no raw PMU nor supports PERF_PMU_CAP_EXTENDED_HW_TYPE. - Update the lock contention test case for CSV output. - Fix a segfault in the daemon command test. Vendor events (JSON): - Add has_event() to check if the given event is available on system at runtime. On Intel machines, some transaction events may not be present when TSC extensions are disabled. - Update Intel event metrics. Misc: - Sort symbols by name using an external array of pointers instead of a rbtree node in the symbol. This will save 16-bytes or 24-bytes per symbol whether the sorting is actually requested or not. - Fix unwinding DWARF callstacks using libdw when --symfs option is used" * tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next: (38 commits) perf test: Fix event parsing test when PERF_PMU_CAP_EXTENDED_HW_TYPE isn't supported. perf test: Fix event parsing test on Arm perf evsel amd: Fix IBS error message perf: unwind: Fix symfs with libdw perf symbol: Fix uninitialized return value in symbols__find_by_name() perf test: Test perf lock contention CSV output perf lock contention: Add --output option perf lock contention: Add -x option for CSV style output perf lock: Remove stale comments perf vendor events intel: Update tigerlake to 1.13 perf vendor events intel: Update skylakex to 1.31 perf vendor events intel: Update skylake to 57 perf vendor events intel: Update sapphirerapids to 1.14 perf vendor events intel: Update icelakex to 1.21 perf vendor events intel: Update icelake to 1.19 perf vendor events intel: Update cascadelakex to 1.19 perf vendor events intel: Update meteorlake to 1.03 perf vendor events intel: Add rocketlake events/metrics perf vendor metrics intel: Make transaction metrics conditional perf jevents: Support for has_event function ...
2023-07-08Merge tag 'bitmap-6.5-rc1' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: "Fixes for different bitmap pieces: - lib/test_bitmap: increment failure counter properly The tests that don't use expect_eq() macro to determine that a test is failured must increment failed_tests explicitly. - lib/bitmap: drop optimization of bitmap_{from,to}_arr64 bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE architectures when it's wired to bitmap_copy_clear_tail(). - nodemask: Drop duplicate check in for_each_node_mask() As the return value type of first_node() became unsigned, the node >= 0 became unnecessary. - cpumask: fix function description kernel-doc notation - MAINTAINERS: Add bits.h and bitfield.h to the BITMAP API record Add linux/bits.h and linux/bitfield.h for visibility" * tag 'bitmap-6.5-rc1' of https://github.com/norov/linux: MAINTAINERS: Add bitfield.h to the BITMAP API record MAINTAINERS: Add bits.h to the BITMAP API record cpumask: fix function description kernel-doc notation nodemask: Drop duplicate check in for_each_node_mask() lib/bitmap: drop optimization of bitmap_{from,to}_arr64 lib/test_bitmap: increment failure counter properly
2023-07-08lib: dhry: fix sleeping allocations inside non-preemptable sectionGeert Uytterhoeven
The Smatch static checker reports the following warnings: lib/dhry_run.c:38 dhry_benchmark() warn: sleeping in atomic context lib/dhry_run.c:43 dhry_benchmark() warn: sleeping in atomic context Indeed, dhry() does sleeping allocations inside the non-preemptable section delimited by get_cpu()/put_cpu(). Fix this by using atomic allocations instead. Add error handling, as atomic these allocations may fail. Link: https://lkml.kernel.org/r/bac6d517818a7cd8efe217c1ad649fffab9cc371.1688568764.git.geert+renesas@glider.be Fixes: 13684e966d46283e ("lib: dhry: fix unstable smp_processor_id(_) usage") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/0469eb3a-02eb-4b41-b189-de20b931fa56@moroto.mountain Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08kasan, slub: fix HW_TAGS zeroing with slub_debugAndrey Konovalov
Commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested") added precise kmalloc redzone poisoning to the slub_debug functionality. However, this commit didn't account for HW_TAGS KASAN fully initializing the object via its built-in memory initialization feature. Even though HW_TAGS KASAN memory initialization contains special memory initialization handling for when slub_debug is enabled, it does not account for in-object slub_debug redzones. As a result, HW_TAGS KASAN can overwrite these redzones and cause false-positive slub_debug reports. To fix the issue, avoid HW_TAGS KASAN memory initialization when slub_debug is enabled altogether. Implement this by moving the __slub_debug_enabled check to slab_post_alloc_hook. Common slab code seems like a more appropriate place for a slub_debug check anyway. Link: https://lkml.kernel.org/r/678ac92ab790dba9198f9ca14f405651b97c8502.1688561016.git.andreyknvl@google.com Fixes: 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: Will Deacon <will@kernel.org> Acked-by: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: kasan-dev@googlegroups.com Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08kasan: fix type cast in memory_is_poisoned_nAndrey Konovalov
Commit bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13 builtins") introduced a bug into the memory_is_poisoned_n implementation: it effectively removed the cast to a signed integer type after applying KASAN_GRANULE_MASK. As a result, KASAN started failing to properly check memset, memcpy, and other similar functions. Fix the bug by adding the cast back (through an additional signed integer variable to make the code more readable). Link: https://lkml.kernel.org/r/8c9e0251c2b8b81016255709d4ec42942dcaf018.1688431866.git.andreyknvl@google.com Fixes: bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13 builtins") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08mailmap: add entries for Heiko StuebnerHeiko Stuebner
I am going to lose my vrull.eu address at the end of july, and while adding it to mailmap I also realised that there are more old addresses from me dangling, so update .mailmap for all of them. Link: https://lkml.kernel.org/r/20230704163919.1136784-3-heiko@sntech.de Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08mailmap: update manpage linkHeiko Stuebner
Patch series "Update .mailmap for my work address and fix manpage". While updating mailmap for the going-away address, I also found that on current systems the manpage linked from the header comment changed. And in fact it looks like the git mailmap feature got its own manpage. This patch (of 2): On recent systems the git-shortlog manpage only tells people to See gitmailmap(5) So instead of sending people on a scavenger hunt, put that info into the header directly. Though keep the old reference around for older systems. Link: https://lkml.kernel.org/r/20230704163919.1136784-1-heiko@sntech.de Link: https://lkml.kernel.org/r/20230704163919.1136784-2-heiko@sntech.de Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08bootmem: remove the vmemmap pages from kmemleak in free_bootmem_pageLiu Shixin
commit dd0ff4d12dd2 ("bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem") fix an overlaps existing problem of kmemleak. But the problem still existed when HAVE_BOOTMEM_INFO_NODE is disabled, because in this case, free_bootmem_page() will call free_reserved_page() directly. Fix the problem by adding kmemleak_free_part() in free_bootmem_page() when HAVE_BOOTMEM_INFO_NODE is disabled. Link: https://lkml.kernel.org/r/20230704101942.2819426-1-liushixin2@huawei.com Fixes: f41f2ed43ca5 ("mm: hugetlb: free the vmemmap pages associated with each HugeTLB page") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: Muchun Song <songmuchun@bytedance.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08MAINTAINERS: add linux-next infoRandy Dunlap
Add linux-next info to MAINTAINERS for ease of finding this data. Link: https://lkml.kernel.org/r/20230704054410.12527-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08mailmap: add Markus Schneider-PargmannMarkus Schneider-Pargmann
Add my old mail address and update my name. Link: https://lkml.kernel.org/r/20230628081341.3470229-1-msp@baylibre.com Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08writeback: account the number of pages written backMatthew Wilcox (Oracle)
nr_to_write is a count of pages, so we need to decrease it by the number of pages in the folio we just wrote, not by 1. Most callers specify either LONG_MAX or 1, so are unaffected, but writeback_sb_inodes() might end up writing 512x as many pages as it asked for. Dave added: : XFS is the only filesystem this would affect, right? AFAIA, nothing : else enables large folios and uses writeback through : write_cache_pages() at this point... : : In which case, I'd be surprised if much difference, if any, gets : noticed by anyone. Link: https://lkml.kernel.org/r/20230628185548.981888-1-willy@infradead.org Fixes: 793917d997df ("mm/readahead: Add large folio readahead") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08mm: call arch_swap_restore() from do_swap_page()Peter Collingbourne
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved the call to swap_free() before the call to set_pte_at(), which meant that the MTE tags could end up being freed before set_pte_at() had a chance to restore them. Fix it by adding a call to the arch_swap_restore() hook before the call to swap_free(). Link: https://lkml.kernel.org/r/20230523004312.1807357-2-pcc@google.com Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965 Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") Signed-off-by: Peter Collingbourne <pcc@google.com> Reported-by: Qun-wei Lin <Qun-wei.Lin@mediatek.com> Closes: https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@mediatek.com/ Acked-by: David Hildenbrand <david@redhat.com> Acked-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> [6.1+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08squashfs: fix cache race with migrationVincent Whitchurch
Migration replaces the page in the mapping before copying the contents and the flags over from the old page, so check that the page in the page cache is really up to date before using it. Without this, stressing squashfs reads with parallel compaction sometimes results in squashfs reporting data corruption. Link: https://lkml.kernel.org/r/20230629-squashfs-cache-migration-v1-1-d50ebe55099d@axis.com Fixes: e994f5b677ee ("squashfs: cache partial compressed blocks") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>