summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2023-08-25Merge branch 'for-next/misc' into for-next/coreWill Deacon
* for-next/misc: arm64/sysreg: refactor deprecated strncpy arm64: sysreg: Generate C compiler warnings on {read,write}_sysreg_s arguments arm64: sdei: abort running SDEI handlers during crash arm64: Explicitly include correct DT includes arm64/Kconfig: Sort the RCpc feature under the ARMv8.3 features menu arm64: vdso: remove two .altinstructions related symbols arm64/ptrace: Clean up error handling path in sve_set_common()
2023-08-25Merge branch 'for-next/entry' into for-next/coreWill Deacon
* for-next/entry: arm64: syscall: unmask DAIF earlier for SVCs
2023-08-25x86/sev: Make enc_dec_hypercall() accept a size instead of npagesSteve Rutherford
enc_dec_hypercall() accepted a page count instead of a size, which forced its callers to round up. As a result, non-page aligned vaddrs caused pages to be spuriously marked as decrypted via the encryption status hypercall, which in turn caused consistent corruption of pages during live migration. Live migration requires accurate encryption status information to avoid migrating pages from the wrong perspective. Fixes: 064ce6c550a0 ("mm: x86: Invoke hypercall when page encryption status is changed") Signed-off-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Tested-by: Ben Hillier <bhillier@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230824223731.2055016-1-srutherford@google.com
2023-08-25x86/hyperv: Remove duplicate includeJiapeng Chong
./arch/x86/hyperv/ivm.c: asm/sev.h is included more than once. ./arch/x86/hyperv/ivm.c: asm/coco.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6212 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080352.98945-1-jiapeng.chong@linux.alibaba.com
2023-08-25x86/hyperv: Move the code in ivm.c around to avoid unnecessary ifdef'sDexuan Cui
Group the code this way so that we can avoid too many ifdef's: Data only used in an SNP VM with the paravisor; Functions only used in an SNP VM with the paravisor; Data only used in an SNP VM without the paravisor; Functions only used in an SNP VM without the paravisor; Functions only used in a TDX VM, with and without the paravisor; Functions used in an SNP or TDX VM, when the paravisor is present; Functions always used, even in a regular non-CoCo VM. No functional change. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-11-decui@microsoft.com
2023-08-25x86/hyperv: Remove hv_isolation_type_en_snpDexuan Cui
In ms_hyperv_init_platform(), do not distinguish between a SNP VM with the paravisor and a SNP VM without the paravisor. Replace hv_isolation_type_en_snp() with !ms_hyperv.paravisor_present && hv_isolation_type_snp(). The hv_isolation_type_en_snp() in drivers/hv/hv.c and drivers/hv/hv_common.c can be changed to hv_isolation_type_snp() since we know !ms_hyperv.paravisor_present is true there. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-10-decui@microsoft.com
2023-08-25x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM with the paravisorDexuan Cui
When the paravisor is present, a SNP VM must use GHCB to access some special MSRs, including HV_X64_MSR_GUEST_OS_ID and some SynIC MSRs. Similarly, when the paravisor is present, a TDX VM must use TDX GHCI to access the same MSRs. Implement hv_tdx_msr_write() and hv_tdx_msr_read(), and use the helper functions hv_ivm_msr_read() and hv_ivm_msr_write() to access the MSRs in a unified way for SNP/TDX VMs with the paravisor. Do not export hv_tdx_msr_write() and hv_tdx_msr_read(), because we never really used hv_ghcb_msr_write() and hv_ghcb_msr_read() in any module. Update arch/x86/include/asm/mshyperv.h so that the kernel can still build if CONFIG_AMD_MEM_ENCRYPT or CONFIG_INTEL_TDX_GUEST is not set, or neither is set. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-9-decui@microsoft.com
2023-08-25Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisorDexuan Cui
The post_msg_page was removed in commit 9a6b1a170ca8 ("Drivers: hv: vmbus: Remove the per-CPU post_msg_page") However, it turns out that we need to bring it back, but only for a TDX VM with the paravisor: in such a VM, the hyperv_pcpu_input_arg is not decrypted, but the HVCALL_POST_MESSAGE in such a VM needs a decrypted page as the hypercall input page: see the comments in hyperv_init() for a detailed explanation. Except for HVCALL_POST_MESSAGE and HVCALL_SIGNAL_EVENT, the other hypercalls in a TDX VM with the paravisor still use hv_hypercall_pg and must use the hyperv_pcpu_input_arg (which is encrypted in such a VM), when a hypercall input page is used. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-8-decui@microsoft.com
2023-08-25x86/hyperv: Introduce a global variable hyperv_paravisor_presentDexuan Cui
The new variable hyperv_paravisor_present is set only when the VM is a SNP/TDX VM with the paravisor running: see ms_hyperv_init_platform(). We introduce hyperv_paravisor_present because we can not use ms_hyperv.paravisor_present in arch/x86/include/asm/mshyperv.h: struct ms_hyperv_info is defined in include/asm-generic/mshyperv.h, which is included at the end of arch/x86/include/asm/mshyperv.h, but at the beginning of arch/x86/include/asm/mshyperv.h, we would already need to use struct ms_hyperv_info in hv_do_hypercall(). We use hyperv_paravisor_present only in include/asm-generic/mshyperv.h, and use ms_hyperv.paravisor_present elsewhere. In the future, we'll introduce a hypercall function structure for different VM types, and at boot time, the right function pointers would be written into the structure so that runtime testing of TDX vs. SNP vs. normal will be avoided and hyperv_paravisor_present will no longer be needed. Call hv_vtom_init() when it's a VBS VM or when ms_hyperv.paravisor_present is true, i.e. the VM is a SNP VM or TDX VM with the paravisor. Enhance hv_vtom_init() for a TDX VM with the paravisor. In hv_common_cpu_init(), don't decrypt the hyperv_pcpu_input_arg for a TDX VM with the paravisor, just like we don't decrypt the page for a SNP VM with the paravisor. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-7-decui@microsoft.com
2023-08-25x86/hyperv: Fix serial console interrupts for fully enlightened TDX guestsDexuan Cui
When a fully enlightened TDX guest runs on Hyper-V, the UEFI firmware sets the HW_REDUCED flag and consequently ttyS0 interrupts can't work. Fix the issue by overriding x86_init.acpi.reduced_hw_early_init(). Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-5-decui@microsoft.com
2023-08-25Drivers: hv: vmbus: Support fully enlightened TDX guestsDexuan Cui
Add Hyper-V specific code so that a fully enlightened TDX guest (i.e. without the paravisor) can run on Hyper-V: Don't use hv_vp_assist_page. Use GHCI instead. Don't try to use the unsupported HV_REGISTER_CRASH_CTL. Don't trust (use) Hyper-V's TLB-flushing hypercalls. Don't use lazy EOI. Share the SynIC Event/Message pages with the hypervisor. Don't use the Hyper-V TSC page for now, because non-trivial work is required to share the page with the hypervisor. Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-4-decui@microsoft.com
2023-08-25x86/hyperv: Support hypercalls for fully enlightened TDX guestsDexuan Cui
A fully enlightened TDX guest on Hyper-V (i.e. without the paravisor) only uses the GHCI call rather than hv_hypercall_pg. Do not initialize hypercall_pg for such a guest. In hv_common_cpu_init(), the hyperv_pcpu_input_arg page needs to be decrypted in such a guest. Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-3-decui@microsoft.com
2023-08-25x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guestsDexuan Cui
No logic change to SNP/VBS guests. hv_isolation_type_tdx() will be used to instruct a TDX guest on Hyper-V to do some TDX-specific operations, e.g. for a fully enlightened TDX guest (i.e. without the paravisor), hv_do_hypercall() should use __tdx_hypercall() and such a guest on Hyper-V should handle the Hyper-V Event/Message/Monitor pages specially. Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-2-decui@microsoft.com
2023-08-24x86/crash: optimize CPU changesEric DeVolder
crash_prepare_elf64_headers() writes into the elfcorehdr an ELF PT_NOTE for all possible CPUs. As such, subsequent changes to CPUs (ie. hot un/plug, online/offline) do not need to rewrite the elfcorehdr. The kimage->file_mode term covers kdump images loaded via the kexec_file_load() syscall. Since crash_prepare_elf64_headers() wrote the initial elfcorehdr, no update to the elfcorehdr is needed for CPU changes. The kimage->elfcorehdr_updated term covers kdump images loaded via the kexec_load() syscall. At least one memory or CPU change must occur to cause crash_prepare_elf64_headers() to rewrite the elfcorehdr. Afterwards, no update to the elfcorehdr is needed for CPU changes. This code is intentionally *NOT* hoisted into crash_handle_hotplug_event() as it would prevent the arch-specific handler from running for CPU changes. This would break PPC, for example, which needs to update other information besides the elfcorehdr, on CPU changes. Link: https://lkml.kernel.org/r/20230814214446.6659-9-eric.devolder@oracle.com Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Akhil Raj <lf32.dev@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24crash: hotplug support for kexec_load()Eric DeVolder
The hotplug support for kexec_load() requires changes to the userspace kexec-tools and a little extra help from the kernel. Given a kdump capture kernel loaded via kexec_load(), and a subsequent hotplug event, the crash hotplug handler finds the elfcorehdr and rewrites it to reflect the hotplug change. That is the desired outcome, however, at kernel panic time, the purgatory integrity check fails (because the elfcorehdr changed), and the capture kernel does not boot and no vmcore is generated. Therefore, the userspace kexec-tools/kexec must indicate to the kernel that the elfcorehdr can be modified (because the kexec excluded the elfcorehdr from the digest, and sized the elfcorehdr memory buffer appropriately). To facilitate hotplug support with kexec_load(): - a new kexec flag KEXEC_UPATE_ELFCOREHDR indicates that it is safe for the kernel to modify the kexec_load()'d elfcorehdr - the /sys/kernel/crash_elfcorehdr_size node communicates the preferred size of the elfcorehdr memory buffer - The sysfs crash_hotplug nodes (ie. /sys/devices/system/[cpu|memory]/crash_hotplug) dynamically take into account kexec_file_load() vs kexec_load() and KEXEC_UPDATE_ELFCOREHDR. This is critical so that the udev rule processing of crash_hotplug is all that is needed to determine if the userspace unload-then-load of the kdump image is to be skipped, or not. The proposed udev rule change looks like: # The kernel updates the crash elfcorehdr for CPU and memory changes SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" The table below indicates the behavior of kexec_load()'d kdump image updates (with the new udev crash_hotplug rule in place): Kernel |Kexec -------+-----+---- Old |Old |New | a | a -------+-----+---- New | a | b -------+-----+---- where kexec 'old' and 'new' delineate kexec-tools has the needed modifications for the crash hotplug feature, and kernel 'old' and 'new' delineate the kernel supports this crash hotplug feature. Behavior 'a' indicates the unload-then-reload of the entire kdump image. For the kexec 'old' column, the unload-then-reload occurs due to the missing flag KEXEC_UPDATE_ELFCOREHDR. An 'old' kernel (with 'new' kexec) does not present the crash_hotplug sysfs node, which leads to the unload-then-reload of the kdump image. Behavior 'b' indicates the desired optimized behavior of the kernel directly modifying the elfcorehdr and avoiding the unload-then-reload of the kdump image. If the udev rule is not updated with crash_hotplug node check, then no matter any combination of kernel or kexec is new or old, the kdump image continues to be unload-then-reload on hotplug changes. To fully support crash hotplug feature, there needs to be a rollout of kernel, kexec-tools and udev rule changes. However, the order of the rollout of these pieces does not matter; kexec_load()'d kdump images still function for hotplug as-is. Link: https://lkml.kernel.org/r/20230814214446.6659-7-eric.devolder@oracle.com Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Suggested-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Akhil Raj <lf32.dev@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24x86/crash: add x86 crash hotplug supportEric DeVolder
When CPU or memory is hot un/plugged, or off/onlined, the crash elfcorehdr, which describes the CPUs and memory in the system, must also be updated. A new elfcorehdr is generated from the available CPUs and memory and replaces the existing elfcorehdr. The segment containing the elfcorehdr is identified at run-time in crash_core:crash_handle_hotplug_event(). No modifications to purgatory (see 'kexec: exclude elfcorehdr from the segment digest') or boot_params (as the elfcorehdr= capture kernel command line parameter pointer remains unchanged and correct) are needed, just elfcorehdr. For kexec_file_load(), the elfcorehdr segment size is based on NR_CPUS and CRASH_MAX_MEMORY_RANGES in order to accommodate a growing number of CPU and memory resources. For kexec_load(), the userspace kexec utility needs to size the elfcorehdr segment in the same/similar manner. To accommodate kexec_load() syscall in the absence of kexec_file_load() syscall support, prepare_elf_headers() and dependents are moved outside of CONFIG_KEXEC_FILE. [eric.devolder@oracle.com: correct unused function build error] Link: https://lkml.kernel.org/r/20230821182644.2143-1-eric.devolder@oracle.com Link: https://lkml.kernel.org/r/20230814214446.6659-6-eric.devolder@oracle.com Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Akhil Raj <lf32.dev@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24nios2: fix flush_dcache_page() for usage from irq contextHelge Deller
Since at least kernel 6.1, flush_dcache_page() is called with IRQs disabled, e.g. from aio_complete(). But the current implementation for flush_dcache_page() on NIOS2 unintentionally re-enables IRQs, which may lead to deadlocks. Fix it by using xa_lock_irqsave() and xa_unlock_irqrestore() for the flush_dcache_mmap_*lock() macros instead. Link: https://lkml.kernel.org/r/ZOTF5WWURQNH9+iw@p100 Signed-off-by: Helge Deller <deller@gmx.de> Cc: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24mm/swap: stop using page->private on tail pages for THP_SWAPDavid Hildenbrand
Patch series "mm/swap: stop using page->private on tail pages for THP_SWAP + cleanups". This series stops using page->private on tail pages for THP_SWAP, replaces folio->private by folio->swap for swapcache folios, and starts using "new_folio" for tail pages that we are splitting to remove the usage of page->private for swapcache handling completely. This patch (of 4): Let's stop using page->private on tail pages, making it possible to just unconditionally reuse that field in the tail pages of large folios. The remaining usage of the private field for THP_SWAP is in the THP splitting code (mm/huge_memory.c), that we'll handle separately later. Update the THP_SWAP documentation and sanity checks in mm_types.h and __split_huge_page_tail(). [david@redhat.com: stop using page->private on tail pages for THP_SWAP] Link: https://lkml.kernel.org/r/6f0a82a3-6948-20d9-580b-be1dbf415701@redhat.com Link: https://lkml.kernel.org/r/20230821160849.531668-1-david@redhat.com Link: https://lkml.kernel.org/r/20230821160849.531668-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24arm64: mm: use ptep_clear() instead of pte_clear() in clear_flush()Qi Zheng
In clear_flush(), the original pte may be a present entry, so we should use ptep_clear() to let page_table_check track the pte clearing operation, otherwise it may cause false positive in subsequent set_pte_at(). Link: https://lkml.kernel.org/r/20230810093241.1181142-1-qi.zheng@linux.dev Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24mm: rationalise flush_icache_pages() and flush_icache_page()Matthew Wilcox (Oracle)
Move the default (no-op) implementation of flush_icache_pages() to <linux/cacheflush.h> from <asm-generic/cacheflush.h>. Remove the flush_icache_page() wrapper from each architecture into <linux/cacheflush.h>. Link: https://lkml.kernel.org/r/20230802151406.3735276-32-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24xtensa: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT, update_mmu_cache_range(), flush_dcache_folio() and flush_icache_pages(). Link: https://lkml.kernel.org/r/20230802151406.3735276-30-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24x86: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT and a noop update_mmu_cache_range(). Link: https://lkml.kernel.org/r/20230802151406.3735276-29-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24um: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT and update_mmu_cache_range(). Link: https://lkml.kernel.org/r/20230802151406.3735276-28-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24sparc64: implement the new page table range APIMatthew Wilcox (Oracle)
Add set_ptes(), update_mmu_cache_range(), flush_dcache_folio() and flush_icache_pages(). Convert the PG_dcache_dirty flag from being per-page to per-folio. Link: https://lkml.kernel.org/r/20230802151406.3735276-27-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24sparc32: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT, update_mmu_cache_range(), flush_dcache_folio() and flush_icache_pages(). Link: https://lkml.kernel.org/r/20230802151406.3735276-26-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24sh: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT, update_mmu_cache_range(), flush_dcache_folio() and flush_icache_pages(). Change the PG_dcache_clean flag from being per-page to per-folio. Flush the entire folio containing the pages in flush_icache_pages() for ease of implementation. Link: https://lkml.kernel.org/r/20230802151406.3735276-25-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24s390: implement the new page table range APIMatthew Wilcox (Oracle)
Add set_ptes() and update_mmu_cache_range(). Link: https://lkml.kernel.org/r/20230802151406.3735276-24-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24riscv: implement the new page table range APIMatthew Wilcox (Oracle)
Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio(). Change the PG_dcache_clean flag from being per-page to per-folio. Link: https://lkml.kernel.org/r/20230802151406.3735276-23-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24powerpc: implement the new page table range APIMatthew Wilcox (Oracle)
Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio(). Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio. [willy@infradead.org: re-export flush_dcache_icache_folio()] Link: https://lkml.kernel.org/r/ZMx1daYwvD9EM7Cv@casper.infradead.org Link: https://lkml.kernel.org/r/20230802151406.3735276-22-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24parisc: implement the new page table range APIMatthew Wilcox (Oracle)
Add set_ptes(), update_mmu_cache_range(), flush_dcache_folio() and flush_icache_pages(). Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio. Link: https://lkml.kernel.org/r/20230802151406.3735276-21-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24openrisc: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT, update_mmu_cache_range() and flush_dcache_folio(). Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio. Link: https://lkml.kernel.org/r/20230802151406.3735276-20-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24nios2: implement the new page table range APIMatthew Wilcox (Oracle)
Add set_ptes(), update_mmu_cache_range(), flush_icache_pages() and flush_dcache_folio(). Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio. Link: https://lkml.kernel.org/r/20230802151406.3735276-19-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24mips: implement the new page table range APIMatthew Wilcox (Oracle)
Rename _PFN_SHIFT to PFN_PTE_SHIFT. Convert a few places to call set_pte() instead of set_pte_at(). Add set_ptes(), update_mmu_cache_range(), flush_icache_pages() and flush_dcache_folio(). Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio. Link: https://lkml.kernel.org/r/20230802151406.3735276-18-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24microblaze: implement the new page table range APIMatthew Wilcox (Oracle)
Rename PFN_SHIFT_OFFSET to PTE_PFN_SHIFT. Change the calling convention for set_pte() to be the same as other architectures. Add update_mmu_cache_range(), flush_icache_pages() and flush_dcache_folio(). [arnd@arndb.de: mark flush_dcache_folio() inline] Link: https://lkml.kernel.org/r/20230810141947.1236730-9-arnd@kernel.org Link: https://lkml.kernel.org/r/20230802151406.3735276-17-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24m68k: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT, update_mmu_cache_range(), flush_icache_pages() and flush_dcache_folio(). Link: https://lkml.kernel.org/r/20230802151406.3735276-16-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24loongarch: implement the new page table range APIMatthew Wilcox (Oracle)
Add update_mmu_cache_range() and change _PFN_SHIFT to PFN_PTE_SHIFT. It would probably be more efficient to implement __update_tlb() by flushing the entire folio instead of calling __update_tlb() N times, but I'll leave that for someone who understands the architecture better. Link: https://lkml.kernel.org/r/20230802151406.3735276-15-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24ia64: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT, update_mmu_cache_range() and flush_dcache_folio(). Change the PG_arch_1 (aka PG_dcache_clean) flag from being per-page to per-folio, which makes arch_dma_mark_clean() and mark_clean() a little more exciting. [willy@infradead.org: fix folio_size() handling] Link: https://lkml.kernel.org/r/ZNPlOCe8F+nrzPxr@casper.infradead.org Link: https://lkml.kernel.org/r/20230802151406.3735276-14-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24hexagon: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT and update_mmu_cache_range(). Link: https://lkml.kernel.org/r/20230802151406.3735276-13-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Brian Cain <bcain@quicinc.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24csky: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT, update_mmu_cache_range() and flush_dcache_folio(). Change the PG_dcache_clean flag from being per-page to per-folio. Link: https://lkml.kernel.org/r/20230802151406.3735276-12-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24arm64: implement the new page table range APIMatthew Wilcox (Oracle)
Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio(). Change the PG_dcache_clean flag from being per-page to per-folio. Link: https://lkml.kernel.org/r/20230802151406.3735276-11-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24arm: implement the new page table range APIMatthew Wilcox (Oracle)
Add set_ptes(), update_mmu_cache_range(), flush_dcache_folio() and flush_icache_pages(). Change the PG_dcache_clear flag from being per-page to per-folio which makes __dma_page_dev_to_cpu() a bit more exciting. Also add flush_cache_pages(), even though this isn't used by generic code (yet?) [m.szyprowski@samsung.com: fix potential endless loop in __dma_page_dev_to_cpu()] Link: https://lkml.kernel.org/r/20230809172737.3574190-1-m.szyprowski@samsung.com [willy@infradead.org: fix folio conversion in __dma_page_dev_to_cpu()] Link: https://lkml.kernel.org/r/20230823191852.1556561-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230802151406.3735276-10-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24arc: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT, update_mmu_cache_range(), flush_dcache_folio() and flush_icache_pages(). Change the PG_dc_clean flag from being per-page to per-folio (which means it cannot always be set as we don't know that all pages in this folio were cleaned). Enhance the internal flush routines to take the number of pages to flush. Link: https://lkml.kernel.org/r/20230802151406.3735276-9-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24alpha: implement the new page table range APIMatthew Wilcox (Oracle)
Add PFN_PTE_SHIFT, update_mmu_cache_range() and flush_icache_pages(). Link: https://lkml.kernel.org/r/20230802151406.3735276-8-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24mm: convert page_table_check_pte_set() to page_table_check_ptes_set()Matthew Wilcox (Oracle)
Tell the page table check how many PTEs & PFNs we want it to check. Link: https://lkml.kernel.org/r/20230802151406.3735276-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24minmax: add in_range() macroMatthew Wilcox (Oracle)
Patch series "New page table range API", v6. This patchset changes the API used by the MM to set up page table entries. The four APIs are: set_ptes(mm, addr, ptep, pte, nr) update_mmu_cache_range(vma, addr, ptep, nr) flush_dcache_folio(folio) flush_icache_pages(vma, page, nr) flush_dcache_folio() isn't technically new, but no architecture implemented it, so I've done that for them. The old APIs remain around but are mostly implemented by calling the new interfaces. The new APIs are based around setting up N page table entries at once. The N entries belong to the same PMD, the same folio and the same VMA, so ptep++ is a legitimate operation, and locking is taken care of for you. Some architectures can do a better job of it than just a loop, but I have hesitated to make too deep a change to architectures I don't understand well. One thing I have changed in every architecture is that PG_arch_1 is now a per-folio bit instead of a per-page bit when used for dcache clean/dirty tracking. This was something that would have to happen eventually, and it makes sense to do it now rather than iterate over every page involved in a cache flush and figure out if it needs to happen. The point of all this is better performance, and Fengwei Yin has measured improvement on x86. I suspect you'll see improvement on your architecture too. Try the new will-it-scale test mentioned here: https://lore.kernel.org/linux-mm/20230206140639.538867-5-fengwei.yin@intel.com/ You'll need to run it on an XFS filesystem and have CONFIG_TRANSPARENT_HUGEPAGE set. This patchset is the basis for much of the anonymous large folio work being done by Ryan, so it's received quite a lot of testing over the last few months. This patch (of 38): Determine if a value lies within a range more efficiently (subtraction + comparison vs two comparisons and an AND). It also has useful (under some circumstances) behaviour if the range exceeds the maximum value of the type. Convert all the conflicting definitions of in_range() within the kernel; some can use the generic definition while others need their own definition. Link: https://lkml.kernel.org/r/20230802151406.3735276-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230802151406.3735276-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24mm: drop per-VMA lock when returning VM_FAULT_RETRY or VM_FAULT_COMPLETEDSuren Baghdasaryan
handle_mm_fault returning VM_FAULT_RETRY or VM_FAULT_COMPLETED means mmap_lock has been released. However with per-VMA locks behavior is different and the caller should still release it. To make the rules consistent for the caller, drop the per-VMA lock when returning VM_FAULT_RETRY or VM_FAULT_COMPLETED. Currently the only path returning VM_FAULT_RETRY under per-VMA locks is do_swap_page and no path returns VM_FAULT_COMPLETED for now. [willy@infradead.org: fix riscv] Link: https://lkml.kernel.org/r/CAJuCfpE6GWEx1rPBmNpUfoD5o-gNFz9-UFywzCE2PbEGBiVz7g@mail.gmail.com Link: https://lkml.kernel.org/r/20230630211957.1341547-4-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Peter Xu <peterx@redhat.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hillf Danton <hdanton@sina.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michel Lespinasse <michel@lespinasse.org> Cc: Minchan Kim <minchan@google.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Punit Agrawal <punit.agrawal@bytedance.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-25powerpc/mpc5xxx: Add missing fwnode_handle_put()Liang He
In mpc5xxx_fwnode_get_bus_frequency(), we should add fwnode_handle_put() when break out of the iteration fwnode_for_each_parent_node() as it will automatically increase and decrease the refcounter. Fixes: de06fba62af6 ("powerpc/mpc5xxx: Switch mpc5xxx_get_bus_frequency() to use fwnode") Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230322030423.1855440-1-windhl@126.com
2023-08-25powerpc/config: Disable SLAB_DEBUG_ON in skirootJoel Stanley
In 5.10 commit 5e84dd547bce ("powerpc/configs/skiroot: Enable some more hardening options") set SLUB_DEBUG_ON. When 5.14 came around, commit 792702911f58 ("slub: force on no_hash_pointers when slub_debug is enabled") print all the pointers when SLUB_DEBUG_ON is set. This was fine, but in 5.12 commit 5ead723a20e0 ("lib/vsprintf: no_hash_pointers prints all addresses as unhashed") added the warning at boot. Disable SLAB_DEBUG_ON as we don't want the nasty warning. We have CONFIG_EXPERT so SLAB_DEBUG is enabled. We do lose the settings in DEBUG_DEFAULT_FLAGS, but it's not clear that these should have been always-on anyway. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230705023056.16273-1-joel@jms.id.au
2023-08-25powerpc/pseries: Remove unused hcall tracing instructionNicholas Piggin
When JUMP_LABEL=n, the tracepoint refcount test in the pre-call stores the refcount value to the stack, so the same value can be used for the post-call (presumably to avoid racing with the value concurrently changing). On little-endian (ELFv2) that might have just worked by luck, because 32(r1) is STK_PARAM(R3) there and so the value save gets clobbered by the tracing code when it's non-zero, but fortunately r3 is the hcall number and 0 is an invalid hcall number so it should get clobbered by another non-zero value. In any case, commit cc1adb5f32557 ("powerpc/pseries: Use jump labels for hcall tracepoints") removed the code that actually used the value stored, so now it's just dead code. It's fragile to be storing to the stack like this, and confusing. Better remove it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230509091600.70994-2-npiggin@gmail.com
2023-08-25powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=nNicholas Piggin
With JUMP_LABEL=n, hcall_tracepoint_refcount's address is being tested instead of its value. This results in the tracing slowpath always being taken unnecessarily. Fixes: 9a10ccb29c0a2 ("powerpc/pseries: move hcall_tracepoint_refcount out of .toc") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230509091600.70994-1-npiggin@gmail.com