summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2024-11-05arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()Yicong Yang
Young bit operation on PMD table entry is only supported if FEAT_HAFT enabled system wide. Add a warning for notifying the misbehaviour. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20241102104235.62560-6-yangyicong@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNGYicong Yang
With the support of FEAT_HAFT, the NONLEAF_PMD_YOUNG can be enabled on arm64 since the hardware is capable of updating the AF flag for PMD table descriptor. Since the AF bit of the table descriptor shares the same bit position in block descriptors, we only need to implement arch_has_hw_nonleaf_pmd_young() and select related configs. The related pmd_young test/update operations keeps the same with and already implemented for transparent page support. Currently ARCH_HAS_NONLEAF_PMD_YOUNG is used to improve the efficiency of lru-gen aging. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20241102104235.62560-5-yangyicong@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05arm64: Add support for FEAT_HAFTYicong Yang
Armv8.9/v9.4 introduces the feature Hardware managed Access Flag for Table descriptors (FEAT_HAFT). The feature is indicated by ID_AA64MMFR1_EL1.HAFDBS == 0b0011 and can be enabled by TCR2_EL1.HAFT so it has a dependency on FEAT_TCR2. Adds the Kconfig for FEAT_HAFT and support detecting and enabling the feature. The feature is enabled in __cpu_setup() before MMU on just like HA. A CPU capability is added to notify the user of the feature. Add definition of P{G,4,U,M}D_TABLE_AF bit and set the AF bit when creating the page table, which will save the hardware from having to update them at runtime. This will be ignored if FEAT_HAFT is not enabled. The AF bit of table descriptors cannot be managed by the software per spec, unlike the HA. So this should be used only if it's supported system wide by system_supports_haft(). Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241102104235.62560-4-yangyicong@huawei.com Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> [catalin.marinas@arm.com: added the ID check back to __cpu_setup in case of future CPU errata] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05sysfs: treewide: constify attribute callback of bin_attribute::mmap()Thomas Weißschuh
The mmap() callbacks should not modify the struct bin_attribute passed as argument. Enforce this by marking the argument as const. As there are not many callback implementers perform this change throughout the tree at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-6-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05arm64/ptdump: Test both PTE_TABLE_BIT and PTE_VALID for block mappingsAnshuman Khandual
Test both PTE_TABLE_BIT and PTE_VALID for block mappings, similar to KVM S2 ptdump. This ensures consistency in identifying block mappings, both in the S1 and the S2 page tables. Besides being kernel page tables, there will not be any unmapped (!PTE_VALID) block mappings. Cc: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20241105044154.4064181-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05perf/x86/intel: Do not enable large PEBS for events with aux actions or aux ↵Adrian Hunter
sampling Events with aux actions or aux sampling expect the PMI to coincide with the event, which does not happen for large PEBS, so do not enable large PEBS in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20241022155920.17511-5-adrian.hunter@intel.com
2024-11-05perf/x86/intel/pt: Add support for pause / resumeAdrian Hunter
Prevent tracing to start if aux_paused. Implement support for PERF_EF_PAUSE / PERF_EF_RESUME. When aux_paused, stop tracing. When not aux_paused, only start tracing if it isn't currently meant to be stopped. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20241022155920.17511-4-adrian.hunter@intel.com
2024-11-05perf/x86/intel/pt: Fix buffer full but size is 0 caseAdrian Hunter
If the trace data buffer becomes full, a truncated flag [T] is reported in PERF_RECORD_AUX. In some cases, the size reported is 0, even though data must have been added to make the buffer full. That happens when the buffer fills up from empty to full before the Intel PT driver has updated the buffer position. Then the driver calculates the new buffer position before calculating the data size. If the old and new positions are the same, the data size is reported as 0, even though it is really the whole buffer size. Fix by detecting when the buffer position is wrapped, and adjust the data size calculation accordingly. Example Use a very small buffer size (8K) and observe the size of truncated [T] data. Before the fix, it is possible to see records of 0 size. Before: $ perf record -m,8K -e intel_pt// uname Linux [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.105 MB perf.data ] $ perf script -D --no-itrace | grep AUX | grep -F '[T]' Warning: AUX data lost 2 times out of 3! 5 19462712368111 0x19710 [0x40]: PERF_RECORD_AUX offset: 0 size: 0 flags: 0x1 [T] 5 19462712700046 0x19ba8 [0x40]: PERF_RECORD_AUX offset: 0x170 size: 0xe90 flags: 0x1 [T] After: $ perf record -m,8K -e intel_pt// uname Linux [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.040 MB perf.data ] $ perf script -D --no-itrace | grep AUX | grep -F '[T]' Warning: AUX data lost 2 times out of 3! 1 113720802995 0x4948 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x2000 flags: 0x1 [T] 1 113720979812 0x6b10 [0x40]: PERF_RECORD_AUX offset: 0x2000 size: 0x2000 flags: 0x1 [T] Fixes: 52ca9ced3f70 ("perf/x86/intel/pt: Add Intel PT PMU driver") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20241022155920.17511-2-adrian.hunter@intel.com
2024-11-05riscv: add PREEMPT_LAZY supportJisheng Zhang
riscv has switched to GENERIC_ENTRY, so adding PREEMPT_LAZY is as simple as adding TIF_NEED_RESCHED_LAZY related definitions and enabling ARCH_HAS_PREEMPT_LAZY. [bigeasy: Replace old PREEMPT_AUTO bits with new PREEMPT_LAZY ] Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lkml.kernel.org/r/20241021151257.102296-4-bigeasy@linutronix.de
2024-11-05sched, x86: Enable Lazy preemptionPeter Zijlstra
Add the TIF bit and select the Kconfig symbol to make it go. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lkml.kernel.org/r/20241007075055.555778919@infradead.org
2024-11-05seqlock, treewide: Switch to non-raw seqcount_latch interfaceMarco Elver
Switch all instrumentable users of the seqcount_latch interface over to the non-raw interface. Co-developed-by: "Peter Zijlstra (Intel)" <peterz@infradead.org> Signed-off-by: "Peter Zijlstra (Intel)" <peterz@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241104161910.780003-5-elver@google.com
2024-11-05locking/atomic/x86: Use ALT_OUTPUT_SP() for __arch_{,try_}cmpxchg64_emu()Uros Bizjak
x86_32 __arch_{,try_}cmpxchg64_emu()() macros use CALL instruction inside asm statement. Use ALT_OUTPUT_SP() macro to add required dependence on %esp register. Fixes: 79e1dd05d1a2 ("x86: Provide an alternative() based cmpxchg64()") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241103160954.3329-2-ubizjak@gmail.com
2024-11-05locking/atomic/x86: Use ALT_OUTPUT_SP() for __alternative_atomic64()Uros Bizjak
CONFIG_X86_CMPXCHG64 variant of x86_32 __alternative_atomic64() macro uses CALL instruction inside asm statement. Use ALT_OUTPUT_SP() macro to add required dependence on %esp register. Fixes: 819165fb34b9 ("x86: Adjust asm constraints in atomic64 wrappers") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241103160954.3329-1-ubizjak@gmail.com
2024-11-05ARM: dts: microchip: sam9x75_curiosity: add sam9x75 curiosity boardVarshini Rajendran
Add device tree file for sam9x75 curiosity board. Signed-off-by: Varshini Rajendran <varshini.rajendran@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Reviewed-by: Hari Prasath Gujulan Elango <hari.prasathge@microchip.com> Link: https://lore.kernel.org/r/20241010120444.93252-1-varshini.rajendran@microchip.com Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
2024-11-05ARM: dts: at91: sam9x7: add device tree for SoCVarshini Rajendran
Add device tree file for SAM9X7 SoC family. Co-developed-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Varshini Rajendran <varshini.rajendran@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Link: https://lore.kernel.org/r/20241010120432.93151-1-varshini.rajendran@microchip.com Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
2024-11-05arm64/mm: Sanity check PTE address before runtime P4D/PUD foldingArd Biesheuvel
The runtime P4D/PUD folding logic assumes that the respective pgd_t* and p4d_t* arguments are pointers into actual page tables that are part of the hierarchy being operated on. This may not always be the case, and we have been bitten once by this already [0], where the argument was actually a stack variable, and in this case, the logic does not work at all. So let's add a VM_BUG_ON() for each case, to ensure that the address of the provided page table entry is consistent with the address being translated. [0] https://lore.kernel.org/all/20240725090345.28461-1-will@kernel.org/T/#u Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20241105093919.1312049-2-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05arm64: setup: name 'tcr2' registerYicong Yang
TCR2_EL1 introduced some additional controls besides TCR_EL1. Currently only PIE is supported and enabled by writing TCR2_EL1 directly if PIE detected. Introduce a named register 'tcr2' just like 'tcr' we've already had. It'll be initialized to 0 and updated if certain feature detected and needs to be enabled. Touch the TCR2_EL1 registers at last with the updated 'tcr2' value if FEAT_TCR2 supported by checking ID_AA64MMFR3_EL1.TCRX. Then we can extend the support of other features controlled by TCR2_EL1. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241102104235.62560-3-yangyicong@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05arm64/sysreg: Update ID_AA64MMFR1_EL1 registerYicong Yang
Update ID_AA64MMFR1_EL1 register fields definition per DDI0601 (ID092424) 2024-09. ID_AA64MMFR1_EL1.ETS adds definition for FEAT_ETS2 and FEAT_ETS3. ID_AA64MMFR1_EL1.HAFDBS adds definition for FEAT_HAFT and FEAT_HDBSS. Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241102104235.62560-2-yangyicong@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-05KVM: PPC: Book3S HV: Add Power11 capability support for Nested PAPR guestsAmit Machhiwal
The Power11 architected and raw mode support in Linux was merged in commit c2ed087ed35c ("powerpc: Add Power11 architected and raw mode"), and the corresponding support in QEMU is pending in [1], which is currently in its V6. Currently, booting a KVM guest inside a pseries LPAR (Logical Partition) on a kernel without P11 support results the guest boot in a Power10 compatibility mode (i.e., with logical PVR of Power10). However, booting a KVM guest on a kernel with P11 support causes the following boot crash. On a Power11 LPAR, the Power Hypervisor (L0) returns a support for both Power10 and Power11 capabilities through H_GUEST_GET_CAPABILITIES hcall. However, KVM currently supports only Power10 capabilities, resulting in only Power10 capabilities being set as "nested capabilities" via an H_GUEST_SET_CAPABILITIES hcall. In the guest entry path, gs_msg_ops_kvmhv_nestedv2_config_fill_info() is called by kvmhv_nestedv2_flush_vcpu() to fill the GSB (Guest State Buffer) elements. The arch_compat is set to the logical PVR of Power11, followed by an H_GUEST_SET_STATE hcall. This hcall returns H_INVALID_ELEMENT_VALUE as a return code when setting a Power11 logical PVR, as only Power10 capabilities were communicated as supported between PHYP and KVM, utimately resulting in the KVM guest boot crash. KVM: unknown exit, hardware reason ffffffffffffffea NIP 000000007daf97e0 LR 000000007daf1aec CTR 000000007daf1ab4 XER 0000000020040000 CPU#0 MSR 8000000000103000 HID0 0000000000000000 HF 6c002000 iidx 3 didx 3 TB 00000000 00000000 DECR 0 GPR00 8000000000003000 000000007e580e20 000000007db26700 0000000000000000 GPR04 00000000041a0c80 000000007df7f000 0000000000200000 000000007df7f000 GPR08 000000007db6d5d8 000000007e65fa90 000000007db6d5d0 0000000000003000 GPR12 8000000000000001 0000000000000000 0000000000000000 0000000000000000 GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20 0000000000000000 0000000000000000 0000000000000000 000000007db21a30 GPR24 000000007db65000 0000000000000000 0000000000000000 0000000000000003 GPR28 000000007db6d5e0 000000007db22220 000000007daf27ac 000000007db75000 CR 20000404 [ E - - - - G - G ] RES 000@ffffffffffffffff SRR0 000000007daf97e0 SRR1 8000000000102000 PVR 0000000000820200 VRSAVE 0000000000000000 SPRG0 0000000000000000 SPRG1 000000000000ff20 SPRG2 0000000000000000 SPRG3 0000000000000000 SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000 CFAR 0000000000000000 LPCR 0000000000020400 PTCR 0000000000000000 DAR 0000000000000000 DSISR 0000000000000000 Fix this by adding the Power11 capability support and the required plumbing in place. Note: * Booting a Power11 KVM nested PAPR guest requires [1] in QEMU. [1] https://lore.kernel.org/all/20240731055022.696051-1-adityag@linux.ibm.com/ Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241028101622.741573-1-amachhiw@linux.ibm.com
2024-11-05powerpc: Use str_enabled_disabled() helper functionThorsten Blum
Remove hard-coded strings by using the str_enabled_disabled() helper function. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241027222219.1173-2-thorsten.blum@linux.dev
2024-11-05powerpc/modules: start/end_opd are only needed for ABI v1Michael Ellerman
The start_opd/end_opd members of struct mod_arch_specific are only needed for kernels built using ELF ABI v1. Guard them with an ifdef to save a little bit of space on ELF ABI v2 kernels. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20240812063312.730496-1-mpe@ellerman.id.au
2024-11-05powerpc/ps3: replace open-coded sysfs_emit functionPaulo Miguel Almeida
sysfs_emit() helper function should be used when formatting the value to be returned to user space. This patch replaces open-coded sysfs_emit() in sysfs .show() callbacks Link: https://github.com/KSPP/linux/issues/105 Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/ZxMV3YvSulJFZ8rk@mail.google.com
2024-11-05riscv: kvm: Fix out-of-bounds array accessBjörn Töpel
In kvm_riscv_vcpu_sbi_init() the entry->ext_idx can contain an out-of-bound index. This is used as a special marker for the base extensions, that cannot be disabled. However, when traversing the extensions, that special marker is not checked prior indexing the array. Add an out-of-bounds check to the function. Fixes: 56d8a385b605 ("RISC-V: KVM: Allow some SBI extensions to be disabled by default") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241104191503.74725-1-bjorn@kernel.org Signed-off-by: Anup Patel <anup@brainfault.org>
2024-11-05RISC-V: KVM: Fix APLIC in_clrip and clripnum write emulationYong-Xuan Wang
In the section "4.7 Precise effects on interrupt-pending bits" of the RISC-V AIA specification defines that: "If the source mode is Level1 or Level0 and the interrupt domain is configured in MSI delivery mode (domaincfg.DM = 1): The pending bit is cleared whenever the rectified input value is low, when the interrupt is forwarded by MSI, or by a relevant write to an in_clrip register or to clripnum." Update the aplic_write_pending() to match the spec. Fixes: d8dd9f113e16 ("RISC-V: KVM: Fix APLIC setipnum_le/be write emulation") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241029085542.30541-1-yongxuan.wang@sifive.com Signed-off-by: Anup Patel <anup@brainfault.org>
2024-11-04KVM: SVM: Propagate error from snp_guest_req_init() to userspaceSean Christopherson
If snp_guest_req_init() fails, return the provided error code up the stack to userspace, e.g. so that userspace can log that KVM_SEV_INIT2 failed, as opposed to some random operation later in VM setup failing because SNP wasn't actually enabled for the VM. Note, KVM itself doesn't consult the return value from __sev_guest_init(), i.e. the fallout is purely that userspace may be confused. Fixes: 88caf544c930 ("KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202410192220.MeTyHPxI-lkp@intel.com Link: https://lore.kernel.org/r/20241031203214.1585751-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: nVMX: Treat vpid01 as current if L2 is active, but with VPID disabledSean Christopherson
When getting the current VPID, e.g. to emulate a guest TLB flush, return vpid01 if L2 is running but with VPID disabled, i.e. if VPID is disabled in vmcs12. Architecturally, if VPID is disabled, then the guest and host effectively share VPID=0. KVM emulates this behavior by using vpid01 when running an L2 with VPID disabled (see prepare_vmcs02_early_rare()), and so KVM must also treat vpid01 as the current VPID while L2 is active. Unconditionally treating vpid02 as the current VPID when L2 is active causes KVM to flush TLB entries for vpid02 instead of vpid01, which results in TLB entries from L1 being incorrectly preserved across nested VM-Enter to L2 (L2=>L1 isn't problematic, because the TLB flush after nested VM-Exit flushes vpid01). The bug manifests as failures in the vmx_apicv_test KVM-Unit-Test, as KVM incorrectly retains TLB entries for the APIC-access page across a nested VM-Enter. Opportunisticaly add comments at various touchpoints to explain the architectural requirements, and also why KVM uses vpid01 instead of vpid02. All credit goes to Chao, who root caused the issue and identified the fix. Link: https://lore.kernel.org/all/ZwzczkIlYGX+QXJz@intel.com Fixes: 2b4a5a5d5688 ("KVM: nVMX: Flush current VPID (L1 vs. L2) for KVM_REQ_TLB_FLUSH_GUEST") Cc: stable@vger.kernel.org Cc: Like Xu <like.xu.linux@gmail.com> Debugged-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Chao Gao <chao.gao@intel.com> Tested-by: Chao Gao <chao.gao@intel.com> Link: https://lore.kernel.org/r/20241031202011.1580522-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86: Short-circuit all of kvm_apic_set_base() if MSR value is unchangedSean Christopherson
Do nothing in all of kvm_apic_set_base(), not just __kvm_apic_set_base(), if the incoming MSR value is the same as the current value. Validating the mode transitions is obviously unnecessary, and rejecting the write is pointless if the vCPU already has an invalid value, e.g. if userspace is doing weird things and modified guest CPUID after setting MSR_IA32_APICBASE. Bailing early avoids kvm_recalculate_apic_map()'s slow path in the rare scenario where the map is DIRTY due to some other vCPU dirtying the map, in which case it's the other vCPU/task's responsibility to recalculate the map. Note, kvm_lapic_reset() calls __kvm_apic_set_base() only when emulating RESET, in which case the old value is guaranteed to be zero, and the new value is guaranteed to be non-zero. I.e. all callers of __kvm_apic_set_base() effectively pre-check for the MSR value actually changing. Don't bother keeping the check in __kvm_apic_set_base(), as no additional callers are expected, and implying that the MSR might already be non-zero at the time of kvm_lapic_reset() could confuse readers. Link: https://lore.kernel.org/r/20241101183555.1794700-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86: Unpack msr_data structure prior to calling kvm_apic_set_base()Sean Christopherson
Pass in the new value and "host initiated" as separate parameters to kvm_apic_set_base(), as forcing the KVM_SET_SREGS path to declare and fill an msr_data structure is awkward and kludgy, e.g. __set_sregs_common() doesn't even bother to set the proper MSR index. No functional change intended. Suggested-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Link: https://lore.kernel.org/r/20241101183555.1794700-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86: Make kvm_recalculate_apic_map() local to lapic.cSean Christopherson
Make kvm_recalculate_apic_map() local to lapic.c now that all external callers are gone. No functional change intended. Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241009181742.1128779-8-seanjc@google.com Link: https://lore.kernel.org/r/20241101183555.1794700-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86: Rename APIC base setters to better capture their relationshipSean Christopherson
Rename kvm_set_apic_base() and kvm_lapic_set_base() to kvm_apic_set_base() and __kvm_apic_set_base() respectively to capture that the underscores version is a "special" variant (it exists purely to avoid recalculating the optimized map multiple times when stuffing the RESET value). Opportunistically add a comment explaining why kvm_lapic_reset() uses the inner helper. Note, KVM deliberately invokes kvm_arch_vcpu_create() while kvm->lock is NOT held so that vCPU setup isn't serialized if userspace is creating multiple/all vCPUs in parallel. I.e. triggering an extra recalculation is not limited to theoretical/rare edge cases, and so is worth avoiding. No functional change intended. Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241009181742.1128779-7-seanjc@google.com Link: https://lore.kernel.org/r/20241101183555.1794700-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86: Move kvm_set_apic_base() implementation to lapic.c (from x86.c)Sean Christopherson
Move kvm_set_apic_base() to lapic.c so that the bulk of KVM's local APIC code resides in lapic.c, regardless of whether or not KVM is emulating the local APIC in-kernel. This will also allow making various helpers visible only to lapic.c. No functional change intended. Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241009181742.1128779-6-seanjc@google.com Link: https://lore.kernel.org/r/20241101183555.1794700-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86: Inline kvm_get_apic_mode() in lapic.hSean Christopherson
Inline kvm_get_apic_mode() in lapic.h to avoid a CALL+RET as well as an export. The underlying kvm_apic_mode() helper is public information, i.e. there is no state/information that needs to be hidden from vendor modules. No functional change intended. Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241009181742.1128779-5-seanjc@google.com Link: https://lore.kernel.org/r/20241101183555.1794700-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86: Get vcpu->arch.apic_base directly and drop kvm_get_apic_base()Sean Christopherson
Access KVM's emulated APIC base MSR value directly instead of bouncing through a helper, as there is no reason to add a layer of indirection, and there are other MSRs with a "set" but no "get", e.g. EFER. No functional change intended. Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241009181742.1128779-4-seanjc@google.com Link: https://lore.kernel.org/r/20241101183555.1794700-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86: Drop superfluous kvm_lapic_set_base() call when setting APIC stateSean Christopherson
Now that kvm_lapic_set_base() does nothing if the "new" APIC base MSR is the same as the current value, drop the kvm_lapic_set_base() call in the KVM_SET_LAPIC flow that passes in the current value, as it too does nothing. Note, the purpose of invoking kvm_lapic_set_base() was purely to set apic->base_address (see commit 5dbc8f3fed0b ("KVM: use kvm_lapic_set_base() to change apic_base")). And there is no evidence that explicitly setting apic->base_address in KVM_SET_LAPIC ever had any functional impact; even in the original commit 96ad2cc61324 ("KVM: in-kernel LAPIC save and restore support"), all flows that set apic_base also set apic->base_address to the same address. E.g. svm_create_vcpu() did open code a write to apic_base, svm->vcpu.apic_base = 0xfee00000 | MSR_IA32_APICBASE_ENABLE; but it also called kvm_create_lapic() when irqchip_in_kernel() is true. Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241009181742.1128779-3-seanjc@google.com Link: https://lore.kernel.org/r/20241101183555.1794700-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86: Short-circuit all kvm_lapic_set_base() if MSR value isn't changingSean Christopherson
Do nothing in kvm_lapic_set_base() if the APIC base MSR value is the same as the current value. All flows except the handling of the base address explicitly take effect if and only if relevant bits are changing. For the base address, invoking kvm_lapic_set_base() before KVM initializes the base to APIC_DEFAULT_PHYS_BASE during vCPU RESET would be a KVM bug, i.e. KVM _must_ initialize apic->base_address before exposing the vCPU (to userspace or KVM at-large). Note, the inhibit is intended to be set if the base address is _changed_ from the default, i.e. is also covered by the RESET behavior. Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241009181742.1128779-2-seanjc@google.com Link: https://lore.kernel.org/r/20241101183555.1794700-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86/mmu: Drop per-VM zapped_obsolete_pages listVipin Sharma
Drop the per-VM zapped_obsolete_pages list now that the usage from the defunct mmu_shrinker is gone, and instead use a local list to track pages in kvm_zap_obsolete_pages(), the sole remaining user of zapped_obsolete_pages. Opportunistically add an assertion to verify and document that slots_lock must be held, i.e. that there can only be one active instance of kvm_zap_obsolete_pages() at any given time, and by doing so also prove that using a local list instead of a per-VM list doesn't change any functionality (beyond trivialities like list initialization). Signed-off-by: Vipin Sharma <vipinsh@google.com> Link: https://lore.kernel.org/r/20241101201437.1604321-2-vipinsh@google.com [sean: split to separate patch, write changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86/mmu: Remove KVM's MMU shrinkerVipin Sharma
Remove KVM's MMU shrinker and (almost) all of its related code, as the current implementation is very disruptive to VMs (if it ever runs), without providing any meaningful benefit[1]. Alternatively, KVM could repurpose its shrinker, e.g. to reclaim pages from the per-vCPU caches[2], but given that no one has complained about lack of TDP MMU support for the shrinker in the 3+ years since the TDP MMU was enabled by default, it's safe to say that there is likely no real use case for initiating reclaim of KVM's page tables from the shrinker. And while clever/cute, reclaiming the per-vCPU caches doesn't scale the same way that reclaiming in-use page table pages does. E.g. the amount of memory being used by a VM doesn't always directly correlate with the number vCPUs, and even when it does, reclaiming a few pages from per-vCPU caches likely won't make much of a dent in the VM's total memory usage, especially for VMs with huge amounts of memory. Lastly, if it turns out that there is a strong use case for dropping the per-vCPU caches, re-introducing the shrinker registration is trivial compared to the complexity of actually reclaiming pages from the caches. [1] https://lore.kernel.org/lkml/Y45dldZnI6OIf+a5@google.com [2] https://lore.kernel.org/kvm/20241004195540.210396-3-vipinsh@google.com Suggested-by: Sean Christopherson <seanjc@google.com> Suggested-by: David Matlack <dmatlack@google.com> Signed-off-by: Vipin Sharma <vipinsh@google.com> Link: https://lore.kernel.org/r/20241101201437.1604321-2-vipinsh@google.com [sean: keep zapped_obsolete_pages for now, massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86/mmu: WARN if huge page recovery triggered during dirty loggingDavid Matlack
WARN and bail out of recover_huge_pages_range() if dirty logging is enabled. KVM shouldn't be recovering huge pages during dirty logging anyway, since KVM needs to track writes at 4KiB. However it's not out of the possibility that that changes in the future. If KVM wants to recover huge pages during dirty logging, make_huge_spte() must be updated to write-protect the new huge page mapping. Otherwise, writes through the newly recovered huge page mapping will not be tracked. Note that this potential risk did not exist back when KVM zapped to recover huge page mappings, since subsequent accesses would just be faulted in at PG_LEVEL_4K if dirty logging was enabled. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20240823235648.3236880-7-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86/mmu: Rename make_huge_page_split_spte() to make_small_spte()David Matlack
Rename make_huge_page_split_spte() to make_small_spte(). This ensures that the usage of "small_spte" and "huge_spte" are consistent between make_huge_spte() and make_small_spte(). This should also reduce some confusion as make_huge_page_split_spte() almost reads like it will create a huge SPTE, when in fact it is creating a small SPTE to split the huge SPTE. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20240823235648.3236880-6-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86/mmu: Recover TDP MMU huge page mappings in-place instead of zappingDavid Matlack
Recover TDP MMU huge page mappings in-place instead of zapping them when dirty logging is disabled, and rename functions that recover huge page mappings when dirty logging is disabled to move away from the "zap collapsible spte" terminology. Before KVM flushes TLBs, guest accesses may be translated through either the (stale) small SPTE or the (new) huge SPTE. This is already possible when KVM is doing eager page splitting (where TLB flushes are also batched), and when vCPUs are faulting in huge mappings (where TLBs are flushed after the new huge SPTE is installed). Recovering huge pages reduces the number of page faults when dirty logging is disabled: $ perf stat -e kvm:kvm_page_fault -- ./dirty_log_perf_test -s anonymous_hugetlb_2mb -v 64 -e -b 4g Before: 393,599 kvm:kvm_page_fault After: 262,575 kvm:kvm_page_fault vCPU throughput and the latency of disabling dirty-logging are about equal compared to zapping, but avoiding faults can be beneficial to remove vCPU jitter in extreme scenarios. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20240823235648.3236880-5-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86/mmu: Refactor TDP MMU iter need resched checkSean Christopherson
Refactor the TDP MMU iterator "need resched" checks into a helper function so they can be called from a different code path in a subsequent commit. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20240823235648.3236880-4-dmatlack@google.com [sean: rebase on a swapped order of checks] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86/mmu: Demote the WARN on yielded in xxx_cond_resched() to ↵Sean Christopherson
KVM_MMU_WARN_ON Convert the WARN in tdp_mmu_iter_cond_resched() that the iterator hasn't already yielded to a KVM_MMU_WARN_ON() so the code is compiled out for production kernels (assuming production kernels disable KVM_PROVE_MMU). Checking for a needed reschedule is a hot path, and KVM sanity checks iter->yielded in several other less-hot paths, i.e. the odds of KVM not flagging that something went sideways are quite low. Furthermore, the odds of KVM not noticing *and* the WARN detecting something worth investigating are even lower. Link: https://lore.kernel.org/r/20241031170633.1502783-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04KVM: x86/mmu: Check yielded_gfn for forward progress iff resched is neededSean Christopherson
Swap the order of the checks in tdp_mmu_iter_cond_resched() so that KVM checks to see if a resched is needed _before_ checking to see if yielding must be disallowed to guarantee forward progress. Iterating over TDP MMU SPTEs is a hot path, e.g. tearing down a root can touch millions of SPTEs, and not needing to reschedule is by far the common case. On the other hand, disallowing yielding because forward progress has not been made is a very rare case. Returning early for the common case (no resched), effectively reduces the number of checks from 2 to 1 for the common case, and should make the code slightly more predictable for the CPU. To resolve a weird conundrum where the forward progress check currently returns false, but the need resched check subtly returns iter->yielded, which _should_ be false (enforced by a WARN), return false unconditionally (which might also help make the sequence more predictable). If KVM has a bug where iter->yielded is left danging, continuing to yield is neither right nor wrong, it was simply an artifact of how the original code was written. Unconditionally returning false when yielding is unnecessary or unwanted will also allow extracting the "should resched" logic to a separate helper in a future patch. Cc: David Matlack <dmatlack@google.com> Reviewed-by: James Houghton <jthoughton@google.com> Link: https://lore.kernel.org/r/20241031170633.1502783-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-11-04Merge tag 'arm-fixes-6.12-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Where the last set of fixes was mostly drivers, this time the devicetree changes all come at once, targeting mostly the Rockchips, Qualcomm and NXP platforms. The Qualcomm bugfixes target the Snapdragon X Elite laptops, specifically problems with PCIe and NVMe support to improve reliability, and a boot regresion on msm8939. Also for Snapdragon platforms, there are a number of correctness changes in the several platform specific device drivers, but none of these are as impactful. On the NXP i.MX platform, the fixes are all for 64-bit i.MX8 variants, correcting individual entries in the devicetree that were incorrect and causing the media, video, mmc and spi drivers to misbehave in minor ways. The Arm SCMI firmware driver gets fixes for a use-after-free bug and for correctly parsing firmware information. On the RISC-V side, there are three minor devicetree fixes for starfive and sophgo, again addressing only minor mistakes. One device driver patch fixes a problem with spurious interrupt handling" * tag 'arm-fixes-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (63 commits) firmware: arm_scmi: Use vendor string in max-rx-timeout-ms dt-bindings: firmware: arm,scmi: Add missing vendor string riscv: dts: Replace deprecated snps,nr-gpios property for snps,dw-apb-gpio-port devices arm64: dts: rockchip: Correct GPIO polarity on brcm BT nodes arm64: dts: rockchip: Drop invalid clock-names from es8388 codec nodes ARM: dts: rockchip: Fix the realtek audio codec on rk3036-kylin ARM: dts: rockchip: Fix the spi controller on rk3036 ARM: dts: rockchip: drop grf reference from rk3036 hdmi ARM: dts: rockchip: fix rk3036 acodec node arm64: dts: rockchip: remove orphaned pinctrl-names from pinephone pro soc: qcom: pmic_glink: Handle GLINK intent allocation rejections rpmsg: glink: Handle rejected intent request better arm64: dts: qcom: x1e80100: fix PCIe5 interconnect arm64: dts: qcom: x1e80100: fix PCIe4 interconnect arm64: dts: qcom: x1e80100: Fix up BAR spaces MAINTAINERS: invert Misc RISC-V SoC Support's pattern soc: qcom: socinfo: fix revision check in qcom_socinfo_probe() arm64: dts: qcom: x1e80100-qcp: fix nvme regulator boot glitch arm64: dts: qcom: x1e80100-microsoft-romulus: fix nvme regulator boot glitch arm64: dts: qcom: x1e80100-yoga-slim7x: fix nvme regulator boot glitch ...
2024-11-04jump_label: adjust inline asm to be consistentAlice Ryhl
To avoid duplication of inline asm between C and Rust, we need to import the inline asm from the relevant `jump_label.h` header into Rust. To make that easier, this patch updates the header files to expose the inline asm via a new ARCH_STATIC_BRANCH_ASM macro. The header files are all updated to define a ARCH_STATIC_BRANCH_ASM that takes the same arguments in a consistent order so that Rust can use the same logic for every architecture. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Jason Baron <jbaron@akamai.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Gary Guo <gary@garyguo.net> Cc: " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " <bjorn3_gh@protonmail.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anup Patel <apatel@ventanamicro.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Conor Dooley <conor.dooley@microchip.com> Cc: Samuel Holland <samuel.holland@sifive.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tianrui Zhao <zhaotianrui@loongson.cn> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/20241030-tracepoint-v12-4-eec7f0f8ad22@google.com Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Co-developed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> # RISC-V Signed-off-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-04ARM: dts: omap4-kc1: fix twl6030 power nodeAndreas Kemnade
dtbs_check was moaning about twl6030-power, use the standard property instead. Apparently that twl6030 power snippet slipped in without the corresponding driver. Now it is handled by the standard property. CC: Paul Kocialkowski <contact@paulk.fr> Signed-off-by: Andreas Kemnade <andreas@kemnade.info> Reviewed-by: Paul Kocialkowski <contact@paulk.fr> Link: https://lore.kernel.org/r/20240915193527.1071792-1-andreas@kemnade.info Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2024-11-04ARM: dts: am335x-bone-common: Increase MDIO reset deassert delay to 50msGeert Uytterhoeven
Commit b9bf5612610aa7e3 ("ARM: dts: am335x-bone-common: Increase MDIO reset deassert time") already increased the MDIO reset deassert delay from 6.5 to 13 ms, but this may still cause Ethernet PHY probe failures: SMSC LAN8710/LAN8720 4a101000.mdio:00: probe with driver SMSC LAN8710/LAN8720 failed with error -5 On BeagleBone Black Rev. C3, ETH_RESETn is controlled by an open-drain AND gate. It is pulled high by a 10K resistor, and has a 4.7µF capacitor to ground, giving an RC time constant of 47ms. As it takes 0.7RC to charge the capacitor above the threshold voltage of a CMOS input (VDD/2), the delay should be at least 33ms. Considering the typical tolerance of 20% on capacitors, 40ms would be safer. Add an additional safety margin and settle for 50ms. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/9002a58daa1b2983f39815b748ee9d2f8dcc4829.1730366936.git.geert+renesas@glider.be Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2024-11-04arm64: signal: Remove unused macroKevin Brodsky
Commit 33f082614c34 ("arm64: signal: Allow expansion of the signal frame") introduced the BASE_SIGFRAME_SIZE macro but it has apparently never been used; just remove it. Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Link: https://lore.kernel.org/r/20241029144539.111155-4-kevin.brodsky@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-04arm64: signal: Remove unnecessary check when saving POE stateKevin Brodsky
The POE frame record is allocated unconditionally if POE is supported. If the allocation fails, a SIGSEGV is delivered before setup_sigframe() can be reached. As a result there is no need to consider poe_offset before saving POR_EL0; just remove that check. This is in line with other frame records (FPMR, TPIDR2). Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Link: https://lore.kernel.org/r/20241029144539.111155-3-kevin.brodsky@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-04arm64/mm: Drop setting PTE_TYPE_PAGE in pte_mkcont()Anshuman Khandual
PTE_TYPE_PAGE bits were being set in pte_mkcont() because PTE_TABLE_BIT was being cleared in pte_mkhuge(). But after arch_make_huge_pte() modification in commit f8192813dcbe ("arm64/mm: Re-organize arch_make_huge_pte()"), which dropped pte_mkhuge() completely, setting back PTE_TYPE_PAGE bits is no longer necessary. Change pte_mkcont() to only set PTE_CONT. Cc: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20241104041617.3804617-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>