KVM: arm64: Avoid soft lockups due to I-cache maintenance

Gavin reports of soft lockups on his Ampere Altra Max machine when backing KVM guests with hugetlb pages. Upon further investigation, it was found that the system is unable to keep up with parallel I-cache invalidations done by KVM's stage-2 fault handler. This is ultimately an implementation problem. I-cache maintenance instructions are available at EL0, so nothing stops a malicious userspace from hammering a system with CMOs and cause it to fall over. "Fixing" this problem in KVM is nothing more than slapping a bandage over a much deeper problem. Anyway, the kernel already has a heuristic for limiting TLB invalidations to avoid soft lockups. Reuse that logic to limit I-cache CMOs done by KVM to map executable pages on systems without FEAT_DIC. While at it, restructure __invalidate_icache_guest_page() to improve readability and squeeze our new condition into the existing branching structure. Link: https://lore.kernel.org/kvmarm/20230904072826.1468907-1-gshan@redhat.com/ Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Gavin Shan <gshan@redhat.com> Link: https://lore.kernel.org/r/20230920080133.944717-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
author: Oliver Upton <oliver.upton@linux.dev> 2023-09-20 08:01:33 +0000
committer: Oliver Upton <oliver.upton@linux.dev> 2023-09-22 17:55:05 +0000
commit: 909b583f81b5bb5a398d4580543f59b908a86ccc (patch)
tree: eb57b723c2583c80f2f2911793f935591144e57f
parent: ec1c3b9ff16082f880b304be40992568f4eee6a7 (diff)
1 files changed, 31 insertions, 6 deletions
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 96a80e8f6226..a425ecdd7be0 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -224,16 +224,41 @@ static inline void __clean_dcache_guest_page(void *va, size_t size)
 	kvm_flush_dcache_to_poc(va, size);
 }
 
+static inline size_t __invalidate_icache_max_range(void)
+{
+	u8 iminline;
+	u64 ctr;
+
+	asm volatile(ALTERNATIVE_CB("movz %0, #0\n"
+				    "movk %0, #0, lsl #16\n"
+				    "movk %0, #0, lsl #32\n"
+				    "movk %0, #0, lsl #48\n",
+				    ARM64_ALWAYS_SYSTEM,
+				    kvm_compute_final_ctr_el0)
+		     : "=r" (ctr));
+
+	iminline = SYS_FIELD_GET(CTR_EL0, IminLine, ctr) + 2;
+	return MAX_DVM_OPS << iminline;
+}
+
 static inline void __invalidate_icache_guest_page(void *va, size_t size)
 {
-	if (icache_is_aliasing()) {
-		/* any kind of VIPT cache */
+	/*
+	 * VPIPT I-cache maintenance must be done from EL2. See comment in the
+	 * nVHE flavor of __kvm_tlb_flush_vmid_ipa().
+	 */
+	if (icache_is_vpipt() && read_sysreg(CurrentEL) != CurrentEL_EL2)
+		return;
+
+	/*
+	 * Blow the whole I-cache if it is aliasing (i.e. VIPT) or the
+	 * invalidation range exceeds our arbitrary limit on invadations by
+	 * cache line.
+	 */
+	if (icache_is_aliasing() || size > __invalidate_icache_max_range())
 		icache_inval_all_pou();
-	} else if (read_sysreg(CurrentEL) != CurrentEL_EL1 ||
-		   !icache_is_vpipt()) {
-		/* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
+	else
 		icache_inval_pou((unsigned long)va, (unsigned long)va + size);
-	}
 }
 
 void kvm_set_way_flush(struct kvm_vcpu *vcpu);
author	Oliver Upton <oliver.upton@linux.dev>	2023-09-20 08:01:33 +0000
committer	Oliver Upton <oliver.upton@linux.dev>	2023-09-22 17:55:05 +0000
commit	909b583f81b5bb5a398d4580543f59b908a86ccc (patch)
tree	eb57b723c2583c80f2f2911793f935591144e57f
parent	ec1c3b9ff16082f880b304be40992568f4eee6a7 (diff)