summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-06-03KVM: SVM: Consider NUMA affinity when allocating per-CPU save_areaLi RongQing
save_area of per-CPU svm_data are dominantly accessed from their own local CPUs, so allocate them node-local for performance reason so rename __snp_safe_alloc_page as snp_safe_alloc_page_node which accepts numa node id as input parameter, svm_cpu_init call it with node id switched from cpu id Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20240520120858.13117-4-lirongqing@baidu.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: SVM: not account memory allocation for per-CPU svm_dataLi RongQing
The allocation for the per-CPU save area in svm_cpu_init shouldn't be accounted, So introduce __snp_safe_alloc_page helper, which has gfp flag as input, svm_cpu_init calls __snp_safe_alloc_page with GFP_KERNEL, snp_safe_alloc_page calls __snp_safe_alloc_page with GFP_KERNEL_ACCOUNT as input Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20240520120858.13117-3-lirongqing@baidu.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: SVM: remove useless input parameter in snp_safe_alloc_pageLi RongQing
The input parameter 'vcpu' in snp_safe_alloc_page is not used. Therefore, remove it. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20240520120858.13117-2-lirongqing@baidu.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03Merge tag 'renesas-pinctrl-fixes-for-v6.10-tag1' of ↵Linus Walleij
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers into fixes pinctrl: renesas: Fixes for v6.10 - Fix PREEMPT_RT build failure on RZ/G2L. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2024-06-03Merge tag 'cxl-fixes-6.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dave Jiang: - Compile fix for cxl-test from missing linux/vmalloc.h - Fix for memregion leaks in devm_cxl_add_region() * tag 'cxl-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Fix memregion leaks in devm_cxl_add_region() cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c
2024-06-03KVM: x86/pmu: Manipulate FIXED_CTR_CTRL MSR with macrosDapeng Mi
Magic numbers are used to manipulate the bit fields of FIXED_CTR_CTRL MSR. This makes reading code become difficult, so use pre-defined macros to replace these magic numbers. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240430005239.13527-3-dapeng1.mi@linux.intel.com [sean: drop unnecessary curly braces] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: x86/pmu: Change ambiguous _mask suffix to _rsvd in kvm_pmuDapeng Mi
Several '_mask' suffixed variables such as, global_ctrl_mask, are defined in kvm_pmu structure. However the _mask suffix is ambiguous and misleading since it's not a real mask with positive logic. On the contrary it represents the reserved bits of corresponding MSRs and these bits should not be accessed. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20240430005239.13527-2-dapeng1.mi@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: x86/mmu: Only allocate shadowed translation cache for sp->role.level <= ↵Hou Wenlong
KVM_MAX_HUGEPAGE_LEVEL Only the indirect SP with sp->role.level <= KVM_MAX_HUGEPAGE_LEVEL might have leaf gptes, so allocation of shadowed translation cache is needed only for it. Then, it can use sp->shadowed_translation to determine whether to use the information in the shadowed translation cache or not. Also, extend the WARN in FNAME(sync_spte)() to ensure that this won't break shadow_mmu_get_sp_for_split(). Suggested-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com> Link: https://lore.kernel.org/r/5b0cda8a7456cda476b14fca36414a56f921dd52.1715398655.git.houwenlong.hwl@antgroup.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03ima: fix wrong zero-assignment during securityfs dentry removeEnrico Bravi
In case of error during ima_fs_init() all the dentry already created are removed. {ascii, binary}_securityfs_measurement_lists are freed calling for each array the remove_securityfs_measurement_lists(). This function, at the end, assigns to zero the securityfs_measurement_list_count. This causes during the second call of remove_securityfs_measurement_lists() to leave the dentry of the array pending, not removing them correctly, because the securityfs_measurement_list_count is already zero. Move the securityfs_measurement_list_count = 0 after the two remove_securityfs_measurement_lists() calls to correctly remove all the dentry already allocated. Fixes: 9fa8e7625008 ("ima: add crypto agility support for template-hash algorithm") Signed-off-by: Enrico Bravi <enrico.bravi@polito.it> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2024-06-03iio: inkern: fix channel read regressionJohan Hovold
A recent "cleanup" broke IIO channel read outs and thereby thermal mitigation on the Lenovo ThinkPad X13s by returning zero instead of the expected IIO value type in iio_read_channel_processed_scale(): thermal thermal_zone12: failed to read out thermal zone (-22) Fixes: 3092bde731ca ("iio: inkern: move to the cleanup.h magic") Cc: Nuno Sa <nuno.sa@analog.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20240530074416.13697-1-johan+linaro@kernel.org Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-06-03iio: imu: inv_mpu6050: stabilized timestamping in interruptJean-Baptiste Maneyrol
Use IRQ ONESHOT flag to ensure the timestamp is not updated in the hard handler during the thread handler. And use a fixed value of 1 sample that correspond to this first timestamp. This way we can ensure the timestamp is always corresponding to the value used by the timestamping mechanism. Otherwise, it is possible that between FIFO count read and FIFO processing the timestamp is overwritten in the hard handler. Fixes: 111e1abd0045 ("iio: imu: inv_mpu6050: use the common inv_sensors timestamp module") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com> Link: https://lore.kernel.org/r/20240527150117.608792-1-inv.git-commit@tdk.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-06-03iio: adc: ad7173: Fix sampling frequency settingDumitru Ceclan
This patch fixes two issues regarding the sampling frequency setting: -The attribute was set as per device, not per channel. As such, when setting the sampling frequency, the configuration was always done for the slot 0, and the correct configuration was applied on the next channel configuration call by the LRU mechanism. -The LRU implementation does not take into account external settings of the slot registers. When setting the sampling frequency directly to a slot register in write_raw(), there is no guarantee that other channels were not also using that slot and now incorrectly retain their config as live. Set the sampling frequency attribute as separate in the channel templates. Do not set the sampling directly to the slot register in write_raw(), just mark the config as not live and let the LRU mechanism handle it. As the reg variable is no longer used, remove it. Fixes: 76a1e6a42802 ("iio: adc: ad7173: add AD7173 driver") Signed-off-by: Dumitru Ceclan <dumitru.ceclan@analog.com> Link: https://lore.kernel.org/r/20240530-ad7173-fixes-v3-5-b85f33079e18@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-06-03iio: adc: ad7173: Clear append status bitDumitru Ceclan
The previous value of the append status bit was not cleared before setting the new value. This caused the bit to remain set after enabling buffered mode for multiple channels and not permit further buffered reads from a single channel after the fact. Fixes: 76a1e6a42802 ("iio: adc: ad7173: add AD7173 driver") Signed-off-by: Dumitru Ceclan <dumitru.ceclan@analog.com> Link: https://lore.kernel.org/r/20240530-ad7173-fixes-v3-4-b85f33079e18@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-06-03KVM: x86: invalid_list not used anymore in mmu_shrink_scanLiang Chen
'invalid_list' is now gathered in KVM_MMU_ZAP_OLDEST_MMU_PAGES. Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Link: https://lore.kernel.org/r/20240509044710.18788-1-liangchen.linux@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03tools headers uapi: Sync linux/stat.h with the kernel sources to pick ↵Arnaldo Carvalho de Melo
STATX_SUBVOL To pick the changes from: 2a82bb02941fb53d ("statx: stx_subvol") This silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlnK2Fmx_gahzwZI@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-06-03Merge branch 'kvm-6.11-sev-snp' into HEADPaolo Bonzini
Pull base x86 KVM support for running SEV-SNP guests from Michael Roth: * add some basic infrastructure and introduces a new KVM_X86_SNP_VM vm_type to handle differences versus the existing KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types. * implement the KVM API to handle the creation of a cryptographic launch context, encrypt/measure the initial image into guest memory, and finalize it before launching it. * implement handling for various guest-generated events such as page state changes, onlining of additional vCPUs, etc. * implement the gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges as well as cleaning them up prior to returning them to the host for use as normal memory. Because those cleanup hooks supplant certain activities like issuing WBINVDs during KVM MMU invalidations, avoid duplicating that work to avoid unecessary overhead. This merge leaves out support support for attestation guest requests and for loading the signing keys to be used for attestation requests.
2024-06-03Merge tag 'kvm-riscv-fixes-6.10-1' of https://github.com/kvm-riscv/linux ↵Paolo Bonzini
into HEAD KVM/riscv fixes for 6.10, take #1 - No need to use mask when hart-index-bits is 0 - Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext()
2024-06-03Merge branch 'kvm-fixes-6.10-1' into HEADPaolo Bonzini
* Fixes and debugging help for the #VE sanity check. Also disable it by default, even for CONFIG_DEBUG_KERNEL, because it was found to trigger spuriously (most likely a processor erratum as the exact symptoms vary by generation). * Avoid WARN() when two NMIs arrive simultaneously during an NMI-disabled situation (GIF=0 or interrupt shadow) when the processor supports virtual NMI. While generally KVM will not request an NMI window when virtual NMIs are supported, in this case it *does* have to single-step over the interrupt shadow or enable the STGI intercept, in order to deliver the latched second NMI. * Drop support for hand tuning APIC timer advancement from userspace. Since we have adaptive tuning, and it has proved to work well, drop the module parameter for manual configuration and with it a few stupid bugs that it had.
2024-06-03Merge branch 'kvm-fixes-6.10-1' into HEADPaolo Bonzini
* Fixes and debugging help for the #VE sanity check. Also disable it by default, even for CONFIG_DEBUG_KERNEL, because it was found to trigger spuriously (most likely a processor erratum as the exact symptoms vary by generation). * Avoid WARN() when two NMIs arrive simultaneously during an NMI-disabled situation (GIF=0 or interrupt shadow) when the processor supports virtual NMI. While generally KVM will not request an NMI window when virtual NMIs are supported, in this case it *does* have to single-step over the interrupt shadow or enable the STGI intercept, in order to deliver the latched second NMI. * Drop support for hand tuning APIC timer advancement from userspace. Since we have adaptive tuning, and it has proved to work well, drop the module parameter for manual configuration and with it a few stupid bugs that it had.
2024-06-03KVM: x86: Drop support for hand tuning APIC timer advancement from userspaceSean Christopherson
Remove support for specifying a static local APIC timer advancement value, and instead present a read-only boolean parameter to let userspace enable or disable KVM's dynamic APIC timer advancement. Realistically, it's all but impossible for userspace to specify an advancement that is more precise than what KVM's adaptive tuning can provide. E.g. a static value needs to be tuned for the exact hardware and kernel, and if KVM is using hrtimers, likely requires additional tuning for the exact configuration of the entire system. Dropping support for a userspace provided value also fixes several flaws in the interface. E.g. KVM interprets a negative value other than -1 as a large advancement, toggling between a negative and positive value yields unpredictable behavior as vCPUs will switch from dynamic to static advancement, changing the advancement in the middle of VM creation can result in different values for vCPUs within a VM, etc. Those flaws are mostly fixable, but there's almost no justification for taking on yet more complexity (it's minimal complexity, but still non-zero). The only arguments against using KVM's adaptive tuning is if a setup needs a higher maximum, or if the adjustments are too reactive, but those are arguments for letting userspace control the absolute max advancement and the granularity of each adjustment, e.g. similar to how KVM provides knobs for halt polling. Link: https://lore.kernel.org/all/20240520115334.852510-1-zhoushuling@huawei.com Cc: Shuling Zhou <zhoushuling@huawei.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240522010304.1650603-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03KVM: SEV-ES: Delegate LBR virtualization to the processorRavi Bangoria
As documented in APM[1], LBR Virtualization must be enabled for SEV-ES guests. Although KVM currently enforces LBRV for SEV-ES guests, there are multiple issues with it: o MSR_IA32_DEBUGCTLMSR is still intercepted. Since MSR_IA32_DEBUGCTLMSR interception is used to dynamically toggle LBRV for performance reasons, this can be fatal for SEV-ES guests. For ex SEV-ES guest on Zen3: [guest ~]# wrmsr 0x1d9 0x4 KVM: entry failed, hardware error 0xffffffff EAX=00000004 EBX=00000000 ECX=000001d9 EDX=00000000 Fix this by never intercepting MSR_IA32_DEBUGCTLMSR for SEV-ES guests. No additional save/restore logic is required since MSR_IA32_DEBUGCTLMSR is of swap type A. o KVM will disable LBRV if userspace sets MSR_IA32_DEBUGCTLMSR before the VMSA is encrypted. Fix this by moving LBRV enablement code post VMSA encryption. [1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June 2023, Vol 2, 15.35.2 Enabling SEV-ES. https://bugzilla.kernel.org/attachment.cgi?id=304653 Fixes: 376c6d285017 ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading") Co-developed-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Message-ID: <20240531044644.768-4-ravi.bangoria@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03KVM: SEV-ES: Disallow SEV-ES guests when X86_FEATURE_LBRV is absentRavi Bangoria
As documented in APM[1], LBR Virtualization must be enabled for SEV-ES guests. So, prevent SEV-ES guests when LBRV support is missing. [1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June 2023, Vol 2, 15.35.2 Enabling SEV-ES. https://bugzilla.kernel.org/attachment.cgi?id=304653 Fixes: 376c6d285017 ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading") Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Message-ID: <20240531044644.768-3-ravi.bangoria@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03KVM: SEV-ES: Prevent MSR access post VMSA encryptionNikunj A Dadhania
KVM currently allows userspace to read/write MSRs even after the VMSA is encrypted. This can cause unintentional issues if MSR access has side- effects. For ex, while migrating a guest, userspace could attempt to migrate MSR_IA32_DEBUGCTLMSR and end up unintentionally disabling LBRV on the target. Fix this by preventing access to those MSRs which are context switched via the VMSA, once the VMSA is encrypted. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Message-ID: <20240531044644.768-2-ravi.bangoria@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03KVM: SVM: Remove the need to trigger an UNBLOCK event on AP creationTom Lendacky
All SNP APs are initially started using the APIC INIT/SIPI sequence in the guest. This sequence moves the AP MP state from KVM_MP_STATE_UNINITIALIZED to KVM_MP_STATE_RUNNABLE, so there is no need to attempt the UNBLOCK. As it is, the UNBLOCK support in SVM is only enabled when AVIC is enabled. When AVIC is disabled, AP creation is still successful. Remove the KVM_REQ_UNBLOCK request from the AP creation code and revert the changes to the vcpu_unblocking() kvm_x86_ops path. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03KVM: SEV: Don't WARN() if RMP lookup fails when invalidating gmem pagesPaolo Bonzini
The hook only handles cleanup work specific to SNP, e.g. RMP table entries and flushing caches for encrypted guest memory. When run on a non-SNP-enabled host (currently only possible using KVM_X86_SW_PROTECTED_VM, e.g. via KVM selftests), the callback is a noop and will WARN due to the RMP table not being present. It's actually expected in this case that the RMP table wouldn't be present and that the hook should be a noop, so drop the WARN_ONCE(). Reported-by: Sean Christopherson <seanjc@google.com> Closes: https://lore.kernel.org/kvm/ZkU3_y0UoPk5yAeK@google.com/ Fixes: 8eb01900b018 ("KVM: SEV: Implement gmem hook for invalidating private pages") Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03KVM: SEV: Automatically switch reclaimed pages to sharedMichael Roth
Currently there's a consistent pattern of always calling host_rmp_make_shared() immediately after snp_page_reclaim(), so go ahead and handle it automatically as part of snp_page_reclaim(). Also rename it to kvm_rmp_make_shared() to more easily distinguish it as a KVM-specific variant of the more generic rmp_make_shared() helper. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-03Merge tag 'loongarch-fixes-6.10-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Some bootloader interface fixes, a dts fix, and a trivial cleanup" * tag 'loongarch-fixes-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Fix GMAC's phy-mode definitions in dts LoongArch: Override higher address bits in JUMP_VIRT_ADDR LoongArch: Fix entry point in kernel image header LoongArch: Add all CPUs enabled by fdt to NUMA node 0 LoongArch: Fix built-in DTB detection LoongArch: Remove CONFIG_ACPI_TABLE_UPGRADE in platform_init()
2024-06-03irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()Hagar Hemdan
its_vlpi_prop_update() calls lpi_write_config() which obtains the mapping information for a VLPI without lock held. So it could race with its_vlpi_unmap(). Since all calls from its_irq_set_vcpu_affinity() require the same lock to be held, hoist the locking there instead of sprinkling the locking all over the place. This bug was discovered using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. [ tglx: Use guard() instead of goto ] Fixes: 015ec0386ab6 ("irqchip/gic-v3-its: Add VLPI configuration handling") Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Hagar Hemdan <hagarhem@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240531162144.28650-1-hagarhem@amazon.com
2024-06-03bpf: Fix a potential use-after-free in bpf_link_free()Cong Wang
After commit 1a80dbcb2dba, bpf_link can be freed by link->ops->dealloc_deferred, but the code still tests and uses link->ops->dealloc afterward, which leads to a use-after-free as reported by syzbot. Actually, one of them should be sufficient, so just call one of them instead of both. Also add a WARN_ON() in case of any problematic implementation. Fixes: 1a80dbcb2dba ("bpf: support deferring bpf_link dealloc to after RCU grace period") Reported-by: syzbot+1989ee16d94720836244@syzkaller.appspotmail.com Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240602182703.207276-1-xiyou.wangcong@gmail.com
2024-06-03KVM: VMX: Switch to new Intel CPU model infrastructureTony Luck
Use x86_vfm (vendor, family, module) to detect CPUs that are affected by PERF_GLOBAL_CTRL bugs instead of manually checking the family and model. The new VFM infrastructure encodes all information in one handy location. No functional change intended. Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240520224620.9480-10-tony.luck@intel.com [sean: massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: x86/pmu: Switch to new Intel CPU model definesTony Luck
Use X86_MATCH_VFM(), which does Vendor checking in addition to Family and Model checking, to do FMS-based detection of PEBS features. No functional change intended. Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240520224620.9480-9-tony.luck@intel.com [sean: massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: x86: Remove IA32_PERF_GLOBAL_OVF_CTRL from KVM_GET_MSR_INDEX_LISTJim Mattson
This MSR reads as 0, and any host-initiated writes are ignored, so there's no reason to enumerate it in KVM_GET_MSR_INDEX_LIST. Signed-off-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20231113184854.2344416-1-jmattson@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03cpufreq: intel_pstate: Fix unchecked HWP MSR accessSrinivas Pandruvada
Fix unchecked MSR access error for processors with no HWP support. On such processors, maximum frequency can be changed by the system firmware using ACPI event ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED. This results in accessing HWP MSR 0x771. Call Trace: <TASK> generic_exec_single+0x58/0x120 smp_call_function_single+0xbf/0x110 rdmsrl_on_cpu+0x46/0x60 intel_pstate_get_hwp_cap+0x1b/0x70 intel_pstate_update_limits+0x2a/0x60 acpi_processor_notify+0xb7/0x140 acpi_ev_notify_dispatch+0x3b/0x60 HWP MSR 0x771 can be only read on a CPU which supports HWP and enabled. Hence intel_pstate_get_hwp_cap() can only be called when hwp_active is true. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Closes: https://lore.kernel.org/linux-pm/20240529155740.Hq2Hw7be@linutronix.de/ Fixes: e8217b4bece3 ("cpufreq: intel_pstate: Update the maximum CPU frequency consistently") Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-03KVM: x86: Move shadow_phys_bits into "kvm_host", as "maxphyaddr"Sean Christopherson
Move shadow_phys_bits into "struct kvm_host_values", i.e. into KVM's global "kvm_host" variable, so that it is automatically exported for use in vendor modules. Rename the variable/field to maxphyaddr to more clearly capture what value it holds, now that it's used outside of the MMU (and because the "shadow" part is more than a bit misleading as the variable is not at all unique to shadow paging). Recomputing the raw/true host.MAXPHYADDR on every use can be subtly expensive, e.g. it will incur a VM-Exit on the CPUID if KVM is running as a nested hypervisor. Vendor code already has access to the information, e.g. by directly doing CPUID or by invoking kvm_get_shadow_phys_bits(), so there's no tangible benefit to making it MMU-only. Link: https://lore.kernel.org/r/20240423221521.2923759-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: x86/mmu: Snapshot shadow_phys_bits when kvm.ko is loadedSean Christopherson
Snapshot shadow_phys_bits when kvm.ko is loaded, not when a vendor module is loaded, to guard against usage of shadow_phys_bits before it is initialized. The computation isn't vendor specific in any way, i.e. there there is no reason to wait to snapshot the value until a vendor module is loaded, nor is there any reason to recompute the value every time a vendor module is loaded. Opportunistically convert it from "read mostly" to "read-only after init". Link: https://lore.kernel.org/r/20240423221521.2923759-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: SVM: Use KVM's snapshot of the host's XCR0 for SEV-ES host stateSean Christopherson
Use KVM's snapshot of the host's XCR0 when stuffing SEV-ES host state instead of reading XCR0 from hardware. XCR0 is only written during boot, i.e. won't change while KVM is running (and KVM at large is hosed if that doesn't hold true). Link: https://lore.kernel.org/r/20240423221521.2923759-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: x86: Add a struct to consolidate host values, e.g. EFER, XCR0, etc...Sean Christopherson
Add "struct kvm_host_values kvm_host" to hold the various host values that KVM snapshots during initialization. Bundling the host values into a single struct simplifies adding new MSRs and other features with host state/values that KVM cares about, and provides a one-stop shop. E.g. adding a new value requires one line, whereas tracking each value individual often requires three: declaration, definition, and export. No functional change intended. Link: https://lore.kernel.org/r/20240423221521.2923759-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: selftests: remove unused struct 'memslot_antagonist_args'Dr. David Alan Gilbert
'memslot_antagonist_args' is unused since the original commit f73a3446252e ("KVM: selftests: Add memslot modification stress test"). Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20240602235529.228204-1-linux@treblig.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: fix documentation rendering for KVM_CAP_VM_MOVE_ENC_CONTEXT_FROMJulian Stecklina
The documentation for KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM doesn't use the correct keyword formatting, which breaks rendering on https://www.kernel.org/doc/html/latest/virt/kvm/api.html. Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de> Link: https://lore.kernel.org/r/20240520143220.340737-1-julian.stecklina@cyberus-technology.de Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03Revert "KVM: async_pf: avoid recursive flushing of work items"Sean Christopherson
Now that KVM does NOT gift async #PF workers a "struct kvm" reference, don't bother skipping "done" workers when flushing/canceling queued workers, as the deadlock that was being fudged around can no longer occur. When workers, i.e. async_pf_execute(), were gifted a referenced, it was possible for a worker to put the last reference and trigger VM destruction, i.e. trigger flushing of a workqueue from a worker in said workqueue. Note, there is no actual lock, the deadlock was that a worker will be stuck waiting for itself (the workqueue code simulates a lock/unlock via lock_map_{acquire,release}()). Skipping "done" workers isn't problematic per se, but using work->vcpu as a "done" flag is confusing, e.g. it's not clear that async_pf.lock is acquired to protect the work->vcpu, NOT the processing of async_pf.queue (which is protected by vcpu->mutex). This reverts commit 22583f0d9c85e60c9860bc8a0ebff59fe08be6d7. Suggested-by: Xu Yilun <yilun.xu@linux.intel.com> Link: https://lore.kernel.org/r/20240423191649.2885257-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: Update halt polling documentation to note that KVM has 4 module paramsParshuram Sangle
Update KVM's halt-polling documentation to correclty reflect that KVM has 4 relevant module params, not 3 params. Co-developed-by: Rajendran Jaishankar <jaishankar.rajendran@intel.com> Signed-off-by: Rajendran Jaishankar <jaishankar.rajendran@intel.com> Signed-off-by: Parshuram Sangle <parshuram.sangle@intel.com> Link: https://lore.kernel.org/r/20231102154628.2120-3-parshuram.sangle@intel.com [sean: drop unrelated and misleading doc update] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: Enable halt polling shrink parameter by defaultParshuram Sangle
Default halt_poll_ns_shrink value of 0 always resets polling interval to 0 on an un-successful poll where vcpu wakeup is not received. This is mostly to avoid pointless polling for more number of shorter intervals. But disabled shrink assumes vcpu wakeup is less likely to be received in subsequent shorter polling intervals. Another side effect of 0 shrink value is that, even on a successful poll if total block time was greater than current polling interval, the polling interval starts over from 0 instead of shrinking by a factor. Enabling shrink with value of 2 allows the polling interval to gradually decrement in case of un-successful poll events as well. This gives a fair chance for successful polling events in subsequent polling intervals rather than resetting it to 0 and starting over from grow_start. Below kvm stat log snippet shows interleaved growth and shrinking of polling interval: 87162647182125: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (grow 0) 87162647637763: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (grow 10000) 87162649627943: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 40000 (grow 20000) 87162650892407: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (shrink 40000) 87162651540378: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 40000 (grow 20000) 87162652276768: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (shrink 40000) 87162652515037: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 40000 (grow 20000) 87162653383787: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (shrink 40000) 87162653627670: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (shrink 20000) 87162653796321: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (grow 10000) 87162656171645: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (shrink 20000) 87162661607487: kvm_halt_poll_ns: vcpu 0: halt_poll_ns 0 (shrink 10000) Having both grow and shrink enabled creates a balance in polling interval growth and shrink behavior. Tests show improved successful polling attempt ratio which contribute to VM performance. Power penalty is quite negligible as shrunk polling intervals create bursts of very short durations. Performance assessment results show 3-6% improvements in CPU+GPU, Memory and Storage Android VM workloads whereas 5-9% improvement in average FPS of gaming VM workloads. Power penalty is below 1% where host OS is either idle or running a native workload having 2 VMs enabled. CPU/GPU intensive gaming workloads as well do not show any increased power overhead with shrink enabled. Co-developed-by: Rajendran Jaishankar <jaishankar.rajendran@intel.com> Signed-off-by: Rajendran Jaishankar <jaishankar.rajendran@intel.com> Signed-off-by: Parshuram Sangle <parshuram.sangle@intel.com> Link: https://lore.kernel.org/r/20231102154628.2120-2-parshuram.sangle@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03KVM: Unexport kvm_debugfs_dirBorislav Petkov
After faf01aef0570 ("KVM: PPC: Merge powerpc's debugfs entry content into generic entry") kvm_debugfs_dir is not used anywhere else outside of kvm_main.c Unexport it and make it static. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240515150804.9354-1-bp@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-03x86/kexec: Fix bug with call depth trackingDavid Kaplan
The call to cc_platform_has() triggers a fault and system crash if call depth tracking is active because the GS segment has been reset by load_segments() and GS_BASE is now 0 but call depth tracking uses per-CPU variables to operate. Call cc_platform_has() earlier in the function when GS is still valid. [ bp: Massage. ] Fixes: 5d8213864ade ("x86/retbleed: Add SKL return thunk") Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20240603083036.637-1-bp@kernel.org
2024-06-03bpf, devmap: Remove unnecessary if check in for loopThorsten Blum
The iterator variable dst cannot be NULL and the if check can be removed. Remove it and fix the following Coccinelle/coccicheck warning reported by itnull.cocci: ERROR: iterator variable bound on line 762 cannot be NULL Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240529101900.103913-2-thorsten.blum@toblux.com
2024-06-03ASoC: rt722-sdca-sdw: add silence detection register as volatileJack Yu
Including silence detection register as volatile. Signed-off-by: Jack Yu <jack.yu@realtek.com> Link: https://msgid.link/r/c66a6bd6d220426793096b42baf85437@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-06-03ASoC: fsl: add missing MODULE_DESCRIPTION() macroJeff Johnson
make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/fsl/imx-pcm-dma.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://msgid.link/r/20240602-md-snd-fsl-imx-pcm-dma-v1-1-e7efc33c6bf3@quicinc.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-06-03ASoC: mxs: add missing MODULE_DESCRIPTION() macroJeff Johnson
make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/mxs/snd-soc-mxs-pcm.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://msgid.link/r/20240602-md-snd-soc-mxs-pcm-v1-1-1e663d11328d@quicinc.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-06-03MAINTAINERS: copy linux-arm-msm for sound/qcom changesDmitry Baryshkov
Not having linux-arm-msm@ in cc for audio-related changes for Qualcomm platforms means that interested parties can easily miss the patches. Add corresponding L: entry so that linux-arm-msm ML gets CC'ed for audio patches too. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://msgid.link/r/20240531-asoc-qcom-cc-lamsm-v1-1-f026ad618496@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-06-03ASoC: SOF: Intel: hda-dai: remove skip_tlv labelBard Liao
We just return 0 after the skip_tlv label. No need to use a label. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://msgid.link/r/20240603073224.14726-3-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>