summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-06-24ASoC: SOF: Intel: hda-loader: Make sure that the fw load sequence is followedPeter Ujfalusi
The hda_dsp_enable_core() is powering up _and_ unstall the core in one call while the first step of the firmware loading must not unstall the core. The core can be unstalled only after the set cpb_cfp and the configuration of the IPC register for the ROM_CONTROL message. Complements: 2a68ff846164 ("ASoC: SOF: Intel: hda: Revisit IMR boot sequence") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/20220609085949.29062-3-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24ASoC: SOF: Intel: hda-dsp: Expose hda_dsp_core_power_up()Peter Ujfalusi
The hda_dsp_core_power_up() needs to be exposed so that it can be used in hda-loader.c to correct the boot flow. The first step must not unstall the core, it should only power up the core(s). Add sanity check for the core_mask while exposing it to be safe. Complements: 2a68ff846164 ("ASoC: SOF: Intel: hda: Revisit IMR boot sequence") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/20220609085949.29062-2-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24ASoC: ak4613: cares Simple-Audio-Card case for TDMKuninori Morimoto
Renesas is the only user of ak4613 on upstream for now, and commit f28dbaa958fbd8 ("ASoC: ak4613: add TDM256 support") added TDM256 support. Renesas tested part of it, because of board connection. It was assuming ak4613 is probed via Audio-Graph-Card, but it might be probed via Simple-Audio-Card either. It will indicates WARNING in such case. This patch fixup it. Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/87h74v29f7.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24ASoC: codecs: rt700/rt711/rt711-sdca: resume bus/codec in .set_jack_detectPierre-Louis Bossart
The .set_jack_detect() codec component callback is invoked during card registration, which happens when the machine driver is probed. The issue is that this callback can race with the bus suspend/resume, and IO timeouts can happen. This can be reproduced very easily if the machine driver is 'blacklisted' and manually probed after the bus suspends. The bus and codec need to be re-initialized using pm_runtime helpers. Previous contributions tried to make sure accesses to the bus during the .set_jack_detect() component callback only happen when the bus is active. This was done by changing the regcache status on a component remove. This is however a layering violation, the regcache status should only be modified on device probe, suspend and resume. The component probe/remove should not modify how the device regcache is handled. This solution also didn't handle all the possible race conditions, and the RT700 headset codec was not handled. This patch tries to resume the codec device before handling the jack initializations. In case the codec has not yet been initialized, pm_runtime may not be enabled yet, so we don't squelch the -EACCES error code and only stop the jack information. When the codec reports as attached, the jack initialization will proceed as usual. BugLink: https://github.com/thesofproject/linux/issues/3643 Fixes: 7ad4d237e7c4a ('ASoC: rt711-sdca: Add RT711 SDCA vendor-specific driver') Fixes: 899b12542b089 ('ASoC: rt711: add snd_soc_component remove callback') Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20220606203752.144159-8-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24ASoC: codecs: rt700/rt711/rt711-sdca: initialize workqueues in probePierre-Louis Bossart
The workqueues are initialized in the io_init functions, which isn't quite right. In some tests, this leads to warnings throw from __queue_delayed_work() WARN_ON_FUNCTION_MISMATCH(timer->function, delayed_work_timer_fn); Move all the initializations to the probe functions. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20220606203752.144159-7-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24ASoC: rt7*-sdw: harden jack_detect_handlerPierre-Louis Bossart
Realtek headset codec drivers typically check if the card is instantiated before proceeding with the jack detection. The rt700, rt711 and rt711-sdca are however missing a check on the card pointer, which can lead to NULL dereferences encountered in driver bind/unbind tests. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20220606203752.144159-6-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24ASoC: rt711: fix calibrate mutex initializationPierre-Louis Bossart
Follow the same flow as rt711-sdca and initialize all mutexes at probe time. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20220606203752.144159-5-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24ASoC: Intel: sof_sdw: handle errors on card registrationPierre-Louis Bossart
If the card registration fails, typically because of deferred probes, the device properties added for headset codecs are not removed, which leads to kernel oopses in driver bind/unbind tests. We already clean-up the device properties when the card is removed, this code can be moved as a helper and called upon card registration errors. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20220606203752.144159-4-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24ASoC: rt711-sdca-sdw: fix calibrate mutex initializationPierre-Louis Bossart
In codec driver bind/unbind test, the following warning is thrown: DEBUG_LOCKS_WARN_ON(lock->magic != lock) ... [ 699.182495] rt711_sdca_jack_init+0x1b/0x1d0 [snd_soc_rt711_sdca] [ 699.182498] rt711_sdca_set_jack_detect+0x3b/0x90 [snd_soc_rt711_sdca] [ 699.182500] snd_soc_component_set_jack+0x24/0x50 [snd_soc_core] A quick check in the code shows that the 'calibrate_mutex' used by this driver are not initialized at probe time. Moving the initialization to the probe removes the issue. BugLink: https://github.com/thesofproject/linux/issues/3644 Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20220606203752.144159-3-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24ASoC: Realtek/Maxim SoundWire codecs: disable pm_runtime on removePierre-Louis Bossart
When binding/unbinding codec drivers, the following warnings are thrown: [ 107.266879] rt715-sdca sdw:3:025d:0714:01: Unbalanced pm_runtime_enable! [ 306.879700] rt711-sdca sdw:0:025d:0711:01: Unbalanced pm_runtime_enable! Add a remove callback for all Realtek/Maxim SoundWire codecs and remove this warning. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20220606203752.144159-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-06-24Merge tag 'memory-controller-drv-fixes-5.19' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl into arm/fixes Memory controller drivers - fixes for v5.19 Broken in current cycle: 1. OMAP GPMC: fix Kconfig dependency for OMAP_GPMC, so it will not be visible for everyone (driver is specific to OMAP). Broken before: 1. Mediatek SMI: fix missing put_device() in error paths. 2. Exynos DMC: fix OF node leaks in error paths. * tag 'memory-controller-drv-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl: memory: samsung: exynos5422-dmc: Fix refcount leak in of_get_dram_timings memory: mtk-smi: add missing put_device() call in mtk_smi_device_link_common memory: omap-gpmc: OMAP_GPMC should depend on ARCH_OMAP2PLUS || ARCH_KEYSTONE || ARCH_K3 Link: https://lore.kernel.org/r/20220624081819.33617-1-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24Merge tag 'samsung-fixes-5.19' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/fixes Samsung fixes for v5.19 Both fixes are for issues present before v5.19 merge window: 1. Correct UART clocks on Exynos7885. Although the initial, fixed DTS commit is from v5.18, the issue will be exposed with a upcoming fix to Exynos7885 clock driver, so we need to correct the DTS earlier. 2. Fix theoretical OF node leak in Exynos machine code. * tag 'samsung-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ARM: exynos: Fix refcount leak in exynos_map_pmu arm64: dts: exynos: Correct UART clocks on Exynos7885 Link: https://lore.kernel.org/r/20220624080423.31427-1-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24Merge tag 'arm-soc/for-5.19/devicetree-fixes' of ↵Arnd Bergmann
https://github.com/Broadcom/stblinux into arm/fixes This pull request contains Broadcom ARM-SoC Device Tree fixes for 5.19, please pull the following: - Stefan fixes the Raspberry Pi 400 GPIO expander line names to match that of the downstream Raspberry Pi Linux tree * tag 'arm-soc/for-5.19/devicetree-fixes' of https://github.com/Broadcom/stblinux: ARM: dts: bcm2711-rpi-400: Fix GPIO line names Link: https://lore.kernel.org/r/20220620170745.2485199-1-f.fainelli@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24Merge tag 'ti-k3-dt-fixes-for-v5.19' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/ti/linux into arm/fixes Devicetree fixes for TI K3 platforms for v5.19 Critical fixes for the following: * Boot failure on J721s2 (overlap GIC memory map) * AM64 boot fails on highspeed cards (SoC characterization updates) * tag 'ti-k3-dt-fixes-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ti/linux: arm64: dts: ti: k3-am64-main: Remove support for HS400 speed mode arm64: dts: ti: k3-j721s2: Fix overlapping GICD memory region Link: https://lore.kernel.org/r/20220618031627.xxvscc22c6doaa3t@kahuna Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24arm: mach-spear: Add missing of_node_put() in time.cLiang He
In spear_setup_of_timer(), of_find_matching_node() will return a node pointer with refcount incrementd. We should use of_node_put() in each fail path or when it is not used anymore. Signed-off-by: Liang He <windhl@126.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20220616093027.3984903-1-windhl@126.com' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24ARM: cns3xxx: Fix refcount leak in cns3xxx_initMiaoqian Lin
of_find_compatible_node() returns a node pointer with refcount incremented, we should use of_node_put() on it when done. Add missing of_node_put() to avoid refcount leak. Fixes: 415f59142d9d ("ARM: cns3xxx: initial DT support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Acked-by: Krzysztof Halasa <khalasa@piap.pl> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24MAINTAINERS: Update email addressThara Gopinath
Update my email address in the MAINTAINERS file as the current one will stop functioning in a while. Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-24cifs: avoid deadlocks while updating ifaceShyam Prasad N
We use cifs_tcp_ses_lock to protect a lot of things. Not only does it protect the lists of connections, sessions, tree connects, open file lists, etc., we also use it to protect some fields in each of it's entries. In this case, cifs_mark_ses_for_reconnect takes the cifs_tcp_ses_lock to traverse the lists, and then calls cifs_update_iface. However, that can end up calling cifs_put_tcp_session, which picks up the same lock again. Avoid this by taking a ref for the session, drop the lock, and then call update iface. Also, in cifs_update_iface, avoid nested locking of iface_lock and chan_lock, as much as possible. When unavoidable, we need to pick iface_lock first. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-06-24MAINTAINERS: Add new IOMMU development mailing listJoerg Roedel
The IOMMU mailing list will move from lists.linux-foundation.org to lists.linux.dev. The hard switch of the archive will happen on July 5th, but add the new list now already so that people start using the list when sending patches. After July 5th the old list will disappear. Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20220624125139.412-1-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-06-24usb: chipidea: udc: check request status before setting device addressXu Yang
The complete() function may be called even though request is not completed. In this case, it's necessary to check request status so as not to set device address wrongly. Fixes: 10775eb17bee ("usb: chipidea: udc: update gadget states according to ch9") cc: <stable@vger.kernel.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20220623030242.41796-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-24USB: gadget: Fix double-free bug in raw_gadget driverAlan Stern
Re-reading a recently merged fix to the raw_gadget driver showed that it inadvertently introduced a double-free bug in a failure pathway. If raw_ioctl_init() encounters an error after the driver ID number has been allocated, it deallocates the ID number before returning. But when dev_free() runs later on, it will then try to deallocate the ID number a second time. Closely related to this issue is another error in the recent fix: The ID number is stored in the raw_dev structure before the code checks to see whether the structure has already been initialized, in which case the new ID number would overwrite the earlier value. The solution to both bugs is to keep the new ID number in a local variable, and store it in the raw_dev structure only after the check for prior initialization. No errors can occur after that point, so the double-free will never happen. Fixes: f2d8c2606825 ("usb: gadget: Fix non-unique driver names in raw-gadget driver") CC: Andrey Konovalov <andreyknvl@gmail.com> CC: <stable@vger.kernel.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/YrMrRw5AyIZghN0v@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-24ALSA: usb-audio: Workarounds for Behringer UMC 204/404 HDTakashi Iwai
Both Behringer UMC 202 HD and 404 HD need explicit quirks to enable the implicit feedback mode and start the playback stream primarily. The former seems fixing the stuttering and the latter is required for a playback-only case. Note that the "clock source 41 is not valid" error message still appears even after this fix, but it should be only once at probe. The reason of the error is still unknown, but this seems to be mostly harmless as it's a one-off error and the driver retires the clock setup and it succeeds afterwards. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215934 Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220624101132.14528-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-24crypto: ccp - Fix device IRQ counting by using platform_irq_count()Tom Lendacky
The ccp driver loops through the platform device resources array to get the IRQ count for the device. With commit a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core"), the IRQ resources are no longer stored in the platform device resource array. As a result, the IRQ count is now always zero. This causes the driver to issue a second call to platform_get_irq(), which fails if the IRQ count is really 1, causing the loading of the driver to fail. Replace looping through the resources array to count the number of IRQs with a call to platform_irq_count(). Fixes: a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-24KVM: selftests: Enhance handling WRMSR ICR register in x2APIC modeZeng Guang
Hardware would directly write x2APIC ICR register instead of software emulation in some circumstances, e.g when Intel IPI virtualization is enabled. This behavior requires normal reserved bits checking to ensure them input as zero, otherwise it will cause #GP. So we need mask out those reserved bits from the data written to vICR register. Remove Delivery Status bit emulation in test case as this flag is invalid and not needed in x2APIC mode. KVM may ignore clearing it during interrupt dispatch which will lead to fake test failure. Opportunistically correct vector number for test sending IPI to non-existent vCPUs. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Message-Id: <20220623094511.26066-1-guang.zeng@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: selftests: Add a self test for CMCI and UCNA emulations.Jue Wang
This patch add a self test that verifies user space can inject UnCorrectable No Action required (UCNA) memory errors to the guest. It also verifies that incorrectly configured MSRs for Corrected Machine Check Interrupt (CMCI) emulation will result in #GP. Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220610171134.772566-9-juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86: Enable CMCI capability by default and handle injected UCNA errorsJue Wang
This patch enables MCG_CMCI_P by default in kvm_mce_cap_supported. It reuses ioctl KVM_X86_SET_MCE to implement injection of UnCorrectable No Action required (UCNA) errors, signaled via Corrected Machine Check Interrupt (CMCI). Neither of the CMCI and UCNA emulations depends on hardware. Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220610171134.772566-8-juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86: Add emulation for MSR_IA32_MCx_CTL2 MSRs.Jue Wang
This patch adds the emulation of IA32_MCi_CTL2 registers to KVM. A separate mci_ctl2_banks array is used to keep the existing mce_banks register layout intact. In Machine Check Architecture, in addition to MCG_CMCI_P, bit 30 of the per-bank register IA32_MCi_CTL2 controls whether Corrected Machine Check error reporting is enabled. Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220610171134.772566-7-juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86: Use kcalloc to allocate the mce_banks array.Jue Wang
This patch updates the allocation of mce_banks with the array allocation API (kcalloc) as a precedent for the later mci_ctl2_banks to implement per-bank control of Corrected Machine Check Interrupt (CMCI). Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220610171134.772566-6-juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86: Add Corrected Machine Check Interrupt (CMCI) emulation to lapic.Jue Wang
This patch calculates the number of lvt entries as part of KVM_X86_MCE_SETUP conditioned on the presence of MCG_CMCI_P bit in MCG_CAP and stores result in kvm_lapic. It translats from APIC_LVTx register to index in lapic_lvt_entry enum. It extends the APIC_LVTx macro as well as other lapic write/reset handling etc to support Corrected Machine Check Interrupt. Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220610171134.772566-5-juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86: Add APIC_LVTx() macro.Jue Wang
An APIC_LVTx macro is introduced to calcualte the APIC_LVTx register offset based on the index in the lapic_lvt_entry enum. Later patches will extend the APIC_LVTx macro to support the APIC_LVTCMCI register in order to implement Corrected Machine Check Interrupt signaling. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220610171134.772566-4-juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Avoid unnecessary flush on eager page splitPaolo Bonzini
The TLB flush before installing the newly-populated lower level page table is unnecessary if the lower-level page table maps the huge page identically. KVM knows it is if it did not reuse an existing shadow page table, tell drop_large_spte() to skip the flush in that case. Extracted from a patch by David Matlack. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86: Fill apic_lvt_mask with enums / explicit entries.Jue Wang
This patch defines a lapic_lvt_entry enum used as explicit indices to the apic_lvt_mask array. In later patches a LVT_CMCI will be added to implement the Corrected Machine Check Interrupt signaling. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220610171134.772566-3-juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86: Make APIC_VERSION capture only the magic 0x14UL.Jue Wang
Refactor APIC_VERSION so that the maximum number of LVT entries is inserted at runtime rather than compile time. This will be used in a subsequent commit to expose the LVT CMCI Register to VMs that support Corrected Machine Check error counting/signaling (IA32_MCG_CAP.MCG_CMCI_P=1). Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220610171134.772566-2-juew@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Extend Eager Page Splitting to nested MMUsDavid Matlack
Add support for Eager Page Splitting pages that are mapped by nested MMUs. Walk through the rmap first splitting all 1GiB pages to 2MiB pages, and then splitting all 2MiB pages to 4KiB pages. Note, Eager Page Splitting is limited to nested MMUs as a policy rather than due to any technical reason (the sp->role.guest_mode check could just be deleted and Eager Page Splitting would work correctly for all shadow MMU pages). There is really no reason to support Eager Page Splitting for tdp_mmu=N, since such support will eventually be phased out, and there is no current use case supporting Eager Page Splitting on hosts where TDP is either disabled or unavailable in hardware. Furthermore, future improvements to nested MMU scalability may diverge the code from the legacy shadow paging implementation. These improvements will be simpler to make if Eager Page Splitting does not have to worry about legacy shadow paging. Splitting huge pages mapped by nested MMUs requires dealing with some extra complexity beyond that of the TDP MMU: (1) The shadow MMU has a limit on the number of shadow pages that are allowed to be allocated. So, as a policy, Eager Page Splitting refuses to split if there are KVM_MIN_FREE_MMU_PAGES or fewer pages available. (2) Splitting a huge page may end up re-using an existing lower level shadow page tables. This is unlike the TDP MMU which always allocates new shadow page tables when splitting. (3) When installing the lower level SPTEs, they must be added to the rmap which may require allocating additional pte_list_desc structs. Case (2) is especially interesting since it may require a TLB flush, unlike the TDP MMU which can fully split huge pages without any TLB flushes. Specifically, an existing lower level page table may point to even lower level page tables that are not fully populated, effectively unmapping a portion of the huge page, which requires a flush. As of this commit, a flush is always done always after dropping the huge page and before installing the lower level page table. This TLB flush could instead be delayed until the MMU lock is about to be dropped, which would batch flushes for multiple splits. However these flushes should be rare in practice (a huge page must be aliased in multiple SPTEs and have been split for NX Huge Pages in only some of them). Flushing immediately is simpler to plumb and also reduces the chances of tripping over a CPU bug (e.g. see iTLB multihit). [ This commit is based off of the original implementation of Eager Page Splitting from Peter in Google's kernel from 2016. ] Suggested-by: Peter Feiner <pfeiner@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-23-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: Allow for different capacities in kvm_mmu_memory_cache structsDavid Matlack
Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at declaration time rather than being fixed for all declarations. This will be used in a follow-up commit to declare an cache in x86 with a capacity of 512+ objects without having to increase the capacity of all caches in KVM. This change requires each cache now specify its capacity at runtime, since the cache struct itself no longer has a fixed capacity known at compile time. To protect against someone accidentally defining a kvm_mmu_memory_cache struct directly (without the extra storage), this commit includes a WARN_ON() in kvm_mmu_topup_memory_cache(). In order to support different capacities, this commit changes the objects pointer array to be dynamically allocated the first time the cache is topped-up. While here, opportunistically clean up the stack-allocated kvm_mmu_memory_cache structs in riscv and arm64 to use designated initializers. No functional change intended. Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-22-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: pull call to drop_large_spte() into __link_shadow_page()Paolo Bonzini
Before allocating a child shadow page table, all callers check whether the parent already points to a huge page and, if so, they drop that SPTE. This is done by drop_large_spte(). However, dropping the large SPTE is really only necessary before the sp is installed. While the sp is returned by kvm_mmu_get_child_sp(), installing it happens later in __link_shadow_page(). Move the call there instead of having it in each and every caller. To ensure that the shadow page is not linked twice if it was present, do _not_ opportunistically make kvm_mmu_get_child_sp() idempotent: instead, return an error value if the shadow page already existed. This is a bit more verbose, but clearer than NULL. Finally, now that the drop_large_spte() name is not taken anymore, remove the two underscores in front of __drop_large_spte(). Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Zap collapsible SPTEs in shadow MMU at all possible levelsDavid Matlack
Currently KVM only zaps collapsible 4KiB SPTEs in the shadow MMU. This is fine for now since KVM never creates intermediate huge pages during dirty logging. In other words, KVM always replaces 1GiB pages directly with 4KiB pages, so there is no reason to look for collapsible 2MiB pages. However, this will stop being true once the shadow MMU participates in eager page splitting. During eager page splitting, each 1GiB is first split into 2MiB pages and then those are split into 4KiB pages. The intermediate 2MiB pages may be left behind if an error condition causes eager page splitting to bail early. No functional change intended. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-20-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Extend make_huge_page_split_spte() for the shadow MMUDavid Matlack
Currently make_huge_page_split_spte() assumes execute permissions can be granted to any 4K SPTE when splitting huge pages. This is true for the TDP MMU but is not necessarily true for the shadow MMU, since KVM may be shadowing a non-executable huge page. To fix this, pass in the role of the child shadow page where the huge page will be split and derive the execution permission from that. This is correct because huge pages are always split with direct shadow page and thus the shadow page role contains the correct access permissions. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-19-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Cache the access bits of shadowed translationsDavid Matlack
Splitting huge pages requires allocating/finding shadow pages to replace the huge page. Shadow pages are keyed, in part, off the guest access permissions they are shadowing. For fully direct MMUs, there is no shadowing so the access bits in the shadow page role are always ACC_ALL. But during shadow paging, the guest can enforce whatever access permissions it wants. In particular, eager page splitting needs to know the permissions to use for the subpages, but KVM cannot retrieve them from the guest page tables because eager page splitting does not have a vCPU. Fortunately, the guest access permissions are easy to cache whenever page faults or FNAME(sync_page) update the shadow page tables; this is an extension of the existing cache of the shadowed GFNs in the gfns array of the shadow page. The access bits only take up 3 bits, which leaves 61 bits left over for gfns, which is more than enough. Now that the gfns array caches more information than just GFNs, rename it to shadowed_translation. While here, preemptively fix up the WARN_ON() that detects gfn mismatches in direct SPs. The WARN_ON() was paired with a pr_err_ratelimited(), which means that users could sometimes see the WARN without the accompanying error message. Fix this by outputting the error message as part of the WARN splat, and opportunistically make them WARN_ONCE() because if these ever fire, they are all but guaranteed to fire a lot and will bring down the kernel. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-18-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Update page stats in __rmap_add()David Matlack
Update the page stats in __rmap_add() rather than at the call site. This will avoid having to manually update page stats when splitting huge pages in a subsequent commit. No functional change intended. Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-17-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Decouple rmap_add() and link_shadow_page() from kvm_vcpuDavid Matlack
Allow adding new entries to the rmap and linking shadow pages without a struct kvm_vcpu pointer by moving the implementation of rmap_add() and link_shadow_page() into inner helper functions. No functional change intended. Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-16-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Pass const memslot to rmap_add()David Matlack
Constify rmap_add()'s @slot parameter; it is simply passed on to gfn_to_rmap(), which takes a const memslot. No functional change intended. Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-15-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Allow NULL @vcpu in kvm_mmu_find_shadow_page()David Matlack
Allow @vcpu to be NULL in kvm_mmu_find_shadow_page() (and its only caller __kvm_mmu_get_shadow_page()). @vcpu is only required to sync indirect shadow pages, so it's safe to pass in NULL when looking up direct shadow pages. This will be used for doing eager page splitting, which allocates direct shadow pages from the context of a VM ioctl without access to a vCPU pointer. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-14-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Pass kvm pointer separately from vcpu to ↵David Matlack
kvm_mmu_find_shadow_page() Get the kvm pointer from the caller, rather than deriving it from vcpu->kvm, and plumb the kvm pointer all the way from kvm_mmu_get_shadow_page(). With this change in place, the vcpu pointer is only needed to sync indirect shadow pages. In other words, __kvm_mmu_get_shadow_page() can now be used to get *direct* shadow pages without a vcpu pointer. This enables eager page splitting, which needs to allocate direct shadow pages during VM ioctls. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-13-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Replace vcpu with kvm in kvm_mmu_alloc_shadow_page()David Matlack
The vcpu pointer in kvm_mmu_alloc_shadow_page() is only used to get the kvm pointer. So drop the vcpu pointer and just pass in the kvm pointer. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-12-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Pass memory caches to allocate SPs separatelyDavid Matlack
Refactor kvm_mmu_alloc_shadow_page() to receive the caches from which it will allocate the various pieces of memory for shadow pages as a parameter, rather than deriving them from the vcpu pointer. This will be useful in a future commit where shadow pages are allocated during VM ioctls for eager page splitting, and thus will use a different set of caches. Preemptively pull the caches out all the way to kvm_mmu_get_shadow_page() since eager page splitting will not be calling kvm_mmu_alloc_shadow_page() directly. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-11-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Move guest PT write-protection to account_shadowed()David Matlack
Move the code that write-protects newly-shadowed guest page tables into account_shadowed(). This avoids a extra gfn-to-memslot lookup and is a more logical place for this code to live. But most importantly, this reduces kvm_mmu_alloc_shadow_page()'s reliance on having a struct kvm_vcpu pointer, which will be necessary when creating new shadow pages during VM ioctls for eager page splitting. Note, it is safe to drop the role.level == PG_LEVEL_4K check since account_shadowed() returns early if role.level > PG_LEVEL_4K. No functional change intended. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-10-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Rename shadow MMU functions that deal with shadow pagesDavid Matlack
Rename 2 functions: kvm_mmu_get_page() -> kvm_mmu_get_shadow_page() kvm_mmu_free_page() -> kvm_mmu_free_shadow_page() This change makes it clear that these functions deal with shadow pages rather than struct pages. It also aligns these functions with the naming scheme for kvm_mmu_find_shadow_page() and kvm_mmu_alloc_shadow_page(). Prefer "shadow_page" over the shorter "sp" since these are core functions and the line lengths aren't terrible. No functional change intended. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-9-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Consolidate shadow page allocation and initializationDavid Matlack
Consolidate kvm_mmu_alloc_page() and kvm_mmu_alloc_shadow_page() under the latter so that all shadow page allocation and initialization happens in one place. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-8-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24KVM: x86/mmu: Decompose kvm_mmu_get_page() into separate functionsDavid Matlack
Decompose kvm_mmu_get_page() into separate helper functions to increase readability and prepare for allocating shadow pages without a vcpu pointer. Specifically, pull the guts of kvm_mmu_get_page() into 2 helper functions: kvm_mmu_find_shadow_page() - Walks the page hash checking for any existing mmu pages that match the given gfn and role. kvm_mmu_alloc_shadow_page() Allocates and initializes an entirely new kvm_mmu_page. This currently requries a vcpu pointer for allocation and looking up the memslot but that will be removed in a future commit. No functional change intended. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-7-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>