git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-06-24	KVM: x86: Force all MMUs to reinitialize if guest CPUID is modified	Sean Christopherson
	Invalidate all MMUs' roles after a CPUID update to force reinitizliation of the MMU context/helpers. Despite the efforts of commit de3ccd26fafc ("KVM: MMU: record maximum physical address width in kvm_mmu_extended_role"), there are still a handful of CPUID-based properties that affect MMU behavior but are not incorporated into mmu_role. E.g. 1gb hugepage support, AMD vs. Intel handling of bit 8, and SEV's C-Bit location all factor into the guest's reserved PTE bits. The obvious alternative would be to add all such properties to mmu_role, but doing so provides no benefit over simply forcing a reinitialization on every CPUID update, as setting guest CPUID is a rare operation. Note, reinitializing all MMUs after a CPUID update does not fix all of KVM's woes. Specifically, kvm_mmu_page_role doesn't track the CPUID properties, which means that a vCPU can reuse shadow pages that should not exist for the new vCPU model, e.g. that map GPAs that are now illegal (due to MAXPHYADDR changes) or that set bits that are now reserved (PAGE_SIZE for 1gb pages), etc... Tracking the relevant CPUID properties in kvm_mmu_page_role would address the majority of problems, but fully tracking that much state in the shadow page role comes with an unpalatable cost as it would require a non-trivial increase in KVM's memory footprint. The GBPAGES case is even worse, as neither Intel nor AMD provides a way to disable 1gb hugepage support in the hardware page walker, i.e. it's a virtualization hole that can't be closed when using TDP. In other words, resetting the MMU after a CPUID update is largely a superficial fix. But, it will allow reverting the tracking of MAXPHYADDR in the mmu_role, and that case in particular needs to mostly work because KVM's shadow_root_level depends on guest MAXPHYADDR when 5-level paging is supported. For cases where KVM botches guest behavior, the damage is limited to that guest. But for the shadow_root_level, a misconfigured MMU can cause KVM to incorrectly access memory, e.g. due to walking off the end of its shadow page tables. Fixes: 7dcd57552008 ("x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed") Cc: Yu Zhang <yu.c.zhang@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	Revert "KVM: x86/mmu: Drop kvm_mmu_extended_role.cr4_la57 hack"	Sean Christopherson
	Restore CR4.LA57 to the mmu_role to fix an amusing edge case with nested virtualization. When KVM (L0) is using TDP, CR4.LA57 is not reflected in mmu_role.base.level because that tracks the shadow root level, i.e. TDP level. Normally, this is not an issue because LA57 can't be toggled while long mode is active, i.e. the guest has to first disable paging, then toggle LA57, then re-enable paging, thus ensuring an MMU reinitialization. But if L1 is crafty, it can load a new CR4 on VM-Exit and toggle LA57 without having to bounce through an unpaged section. L1 can also load a new CR3 on exit, i.e. it doesn't even need to play crazy paging games, a single entry PML5 is sufficient. Such shenanigans are only problematic if L0 and L1 use TDP, otherwise L1 and L2 share an MMU that gets reinitialized on nested VM-Enter/VM-Exit due to mmu_role.base.guest_mode. Note, in the L2 case with nested TDP, even though L1 can switch between L2s with different LA57 settings, thus bypassing the paging requirement, in that case KVM's nested_mmu will track LA57 in base.level. This reverts commit 8053f924cad30bf9f9a24e02b6c8ddfabf5202ea. Fixes: 8053f924cad3 ("KVM: x86/mmu: Drop kvm_mmu_extended_role.cr4_la57 hack") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: x86/mmu: Use MMU's role to detect CR4.SMEP value in nested NPT walk	Sean Christopherson
	Use the MMU's role to get its effective SMEP value when injecting a fault into the guest. When walking L1's (nested) NPT while L2 is active, vCPU state will reflect L2, whereas NPT uses the host's (L1 in this case) CR0, CR4, EFER, etc... If L1 and L2 have different settings for SMEP and L1 does not have EFER.NX=1, this can result in an incorrect PFEC.FETCH when injecting #NPF. Fixes: e57d4a356ad3 ("KVM: Add instruction fetch checking when walking guest page table") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: x86: Properly reset MMU context at vCPU RESET/INIT	Sean Christopherson
	Reset the MMU context at vCPU INIT (and RESET for good measure) if CR0.PG was set prior to INIT. Simply re-initializing the current MMU is not sufficient as the current root HPA may not be usable in the new context. E.g. if TDP is disabled and INIT arrives while the vCPU is in long mode, KVM will fail to switch to the 32-bit pae_root and bomb on the next VM-Enter due to running with a 64-bit CR3 in 32-bit mode. This bug was papered over in both VMX and SVM, but still managed to rear its head in the MMU role on VMX. Because EFER.LMA=1 requires CR0.PG=1, kvm_calc_shadow_mmu_root_page_role() checks for EFER.LMA without first checking CR0.PG. VMX's RESET/INIT flow writes CR0 before EFER, and so an INIT with the vCPU in 64-bit mode will cause the hack-a-fix to generate the wrong MMU role. In VMX, the INIT issue is specific to running without unrestricted guest since unrestricted guest is available if and only if EPT is enabled. Commit 8668a3c468ed ("KVM: VMX: Reset mmu context when entering real mode") resolved the issue by forcing a reset when entering emulated real mode. In SVM, commit ebae871a509d ("kvm: svm: reset mmu on VCPU reset") forced a MMU reset on every INIT to workaround the flaw in common x86. Note, at the time the bug was fixed, the SVM problem was exacerbated by a complete lack of a CR4 update. The vendor resets will be reverted in future patches, primarily to aid bisection in case there are non-INIT flows that rely on the existing VMX logic. Because CR0.PG is unconditionally cleared on INIT, and because CR0.WP and all CR4/EFER paging bits are ignored if CR0.PG=0, simply checking that CR0.PG was '1' prior to INIT/RESET is sufficient to detect a required MMU context reset. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs	Sean Christopherson
	Mark NX as being used for all non-nested shadow MMUs, as KVM will set the NX bit for huge SPTEs if the iTLB mutli-hit mitigation is enabled. Checking the mitigation itself is not sufficient as it can be toggled on at any time and KVM doesn't reset MMU contexts when that happens. KVM could reset the contexts, but that would require purging all SPTEs in all MMUs, for no real benefit. And, KVM already forces EFER.NX=1 when TDP is disabled (for WP=0, SMEP=1, NX=0), so technically NX is never reserved for shadow MMUs. Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: x86/mmu: Remove broken WARN that fires on 32-bit KVM w/ nested EPT	Sean Christopherson
	Remove a misguided WARN that attempts to detect the scenario where using a special A/D tracking flag will set reserved bits on a non-MMIO spte. The WARN triggers false positives when using EPT with 32-bit KVM because of the !64-bit clause, which is just flat out wrong. The whole A/D tracking goo is specific to EPT, and one of the big selling points of EPT is that EPT is decoupled from the host's native paging mode. Drop the WARN instead of trying to salvage the check. Keeping a check specific to A/D tracking bits would essentially regurgitate the same code that led to KVM needed the tracking bits in the first place. A better approach would be to add a generic WARN on reserved bits being set, which would naturally cover the A/D tracking bits, work for all flavors of paging, and be self-documenting to some extent. Fixes: 8a406c89532c ("KVM: x86/mmu: Rename and document A/D scheme for TDP SPTEs") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: debugfs: Reuse binary stats descriptors	Jing Zhang
	To remove code duplication, use the binary stats descriptors in the implementation of the debugfs interface for statistics. This unifies the definition of statistics for the binary and debugfs interfaces. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-8-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Add selftest for KVM statistics data binary interface	Jing Zhang
	Add selftest to check KVM stats descriptors validity. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-7-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: stats: Add documentation for binary statistics interface	Jing Zhang
	This new API provides a file descriptor for every VM and VCPU to read KVM statistics data in binary format. It is meant to provide a lightweight, flexible, scalable and efficient lock-free solution for user space telemetry applications to pull the statistics data periodically for large scale systems. The pulling frequency could be as high as a few times per second. The statistics descriptors are defined by KVM in kernel and can be by userspace to discover VM/VCPU statistics during the one-time setup stage. The statistics data itself could be read out by userspace telemetry periodically without any extra parsing or setup effort. There are a few existed interface protocols and definitions, but no one can fulfil all the requirements this interface implemented as below: 1. During high frequency periodic stats reading, there should be no extra efforts except the stats data read itself. 2. Support stats annotation, like type (cumulative, instantaneous, peak, histogram, etc) and unit (counter, time, size, cycles, etc). 3. The stats data reading should be free of lock/synchronization. We don't care about the consistency between all the stats data. All stats data can not be read out at exactly the same time. We really care about the change or trend of the stats data. The lock-free solution is not just for efficiency and scalability, also for the stats data accuracy and usability. For example, in the situation that all the stats data readings are protected by a global lock, if one VCPU died somehow with that lock held, then all stats data reading would be blocked, then we have no way from stats data that which VCPU has died. 4. The stats data reading workload can be handed over to other unprivileged process. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-6-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: stats: Support binary stats retrieval for a VCPU	Jing Zhang
	Add a VCPU ioctl to get a statistics file descriptor by which a read functionality is provided for userspace to read out VCPU stats header, descriptors and data. Define VCPU statistics descriptors and header for all architectures. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-5-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: stats: Support binary stats retrieval for a VM	Jing Zhang
	Add a VM ioctl to get a statistics file descriptor by which a read functionality is provided for userspace to read out VM stats header, descriptors and data. Define VM statistics descriptors and header for all architectures. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-4-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	Merge tag 'drm-fixes-2021-06-25' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Dave Airlie: "This is a bit bigger than I'd like at this stage, and I guess last week was extra quiet, but it's mostly one fix across three drivers to wait for buffer move pinning to complete. There was one locking change that got reverted so it's just noise. Otherwise the amdgpu/nouveau changes are for known regressions, and otherwise it's just misc changes in kmb/atmel/vc4 drivers. Summary: core: - auth locking change + brown paper bag revert radeon/nouveau/amdgpu/ttm: - wait for BO to be pinned after moving it (same fix in three drivers) amdgpu: - Revert GFX9/10 doorbell fixes, we just end up trading one bug for another - Potential memory corruption fix in framebuffer handling nouveau: - fix regression checking dma addresses kmb: - error return fix atmel-hlcdc: - fix kernel warnings at boot - enable async flips vc4: - fix CPU hang due to power management" * tag 'drm-fixes-2021-06-25' of git://anongit.freedesktop.org/drm/drm: drm/nouveau: fix dma_address check for CPU/GPU sync drm/kmb: Fix error return code in kmb_hw_init() drm/amdgpu: wait for moving fence after pinning drm/radeon: wait for moving fence after pinning drm/nouveau: wait for moving fence after pinning v2 Revert "drm: add a locked version of drm_is_current_master" Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue." Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell." drm/amdgpu: Call drm_framebuffer_init last for framebuffer init drm: add a locked version of drm_is_current_master drm/atmel-hlcdc: Allow async page flips drm/panel: ld9040: reference spi_device_id table drm: atmel_hlcdc: Enable the crtc vblank prior to crtc usage. drm/vc4: hdmi: Make sure the controller is powered in detect drm/vc4: hdmi: Move the HSM clock enable to runtime_pm
2021-06-24	i2c: robotfuzz-osif: fix control-request directions	Johan Hovold
	The direction of the pipe argument must match the request-type direction bit or control requests may fail depending on the host-controller-driver implementation. Control transfers without a data stage are treated as OUT requests by the USB stack and should be using usb_sndctrlpipe(). Failing to do so will now trigger a warning. Fix the OSIFI2C_SET_BIT_RATE and OSIFI2C_STOP requests which erroneously used the osif_usb_read() helper and set the IN direction bit. Reported-by: syzbot+9d7dadd15b8819d73f41@syzkaller.appspotmail.com Fixes: 83e53a8f120f ("i2c: Add bus driver for for OSIF USB i2c device.") Cc: stable@vger.kernel.org # 3.14 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-06-25	Merge tag 'drm-misc-fixes-2021-06-24' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A DMA address check for nouveau, an error code return fix for kmb, fixes to wait for a moving fence after pinning the BO for amdgpu, nouveau and radeon, a crtc and async page flip fix for atmel-hlcdc and a cpu hang fix for vc4. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210624190353.wyizoil3wqrrxz5d@gilmour
2021-06-24	i2c: dev: Add __user annotation	Andreas Hecht
	Fix Sparse warnings: drivers/i2c/i2c-dev.c:546:19: warning: incorrect type in assignment (different address spaces) drivers/i2c/i2c-dev.c:549:53: warning: incorrect type in argument 2 (different address spaces) compat_ptr() returns a pointer tagged __user which gets assigned to a pointer missing the __user annotation. The same pointer is passed to copy_from_user() as an argument where it is expected to have the __user annotation. Fix both by adding the __user annotation to the pointer. Fixes: 7d5cb45655f2 ("i2c compat ioctls: move to ->compat_ioctl()") Signed-off-by: Andreas Hecht <andreas.e.hecht@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-06-24	libceph: set global_id as soon as we get an auth ticket	Ilya Dryomov
	Commit 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket") delayed the setting of global_id too much. It is set only after all tickets are received, but in pre-nautilus clusters an auth ticket and the service tickets are obtained in separate steps (for a total of three MAuth replies). When the service tickets are requested, global_id is used to build an authorizer; if global_id is still 0 we never get them and fail to establish the session. Moving the setting of global_id into protocol implementations. This way global_id can be set exactly when an auth ticket is received, not sooner nor later. Fixes: 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24	libceph: don't pass result into ac->ops->handle_reply()	Ilya Dryomov
	There is no result to pass in msgr2 case because authentication failures are reported through auth_bad_method frame and in MAuth case an error is returned immediately. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24	spi: Fix self assignment issue with ancillary->mode	Colin Ian King
	There is an assignment of ancillary->mode to itself which looks dubious since the proceeding comment states that the speed and mode is taken over from the SPI main device, indicating that ancillary->mode should assigned using the value spi->mode. Fix this. Addresses-Coverity: ("Self assignment") Fixes: 0c79378c0199 ("spi: add ancillary device support") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210623172300.161484-1-colin.king@canonical.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-24	Merge tag 'mmc-v5.13-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: "Use memcpy_to/fromio for dram-access-quirk in the meson-gx host driver" * tag 'mmc-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: meson-gx: use memcpy_to/fromio for dram-access-quirk
2021-06-24	Merge tag 'core-urgent-2021-06-24' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull sigqueue cache fix from Ingo Molnar: "Fix a memory leak in the recently introduced sigqueue cache" * tag 'core-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: signal: Prevent sigqueue caching after task got released
2021-06-24	Merge tag 'sched-urgent-2021-06-24' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "A last minute cgroup bandwidth scheduling fix for a recently introduced logic fail which triggered a kernel warning by LTP's cfs_bandwidth01 test" * tag 'sched-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Ensure that the CFS parent is added after unthrottling
2021-06-24	Merge tag 'perf-urgent-2021-06-24' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fix from Ingo Molnar: "An LBR buffer fix for code that probably only worked accidentally" * tag 'perf-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/lbr: Zero the xstate buffer on allocation
2021-06-24	KVM: do not allow mapping valid but non-reference-counted pages	Nicholas Piggin
	It's possible to create a region which maps valid but non-refcounted pages (e.g., tail pages of non-compound higher order allocations). These host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family of APIs, which take a reference to the page, which takes it from 0 to 1. When the reference is dropped, this will free the page incorrectly. Fix this by only taking a reference on valid pages if it was non-zero, which indicates it is participating in normal refcounting (and can be released with put_page). This addresses CVE-2021-22543. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: stats: Add fd-based API to read binary stats data	Jing Zhang
	This commit defines the API for userspace and prepare the common functionalities to support per VM/VCPU binary stats data readings. The KVM stats now is only accessible by debugfs, which has some shortcomings this change series are supposed to fix: 1. The current debugfs stats solution in KVM could be disabled when kernel Lockdown mode is enabled, which is a potential rick for production. 2. The current debugfs stats solution in KVM is organized as "one stats per file", it is good for debugging, but not efficient for production. 3. The stats read/clear in current debugfs solution in KVM are protected by the global kvm_lock. Besides that, there are some other benefits with this change: 1. All KVM VM/VCPU stats can be read out in a bulk by one copy to userspace. 2. A schema is used to describe KVM statistics. From userspace's perspective, the KVM statistics are self-describing. 3. With the fd-based solution, a separate telemetry would be able to read KVM stats in a less privileged environment. 4. After the initial setup by reading in stats descriptors, a telemetry only needs to read the stats data itself, no more parsing or setup is needed. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-3-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: stats: Separate generic stats from architecture specific ones	Jing Zhang
	Generic KVM stats are those collected in architecture independent code or those supported by all architectures; put all generic statistics in a separate structure. This ensures that they are defined the same way in the statistics API which is being added, removing duplication among different architectures in the declaration of the descriptors. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: x86/mmu: Don't WARN on a NULL shadow page in TDP MMU check	Sean Christopherson
	Treat a NULL shadow page in the "is a TDP MMU" check as valid, non-TDP root. KVM uses a "direct" PAE paging MMU when TDP is disabled and the guest is running with paging disabled. In that case, root_hpa points at the pae_root page (of which only 32 bytes are used), not a standard shadow page, and the WARN fires (a lot). Fixes: 0b873fd7fb53 ("KVM: x86/mmu: Remove redundant is_tdp_mmu_enabled check") Cc: David Matlack <dmatlack@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622072454.3449146-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: sefltests: Add x86-64 test to verify MMU reacts to CPUID updates	Sean Christopherson
	Add an x86-only test to verify that x86's MMU reacts to CPUID updates that impact the MMU. KVM has had multiple bugs where it fails to reconfigure the MMU after the guest's vCPU model changes. Sadly, this test is effectively limited to shadow paging because the hardware page walk handler doesn't support software disabling of GBPAGES support, and KVM doesn't manually walk the GVA->GPA on faults for performance reasons (doing so would large defeat the benefits of TDP). Don't require !TDP for the tests as there is still value in running the tests with TDP, even though the tests will fail (barring KVM hacks). E.g. KVM should not completely explode if MAXPHYADDR results in KVM using 4-level vs. 5-level paging for the guest. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-20-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Add hugepage support for x86-64	Sean Christopherson
	Add x86-64 hugepage support in the form of a x86-only variant of virt_pg_map() that takes an explicit page size. To keep things simple, follow the existing logic for 4k pages and disallow creating a hugepage if the upper-level entry is present, even if the desired pfn matches. Opportunistically fix a double "beyond beyond" reported by checkpatch. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-19-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Genericize upper level page table entry struct	Sean Christopherson
	In preparation for adding hugepage support, replace "pageMapL4Entry", "pageDirectoryPointerEntry", and "pageDirectoryEntry" with a common "pageUpperEntry", and add a helper to create an upper level entry. All upper level entries have the same layout, using unique structs provides minimal value and requires a non-trivial amount of code duplication. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-18-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Add PTE helper for x86-64 in preparation for hugepages	Sean Christopherson
	Add a helper to retrieve a PTE pointer given a PFN, address, and level in preparation for adding hugepage support. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-17-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Rename x86's page table "address" to "pfn"	Sean Christopherson
	Rename the "address" field to "pfn" in x86's page table structs to match reality. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Add wrapper to allocate page table page	Sean Christopherson
	Add a helper to allocate a page for use in constructing the guest's page tables. All architectures have identical address and memslot requirements (which appear to be arbitrary anyways). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-15-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Unconditionally allocate EPT tables in memslot 0	Sean Christopherson
	Drop the EPTP memslot param from all EPT helpers and shove the hardcoded '0' down to the vm_phy_page_alloc() calls. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Unconditionally use memslot '0' for page table allocations	Sean Christopherson
	Drop the memslot param from virt_pg_map() and virt_map() and shove the hardcoded '0' down to the vm_phy_page_alloc() calls. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Unconditionally use memslot 0 for vaddr allocations	Sean Christopherson
	Drop the memslot param(s) from vm_vaddr_alloc() now that all callers directly specific '0' as the memslot. Drop the memslot param from virt_pgd_alloc() as well since vm_vaddr_alloc() is its only user. I.e. shove the hardcoded '0' down to the vm_phy_pages_alloc() calls. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	Merge tag 'objtool-urgent-2021-06-24' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "Address a number of objtool warnings that got reported. No change in behavior intended, but code generation might be impacted by commit 1f008d46f124 ("x86: Always inline task_size_max()")" * tag 'objtool-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/lockdep: Improve noinstr vs errors x86: Always inline task_size_max() x86/xen: Fix noinstr fail in exc_xen_unknown_trap() x86/xen: Fix noinstr fail in xen_pv_evtchn_do_upcall() x86/entry: Fix noinstr fail in __do_fast_syscall_32() objtool/x86: Ignore __x86_indirect_alt_* symbols
2021-06-24	hwmon: Support set_trips() of thermal device ops	Dmitry Osipenko
	Support set_trips() callback of thermal device ops. This allows HWMON device to operatively notify thermal core about temperature changes, which is very handy to have in a case where HWMON sensor is used by CPU thermal zone that performs passive cooling and emergency shutdown on overheat. Thermal core will be able to react faster to temperature changes. The set_trips() callback is entirely optional. If HWMON sensor doesn't support setting thermal trips, then the callback is a NO-OP. The dummy callback has no effect on the thermal core. The temperature trips are either complement the temperature polling mechanism of thermal core or replace the polling if sensor can set the trips and polling is disabled by a particular device in a device-tree. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Link: https://lore.kernel.org/r/20210623042231.16008-3-digetx@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-24	hwmon: (lm90) Prevent integer underflows of temperature calculations	Dmitry Osipenko
	The min/max/crit and all other temperature values that are passed to the driver are unlimited and value that is close to INT_MIN results in integer underflow of the temperature calculations made by the driver for LM99 sensor. Temperature hysteresis is among those values that need to be limited, but limiting of hysteresis is independent from the sensor version. Add the missing limits. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Link: https://lore.kernel.org/r/20210623042231.16008-2-digetx@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-24	KVM: arm64: Set the MTE tag bit before releasing the page	Marc Zyngier
	Setting a page flag without holding a reference to the page is living dangerously. In the tag-writing path, we drop the reference to the page by calling kvm_release_pfn_dirty(), and only then set the PG_mte_tagged bit. It would be safer to do it the other way round. Fixes: f0376edb1ddca ("KVM: arm64: Add ioctl to fetch/store tags in a guest") Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/87k0mjidwb.wl-maz@kernel.org
2021-06-24	drm/nouveau: fix dma_address check for CPU/GPU sync	Christian König
	AGP for example doesn't have a dma_address array. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210614110517.1624-1-christian.koenig@amd.com
2021-06-24	Merge branch 'for-next/sve' into for-next/core	Will Deacon
	Optimise SVE switching for CPUs with 128-bit implementations. * for-next/sve: arm64/sve: Skip flushing Z registers with 128 bit vectors arm64/sve: Use the sve_flush macros in sve_load_from_fpsimd_state() arm64/sve: Split _sve_flush macro into separate Z and predicate flushes
2021-06-24	Merge branch 'for-next/smccc' into for-next/core	Will Deacon
	Add support for versions v1.2 and 1.3 of the SMC calling convention. * for-next/smccc: arm64: smccc: Support SMCCC v1.3 SVE register saving hint arm64: smccc: Add support for SMCCCv1.2 extended input/output registers
2021-06-24	Merge branch 'for-next/selftests' into for-next/core	Will Deacon
	Fix output format from SVE selftest. * for-next/selftests: kselftest/arm64: Add missing newline to SVE test skipping output
2021-06-24	Merge branch 'for-next/ptrauth' into for-next/core	Will Deacon
	Allow Pointer Authentication to be configured independently for kernel and userspace. * for-next/ptrauth: arm64: Conditionally configure PTR_AUTH key of the kernel. arm64: Add ARM64_PTR_AUTH_KERNEL config option
2021-06-24	Merge branch 'for-next/perf' into for-next/core	Will Deacon
	PMU driver cleanups for managing IRQ affinity and exposing event attributes via sysfs. * for-next/perf: (36 commits) drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe() perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number arm64: perf: Simplify EVENT ATTR macro in perf_event.c drivers/perf: Simplify EVENT ATTR macro in fsl_imx8_ddr_perf.c drivers/perf: Simplify EVENT ATTR macro in xgene_pmu.c drivers/perf: Simplify EVENT ATTR macro in qcom_l3_pmu.c drivers/perf: Simplify EVENT ATTR macro in qcom_l2_pmu.c drivers/perf: Simplify EVENT ATTR macro in SMMU PMU driver perf: Add EVENT_ATTR_ID to simplify event attributes perf/smmuv3: Don't trample existing events with global filter perf/hisi: Constify static attribute_group structs perf: qcom: Remove redundant dev_err call in qcom_l3_cache_pmu_probe() drivers/perf: hisi: Fix data source control arm64: perf: Add more support on caps under sysfs perf: qcom_l2_pmu: move to use request_irq by IRQF_NO_AUTOEN flag arm_pmu: move to use request_irq by IRQF_NO_AUTOEN flag perf: arm_spe: use DEVICE_ATTR_RO macro perf: xgene_pmu: use DEVICE_ATTR_RO macro perf: qcom: use DEVICE_ATTR_RO macro perf: arm_pmu: use DEVICE_ATTR_RO macro ...
2021-06-24	Merge branch 'for-next/mte' into for-next/core	Will Deacon
	KASAN optimisations for the hardware tagging (MTE) implementation. * for-next/mte: kasan: disable freed user page poisoning with HW tags arm64: mte: handle tags zeroing at page allocation time kasan: use separate (un)poison implementation for integrated init mm: arch: remove indirection level in alloc_zeroed_user_highpage_movable() kasan: speed up mte_set_mem_tag_range
2021-06-24	Merge branch 'for-next/mm' into for-next/core	Will Deacon
	Lots of cleanup to our various page-table definitions, but also some non-critical fixes and removal of some unnecessary memory types. The most interesting change here is the reduction of ARCH_DMA_MINALIGN back to 64 bytes, since we're not aware of any machines that need a higher value with the way the code is structured (only needed for non-coherent DMA). * for-next/mm: arm64: tlb: fix the TTL value of tlb_get_level arm64/mm: Rename ARM64_SWAPPER_USES_SECTION_MAPS arm64: head: fix code comments in set_cpu_boot_mode_flag arm64: mm: drop unused __pa(__idmap_text_start) arm64: mm: fix the count comments in compute_indices arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan arm64: mm: Pass original fault address to handle_mm_fault() arm64/mm: Drop SECTION_[SHIFT\|SIZE\|MASK] arm64/mm: Use CONT_PMD_SHIFT for ARM64_MEMSTART_SHIFT arm64/mm: Drop SWAPPER_INIT_MAP_SIZE arm64: mm: decode xFSC in mem_abort_decode() arm64: mm: Add is_el1_data_abort() helper arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES) arm64: mm: Remove unused support for Normal-WT memory type arm64: acpi: Map EFI_MEMORY_WT memory as Normal-NC arm64: mm: Remove unused support for Device-GRE memory type arm64: mm: Use better bitmap_zalloc() arm64/mm: Make vmemmap_free() available only with CONFIG_MEMORY_HOTPLUG arm64/mm: Remove [PUD\|PMD]_TABLE_BIT from [pud\|pmd]_bad() arm64/mm: Validate CONFIG_PGTABLE_LEVELS
2021-06-24	Merge branch 'for-next/misc' into for-next/core	Will Deacon
	Reduce loglevel of useless print during CPU offlining. * for-next/misc: arm64: smp: Bump debugging information print down to KERN_DEBUG
2021-06-24	Merge branch 'for-next/kasan' into for-next/core	Will Deacon
	Optimise out-of-line KASAN checking when using software tagging. * for-next/kasan: kasan: arm64: support specialized outlined tag mismatch checks
2021-06-24	Merge branch 'for-next/insn' into for-next/core	Will Deacon
	Refactoring of our instruction decoding routines and addition of some missing encodings. * for-next/insn: arm64: insn: avoid circular include dependency arm64: insn: move AARCH64_INSN_SIZE into <asm/insn.h> arm64: insn: decouple patching from insn code arm64: insn: Add load/store decoding helpers arm64: insn: Add some opcodes to instruction decoder arm64: insn: Add barrier encodings arm64: insn: Add SVE instruction class arm64: Move instruction encoder/decoder under lib/ arm64: Move aarch32 condition check functions arm64: Move patching utilities out of instruction encoding/decoding