summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-01-26ceph: set pool_ns in new inode layout for async createsJeff Layton
Dan reported that he was unable to write to files that had been asynchronously created when the client's OSD caps are restricted to a particular namespace. The issue is that the layout for the new inode is only partially being filled. Ensure that we populate the pool_ns_data and pool_ns_len in the iinfo before calling ceph_fill_inode. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/54013 Fixes: 9a8d03ca2e2c ("ceph: attempt to do async create when possible") Reported-by: Dan van der Ster <dan@vanderster.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-26ceph: properly put ceph_string reference after async create attemptJeff Layton
The reference acquired by try_prep_async_create is currently leaked. Ensure we put it. Cc: stable@vger.kernel.org Fixes: 9a8d03ca2e2c ("ceph: attempt to do async create when possible") Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-26ceph: put the requests/sessions when it fails to alloc memoryXiubo Li
When failing to allocate the sessions memory we should make sure the req1 and req2 and the sessions get put. And also in case the max_sessions decreased so when kreallocate the new memory some sessions maybe missed being put. And if the max_sessions is 0 krealloc will return ZERO_SIZE_PTR, which will lead to a distinct access fault. URL: https://tracker.ceph.com/issues/53819 Fixes: e1a4541ec0b9 ("ceph: flush the mdlog before waiting on unsafe reqs") Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-01-26selftests: kvm: move vm_xsave_req_perm call to amx_testPaolo Bonzini
There is no need for tests other than amx_test to enable dynamic xsave states. Remove the call to vm_xsave_req_perm from generic code, and move it inside the test. While at it, allow customizing the bit that is requested, so that future tests can use it differently. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any timeLike Xu
XCR0 is reset to 1 by RESET but not INIT and IA32_XSS is zeroed by both RESET and INIT. The kvm_set_msr_common()'s handling of MSR_IA32_XSS also needs to update kvm_update_cpuid_runtime(). In the above cases, the size in bytes of the XSAVE area containing all states enabled by XCR0 or (XCRO | IA32_XSS) needs to be updated. For simplicity and consistency, existing helpers are used to write values and call kvm_update_cpuid_runtime(), and it's not exactly a fast path. Fixes: a554d207dc46 ("KVM: X86: Processor States following Reset or INIT") Cc: stable@vger.kernel.org Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220126172226.2298529-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: x86: Update vCPU's runtime CPUID on write to MSR_IA32_XSSLike Xu
Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the size in bytes of the XSAVE area is affected by the states enabled in XSS. Fixes: 203000993de5 ("kvm: vmx: add MSR logic for XSAVES") Cc: stable@vger.kernel.org Signed-off-by: Like Xu <likexu@tencent.com> [sean: split out as a separate patch, adjust Fixes tag] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220126172226.2298529-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: x86: Keep MSR_IA32_XSS unchanged for INITXiaoyao Li
It has been corrected from SDM version 075 that MSR_IA32_XSS is reset to zero on Power up and Reset but keeps unchanged on INIT. Fixes: a554d207dc46 ("KVM: X86: Processor States following Reset or INIT") Cc: stable@vger.kernel.org Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220126172226.2298529-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26s390/hypfs: include z/VM guests with access control group setVasily Gorbik
Currently if z/VM guest is allowed to retrieve hypervisor performance data globally for all guests (privilege class B) the query is formed in a way to include all guests but the group name is left empty. This leads to that z/VM guests which have access control group set not being included in the results (even local vm). Change the query group identifier from empty to "any" to retrieve information about all guests from any groups (or without a group set). Cc: stable@vger.kernel.org Fixes: 31cb4bd31a48 ("[S390] Hypervisor filesystem (s390_hypfs) for z/VM") Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-01-26KVM: x86: Free kvm_cpuid_entry2 array on post-KVM_RUN KVM_SET_CPUID{,2}Sean Christopherson
Free the "struct kvm_cpuid_entry2" array on successful post-KVM_RUN KVM_SET_CPUID{,2} to fix a memory leak, the callers of kvm_set_cpuid() free the array only on failure. BUG: memory leak unreferenced object 0xffff88810963a800 (size 2048): comm "syz-executor025", pid 3610, jiffies 4294944928 (age 8.080s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 0d 00 00 00 ................ 47 65 6e 75 6e 74 65 6c 69 6e 65 49 00 00 00 00 GenuntelineI.... backtrace: [<ffffffff814948ee>] kmalloc_node include/linux/slab.h:604 [inline] [<ffffffff814948ee>] kvmalloc_node+0x3e/0x100 mm/util.c:580 [<ffffffff814950f2>] kvmalloc include/linux/slab.h:732 [inline] [<ffffffff814950f2>] vmemdup_user+0x22/0x100 mm/util.c:199 [<ffffffff8109f5ff>] kvm_vcpu_ioctl_set_cpuid2+0x8f/0xf0 arch/x86/kvm/cpuid.c:423 [<ffffffff810711b9>] kvm_arch_vcpu_ioctl+0xb99/0x1e60 arch/x86/kvm/x86.c:5251 [<ffffffff8103e92d>] kvm_vcpu_ioctl+0x4ad/0x950 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4066 [<ffffffff815afacc>] vfs_ioctl fs/ioctl.c:51 [inline] [<ffffffff815afacc>] __do_sys_ioctl fs/ioctl.c:874 [inline] [<ffffffff815afacc>] __se_sys_ioctl fs/ioctl.c:860 [inline] [<ffffffff815afacc>] __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:860 [<ffffffff844a3335>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff844a3335>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: c6617c61e8fe ("KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN") Cc: stable@vger.kernel.org Reported-by: syzbot+be576ad7655690586eec@syzkaller.appspotmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220125210445.2053429-1-seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: nVMX: WARN on any attempt to allocate shadow VMCS for vmcs02Sean Christopherson
WARN if KVM attempts to allocate a shadow VMCS for vmcs02. KVM emulates VMCS shadowing but doesn't virtualize it, i.e. KVM should never allocate a "real" shadow VMCS for L2. The previous code WARNed but continued anyway with the allocation, presumably in an attempt to avoid NULL pointer dereference. However, alloc_vmcs (and hence alloc_shadow_vmcs) can fail, and indeed the sole caller does: if (enable_shadow_vmcs && !alloc_shadow_vmcs(vcpu)) goto out_shadow_vmcs; which makes it not a useful attempt. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220125220527.2093146-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: selftests: Don't skip L2's VMCALL in SMM test for SVM guestSean Christopherson
Don't skip the vmcall() in l2_guest_code() prior to re-entering L2, doing so will result in L2 running to completion, popping '0' off the stack for RET, jumping to address '0', and ultimately dying with a triple fault shutdown. It's not at all obvious why the test re-enters L2 and re-executes VMCALL, but presumably it serves a purpose. The VMX path doesn't skip vmcall(), and the test can't possibly have passed on SVM, so just do what VMX does. Fixes: d951b2210c1a ("KVM: selftests: smm_test: Test SMM enter from L2") Cc: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220125221725.2101126-1-seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: x86: Check .flags in kvm_cpuid_check_equal() tooVitaly Kuznetsov
kvm_cpuid_check_equal() checks for the (full) equality of the supplied CPUID data so .flags need to be checked too. Reported-by: Sean Christopherson <seanjc@google.com> Fixes: c6617c61e8fe ("KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220126131804.2839410-1-vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: x86: Forcibly leave nested virt when SMM state is toggledSean Christopherson
Forcibly leave nested virtualization operation if userspace toggles SMM state via KVM_SET_VCPU_EVENTS or KVM_SYNC_X86_EVENTS. If userspace forces the vCPU out of SMM while it's post-VMXON and then injects an SMI, vmx_enter_smm() will overwrite vmx->nested.smm.vmxon and end up with both vmxon=false and smm.vmxon=false, but all other nVMX state allocated. Don't attempt to gracefully handle the transition as (a) most transitions are nonsencial, e.g. forcing SMM while L2 is running, (b) there isn't sufficient information to handle all transitions, e.g. SVM wants access to the SMRAM save state, and (c) KVM_SET_VCPU_EVENTS must precede KVM_SET_NESTED_STATE during state restore as the latter disallows putting the vCPU into L2 if SMM is active, and disallows tagging the vCPU as being post-VMXON in SMM if SMM is not active. Abuse of KVM_SET_VCPU_EVENTS manifests as a WARN and memory leak in nVMX due to failure to free vmcs01's shadow VMCS, but the bug goes far beyond just a memory leak, e.g. toggling SMM on while L2 is active puts the vCPU in an architecturally impossible state. WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline] WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656 Modules linked in: CPU: 1 PID: 3606 Comm: syz-executor725 Not tainted 5.17.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline] RIP: 0010:free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656 Code: <0f> 0b eb b3 e8 8f 4d 9f 00 e9 f7 fe ff ff 48 89 df e8 92 4d 9f 00 Call Trace: <TASK> kvm_arch_vcpu_destroy+0x72/0x2f0 arch/x86/kvm/x86.c:11123 kvm_vcpu_destroy arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 [inline] kvm_destroy_vcpus+0x11f/0x290 arch/x86/kvm/../../../virt/kvm/kvm_main.c:460 kvm_free_vcpus arch/x86/kvm/x86.c:11564 [inline] kvm_arch_destroy_vm+0x2e8/0x470 arch/x86/kvm/x86.c:11676 kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1217 [inline] kvm_put_kvm+0x4fa/0xb00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1250 kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1273 __fput+0x286/0x9f0 fs/file_table.c:311 task_work_run+0xdd/0x1a0 kernel/task_work.c:164 exit_task_work include/linux/task_work.h:32 [inline] do_exit+0xb29/0x2a30 kernel/exit.c:806 do_group_exit+0xd2/0x2f0 kernel/exit.c:935 get_signal+0x4b0/0x28c0 kernel/signal.c:2862 arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> Cc: stable@vger.kernel.org Reported-by: syzbot+8112db3ab20e70d50c31@syzkaller.appspotmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220125220358.2091737-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: drop unnecessary code in svm_hv_vmcb_dirty_nested_enlightenments()Vitaly Kuznetsov
Commit 3fa5e8fd0a0e4 ("KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized") re-arranged svm_vcpu_init_msrpm() call in svm_create_vcpu(), thus making the comment about vmcb being NULL obsolete. Drop it. While on it, drop superfluous vmcb_is_clean() check: vmcb_mark_dirty() is a bit flip, an extra check is unlikely to bring any performance gain. Drop now-unneeded vmcb_is_clean() helper as well. Fixes: 3fa5e8fd0a0e4 ("KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211220152139.418372-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: hyper-v: Enable Enlightened MSR-Bitmap support for realVitaly Kuznetsov
Commit c4327f15dfc7 ("KVM: SVM: hyper-v: Enlightened MSR-Bitmap support") introduced enlightened MSR-Bitmap support for KVM-on-Hyper-V but it didn't actually enable the support. Similar to enlightened NPT TLB flush and direct TLB flush features, the guest (KVM) has to tell L0 (Hyper-V) that it's using the feature by setting the appropriate feature fit in VMCB control area (sw reserved fields). Fixes: c4327f15dfc7 ("KVM: SVM: hyper-v: Enlightened MSR-Bitmap support") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211220152139.418372-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: Don't kill SEV guest if SMAP erratum triggers in usermodeSean Christopherson
Inject a #GP instead of synthesizing triple fault to try to avoid killing the guest if emulation of an SEV guest fails due to encountering the SMAP erratum. The injected #GP may still be fatal to the guest, e.g. if the userspace process is providing critical functionality, but KVM should make every attempt to keep the guest alive. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: Don't apply SEV+SMAP workaround on code fetch or PT accessSean Christopherson
Resume the guest instead of synthesizing a triple fault shutdown if the instruction bytes buffer is empty due to the #NPF being on the code fetch itself or on a page table access. The SMAP errata applies if and only if the code fetch was successful and ucode's subsequent data read from the code page encountered a SMAP violation. In practice, the guest is likely hosed either way, but crashing the guest on a code fetch to emulated MMIO is technically wrong according to the behavior described in the APM. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn bufferSean Christopherson
Inject #UD if KVM attempts emulation for an SEV guests without an insn buffer and instruction decoding is required. The previous behavior of allowing emulation if there is no insn buffer is undesirable as doing so means KVM is reading guest private memory and thus decoding cyphertext, i.e. is emulating garbage. The check was previously necessary as the emulation type was not provided, i.e. SVM needed to allow emulation to handle completion of emulation after exiting to userspace to handle I/O. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guestsSean Christopherson
WARN if KVM attempts to emulate in response to #UD or #GP for SEV guests, i.e. if KVM intercepts #UD or #GP, as emulation on any fault except #NPF is impossible since KVM cannot read guest private memory to get the code stream, and the CPU's DecodeAssists feature only provides the instruction bytes on #NPF. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-7-seanjc@google.com> [Warn on EMULTYPE_TRAP_UD_FORCED according to Liam Merwick's review. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: x86: Pass emulation type to can_emulate_instruction()Sean Christopherson
Pass the emulation type to kvm_x86_ops.can_emulate_insutrction() so that a future commit can harden KVM's SEV support to WARN on emulation scenarios that should never happen. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: Explicitly require DECODEASSISTS to enable SEV supportSean Christopherson
Add a sanity check on DECODEASSIST being support if SEV is supported, as KVM cannot read guest private memory and thus relies on the CPU to provide the instruction byte stream on #NPF for emulation. The intent of the check is to document the dependency, it should never fail in practice as producing hardware that supports SEV but not DECODEASSISTS would be non-sensical. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: Don't intercept #GP for SEV guestsSean Christopherson
Never intercept #GP for SEV guests as reading SEV guest private memory will return cyphertext, i.e. emulating on #GP can't work as intended. Cc: stable@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26Revert "KVM: SVM: avoid infinite loop on NPF from bad address"Sean Christopherson
Revert a completely broken check on an "invalid" RIP in SVM's workaround for the DecodeAssists SMAP errata. kvm_vcpu_gfn_to_memslot() obviously expects a gfn, i.e. operates in the guest physical address space, whereas RIP is a virtual (not even linear) address. The "fix" worked for the problematic KVM selftest because the test identity mapped RIP. Fully revert the hack instead of trying to translate RIP to a GPA, as the non-SEV case is now handled earlier, and KVM cannot access guest page tables to translate RIP. This reverts commit e72436bc3a5206f95bb384e741154166ddb3202e. Fixes: e72436bc3a52 ("KVM: SVM: avoid infinite loop on NPF from bad address") Reported-by: Liam Merwick <liam.merwick@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: SVM: Never reject emulation due to SMAP errata for !SEV guestsSean Christopherson
Always signal that emulation is possible for !SEV guests regardless of whether or not the CPU provided a valid instruction byte stream. KVM can read all guest state (memory and registers) for !SEV guests, i.e. can fetch the code stream from memory even if the CPU failed to do so because of the SMAP errata. Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)") Cc: stable@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: x86: nSVM: skip eax alignment check for non-SVM instructionsDenis Valeev
The bug occurs on #GP triggered by VMware backdoor when eax value is unaligned. eax alignment check should not be applied to non-SVM instructions because it leads to incorrect omission of the instructions emulation. Apply the alignment check only to SVM instructions to fix. Fixes: d1cba6c92237 ("KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround") Signed-off-by: Denis Valeev <lemniscattaden@gmail.com> Message-Id: <Yexlhaoe1Fscm59u@q> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: x86/cpuid: Exclude unpermitted xfeatures sizes at KVM_GET_SUPPORTED_CPUIDLike Xu
With the help of xstate_get_guest_group_perm(), KVM can exclude unpermitted xfeatures in cpuid.0xd.0.eax, in which case the corresponding xfeatures sizes should also be matched to the permitted xfeatures. To fix this inconsistency, the permitted_xcr0 and permitted_xss are defined consistently, which implies 'supported' plus certain permissions for this task, and it also fixes cpuid.0xd.1.ebx and later leaf-by-leaf queries. Fixes: 445ecdf79be0 ("kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID") Signed-off-by: Like Xu <likexu@tencent.com> Message-Id: <20220125115223.33707-1-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: LAPIC: Also cancel preemption timer during SET_LAPICWanpeng Li
The below warning is splatting during guest reboot. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1931 at arch/x86/kvm/x86.c:10322 kvm_arch_vcpu_ioctl_run+0x874/0x880 [kvm] CPU: 0 PID: 1931 Comm: qemu-system-x86 Tainted: G I 5.17.0-rc1+ #5 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x874/0x880 [kvm] Call Trace: <TASK> kvm_vcpu_ioctl+0x279/0x710 [kvm] __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fd39797350b This can be triggered by not exposing tsc-deadline mode and doing a reboot in the guest. The lapic_shutdown() function which is called in sys_reboot path will not disarm the flying timer, it just masks LVTT. lapic_shutdown() clears APIC state w/ LVT_MASKED and timer-mode bit is 0, this can trigger timer-mode switch between tsc-deadline and oneshot/periodic, which can result in preemption timer be cancelled in apic_update_lvtt(). However, We can't depend on this when not exposing tsc-deadline mode and oneshot/periodic modes emulated by preemption timer. Qemu will synchronise states around reset, let's cancel preemption timer under KVM_SET_LAPIC. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1643102220-35667-1-git-send-email-wanpengli@tencent.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26KVM: VMX: Remove vmcs_config.orderJim Mattson
The maximum size of a VMCS (or VMXON region) is 4096. By definition, these are order 0 allocations. Signed-off-by: Jim Mattson <jmattson@google.com> Message-Id: <20220125004359.147600-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26Merge branch 'lan966x-fixes'David S. Miller
Horatiu Vultur says: ==================== net: lan966x: Fixes for sleep in atomic context This patch series contains 2 fixes for lan966x that is sleeping in atomic context. The first patch fixes the injection of the frames while the second one fixes the updating of the MAC table. v1->v2: - correct the fix tag in the second patch, it was using the wrong sha. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26net: lan966x: Fix sleep in atomic context when updating MAC tableHoratiu Vultur
The function lan966x_mac_wait_for_completion is used to poll the status of the MAC table using the function readx_poll_timeout. The problem with this function is that is called also from atomic context. Therefore update the function to use readx_poll_timeout_atomic. Fixes: e18aba8941b40b ("net: lan966x: add mactable support") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26net: lan966x: Fix sleep in atomic context when injecting framesHoratiu Vultur
On lan966x, when injecting a frame it was polling the register QS_INJ_STATUS to see if it can continue with the injection of the frame. The problem was that it was using readx_poll_timeout which could sleep in atomic context. This patch fixes this issue by using readx_poll_timeout_atomic. Fixes: d28d6d2e37d10d ("net: lan966x: add port module support") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26Merge branch 'dev_addr-const-fixes'David S. Miller
Jakub Kicinski says: ==================== ethernet: fix some esoteric drivers after netdev->dev_addr constification Looking at recent fixes for drivers which don't get included with allmodconfig builds I thought it's worth grepping for more instances of: dev->dev_addr\[.*\] = This set contains the fixes. v2: add last 3 patches which fix drivers for the RiscPC ARM platform. Thanks to Arnd Bergmann for explaining how to build test that. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26ethernet: seeq/ether3: don't write directly to netdev->dev_addrJakub Kicinski
netdev->dev_addr is const now. Compile tested rpc_defconfig w/ GCC 8.5. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26ethernet: 8390/etherh: don't write directly to netdev->dev_addrJakub Kicinski
netdev->dev_addr is const now. Compile tested rpc_defconfig w/ GCC 8.5. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26ethernet: i825xx: don't write directly to netdev->dev_addrJakub Kicinski
netdev->dev_addr is const now. Compile tested rpc_defconfig w/ GCC 8.5. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26ethernet: broadcom/sb1250-mac: don't write directly to netdev->dev_addrJakub Kicinski
netdev->dev_addr is const now. Compile tested bigsur_defconfig and sb1250_swarm_defconfig. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26ethernet: tundra: don't write directly to netdev->dev_addrJakub Kicinski
netdev->dev_addr is const now. Maintain the questionable offsetting in ndo_set_mac_address. Compile tested holly_defconfig and mpc7448_hpc2_defconfig. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26ethernet: 3com/typhoon: don't write directly to netdev->dev_addrJakub Kicinski
This driver casts off the const and writes directly to netdev->dev_addr. This will result in a MAC address tree corruption and a warning. Compile tested ppc6xx_defconfig. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26drm/privacy-screen: honor acpi=off in detect_thinkpad_privacy_screenTong Zhang
when acpi=off is provided in bootarg, kernel crash with [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 [ 1.258308] Call Trace: [ 1.258490] ? acpi_walk_namespace+0x147/0x147 [ 1.258770] acpi_get_devices+0xe4/0x137 [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] The reason is that acpi_walk_namespace expects acpi related stuff initialized but in fact it wouldn't when acpi is set to off. In this case we should honor acpi=off in detect_thinkpad_privacy_screen(). Signed-off-by: Tong Zhang <ztong0001@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220123091004.763775-1-ztong0001@gmail.com
2022-01-26Revert "drm/ast: Support 1600x900 with 108MHz PCLK"Dave Airlie
This reverts commit 9bb7b689274b67ecb3641e399e76f84adc627df1. This caused a regression reported to Red Hat. Fixes: 9bb7b689274b ("drm/ast: Support 1600x900 with 108MHz PCLK") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20220120040527.552068-1-airlied@gmail.com
2022-01-26Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann
Backmerging drm/drm-fixes into drm-misc-fixes for v5.17-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2022-01-25sch_htb: Fail on unsupported parameters when offload is requestedMaxim Mikityanskiy
The current implementation of HTB offload doesn't support some parameters. Instead of ignoring them, actively return the EINVAL error when they are set to non-defaults. As this patch goes to stable, the driver API is not changed here. If future drivers support more offload parameters, the checks can be moved to the driver side. Note that the buffer and cbuffer parameters are also not supported, but the tc userspace tool assigns some default values derived from rate and ceil, and identifying these defaults in sch_htb would be unreliable, so they are still ignored. Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220125100654.424570-1-maximmi@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-25drm/amdgpu/display: Remove t_srx_delay_us.Bas Nieuwenhuizen
Unused. Convert the divisions into asserts on the divisor, to debug why it is zero. The divide by zero is suspected of causing kernel panics. While I have no idea where the zero is coming from I think this patch is a positive either way. Cc: stable@vger.kernel.org Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amd/display: Wrap dcn301_calculate_wm_and_dlg for FPU.Bas Nieuwenhuizen
Mirrors the logic for dcn30. Cue lots of WARNs and some kernel panics without this fix. Cc: stable@vger.kernel.org Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amd/display: Fix FP start/end for dcn30_internal_validate_bw.Bas Nieuwenhuizen
It calls populate_dml_pipes which uses doubles to initialize the scale_ratio_depth params. Mirrors the dcn20 logic. Cc: stable@vger.kernel.org Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amd/display/dc/calcs/dce_calcs: Fix a memleak in calculate_bandwidth()Zhou Qingyang
In calculate_bandwidth(), the tag free_sclk and free_yclk are reversed, which could lead to a memory leak of yclk. Fix this bug by changing the location of free_sclk and free_yclk. This bug was found by a static analyzer. Builds with 'make allyesconfig' show no new warnings, and our static analyzer no longer warns about this code. Fixes: 2be8989d0fc2 ("drm/amd/display/dc/calcs/dce_calcs: Move some large variables from the stack to the heap") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amdgpu/display: use msleep rather than udelay for long delaysAlex Deucher
Some architectures (e.g., ARM) throw an compilation error if the udelay is too long. In general udelays of longer than 2000us are not recommended on any architecture. Switch to msleep in these cases. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amdgpu/display: adjust msleep limit in dp_wait_for_training_aux_rd_intervalAlex Deucher
Some architectures (e.g., ARM) have relatively low udelay limits. On most architectures, anything longer than 2000us is not recommended. Change the check to align with other similar checks in DC. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amdgpu: filter out radeon secondary ids as wellAlex Deucher
Older radeon boards (r2xx-r5xx) had secondary PCI functions which we solely there for supporting multi-head on OSs with special requirements. Add them to the unsupported list as well so we don't attempt to bind to them. The driver would fail to bind to them anyway, but this does so in a cleaner way that should not confuse the user. Cc: stable@vger.kernel.org Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-25drm/amd/display: change FIFO reset condition to embedded display onlyZhan Liu
[Why] FIFO reset is only necessary for fast boot sequence, where otg is disabled and dig fe is enabled when changing dispclk. Fast boot is only enabled on embedded displays. [How] Change FIFO reset condition to "embedded display only". Signed-off-by: Zhan Liu <zhan.liu@amd.com> Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>