linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-01-26	KVM: nVMX: WARN on any attempt to allocate shadow VMCS for vmcs02	Sean Christopherson
	WARN if KVM attempts to allocate a shadow VMCS for vmcs02. KVM emulates VMCS shadowing but doesn't virtualize it, i.e. KVM should never allocate a "real" shadow VMCS for L2. The previous code WARNed but continued anyway with the allocation, presumably in an attempt to avoid NULL pointer dereference. However, alloc_vmcs (and hence alloc_shadow_vmcs) can fail, and indeed the sole caller does: if (enable_shadow_vmcs && !alloc_shadow_vmcs(vcpu)) goto out_shadow_vmcs; which makes it not a useful attempt. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220125220527.2093146-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: selftests: Don't skip L2's VMCALL in SMM test for SVM guest	Sean Christopherson
	Don't skip the vmcall() in l2_guest_code() prior to re-entering L2, doing so will result in L2 running to completion, popping '0' off the stack for RET, jumping to address '0', and ultimately dying with a triple fault shutdown. It's not at all obvious why the test re-enters L2 and re-executes VMCALL, but presumably it serves a purpose. The VMX path doesn't skip vmcall(), and the test can't possibly have passed on SVM, so just do what VMX does. Fixes: d951b2210c1a ("KVM: selftests: smm_test: Test SMM enter from L2") Cc: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220125221725.2101126-1-seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: x86: Check .flags in kvm_cpuid_check_equal() too	Vitaly Kuznetsov
	kvm_cpuid_check_equal() checks for the (full) equality of the supplied CPUID data so .flags need to be checked too. Reported-by: Sean Christopherson <seanjc@google.com> Fixes: c6617c61e8fe ("KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220126131804.2839410-1-vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: x86: Forcibly leave nested virt when SMM state is toggled	Sean Christopherson
	Forcibly leave nested virtualization operation if userspace toggles SMM state via KVM_SET_VCPU_EVENTS or KVM_SYNC_X86_EVENTS. If userspace forces the vCPU out of SMM while it's post-VMXON and then injects an SMI, vmx_enter_smm() will overwrite vmx->nested.smm.vmxon and end up with both vmxon=false and smm.vmxon=false, but all other nVMX state allocated. Don't attempt to gracefully handle the transition as (a) most transitions are nonsencial, e.g. forcing SMM while L2 is running, (b) there isn't sufficient information to handle all transitions, e.g. SVM wants access to the SMRAM save state, and (c) KVM_SET_VCPU_EVENTS must precede KVM_SET_NESTED_STATE during state restore as the latter disallows putting the vCPU into L2 if SMM is active, and disallows tagging the vCPU as being post-VMXON in SMM if SMM is not active. Abuse of KVM_SET_VCPU_EVENTS manifests as a WARN and memory leak in nVMX due to failure to free vmcs01's shadow VMCS, but the bug goes far beyond just a memory leak, e.g. toggling SMM on while L2 is active puts the vCPU in an architecturally impossible state. WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline] WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656 Modules linked in: CPU: 1 PID: 3606 Comm: syz-executor725 Not tainted 5.17.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline] RIP: 0010:free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656 Code: <0f> 0b eb b3 e8 8f 4d 9f 00 e9 f7 fe ff ff 48 89 df e8 92 4d 9f 00 Call Trace: <TASK> kvm_arch_vcpu_destroy+0x72/0x2f0 arch/x86/kvm/x86.c:11123 kvm_vcpu_destroy arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 [inline] kvm_destroy_vcpus+0x11f/0x290 arch/x86/kvm/../../../virt/kvm/kvm_main.c:460 kvm_free_vcpus arch/x86/kvm/x86.c:11564 [inline] kvm_arch_destroy_vm+0x2e8/0x470 arch/x86/kvm/x86.c:11676 kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1217 [inline] kvm_put_kvm+0x4fa/0xb00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1250 kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1273 __fput+0x286/0x9f0 fs/file_table.c:311 task_work_run+0xdd/0x1a0 kernel/task_work.c:164 exit_task_work include/linux/task_work.h:32 [inline] do_exit+0xb29/0x2a30 kernel/exit.c:806 do_group_exit+0xd2/0x2f0 kernel/exit.c:935 get_signal+0x4b0/0x28c0 kernel/signal.c:2862 arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> Cc: stable@vger.kernel.org Reported-by: syzbot+8112db3ab20e70d50c31@syzkaller.appspotmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220125220358.2091737-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: SVM: drop unnecessary code in svm_hv_vmcb_dirty_nested_enlightenments()	Vitaly Kuznetsov
	Commit 3fa5e8fd0a0e4 ("KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized") re-arranged svm_vcpu_init_msrpm() call in svm_create_vcpu(), thus making the comment about vmcb being NULL obsolete. Drop it. While on it, drop superfluous vmcb_is_clean() check: vmcb_mark_dirty() is a bit flip, an extra check is unlikely to bring any performance gain. Drop now-unneeded vmcb_is_clean() helper as well. Fixes: 3fa5e8fd0a0e4 ("KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211220152139.418372-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: SVM: hyper-v: Enable Enlightened MSR-Bitmap support for real	Vitaly Kuznetsov
	Commit c4327f15dfc7 ("KVM: SVM: hyper-v: Enlightened MSR-Bitmap support") introduced enlightened MSR-Bitmap support for KVM-on-Hyper-V but it didn't actually enable the support. Similar to enlightened NPT TLB flush and direct TLB flush features, the guest (KVM) has to tell L0 (Hyper-V) that it's using the feature by setting the appropriate feature fit in VMCB control area (sw reserved fields). Fixes: c4327f15dfc7 ("KVM: SVM: hyper-v: Enlightened MSR-Bitmap support") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20211220152139.418372-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: SVM: Don't kill SEV guest if SMAP erratum triggers in usermode	Sean Christopherson
	Inject a #GP instead of synthesizing triple fault to try to avoid killing the guest if emulation of an SEV guest fails due to encountering the SMAP erratum. The injected #GP may still be fatal to the guest, e.g. if the userspace process is providing critical functionality, but KVM should make every attempt to keep the guest alive. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: SVM: Don't apply SEV+SMAP workaround on code fetch or PT access	Sean Christopherson
	Resume the guest instead of synthesizing a triple fault shutdown if the instruction bytes buffer is empty due to the #NPF being on the code fetch itself or on a page table access. The SMAP errata applies if and only if the code fetch was successful and ucode's subsequent data read from the code page encountered a SMAP violation. In practice, the guest is likely hosed either way, but crashing the guest on a code fetch to emulated MMIO is technically wrong according to the behavior described in the APM. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn buffer	Sean Christopherson
	Inject #UD if KVM attempts emulation for an SEV guests without an insn buffer and instruction decoding is required. The previous behavior of allowing emulation if there is no insn buffer is undesirable as doing so means KVM is reading guest private memory and thus decoding cyphertext, i.e. is emulating garbage. The check was previously necessary as the emulation type was not provided, i.e. SVM needed to allow emulation to handle completion of emulation after exiting to userspace to handle I/O. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guests	Sean Christopherson
	WARN if KVM attempts to emulate in response to #UD or #GP for SEV guests, i.e. if KVM intercepts #UD or #GP, as emulation on any fault except #NPF is impossible since KVM cannot read guest private memory to get the code stream, and the CPU's DecodeAssists feature only provides the instruction bytes on #NPF. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-7-seanjc@google.com> [Warn on EMULTYPE_TRAP_UD_FORCED according to Liam Merwick's review. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: x86: Pass emulation type to can_emulate_instruction()	Sean Christopherson
	Pass the emulation type to kvm_x86_ops.can_emulate_insutrction() so that a future commit can harden KVM's SEV support to WARN on emulation scenarios that should never happen. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: SVM: Explicitly require DECODEASSISTS to enable SEV support	Sean Christopherson
	Add a sanity check on DECODEASSIST being support if SEV is supported, as KVM cannot read guest private memory and thus relies on the CPU to provide the instruction byte stream on #NPF for emulation. The intent of the check is to document the dependency, it should never fail in practice as producing hardware that supports SEV but not DECODEASSISTS would be non-sensical. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: SVM: Don't intercept #GP for SEV guests	Sean Christopherson
	Never intercept #GP for SEV guests as reading SEV guest private memory will return cyphertext, i.e. emulating on #GP can't work as intended. Cc: stable@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	Revert "KVM: SVM: avoid infinite loop on NPF from bad address"	Sean Christopherson
	Revert a completely broken check on an "invalid" RIP in SVM's workaround for the DecodeAssists SMAP errata. kvm_vcpu_gfn_to_memslot() obviously expects a gfn, i.e. operates in the guest physical address space, whereas RIP is a virtual (not even linear) address. The "fix" worked for the problematic KVM selftest because the test identity mapped RIP. Fully revert the hack instead of trying to translate RIP to a GPA, as the non-SEV case is now handled earlier, and KVM cannot access guest page tables to translate RIP. This reverts commit e72436bc3a5206f95bb384e741154166ddb3202e. Fixes: e72436bc3a52 ("KVM: SVM: avoid infinite loop on NPF from bad address") Reported-by: Liam Merwick <liam.merwick@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: SVM: Never reject emulation due to SMAP errata for !SEV guests	Sean Christopherson
	Always signal that emulation is possible for !SEV guests regardless of whether or not the CPU provided a valid instruction byte stream. KVM can read all guest state (memory and registers) for !SEV guests, i.e. can fetch the code stream from memory even if the CPU failed to do so because of the SMAP errata. Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)") Cc: stable@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Message-Id: <20220120010719.711476-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: x86: nSVM: skip eax alignment check for non-SVM instructions	Denis Valeev
	The bug occurs on #GP triggered by VMware backdoor when eax value is unaligned. eax alignment check should not be applied to non-SVM instructions because it leads to incorrect omission of the instructions emulation. Apply the alignment check only to SVM instructions to fix. Fixes: d1cba6c92237 ("KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround") Signed-off-by: Denis Valeev <lemniscattaden@gmail.com> Message-Id: <Yexlhaoe1Fscm59u@q> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: x86/cpuid: Exclude unpermitted xfeatures sizes at KVM_GET_SUPPORTED_CPUID	Like Xu
	With the help of xstate_get_guest_group_perm(), KVM can exclude unpermitted xfeatures in cpuid.0xd.0.eax, in which case the corresponding xfeatures sizes should also be matched to the permitted xfeatures. To fix this inconsistency, the permitted_xcr0 and permitted_xss are defined consistently, which implies 'supported' plus certain permissions for this task, and it also fixes cpuid.0xd.1.ebx and later leaf-by-leaf queries. Fixes: 445ecdf79be0 ("kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID") Signed-off-by: Like Xu <likexu@tencent.com> Message-Id: <20220125115223.33707-1-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: LAPIC: Also cancel preemption timer during SET_LAPIC	Wanpeng Li
	The below warning is splatting during guest reboot. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1931 at arch/x86/kvm/x86.c:10322 kvm_arch_vcpu_ioctl_run+0x874/0x880 [kvm] CPU: 0 PID: 1931 Comm: qemu-system-x86 Tainted: G I 5.17.0-rc1+ #5 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x874/0x880 [kvm] Call Trace: <TASK> kvm_vcpu_ioctl+0x279/0x710 [kvm] __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fd39797350b This can be triggered by not exposing tsc-deadline mode and doing a reboot in the guest. The lapic_shutdown() function which is called in sys_reboot path will not disarm the flying timer, it just masks LVTT. lapic_shutdown() clears APIC state w/ LVT_MASKED and timer-mode bit is 0, this can trigger timer-mode switch between tsc-deadline and oneshot/periodic, which can result in preemption timer be cancelled in apic_update_lvtt(). However, We can't depend on this when not exposing tsc-deadline mode and oneshot/periodic modes emulated by preemption timer. Qemu will synchronise states around reset, let's cancel preemption timer under KVM_SET_LAPIC. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1643102220-35667-1-git-send-email-wanpengli@tencent.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	KVM: VMX: Remove vmcs_config.order	Jim Mattson
	The maximum size of a VMCS (or VMXON region) is 4096. By definition, these are order 0 allocations. Signed-off-by: Jim Mattson <jmattson@google.com> Message-Id: <20220125004359.147600-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-26	cpuset: Fix the bug that subpart_cpus updated wrongly in update_cpumask()	Tianchen Ding
	subparts_cpus should be limited as a subset of cpus_allowed, but it is updated wrongly by using cpumask_andnot(). Use cpumask_and() instead to fix it. Fixes: ee8dde0cd2ce ("cpuset: Add new v2 cpuset.sched.partition flag") Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-26	PCI/sysfs: Find shadow ROM before static attribute initialization	Bjorn Helgaas
	Ville reported that the sysfs "rom" file for VGA devices disappeared after 527139d738d7 ("PCI/sysfs: Convert "rom" to static attribute"). Prior to 527139d738d7, FINAL fixups, including pci_fixup_video() where we find shadow ROMs, were run before pci_create_sysfs_dev_files() created the sysfs "rom" file. After 527139d738d7, "rom" is a static attribute and is created before FINAL fixups are run, so we didn't create "rom" files for shadow ROMs: acpi_pci_root_add ... pci_scan_single_device pci_device_add pci_fixup_video # <-- new HEADER fixup device_add ... if (grp->is_visible()) pci_dev_rom_attr_is_visible # after 527139d738d7 pci_bus_add_devices pci_bus_add_device pci_fixup_device(pci_fixup_final) pci_fixup_video # <-- previous FINAL fixup pci_create_sysfs_dev_files if (pci_resource_len(pdev, PCI_ROM_RESOURCE)) sysfs_create_bin_file("rom") # before 527139d738d7 Change pci_fixup_video() to be a HEADER fixup so it runs before sysfs static attributes are initialized. Rename the Loongson pci_fixup_radeon() to pci_fixup_video() and make its dmesg logging identical to the others since it is doing the same job. Link: https://lore.kernel.org/r/YbxqIyrkv3GhZVxx@intel.com Fixes: 527139d738d7 ("PCI/sysfs: Convert "rom" to static attribute") Link: https://lore.kernel.org/r/20220126154001.16895-1-helgaas@kernel.org Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.13+ Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Krzysztof Wilczyński <kw@linux.com>
2022-01-26	spi: uniphier: fix reference count leak in uniphier_spi_probe()	Xin Xiong
	The issue happens in several error paths in uniphier_spi_probe(). When either dma_get_slave_caps() or devm_spi_register_master() returns an error code, the function forgets to decrease the refcount of both `dma_rx` and `dma_tx` objects, which may lead to refcount leaks. Fix it by decrementing the reference count of specific objects in those error paths. Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Fixes: 28d1dddc59f6 ("spi: uniphier: Add DMA transfer mode support") Link: https://lore.kernel.org/r/20220125101214.35677-1-xiongx18@fudan.edu.cn Signed-off-by: Mark Brown <broonie@kernel.org>
2022-01-26	Merge branch 'lan966x-fixes'	David S. Miller
	Horatiu Vultur says: ==================== net: lan966x: Fixes for sleep in atomic context This patch series contains 2 fixes for lan966x that is sleeping in atomic context. The first patch fixes the injection of the frames while the second one fixes the updating of the MAC table. v1->v2: - correct the fix tag in the second patch, it was using the wrong sha. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	net: lan966x: Fix sleep in atomic context when updating MAC table	Horatiu Vultur
	The function lan966x_mac_wait_for_completion is used to poll the status of the MAC table using the function readx_poll_timeout. The problem with this function is that is called also from atomic context. Therefore update the function to use readx_poll_timeout_atomic. Fixes: e18aba8941b40b ("net: lan966x: add mactable support") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	net: lan966x: Fix sleep in atomic context when injecting frames	Horatiu Vultur
	On lan966x, when injecting a frame it was polling the register QS_INJ_STATUS to see if it can continue with the injection of the frame. The problem was that it was using readx_poll_timeout which could sleep in atomic context. This patch fixes this issue by using readx_poll_timeout_atomic. Fixes: d28d6d2e37d10d ("net: lan966x: add port module support") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	Merge branch 'dev_addr-const-fixes'	David S. Miller
	Jakub Kicinski says: ==================== ethernet: fix some esoteric drivers after netdev->dev_addr constification Looking at recent fixes for drivers which don't get included with allmodconfig builds I thought it's worth grepping for more instances of: dev->dev_addr\[.*\] = This set contains the fixes. v2: add last 3 patches which fix drivers for the RiscPC ARM platform. Thanks to Arnd Bergmann for explaining how to build test that. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	ethernet: seeq/ether3: don't write directly to netdev->dev_addr	Jakub Kicinski
	netdev->dev_addr is const now. Compile tested rpc_defconfig w/ GCC 8.5. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	ethernet: 8390/etherh: don't write directly to netdev->dev_addr	Jakub Kicinski
	netdev->dev_addr is const now. Compile tested rpc_defconfig w/ GCC 8.5. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	ethernet: i825xx: don't write directly to netdev->dev_addr	Jakub Kicinski
	netdev->dev_addr is const now. Compile tested rpc_defconfig w/ GCC 8.5. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	ethernet: broadcom/sb1250-mac: don't write directly to netdev->dev_addr	Jakub Kicinski
	netdev->dev_addr is const now. Compile tested bigsur_defconfig and sb1250_swarm_defconfig. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	ethernet: tundra: don't write directly to netdev->dev_addr	Jakub Kicinski
	netdev->dev_addr is const now. Maintain the questionable offsetting in ndo_set_mac_address. Compile tested holly_defconfig and mpc7448_hpc2_defconfig. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	ethernet: 3com/typhoon: don't write directly to netdev->dev_addr	Jakub Kicinski
	This driver casts off the const and writes directly to netdev->dev_addr. This will result in a MAC address tree corruption and a warning. Compile tested ppc6xx_defconfig. Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-26	ALSA: hda: Fix UAF of leds class devs at unbinding	Takashi Iwai
	The LED class devices that are created by HD-audio codec drivers are registered via devm_led_classdev_register() and associated with the HD-audio codec device. Unfortunately, it turned out that the devres release doesn't work for this case; namely, since the codec resource release happens before the devm call chain, it triggers a NULL dereference or a UAF for a stale set_brightness_delay callback. For fixing the bug, this patch changes the LED class device register and unregister in a manual manner without devres, keeping the instances in hda_gen_spec. Reported-by: Alexander Sergeyev <sergeev917@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220111195229.a77wrpjclqwrx4bx@localhost.localdomain Link: https://lore.kernel.org/r/20220126145011.16728-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-01-26	drm/privacy-screen: honor acpi=off in detect_thinkpad_privacy_screen	Tong Zhang
	when acpi=off is provided in bootarg, kernel crash with [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 [ 1.258308] Call Trace: [ 1.258490] ? acpi_walk_namespace+0x147/0x147 [ 1.258770] acpi_get_devices+0xe4/0x137 [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] The reason is that acpi_walk_namespace expects acpi related stuff initialized but in fact it wouldn't when acpi is set to off. In this case we should honor acpi=off in detect_thinkpad_privacy_screen(). Signed-off-by: Tong Zhang <ztong0001@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220123091004.763775-1-ztong0001@gmail.com
2022-01-26	perf/core: Fix cgroup event list management	Namhyung Kim
	The active cgroup events are managed in the per-cpu cgrp_cpuctx_list. This list is only accessed from current cpu and not protected by any locks. But from the commit ef54c1a476ae ("perf: Rework perf_event_exit_event()"), it's possible to access (actually modify) the list from another cpu. In the perf_remove_from_context(), it can remove an event from the context without an IPI when the context is not active. This is not safe with cgroup events which can have some active events in the context even if ctx->is_active is 0 at the moment. The target cpu might be in the middle of list iteration at the same time. If the event is enabled when it's about to be closed, it might call perf_cgroup_event_disable() and list_del() with the cgrp_cpuctx_list on a different cpu. This resulted in a crash due to an invalid list pointer access during the cgroup list traversal on the cpu which the event belongs to. Let's fallback to IPI to access the cgrp_cpuctx_list from that cpu. Similarly, perf_install_in_context() should use IPI for the cgroup events too. Fixes: ef54c1a476ae ("perf: Rework perf_event_exit_event()") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220124195808.2252071-1-namhyung@kernel.org
2022-01-26	perf: Always wake the parent event	James Clark
	When using per-process mode and event inheritance is set to true, forked processes will create a new perf events via inherit_event() -> perf_event_alloc(). But these events will not have ring buffers assigned to them. Any call to wakeup will be dropped if it's called on an event with no ring buffer assigned because that's the object that holds the wakeup list. If the child event is disabled due to a call to perf_aux_output_begin() or perf_aux_output_end(), the wakeup is dropped leaving userspace hanging forever on the poll. Normally the event is explicitly re-enabled by userspace after it wakes up to read the aux data, but in this case it does not get woken up so the event remains disabled. This can be reproduced when using Arm SPE and 'stress' which forks once before running the workload. By looking at the list of aux buffers read, it's apparent that they stop after the fork: perf record -e arm_spe// -vvv -- stress -c 1 With this patch applied they continue to be printed. This behaviour doesn't happen when using systemwide or per-cpu mode. Reported-by: Ruben Ayrapetyan <Ruben.Ayrapetyan@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211206113840.130802-2-james.clark@arm.com
2022-01-26	serial: mcf: use helpers in mcf_tx_chars()	Jiri Slaby
	Use uart_circ_empty() instead of open-coding it via xmit->head & tail. Use preexisting mcf_stop_tx() to avoid stop-tx code duplication. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220124071430.14907-12-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	serial: fsl_linflexuart: don't call uart_write_wakeup() twice	Jiri Slaby
	linflex_txint() calls linflex_transmit_buffer() which calls uart_write_wakeup(). So there is no point to repeat it in linflex_txint() again -- remove it. Cc: Stefan-gabriel Mirea <stefan-gabriel.mirea@nxp.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220124071430.14907-10-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	serial: fsl_linflexuart: deduplicate character sending	Jiri Slaby
	Introduce a new linflex_put_char() helper to send a character. And use it on both places this code was duplicated. Cc: Stefan-gabriel Mirea <stefan-gabriel.mirea@nxp.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220124071430.14907-9-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	mxser: switch from xmit_buf to kfifo	Jiri Slaby
	Use kfifo for xmit buffer handling. The change is mostly straightforward. It saves complexity both on the stuffing side (mxser_write() and mxser_put_char()) and pulling side (mxser_transmit_chars()). In fact, the loop in mxser_write() can be completely deleted as the wrap of the buffer is taken care of in the kfifo code now. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220124071430.14907-8-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	mxser: use tty_port xmit_buf helpers	Jiri Slaby
	For the mxser driver to use kfifo, use tty_port_alloc_xmit_buf() and tty_port_free_xmit_buf() helpers in activate/shutdown, respectively. As these calls have to be done in a non-atomic context, we have to move them outside spinlock and make sure irq is really stopped after we write to the ISR register. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220124071430.14907-7-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	mxser: fix xmit_buf leak in activate when LSR == 0xff	Jiri Slaby
	When LSR is 0xff in ->activate() (rather unlike), we return an error. Provided ->shutdown() is not called when ->activate() fails, nothing actually frees the buffer in this case. Fix this by properly freeing the buffer in a designated label. We jump there also from the "!info->type" if now too. Fixes: 6769140d3047 ("tty: mxser: use the tty_port_open method") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220124071430.14907-6-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	tty: tty_port_open, document shutdown vs failed activate	Jiri Slaby
	Add a note that ->shutdown is not called when ->activate fails. Just so we are clear. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220124071430.14907-5-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	tty: add kfifo to tty_port	Jiri Slaby
	Define a kfifo inside struct tty_port. We use DECLARE_KFIFO_PTR and let the preexisting tty_port::xmit_buf be also the buffer for the kfifo. And handle the initialization/decomissioning along with xmit_buf, i.e. in tty_port_alloc_xmit_buf() and tty_port_free_xmit_buf(), respectively. This allows for kfifo use in drivers which opt-in, while others still may use the old xmit_buf. mxser will be the first user in the next few patches. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220124071430.14907-4-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	serial: atmel_serial: include circ_buf.h	Jiri Slaby
	atmel_uart_port::rx_ring is defined as struct circ_buf, but circ_buf.h is not included explicitly in atmel_serial.c. It is included only implicitly via serial_core.h. Fix this as serial_core.h might not include that header in the future. Signed-off-by:Jiri Slaby <jslaby@suse.cz> Cc: Richard Genoud <richard.genoud@gmail.com> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Ludovic Desroches <ludovic.desroches@microchip.com> Acked-by: Richard Genoud <richard.genoud@gmail.com> Signed-off-by: Richard Genoud <richard.genoud@gmail.com> Link: https://lore.kernel.org/r/20220124071430.14907-3-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	serial: core: clean up EXPORT_SYMBOLs	Jiri Slaby
	Some EXPORT_SYMBOLs are grouped at one location. Some follow functions they export, but a newline is present before them. Fix all these and move them where they belong. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20220124071430.14907-2-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	tty: serial: fsl_lpuart: count tty buffer overruns	Sherry Sun
	Added support for counting the tty buffer overruns in fsl_lpuart driver like other uart drivers. Reviewed-by: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Sherry Sun <sherry.sun@nxp.com> Link: https://lore.kernel.org/r/20220111085130.5817-1-sherry.sun@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	serial: imx: reduce RX interrupt frequency	Tomasz Moń
	Triggering RX interrupt for every byte defeats the purpose of aging timer and leads to interrupt storm at high baud rates. The interrupt storm can starve line discipline worker and prevent tty throttling, rendering hardware/software flow control useless. Increase receiver trigger level to 8 to increase the minimum period between RX interrupts to 8 characters time. The tradeoff is increased latency. Aging timer resets with every received character. Worst case scenario happens when RX data intercharacter delay is slightly less than the aging timer timeout (8 characters time). The upper bound of the time a character can wait in RxFIFO before interrupt is raised is: (RXTL - 1) * (8 character time timeout + received 1 character time) Usually the data is received in frames, with low intercharacter delay. In such case the latency increase is 8 characters time at the end of the frame with probability (RXTL - 1) / RXTL. Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Tomasz Moń <tomasz.mon@camlingroup.com> Link: https://lore.kernel.org/r/20220117060417.624613-1-tomasz.mon@camlingroup.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	tty: serial: max3100: Remove redundant 'flush_workqueue()' calls	Xu Wang
	'destroy_workqueue()' already drains the queue before destroying it, so there is no need to flush it explicitly. Remove the redundant 'flush_workqueue()' calls. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Link: https://lore.kernel.org/r/20220114085156.43041-1-vulab@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26	serial: core: Initialize rs485 RTS polarity already on probe	Lukas Wunner
	RTS polarity of rs485-enabled ports is currently initialized on uart open via: tty_port_open() tty_port_block_til_ready() tty_port_raise_dtr_rts() # if (C_BAUD(tty)) uart_dtr_rts() uart_port_dtr_rts() There's at least three problems here: First, if no baud rate is set, RTS polarity is not initialized. That's the right thing to do for rs232, but not for rs485, which requires that RTS is deasserted unconditionally. Second, if the DeviceTree property "linux,rs485-enabled-at-boot-time" is present, RTS should be deasserted as early as possible, i.e. on probe. Otherwise it may remain asserted until first open. Third, even though RTS is deasserted on open and close, it may subsequently be asserted by uart_throttle(), uart_unthrottle() or uart_set_termios() because those functions aren't rs485-aware. (Only uart_tiocmset() is.) To address these issues, move RTS initialization from uart_port_dtr_rts() to uart_configure_port(). Prevent subsequent modification of RTS polarity by moving the existing rs485 check from uart_tiocmget() to uart_update_mctrl(). That way, RTS is initialized on probe and then remains unmodified unless the uart transmits data. If rs485 is enabled at runtime (instead of at boot) through a TIOCSRS485 ioctl(), RTS is initialized by the uart driver's ->rs485_config() callback and then likewise remains unmodified. The PL011 driver initializes RTS on uart open and prevents subsequent modification in its ->set_mctrl() callback. That code is obsoleted by the present commit, so drop it. Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Su Bao Cheng <baocheng.su@siemens.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/2d2acaf3a69e89b7bf687c912022b11fd29dfa1e.1642909284.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>