summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-11-30mips, bpf: Fix reference to non-existing Kconfig symbolJohan Almbladh
The Kconfig symbol for R10000 ll/sc errata workaround in the MIPS JIT was misspelled, causing the workaround to not take effect when enabled. Fixes: 72570224bb8f ("mips, bpf: Add JIT workarounds for CPU errata") Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211130160824.3781635-1-johan.almbladh@anyfinetworks.com
2021-11-30platform/x86: amd-pmc: Fix s2idle failures on certain AMD laptopsFabrizio Bertocci
On some AMD hardware laptops, the system fails communicating with the PMC when entering s2idle and the machine is battery powered. Hardware description: HP Pavilion Aero Laptop 13-be0097nr CPU: AMD Ryzen 7 5800U with Radeon Graphics GPU: 03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1638] (rev c1) Detailed description of the problem (and investigation) here: https://gitlab.freedesktop.org/drm/amd/-/issues/1799 Patch is a single line: reduce the polling delay in half, from 100uSec to 50uSec when waiting for a change in state from the PMC after a write command operation. After changing the delay, I did not see a single failure on this machine (I have this fix for now more than one week and s2idle worked every single time on battery power). Cc: stable@vger.kernel.org Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Fabrizio Bertocci <fabriziobertocci@gmail.com> Link: https://lore.kernel.org/r/CADtzkx7TdfbwtaVEXUdD6YXPey52E-nZVQNs+Z41DTx7gqMqtw@mail.gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-11-30bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.Sebastian Andrzej Siewior
The initial implementation of migrate_disable() for mainline was a wrapper around preempt_disable(). RT kernels substituted this with a real migrate disable implementation. Later on mainline gained true migrate disable support, but neither documentation nor affected code were updated. Remove stale comments claiming that migrate_disable() is PREEMPT_RT only. Don't use __this_cpu_inc() in the !PREEMPT_RT path because preemption is not disabled and the RMW operation can be preempted. Fixes: 74d862b682f51 ("sched: Make migrate_disable/enable() independent of RT") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211127163200.10466-3-bigeasy@linutronix.de
2021-11-30Documentation/locking/locktypes: Update migrate_disable() bits.Sebastian Andrzej Siewior
The initial implementation of migrate_disable() for mainline was a wrapper around preempt_disable(). RT kernels substituted this with a real migrate disable implementation. Later on mainline gained true migrate disable support, but the documentation was not updated. Update the documentation, remove the claims about migrate_disable() mapping to preempt_disable() on non-PREEMPT_RT kernels. Fixes: 74d862b682f51 ("sched: Make migrate_disable/enable() independent of RT") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211127163200.10466-2-bigeasy@linutronix.de
2021-11-30ALSA: hda/hdmi: fix HDA codec entry table order for ADL-PKai Vehmanen
Keep the HDA_CODEC_ENTRY entries sorted by the codec VID. ADL-P is the only misplaced Intel HDMI codec. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20211130124732.696896-2-kai.vehmanen@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-11-30ALSA: hda: Add Intel DG2 PCI ID and HDMI codec vidKai Vehmanen
Add HD Audio PCI ID and HDMI codec vendor ID for Intel DG2. Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20211130124732.696896-1-kai.vehmanen@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-11-30KVM: fix avic_set_running for preemptable kernelsPaolo Bonzini
avic_set_running() passes the current CPU to avic_vcpu_load(), albeit via vcpu->cpu rather than smp_processor_id(). If the thread is migrated while avic_set_running runs, the call to avic_vcpu_load() can use a stale value for the processor id. Avoid this by blocking preemption over the entire execution of avic_set_running(). Reported-by: Sean Christopherson <seanjc@google.com> Fixes: 8221c1370056 ("svm: Manage vcpu load/unload when enable AVIC") Cc: stable@vger.kernel.org Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabledPaolo Bonzini
There is nothing to synchronize if APICv is disabled, since neither other vCPUs nor assigned devices can set PIR.ON. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30MAINTAINERS: s390/net: add Alexandra and Wenjia as maintainerKarsten Graul
Add Alexandra and Wenjia as maintainers for drivers/s390/net and iucv. Also, remove myself as maintainer for these areas. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Acked-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30dpaa2-eth: destroy workqueue at the end of remove functionDongliang Mu
The commit c55211892f46 ("dpaa2-eth: support PTP Sync packet one-step timestamping") forgets to destroy workqueue at the end of remove function. Fix this by adding destroy_workqueue before fsl_mc_portal_free and free_netdev. Fixes: c55211892f46 ("dpaa2-eth: support PTP Sync packet one-step timestamping") Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30ice: xsk: clear status_error0 for each allocated descMaciej Fijalkowski
Fix a bug in which the receiving of packets can stop in the zero-copy driver. Ice HW ignores 3 lower bits from QRX_TAIL register, which means that tail is bumped only on intervals of 8. Currently with XSK RX batching in place, ice_alloc_rx_bufs_zc() clears the status_error0 only of the last descriptor that has been allocated/taken from the XSK buffer pool. status_error0 includes DD bit that is looked upon by the ice_clean_rx_irq_zc() to tell if a descriptor can be processed. The bug can be triggered when driver updates the ntu but not the QRX_TAIL, so HW wouldn't have a chance to write to the ready descriptors. Later on driver moves the ntc to the mentioned set of descriptors and interprets them as a ready to be processed, since corresponding DD bits were not cleared nor any writeback has happened that would clear it. This can then lead to ntc == ntu case which means that ring is empty and no further packet processing. Fix the XSK traffic hang that can be observed when l2fwd scenario from xdpsock is used by making sure that status_error0 is cleared for each descriptor that is fed to HW and therefore we are sure that driver will not processed non-valid DD bits. This will also prevent the driver from processing the descriptors that were allocated in favor of the previously processed ones, but writeback didn't happen yet. Fixes: db804cfc21e9 ("ice: Use the xsk batched rx allocation interface") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30net: marvell: mvpp2: Fix the computation of shared CPUsChristophe JAILLET
'bitmap_fill()' fills a bitmap one 'long' at a time. It is likely that an exact number of bits is expected. Use 'bitmap_set()' instead in order not to set unexpected bits. Fixes: e531f76757eb ("net: mvpp2: handle cases where more CPUs are available than s/w threads") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-30i2c: stm32f7: use proper DMAENGINE API for terminationAlain Volmat
dmaengine_terminate_all() is deprecated in favor of explicitly saying if it should be sync or async. Here, we use dmaengine_terminate_sync in i2c_xfer and i2c_smbus_xfer handlers and rely on dmaengine_terminate_async within interrupt handlers (transmission error cases). dmaengine_synchronize is added within i2c_xfer and i2c_smbus_xfer handler to finalize terminate started in interrupt handlers. Signed-off-by: Alain Volmat <alain.volmat@foss.st.com> Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@foss.st.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-11-30i2c: stm32f7: stop dma transfer in case of NACKAlain Volmat
In case of receiving a NACK, the dma transfer should be stopped to avoid feeding data into the FIFO. Also ensure to properly return the proper error code and avoid waiting for the end of the dma completion in case of error happening during the transmission. Fixes: 7ecc8cfde553 ("i2c: i2c-stm32f7: Add DMA support") Signed-off-by: Alain Volmat <alain.volmat@foss.st.com> Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@foss.st.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-11-30i2c: stm32f7: recover the bus on access timeoutAlain Volmat
When getting an access timeout, ensure that the bus is in a proper state prior to returning the error. Fixes: aeb068c57214 ("i2c: i2c-stm32f7: add driver") Signed-off-by: Alain Volmat <alain.volmat@foss.st.com> Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@foss.st.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-11-30KVM: SEV: accept signals in sev_lock_two_vmsPaolo Bonzini
Generally, kvm->lock is not taken for a long time, but sev_lock_two_vms is different: it takes vCPU locks inside, so userspace can hold it back just by calling a vCPU ioctl. Play it safe and use mutex_lock_killable. Message-Id: <20211123005036.2954379-13-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: do not take kvm->lock when destroyingPaolo Bonzini
Taking the lock is useless since there are no other references, and there are already accesses (e.g. to sev->enc_context_owner) that do not take it. So get rid of it. Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-12-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: Prohibit migration of a VM that has mirrorsPaolo Bonzini
VMs that mirror an encryption context rely on the owner to keep the ASID allocated. Performing a KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM would cause a dangling ASID: 1. copy context from A to B (gets ref to A) 2. move context from A to L (moves ASID from A to L) 3. close L (releases ASID from L, B still references it) The right way to do the handoff instead is to create a fresh mirror VM on the destination first: 1. copy context from A to B (gets ref to A) [later] 2. close B (releases ref to A) 3. move context from A to L (moves ASID from A to L) 4. copy context from L to M So, catch the situation by adding a count of how many VMs are mirroring this one's encryption context. Fixes: 0b020f5af092 ("KVM: SEV: Add support for SEV-ES intra host migration") Message-Id: <20211123005036.2954379-11-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: Do COPY_ENC_CONTEXT_FROM with both VMs lockedPaolo Bonzini
Now that we have a facility to lock two VMs with deadlock protection, use it for the creation of mirror VMs as well. One of COPY_ENC_CONTEXT_FROM(dst, src) and COPY_ENC_CONTEXT_FROM(src, dst) would always fail, so the combination is nonsensical and it is okay to return -EBUSY if it is attempted. This sidesteps the question of what happens if a VM is MOVE_ENC_CONTEXT_FROM'd at the same time as it is COPY_ENC_CONTEXT_FROM'd: the locking prevents that from happening. Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-10-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30selftests: sev_migrate_tests: add tests for KVM_CAP_VM_COPY_ENC_CONTEXT_FROMPaolo Bonzini
I am putting the tests in sev_migrate_tests because the failure conditions are very similar and some of the setup code can be reused, too. The tests cover both successful creation of a mirror VM, and error conditions. Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-9-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: move mirror status to destination of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROMPaolo Bonzini
Allow intra-host migration of a mirror VM; the destination VM will be a mirror of the same ASID as the source. Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration") Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-8-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: initialize regions_list of a mirror VMPaolo Bonzini
This was broken before the introduction of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM, but technically harmless because the region list was unused for a mirror VM. However, it is untidy and it now causes a NULL pointer access when attempting to move the encryption context of a mirror VM. Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context") Message-Id: <20211123005036.2954379-7-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: cleanup locking for KVM_CAP_VM_MOVE_ENC_CONTEXT_FROMPaolo Bonzini
Encapsulate the handling of the migration_in_progress flag for both VMs in two functions sev_lock_two_vms and sev_unlock_two_vms. It does not matter if KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM locks the destination struct kvm a bit later, and this change 1) keeps the cleanup chain of labels smaller 2) makes it possible for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM to reuse the logic. Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-6-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: SEV: do not use list_replace_init on an empty listPaolo Bonzini
list_replace_init cannot be used if the source is an empty list, because "new->next->prev = new" will overwrite "old->next": new old prev = new, next = new prev = old, next = old new->next = old->next prev = new, next = old prev = old, next = old new->next->prev = new prev = new, next = old prev = old, next = new new->prev = old->prev prev = old, next = old prev = old, next = old new->next->prev = new prev = old, next = old prev = new, next = new The desired outcome instead would be to leave both old and new the same as they were (two empty circular lists). Use list_cut_before, which already has the necessary check and is documented to discard the previous contents of the list that will hold the result. Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration") Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211123005036.2954379-5-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: x86: Use a stable condition around all VT-d PI pathsPaolo Bonzini
Currently, checks for whether VT-d PI can be used refer to the current status of the feature in the current vCPU; or they more or less pick vCPU 0 in case a specific vCPU is not available. However, these checks do not attempt to synchronize with changes to the IRTE. In particular, there is no path that updates the IRTE when APICv is re-activated on vCPU 0; and there is no path to wakeup a CPU that has APICv disabled, if the wakeup occurs because of an IRTE that points to a posted interrupt. To fix this, always go through the VT-d PI path as long as there are assigned devices and APICv is available on both the host and the VM side. Since the relevant condition was copied over three times, take the hint and factor it into a separate function. Suggested-by: Sean Christopherson <seanjc@google.com> Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: David Matlack <dmatlack@google.com> Message-Id: <20211123004311.2954158-5-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: x86: check PIR even for vCPUs with disabled APICvPaolo Bonzini
The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even if APICv is disabled on the vCPU that receives it. In that case, the interrupt will just cause a vmexit and leave the ON bit set together with the PIR bit corresponding to the interrupt. Right now, the interrupt would not be delivered until APICv is re-enabled. However, fixing this is just a matter of always doing the PIR->IRR synchronization, even if the vCPU has temporarily disabled APICv. This is not a problem for performance, or if anything it is an improvement. First, in the common case where vcpu->arch.apicv_active is true, one fewer check has to be performed. Second, static_call_cond will elide the function call if APICv is not present or disabled. Finally, in the case for AMD hardware we can remove the sync_pir_to_irr callback: it is only needed for apic_has_interrupt_for_ppr, and that function already has a fallback for !APICv. Cc: stable@vger.kernel.org Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: David Matlack <dmatlack@google.com> Message-Id: <20211123004311.2954158-4-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: VMX: prepare sync_pir_to_irr for running with APICv disabledPaolo Bonzini
If APICv is disabled for this vCPU, assigned devices may still attempt to post interrupts. In that case, we need to cancel the vmentry and deliver the interrupt with KVM_REQ_EVENT. Extend the existing code that handles injection of L1 interrupts into L2 to cover this case as well. vmx_hwapic_irr_update is only called when APICv is active so it would be confusing to add a check for vcpu->arch.apicv_active in there. Instead, just use vmx_set_rvi directly in vmx_sync_pir_to_irr. Cc: stable@vger.kernel.org Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211123004311.2954158-3-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: selftests: page_table_test: fix calculation of guest_test_phys_memMaciej S. Szmigiero
A kvm_page_table_test run with its default settings fails on VMX due to memory region add failure: > ==== Test Assertion Failure ==== > lib/kvm_util.c:952: ret == 0 > pid=10538 tid=10538 errno=17 - File exists > 1 0x00000000004057d1: vm_userspace_mem_region_add at kvm_util.c:947 > 2 0x0000000000401ee9: pre_init_before_test at kvm_page_table_test.c:302 > 3 (inlined by) run_test at kvm_page_table_test.c:374 > 4 0x0000000000409754: for_each_guest_mode at guest_modes.c:53 > 5 0x0000000000401860: main at kvm_page_table_test.c:500 > 6 0x00007f82ae2d8554: ?? ??:0 > 7 0x0000000000401894: _start at ??:? > KVM_SET_USER_MEMORY_REGION IOCTL failed, > rc: -1 errno: 17 > slot: 1 flags: 0x0 > guest_phys_addr: 0xc0000000 size: 0x40000000 This is because the memory range that this test is trying to add (0x0c0000000 - 0x100000000) conflicts with LAPIC mapping at 0x0fee00000. Looking at the code it seems that guest_test_*phys*_mem variable gets mistakenly overwritten with guest_test_*virt*_mem while trying to adjust the former for alignment. With the correct variable adjusted this test runs successfully. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <52e487458c3172923549bbcf9dfccfbe6faea60b.1637940473.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: x86/mmu: Handle "default" period when selectively waking kthreadSean Christopherson
Account for the '0' being a default, "let KVM choose" period, when determining whether or not the recovery worker needs to be awakened in response to userspace reducing the period. Failure to do so results in the worker not being awakened properly, e.g. when changing the period from '0' to any small-ish value. Fixes: 4dfe4f40d845 ("kvm: x86: mmu: Make NX huge page recovery period configurable") Cc: stable@vger.kernel.org Cc: Junaid Shahid <junaids@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211120015706.3830341-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: MMU: shadow nested paging does not have PKUPaolo Bonzini
Initialize the mask for PKU permissions as if CR4.PKE=0, avoiding incorrect interpretations of the nested hypervisor's page tables. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: x86/mmu: Remove spurious TLB flushes in TDP MMU zap collapsible pathSean Christopherson
Drop the "flush" param and return values to/from the TDP MMU's helper for zapping collapsible SPTEs. Because the helper runs with mmu_lock held for read, not write, it uses tdp_mmu_zap_spte_atomic(), and the atomic zap handles the necessary remote TLB flush. Similarly, because mmu_lock is dropped and re-acquired between zapping legacy MMUs and zapping TDP MMUs, kvm_mmu_zap_collapsible_sptes() must handle remote TLB flushes from the legacy MMU before calling into the TDP MMU. Fixes: e2209710ccc5d ("KVM: x86/mmu: Skip rmap operations if rmaps not allocated") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211120045046.3940942-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30KVM: x86/mmu: Use yield-safe TDP MMU root iter in MMU notifier unmappingSean Christopherson
Use the yield-safe variant of the TDP MMU iterator when handling an unmapping event from the MMU notifier, as most occurences of the event allow yielding. Fixes: e1eed5847b09 ("KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if possible") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211120015008.3780032-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-29net: mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()Wei Yongjun
Add the missing mutex_unlock before return from function ocelot_hwstamp_set() in the ocelot_setup_ptp_traps() error handling case. Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20211129151652.1165433-1-weiyongjun1@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29Merge tag 'rxrpc-fixes-20211129' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Leak fixes Here are a couple of fixes for leaks in AF_RXRPC: (1) Fix a leak of rxrpc_peer structs in rxrpc_look_up_bundle(). (2) Fix a leak of rxrpc_local structs in rxrpc_lookup_peer(). * tag 'rxrpc-fixes-20211129' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer() rxrpc: Fix rxrpc_peer leak in rxrpc_look_up_bundle() ==================== Link: https://lore.kernel.org/r/163820097905.226370.17234085194655347888.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29Merge branch 'wireguard-siphash-patches-for-5-16-rc6'Jakub Kicinski
Jason A. Donenfeld says: ==================== wireguard/siphash patches for 5.16-rc Here's quite a largeish set of stable patches I've had queued up and testing for a number of months now: - Patch (1) squelches a sparse warning by fixing an annotation. - Patches (2), (3), and (5) are minor improvements and fixes to the test suite. - Patch (4) is part of a tree-wide cleanup to have module-specific init and exit functions. - Patch (6) fixes a an issue with dangling dst references, by having a function to release references immediately rather than deferring, and adds an associated test case to prevent this from regressing. - Patches (7) and (8) help mitigate somewhat a potential DoS on the ingress path due to the use of skb_list's locking hitting contention on multiple cores by switching to using a ring buffer and dropping packets on contention rather than locking up another core spinning. - Patch (9) switches kvzalloc to kvcalloc for better form. - Patch (10) fixes alignment traps in siphash with clang-13 (and maybe other compilers) on armv6, by switching to using the unaligned functions by default instead of the aligned functions by default. ==================== Link: https://lore.kernel.org/r/20211129153929.3457-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29siphash: use _unaligned version by defaultArnd Bergmann
On ARM v6 and later, we define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS because the ordinary load/store instructions (ldr, ldrh, ldrb) can tolerate any misalignment of the memory address. However, load/store double and load/store multiple instructions (ldrd, ldm) may still only be used on memory addresses that are 32-bit aligned, and so we have to use the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS macro with care, or we may end up with a severe performance hit due to alignment traps that require fixups by the kernel. Testing shows that this currently happens with clang-13 but not gcc-11. In theory, any compiler version can produce this bug or other problems, as we are dealing with undefined behavior in C99 even on architectures that support this in hardware, see also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363. Fortunately, the get_unaligned() accessors do the right thing: when building for ARMv6 or later, the compiler will emit unaligned accesses using the ordinary load/store instructions (but avoid the ones that require 32-bit alignment). When building for older ARM, those accessors will emit the appropriate sequence of ldrb/mov/orr instructions. And on architectures that can truly tolerate any kind of misalignment, the get_unaligned() accessors resolve to the leXX_to_cpup accessors that operate on aligned addresses. Since the compiler will in fact emit ldrd or ldm instructions when building this code for ARM v6 or later, the solution is to use the unaligned accessors unconditionally on architectures where this is known to be fast. The _aligned version of the hash function is however still needed to get the best performance on architectures that cannot do any unaligned access in hardware. This new version avoids the undefined behavior and should produce the fastest hash on all architectures we support. Link: https://lore.kernel.org/linux-arm-kernel/20181008211554.5355-4-ard.biesheuvel@linaro.org/ Link: https://lore.kernel.org/linux-crypto/CAK8P3a2KfmmGDbVHULWevB0hv71P2oi2ZCHEAqT=8dQfa0=cqQ@mail.gmail.com/ Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Fixes: 2c956a60778c ("siphash: add cryptographically secure PRF") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29wireguard: ratelimiter: use kvcalloc() instead of kvzalloc()Gustavo A. R. Silva
Use 2-factor argument form kvcalloc() instead of kvzalloc(). Link: https://github.com/KSPP/linux/issues/162 Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> [Jason: Gustavo's link above is for KSPP, but this isn't actually a security fix, as table_size is bounded to 8192 anyway, and gcc realizes this, so the codegen comes out to be about the same.] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29wireguard: receive: drop handshakes if queue lock is contendedJason A. Donenfeld
If we're being delivered packets from multiple CPUs so quickly that the ring lock is contended for CPU tries, then it's safe to assume that the queue is near capacity anyway, so just drop the packet rather than spinning. This helps deal with multicore DoS that can interfere with data path performance. It _still_ does not completely fix the issue, but it again chips away at it. Reported-by: Streun Fabio <fstreun@student.ethz.ch> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29wireguard: receive: use ring buffer for incoming handshakesJason A. Donenfeld
Apparently the spinlock on incoming_handshake's skb_queue is highly contended, and a torrent of handshake or cookie packets can bring the data plane to its knees, simply by virtue of enqueueing the handshake packets to be processed asynchronously. So, we try switching this to a ring buffer to hopefully have less lock contention. This alleviates the problem somewhat, though it still isn't perfect, so future patches will have to improve this further. However, it at least doesn't completely diminish the data plane. Reported-by: Streun Fabio <fstreun@student.ethz.ch> Reported-by: Joel Wanner <joel.wanner@inf.ethz.ch> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29wireguard: device: reset peer src endpoint when netns exitsJason A. Donenfeld
Each peer's endpoint contains a dst_cache entry that takes a reference to another netdev. When the containing namespace exits, we take down the socket and prevent future sockets from being created (by setting creating_net to NULL), which removes that potential reference on the netns. However, it doesn't release references to the netns that a netdev cached in dst_cache might be taking, so the netns still might fail to exit. Since the socket is gimped anyway, we can simply clear all the dst_caches (by way of clearing the endpoint src), which will release all references. However, the current dst_cache_reset function only releases those references lazily. But it turns out that all of our usages of wg_socket_clear_peer_endpoint_src are called from contexts that are not exactly high-speed or bottle-necked. For example, when there's connection difficulty, or when userspace is reconfiguring the interface. And in particular for this patch, when the netns is exiting. So for those cases, it makes more sense to call dst_release immediately. For that, we add a small helper function to dst_cache. This patch also adds a test to netns.sh from Hangbin Liu to ensure this doesn't regress. Tested-by: Hangbin Liu <liuhangbin@gmail.com> Reported-by: Xiumei Mu <xmu@redhat.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Paolo Abeni <pabeni@redhat.com> Fixes: 900575aa33a3 ("wireguard: device: avoid circular netns references") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29wireguard: selftests: rename DEBUG_PI_LIST to DEBUG_PLISTLi Zhijian
DEBUG_PI_LIST was renamed to DEBUG_PLIST since 8e18faeac3 ("lib/plist: rename DEBUG_PI_LIST to DEBUG_PLIST"). Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Fixes: 8e18faeac3e4 ("lib/plist: rename DEBUG_PI_LIST to DEBUG_PLIST") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29wireguard: main: rename 'mod_init' & 'mod_exit' functions to be module-specificRandy Dunlap
Rename module_init & module_exit functions that are named "mod_init" and "mod_exit" so that they are unique in both the System.map file and in initcall_debug output instead of showing up as almost anonymous "mod_init". This is helpful for debugging and in determining how long certain module_init calls take to execute. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29wireguard: selftests: actually test for routing loopsJason A. Donenfeld
We previously removed the restriction on looping to self, and then added a test to make sure the kernel didn't blow up during a routing loop. The kernel didn't blow up, thankfully, but on certain architectures where skb fragmentation is easier, such as ppc64, the skbs weren't actually being discarded after a few rounds through. But the test wasn't catching this. So actually test explicitly for massive increases in tx to see if we have a routing loop. Note that the actual loop problem will need to be addressed in a different commit. Fixes: b673e24aad36 ("wireguard: socket: remove errant restriction on looping to self") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29wireguard: selftests: increase default dmesg log sizeJason A. Donenfeld
The selftests currently parse the kernel log at the end to track potential memory leaks. With these tests now reading off the end of the buffer, due to recent optimizations, some creation messages were lost, making the tests think that there was a free without an alloc. Fix this by increasing the kernel log size. Fixes: 24b70eeeb4f4 ("wireguard: use synchronize_net rather than synchronize_rcu") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29wireguard: allowedips: add missing __rcu annotation to satisfy sparseJason A. Donenfeld
A __rcu annotation got lost during refactoring, which caused sparse to become enraged. Fixes: bf7b042dc62a ("wireguard: allowedips: free empty intermediate nodes when removing single node") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29scsi: ufs: ufs-pci: Add support for Intel ADLAdrian Hunter
Add PCI ID and callbacks to support Intel Alder Lake. Link: https://lore.kernel.org/r/20211124204218.1784559-1-adrian.hunter@intel.com Cc: stable@vger.kernel.org # v5.15+ Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-29Documentation: Add minimum pahole versionArnaldo Carvalho de Melo
A report was made in https://github.com/acmel/dwarves/issues/26 about pahole not being listed in the process/changes.rst file as being needed for building the kernel, address that. Link: https://github.com/acmel/dwarves/issues/26 Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/YZPQ6+u2wTHRfR+W@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/YZfzQ0DvHD5o26Bt@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-11-29Documentation/process: fix self referenceErik Ekman
Instead link to the device tree document with the same name. Signed-off-by: Erik Ekman <erik@kryo.se> Link: https://lore.kernel.org/r/20211119200758.642474-1-erik@kryo.se Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-11-29docs: admin-guide/blockdev: Remove digraph of node-statesAkira Yokosawa
While node-states-8.dot has two digraphs, the dot(1) command can not properly handle multiple graphs in a DOT file and the kernel-doc page at https://www.kernel.org/doc/html/latest/admin-guide/blockdev/drbd/figures.html fails to render the graphs. It turned out that the digraph of node_states can be removed. Quote from Joel's reflection: On reflection, the digraph node_states can be removed entirely. It is too basic to contain any useful information. In addition it references "ioctl_set_state". The ioctl configuration interface for DRBD has long been removed. In fact, it was never in the upstream version of DRBD. Remove node_states and rename the DOT file peer_states-8.dot. Suggested-by: Joel Colledge <joel.colledge@linbit.com> Acked-by: Joel Colledge <joel.colledge@linbit.com> Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Lars Ellenberg <lars.ellenberg@linbit.com> Link: https://lore.kernel.org/r/7df04f45-8746-e666-1a9d-a998f1ab1f91@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-11-29docs: conf.py: fix support for Readthedocs v 1.0.0Mauro Carvalho Chehab
As described at: https://stackoverflow.com/questions/23211695/modifying-content-width-of-the-sphinx-theme-read-the-docs since Sphinx 1.8, the standard way to setup a custom theme is to use html_css_files. While using html_context is OK with RTD 0.5.2, it doesn't work with 1.0.0, causing the theme to not load, producing a very weird html. Tested with: - Sphinx 1.7.9 + sphinx-rtd-theme 0.5.2 - Sphinx 2.4.4 + sphinx-rtd-theme 0.5.2 - Sphinx 2.4.4 + sphinx-rtd-theme 1.0.0 - Sphinx 4.3.0 + sphinx-rtd-theme 1.0.0 Reported-by: Hans Verkuil <hverkuil@xs4all.nl> Tested-by: Hans Verkuil <hverkuil@xs4all.nl> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Tested-by: Akira Yokosawa <akiyks@gmail.com> Link: https://lore.kernel.org/r/80009f0d17ea0840d81e7e16fff6e7677919fdfc.1638004294.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>