summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-11-08KVM: selftests: Remove address rounding in guest codeBen Gardon
Rounding the address the guest writes to a host page boundary will only have an effect if the host page size is larger than the guest page size, but in that case the guest write would still go to the same host page. There's no reason to round the address down, so remove the rounding to simplify the demand paging test. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-3-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Factor code out of demand_paging_testBen Gardon
Much of the code in demand_paging_test can be reused by other, similar multi-vCPU-memory-touching-perfromance-tests. Factor that common code out for reuse. No functional change expected. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20201027233733.1484855-2-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Use a single binary for dirty/clear log testPeter Xu
Remove the clear_dirty_log test, instead merge it into the existing dirty_log_test. It should be cleaner to use this single binary to do both tests, also it's a preparation for the upcoming dirty ring test. The default behavior will run all the modes in sequence. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201001012233.6013-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Always clear dirty bitmap after iterationPeter Xu
We used not to clear the dirty bitmap before because KVM_GET_DIRTY_LOG would overwrite it the next time it copies the dirty log onto it. In the upcoming dirty ring tests we'll start to fetch dirty pages from a ring buffer, so no one is going to clear the dirty bitmap for us. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201001012228.5916-1-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Add blessed SVE registers to get-reg-listAndrew Jones
Add support for the SVE registers to get-reg-list and create a new test, get-reg-list-sve, which tests them when running on a machine with SVE support. Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201029201703.102716-5-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: selftests: Add aarch64 get-reg-list testAndrew Jones
Check for KVM_GET_REG_LIST regressions. The blessed list was created by running on v4.15 with the --core-reg-fixup option. The following script was also used in order to annotate system registers with their names when possible. When new system registers are added the names can just be added manually using the same grep. while read reg; do if [[ ! $reg =~ ARM64_SYS_REG ]]; then printf "\t$reg\n" continue fi encoding=$(echo "$reg" | sed "s/ARM64_SYS_REG(//;s/),//") if ! name=$(grep "$encoding" ../../../../arch/arm64/include/asm/sysreg.h); then printf "\t$reg\n" continue fi name=$(echo "$name" | sed "s/.*SYS_//;s/[\t ]*sys_reg($encoding)$//") printf "\t$reg\t/* $name */\n" done < <(aarch64/get-reg-list --core-reg-fixup --list) Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201029201703.102716-3-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08selftests: kvm: test enforcement of paravirtual cpuid featuresOliver Upton
Add a set of tests that ensure the guest cannot access paravirtual msrs and hypercalls that have been disabled in the KVM_CPUID_FEATURES leaf. Expect a #GP in the case of msr accesses and -KVM_ENOSYS from hypercalls. Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Peter Shier <pshier@google.com> Reviewed-by: Aaron Lewis <aaronlewis@google.com> Message-Id: <20201027231044.655110-7-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08selftests: kvm: Add exception handling to selftestsAaron Lewis
Add the infrastructure needed to enable exception handling in selftests. This allows any of the exception and interrupt vectors to be overridden in the guest. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Alexander Graf <graf@amazon.com> Message-Id: <20201012194716.3950330-4-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08selftests: kvm: Clear uc so UCALL_NONE is being properly reportedAaron Lewis
Ensure the out value 'uc' in get_ucall() is properly reporting UCALL_NONE if the call fails. The return value will be correctly reported, however, the out parameter 'uc' will not be. Clear the struct to ensure the correct value is being reported in the out parameter. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Alexander Graf <graf@amazon.com> Message-Id: <20201012194716.3950330-3-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08selftests: kvm: Fix the segment descriptor layout to match the actual layoutAaron Lewis
Fix the layout of 'struct desc64' to match the layout described in the SDM Vol 3, Chapter 3 "Protected-Mode Memory Management", section 3.4.5 "Segment Descriptors", Figure 3-8 "Segment Descriptor". The test added later in this series relies on this and crashes if this layout is not correct. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Alexander Graf <graf@amazon.com> Message-Id: <20201012194716.3950330-2-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: x86: handle MSR_IA32_DEBUGCTLMSR with report_ignored_msrsPankaj Gupta
Windows2016 guest tries to enable LBR by setting the corresponding bits in MSR_IA32_DEBUGCTLMSR. KVM does not emulate MSR_IA32_DEBUGCTLMSR and spams the host kernel logs with error messages like: kvm [...]: vcpu1, guest rIP: 0xfffff800a8b687d3 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x1, nop" This patch fixes this by enabling error logging only with 'report_ignored_msrs=1'. Signed-off-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com> Message-Id: <20201105153932.24316-1-pankaj.gupta.linux@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08kvm: x86: request masterclock update any time guest uses different msrOliver Upton
Commit 5b9bb0ebbcdc ("kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn", 2020-10-21) subtly changed the behavior of guest writes to MSR_KVM_SYSTEM_TIME(_NEW). Restore the previous behavior; update the masterclock any time the guest uses a different msr than before. Fixes: 5b9bb0ebbcdc ("kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn", 2020-10-21) Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Peter Shier <pshier@google.com> Message-Id: <20201027231044.655110-6-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08kvm: x86: ensure pv_cpuid.features is initialized when enabling capOliver Upton
Make the paravirtual cpuid enforcement mechanism idempotent to ioctl() ordering by updating pv_cpuid.features whenever userspace requests the capability. Extract this update out of kvm_update_cpuid_runtime() into a new helper function and move its other call site into kvm_vcpu_after_set_cpuid() where it more likely belongs. Fixes: 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Peter Shier <pshier@google.com> Message-Id: <20201027231044.655110-5-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08kvm: x86: reads of restricted pv msrs should also result in #GPOliver Upton
commit 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") only protects against disallowed guest writes to KVM paravirtual msrs, leaving msr reads unchecked. Fix this by enforcing KVM_CPUID_FEATURES for msr reads as well. Fixes: 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Peter Shier <pshier@google.com> Message-Id: <20201027231044.655110-4-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: x86: use positive error values for msr emulation that causes #GPMaxim Levitsky
Recent introduction of the userspace msr filtering added code that uses negative error codes for cases that result in either #GP delivery to the guest, or handled by the userspace msr filtering. This breaks an assumption that a negative error code returned from the msr emulation code is a semi-fatal error which should be returned to userspace via KVM_RUN ioctl and usually kill the guest. Fix this by reusing the already existing KVM_MSR_RET_INVALID error code, and by adding a new KVM_MSR_RET_FILTERED error code for the userspace filtered msrs. Fixes: 291f35fb2c1d1 ("KVM: x86: report negative values from wrmsr emulation to userspace") Reported-by: Qian Cai <cai@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20201101115523.115780-1-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: Documentation: Update entry for KVM_CAP_ENFORCE_PV_CPUIDPeter Xu
Should be squashed into 66570e966dd9cb4f. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201023183358.50607-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: Documentation: Update entry for KVM_X86_SET_MSR_FILTERPeter Xu
It should be an accident when rebase, since we've already have section 8.25 (which is KVM_CAP_S390_DIAG318). Fix the number. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201001012044.5151-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08KVM: x86/mmu: fix counting of rmap entries in pte_list_addLi RongQing
Fix an off-by-one style bug in pte_list_add() where it failed to account the last full set of SPTEs, i.e. when desc->sptes is full and desc->more is NULL. Merge the two "PTE_LIST_EXT-1" checks as part of the fix to avoid an extra comparison. Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <1601196297-24104-1-git-send-email-lirongqing@baidu.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-08Merge tag 'kvmarm-fixes-5.10-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for v5.10, take #2 - Fix compilation error when PMD and PUD are folded - Fix regresssion of the RAZ behaviour of ID_AA64ZFR0_EL1
2020-11-07stmmac: intel: change all EHL/TGL to auto detect phy addrVoon Weifeng
Set all EHL/TGL phy_addr to -1 so that the driver will automatically detect it at run-time by probing all the possible 32 addresses. Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com> Link: https://lore.kernel.org/r/20201106094341.4241-1-vee.khee.wong@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: usb: fix spelling typo in cdc_ncm.cWang Qing
Actually, withing should be within. Signed-off-by: Wang Qing <wangqing@vivo.com> Link: https://lore.kernel.org/r/1604649025-22559-1-git-send-email-wangqing@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: core: fix spelling typo in flow_dissector.cWang Qing
withing should be within. Signed-off-by: Wang Qing <wangqing@vivo.com> Link: https://lore.kernel.org/r/1604650310-30432-1-git-send-email-wangqing@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07Merge branch 'net-ipa-constrain-gsi-interrupts'Jakub Kicinski
Alex Elder says: ==================== net: ipa: constrain GSI interrupts The goal of this series is to more tightly control when GSI interrupts are enabled. This is a long-ish series, so I'll describe it in parts. The first patch is actually unrelated... I forgot to include it in my previous series (which exposed the GSI layer to the IPA version). It is a trivial comments-only update patch. The second patch defers registering the GSI interrupt handler until *after* all of the resources that handler touches have been initialized. In practice, we don't see this interrupt that early, but this precludes an obvious problem. The next two patches are simple changes. The first just trivially renames a field. The second switches from using constant mask values to using an enumerated type of bit positions to represent each GSI interrupt type. The rest implement the "real work." First, all interrupts are disabled at initialization time. Next, we keep track of a bitmask of enabled GSI interrupt types, updating it each time we enable or disable one of them. From there we have a set of patches that one-by-one enable each interrupt type only during the period it is required. This includes allowing a channel to generate IEOB interrupts only when it has been enabled. And finally, the last patch simplifies some code now that all GSI interrupt types are handled uniformly. ==================== Link: https://lore.kernel.org/r/20201105181407.8006-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: pass a value to gsi_irq_type_update()Alex Elder
Now that all of the GSI interrupts are handled uniformly, change gsi_irq_type_update() so it takes a value. Have the function assign that value to the cached mask of enabled GSI IRQ types before writing it to hardware. Note that gsi_irq_teardown() will only be called after gsi_irq_disable(), so it's not necessary for the former to disable all IRQ types. Get rid of that. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: only enable GSI general IRQs when neededAlex Elder
Most GSI general errors are unrecoverable without a full reset. Despite that, we want to receive these errors so we can at least report what happened before whatever undefined behavior ensues. Explicitly disable all such interrupts in gsi_irq_setup(), then enable those we want in gsi_irq_enable(). List the interrupt types we are interested in (everything but breakpoint) explicitly rather than using GSI_CNTXT_GSI_IRQ_ALL, and remove that symbol's definition. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: explicitly disallow inter-EE interruptsAlex Elder
It is possible for other execution environments (EEs, like the modem) to request changes to local (AP) channel or event ring state. We do not support this feature. In gsi_irq_setup(), explicitly zero the mask that defines which channels are permitted to generate inter-EE channel state change interrupts. Do the same for the event ring mask. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: only enable GSI IEOB IRQs when neededAlex Elder
A GSI channel must be started in order to use it to perform a transfer data (or command) transaction. And the only time we'll see an IEOB interrupt is if we send a transaction to a started channel. Therefore we do not need to have the IEOB interrupt type enabled until at least one channel has been started. And once the last started channel has been stopped, we can disable the IEOB interrupt type again. We already enable the IEOB interrupt for a particular channel only when it is started. Extend that by having the IEOB interrupt *type* be enabled only when at least one channel is in STARTED state. Disallow all channels from triggering the IEOB interrupt in gsi_irq_setup(). We only enable an channel's interrupt when needed, so there is no longer any need to zero the channel mask in gsi_irq_disable(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: only enable generic command completion IRQ when neededAlex Elder
The completion of a generic EE GSI command is signaled by a global interrupt of type GP_INT1. The only other used type for a global interrupt is a hardware error report. First, disallow all global interrupt types in gsi_irq_setup(). We want to know about hardware errors, so re-enable the interrupt type in gsi_irq_enable(), to allow hardware errors to be reported. Disable that interrupt type again in gsi_irq_disable(). We only issue generic EE commands one at a time, and there's no reason to keep the completion interrupt enabled when no generic EE command is pending. We furthermore have no need to enable the GP_INT2 or GP_INT3 interrupt types (which aren't used). The change in gsi_irq_enable() makes GSI_CNTXT_GLOB_IRQ_ALL unused, so get rid of it. Have gsi_generic_command() enable the GP_INT1 interrupt type (in addition to the ERROR_INT type) only while a generic command is pending. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: only enable GSI event control IRQs when neededAlex Elder
A GSI event ring causes an event control interrupt to fire whenever its state changes (between NOT_ALLOCATED and ALLOCATED). No event ring should ever change state except when we request it to. Currently, we permit *all* events rings to generate event control interrupts--even those that are never used. And we enable event control interrupts essentially at all times, from setup to teardown. Instead, only enable the event control interrupt type for the duration of an event ring command, and when doing so, only allow the event ring being operated upon to cause the interrupt to fire. Disallow all event rings from issuing the event control interrupt in gsi_irq_setup(). Because an event ring's interrupt is only enabled when needed, there is no longer any need to zero the event channel mask in gsi_irq_disable(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: only enable GSI channel control IRQs when neededAlex Elder
A GSI channel causes a channel control interrupt to fire whenever its state changes (between NOT_ALLOCATED, ALLOCATED, STARTED, etc.). We do not support inter-EE channel commands (initiated by other EEs), so no channel should ever change state except when we request it to. Currently, we permit *all* channels to generate channel control interrupts--even those that are never used. And we enable channel control interrupts essentially at all times, from setup to teardown. Instead, disable all channel control interrupts initially in gsi_irq_setup(), and only enable the channel control interrupt type for the duration of a channel command. When doing so, only allow the channel being operated upon to cause the interrupt to fire. Because a channel's interrupt is now enabled only when needed (one channel at a time), there is no longer any need to zero the channel mask in gsi_irq_disable(). Add new gsi_irq_type_enable() and gsi_irq_type_disable() as helper functions to control whether a given GSI interrupt type is enabled. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: cache last-saved GSI IRQ enabled typeAlex Elder
Keep track of the set of GSI interrupt types that are currently enabled by recording the mask value to write (or last written) to the TYPE_IRQ_MSK register. Create a new helper function gsi_irq_type_update() to handle actually writing the register. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: disable all GSI interrupt types initiallyAlex Elder
Introduce gsi_irq_setup() and gsi_irq_teardown() to disable all GSI interrupts when first setting up GSI hardware, and to clean things up when we're done. Re-enable all GSI interrupt types in gsi_irq_enable(), but do so only after each of the type-specific interrupt masks has been configured. Similarly, disable all interrupt types in gsi_irq_disable()--first--before zeroing out the type-specific masks. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: define GSI interrupt types with an enumAlex Elder
Define the GSI interrupt types with an enumerated type whose values are the bit positions representing each interrupt type. Include a short comment describing how each interrupt type is used. Build up the enabled interrupt mask explicitly in gsi_irq_enable(), and get rid of the definition of GSI_CNTXT_TYPE_IRQ_MSK_ALL. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: rename gsi->event_enable_bitmapAlex Elder
Rename the "event_enable_bitmap" field of the GSI structure to be "ieob_enabled_bitmap". An upcoming patch will cache the last value stored for another interrupt mask and this is a more direct naming convention to follow. Add a few comments to explain the bitmap fields in the GSI structure. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: request GSI IRQ laterAlex Elder
Introduce gsi_irq_init() and gsi_irq_exit(), to encapsulate looking up the GSI IRQ and registering its handler. Call gsi_irq_init() a little later in gsi_init(), and initialize the completion earlier. The IRQ handler accesses both the GSI virtual memory pointer and the completion, and this way these things will have been initialized before the gsi_irq() can ever be called. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: ipa: refer to IPA versions, not GSIAlex Elder
The GSI code is now exposed to IPA version numbers, and we handle version-specific behavior based on the IPA version. Modify some comments that talk about GSI versions so they reference IPA versions instead. Correct version number errors in a couple of these comments. The (comment) mapping between IPA and GSI versions in the definition of the ipa_version enumerated type remains. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: x25_asy: Delete the x25_asy driverXie He
This driver transports LAPB (X.25 link layer) frames over TTY links. I can safely say that this driver has no actual user because it was not working at all until: commit 8fdcabeac398 ("drivers/net/wan/x25_asy: Fix to make it work") The code in its current state still has problems: 1. The uses of "struct x25_asy" in x25_asy_unesc (when receiving) and in x25_asy_write_wakeup (when sending) are not protected by locks against x25_asy_change_mtu's changing of the transmitting/receiving buffers. Also, all "netif_running" checks in this driver are not protected by locks against the ndo_stop function. 2. The driver stops all TTY read/write when the netif is down. I think this is not right because this may cause the last outgoing frame before the netif goes down to be incompletely transmitted, and the first incoming frame after the netif goes up to be incompletely received. And there may also be other problems. I was planning to fix these problems but after recent discussions about deleting other old networking code, I think we may just delete this driver, too. Signed-off-by: Xie He <xie.he.0141@gmail.com> Acked-by: Martin Schiller <ms@dev.tdt.de> Link: https://lore.kernel.org/r/20201105073434.429307-1-xie.he.0141@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07Merge tag 'block-5.10-2020-11-07' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request from Christoph: - revert a nvme_queue size optimization (Keith Bush) - fabrics timeout races fixes (Chao Leng and Sagi Grimberg)" - null_blk zone locking fix (Damien) * tag 'block-5.10-2020-11-07' of git://git.kernel.dk/linux-block: null_blk: Fix scheduling in atomic with zoned mode nvme-tcp: avoid repeated request completion nvme-rdma: avoid repeated request completion nvme-tcp: avoid race between time out and tear down nvme-rdma: avoid race between time out and tear down nvme: introduce nvme_sync_io_queues Revert "nvme-pci: remove last_sq_tail"
2020-11-07Merge tag 'io_uring-5.10-2020-11-07' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: "A set of fixes for io_uring: - SQPOLL cancelation fixes - Two fixes for the io_identity COW - Cancelation overflow fix (Pavel) - Drain request cancelation fix (Pavel) - Link timeout race fix (Pavel)" * tag 'io_uring-5.10-2020-11-07' of git://git.kernel.dk/linux-block: io_uring: fix link lookup racing with link timeout io_uring: use correct pointer for io_uring_show_cred() io_uring: don't forget to task-cancel drained reqs io_uring: fix overflowed cancel w/ linked ->files io_uring: drop req/tctx io_identity separately io_uring: ensure consistent view of original task ->mm from SQPOLL io_uring: properly handle SQPOLL request cancelations io-wq: cancel request if it's asking for files and we don't have them
2020-11-07net: macb: fix NULL dereference due to no pcs_config methodParshuram Thombare
This patch fixes NULL pointer dereference due to NULL pcs_config in pcs_ops. Fixes: e4e143e26ce8 ("net: macb: add support for high speed interface") Reported-by: Nicolas Ferre <Nicolas.Ferre@microchip.com> Link: https://lore.kernel.org/netdev/2db854c7-9ffb-328a-f346-f68982723d29@microchip.com/ Signed-off-by: Parshuram Thombare <pthombar@cadence.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/1604599113-2488-1-git-send-email-pthombar@cadence.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07ptp: idt82p33: optimize _idt82p33_adjfineMin Li
Use div_s64 so that the neg_adj is not needed. Signed-off-by: Min Li <min.li.xe@renesas.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/1604634729-24960-3-git-send-email-min.li.xe@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07ptp: idt82p33: use i2c_master_send for bus writeMin Li
Refactor idt82p33_xfer and use i2c_master_send for write operation. Because some I2C controllers are only working with single-burst write transaction. Signed-off-by: Min Li <min.li.xe@renesas.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/1604634729-24960-2-git-send-email-min.li.xe@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07ptp: idt82p33: add adjphase supportMin Li
Add idt82p33_adjphase() to support PHC write phase mode. Signed-off-by: Min Li <min.li.xe@renesas.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/1604634729-24960-1-git-send-email-min.li.xe@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07futex: Handle transient "ownerless" rtmutex state correctlyMike Galbraith
Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner(). This is one possible chain of events leading to this: Task Prio Operation T1 120 lock(F) T2 120 lock(F) -> blocks (top waiter) T3 50 (RT) lock(F) -> boosts T1 and blocks (new top waiter) XX timeout/ -> wakes T2 signal T1 50 unlock(F) -> wakes T3 (rtmutex->owner == NULL, waiter bit is set) T2 120 cleanup -> try_to_take_mutex() fails because T3 is the top waiter and the lower priority T2 cannot steal the lock. -> fixup_pi_state_owner() sees newowner == NULL -> BUG_ON() The comment states that this is invalid and rt_mutex_real_owner() must return a non NULL owner when the trylock failed, but in case of a queued and woken up waiter rt_mutex_real_owner() == NULL is a valid transient state. The higher priority waiter has simply not yet managed to take over the rtmutex. The BUG_ON() is therefore wrong and this is just another retry condition in fixup_pi_state_owner(). Drop the locks, so that T3 can make progress, and then try the fixup again. Gratian provided a great analysis, traces and a reproducer. The analysis is to the point, but it confused the hell out of that tglx dude who had to page in all the futex horrors again. Condensed version is above. [ tglx: Wrote comment and changelog ] Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") Reported-by: Gratian Crisan <gratian.crisan@ni.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87a6w6x7bb.fsf@ni.com Link: https://lore.kernel.org/r/87sg9pkvf7.fsf@nanos.tec.linutronix.de
2020-11-07net: marvell: prestera: fix compilation with CONFIG_BRIDGE=mVadym Kochan
With CONFIG_BRIDGE=m the compilation fails: ld: drivers/net/ethernet/marvell/prestera/prestera_switchdev.o: in function `prestera_bridge_port_event': prestera_switchdev.c:(.text+0x2ebd): undefined reference to `br_vlan_enabled' in case the driver is statically enabled. Fix it by adding 'BRIDGE || BRIDGE=n' dependency. Fixes: e1189d9a5fbe ("net: marvell: prestera: Add Switchdev driver implementation") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Link: https://lore.kernel.org/r/20201106161128.24069-1-vadym.kochan@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07Merge tag 'mlx5-fixes-2020-11-03' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2020-11-03 v1->v2: - Fix fixes line tag in patch #1 - Toss ktls refcount leak fix, Maxim will look further into the root cause. - Toss eswitch chain 0 prio patch, until we determine if it is needed for -rc and net. * tag 'mlx5-fixes-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix incorrect access of RCU-protected xdp_prog net/mlx5e: Fix VXLAN synchronization after function reload net/mlx5: E-switch, Avoid extack error log for disabled vport net/mlx5: Fix deletion of duplicate rules net/mlx5e: Use spin_lock_bh for async_icosq_lock net/mlx5e: Protect encap route dev from concurrent release net/mlx5e: Fix modify header actions memory leak ==================== Link: https://lore.kernel.org/r/20201105202129.23644-1-saeedm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07net: macvlan: remove redundant initialization in macvlan_dev_netpoll_setupMenglong Dong
The initialization for err with 0 seems useless, as it is soon updated with -ENOMEM. So, we can remove it. Changes since v1: -Keep -ENOMEM still. Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> Link: https://lore.kernel.org/r/1604541244-3241-1-git-send-email-dong.menglong@zte.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07r8169: disable hw csum for short packets on all chip versionsHeiner Kallweit
RTL8125B has same or similar short packet hw padding bug as RTL8168evl. The main workaround has been extended accordingly, however we have to disable also hw checksumming for short packets on affected new chip versions. Instead of checking for an affected chip version let's simply disable hw checksumming for short packets in general. v2: - remove the version checks and disable short packet hw csum in general - reflect this in commit title and message Fixes: 0439297be951 ("r8169: add support for RTL8125B") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/7fbb35f0-e244-ef65-aa55-3872d7d38698@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07r8169: fix potential skb double free in an error pathHeiner Kallweit
The caller of rtl8169_tso_csum_v2() frees the skb if false is returned. eth_skb_pad() internally frees the skb on error what would result in a double free. Therefore use __skb_put_padto() directly and instruct it to not free the skb on error. Fixes: b423e9ae49d7 ("r8169: fix offloaded tx checksum for small packets.") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/f7e68191-acff-9ded-4263-c016428a8762@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07cxgb4: Fix the -Wmisleading-indentation warningKaixu Xia
Fix the gcc warning: drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c:2673:9: warning: this 'for' clause does not guard... [-Wmisleading-indentation] 2673 | for (i = 0; i < n; ++i) \ Reported-by: Tosk Robot <tencent_os_robot@tencent.com> Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Link: https://lore.kernel.org/r/1604467444-23043-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>