linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-06-24	KVM: selftests: Use "standard" min virtual address for Hyper-V pages	Sean Christopherson
	Use the de facto standard minimum virtual address for Hyper-V's hcall params page. It's the allocator's job to not double-allocate memory, i.e. there's no reason to force different regions for the params vs. hcall page. This will allow adding a page allocation helper with a "standard" minimum address. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Unconditionally use memslot 0 for x86's GDT/TSS setup	Sean Christopherson
	Refactor x86's GDT/TSS allocations to for memslot '0' at its vm_addr_alloc() call sites instead of passing in '0' from on high. This is a step toward using a common helper for allocating pages. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Unconditionally use memslot 0 when loading elf binary	Sean Christopherson
	Use memslot '0' for all vm_vaddr_alloc() calls when loading the test binary. This is the first step toward adding a helper to handle page allocations with a default value for the target memslot. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Zero out the correct page in the Hyper-V features test	Sean Christopherson
	Fix an apparent copy-paste goof in hyperv_features where hcall_page (which is two pages, so technically just the first page) gets zeroed twice, and hcall_params gets zeroed none times. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: Remove errant asm/barrier.h include to fix arm64 build	Sean Christopherson
	Drop an unnecessary include of asm/barrier.h from dirty_log_test.c to allow the test to build on arm64. arm64, s390, and x86 all build cleanly without the include (PPC and MIPS aren't supported in KVM's selftests). arm64's barrier.h includes linux/kasan-checks.h, which is not copied into tools/. In file included from ../../../../tools/include/asm/barrier.h:8, from dirty_log_test.c:19: .../arm64/include/asm/barrier.h:12:10: fatal error: linux/kasan-checks.h: No such file or directory 12 \| #include <linux/kasan-checks.h> \| ^~~~~~~~~~~~~~~~~~~~~~ compilation terminated. Fixes: 84292e565951 ("KVM: selftests: Add dirty ring buffer test") Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: nVMX: Handle split-lock #AC exceptions that happen in L2	Sean Christopherson
	Mark #ACs that won't be reinjected to the guest as wanted by L0 so that KVM handles split-lock #AC from L2 instead of forwarding the exception to L1. Split-lock #AC isn't yet virtualized, i.e. L1 will treat it like a regular #AC and do the wrong thing, e.g. reinject it into L2. Fixes: e6f8b6c12f03 ("KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest") Cc: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622172244.3561540-1-seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: x86/mmu: Fix uninitialized boolean variable flush	Colin Ian King
	In the case where kvm_memslots_have_rmaps(kvm) is false the boolean variable flush is not set and is uninitialized. If is_tdp_mmu_enabled(kvm) is true then the call to kvm_tdp_mmu_zap_collapsible_sptes passes the uninitialized value of flush into the call. Fix this by initializing flush to false. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: e2209710ccc5 ("KVM: x86/mmu: Skip rmap operations if rmaps not allocated") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622150912.23429-1-colin.king@canonical.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: selftests: fix triple fault if ept=0 in dirty_log_test	Hou Wenlong
	Commit 22f232d134e1 ("KVM: selftests: x86: Set supported CPUIDs on default VM") moved vcpu_set_cpuid into vm_create_with_vcpus, but dirty_log_test doesn't use it to create vm. So vcpu's CPUIDs is not set, the guest's pa_bits in kvm would be smaller than the value queried by userspace. However, the dirty track memory slot is in the highest GPA, the reserved bits in gpte would be set with wrong pa_bits. For shadow paging, page fault would fail in permission_fault and be injected into guest. Since guest doesn't have idt, it finally leads to vm_exit for triple fault. Move vcpu_set_cpuid into vm_vcpu_add_default to set supported CPUIDs on default vcpu, since almost all tests need it. Fixes: 22f232d134e1 ("KVM: selftests: x86: Set supported CPUIDs on default VM") Signed-off-by: Hou Wenlong <houwenlong93@linux.alibaba.com> Message-Id: <411ea2173f89abce56fc1fca5af913ed9c5a89c9.1624351343.git.houwenlong93@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	KVM: x86: Print CPU of last attempted VM-entry when dumping VMCS/VMCB	Jim Mattson
	Failed VM-entry is often due to a faulty core. To help identify bad cores, print the id of the last logical processor that attempted VM-entry whenever dumping a VMCS or VMCB. Signed-off-by: Jim Mattson <jmattson@google.com> Message-Id: <20210621221648.1833148-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	x86/resctrl: Fix kernel-doc in internal.h	Fabio M. De Francesco
	Add description of undocumented parameters. Issues detected by scripts/kernel-doc. Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20210618223206.29539-1-fmdefrancesco@gmail.com
2021-06-24	x86/resctrl: Fix kernel-doc in pseudo_lock.c	Fabio M. De Francesco
	Add undocumented parameters detected by scripts/kernel-doc. Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20210616181530.4094-1-fmdefrancesco@gmail.com
2021-06-24	KVM: selftests: Fix mapping length truncation in m{,un}map()	Zenghui Yu
	max_mem_slots is now declared as uint32_t. The result of (0x200000 * 32767) is unexpectedly truncated to be 0xffe00000, whilst we actually need to allocate about, 63GB. Cast max_mem_slots to size_t in both mmap() and munmap() to fix the length truncation. We'll otherwise see the failure on arm64 thanks to the access_ok() checking in __kvm_set_memory_region(), as the unmapped VA happen to go beyond the task's allowed address space. # ./set_memory_region_test Allowed number of memory slots: 32767 Adding slots 0..32766, each memory region with 2048K size ==== Test Assertion Failure ==== set_memory_region_test.c:391: ret == 0 pid=94861 tid=94861 errno=22 - Invalid argument 1 0x00000000004015a7: test_add_max_memory_regions at set_memory_region_test.c:389 2 (inlined by) main at set_memory_region_test.c:426 3 0x0000ffffb8e67bdf: ?? ??:0 4 0x00000000004016db: _start at :? KVM_SET_USER_MEMORY_REGION IOCTL failed, rc: -1 errno: 22 slot: 2615 Fixes: 3bf0fcd75434 ("KVM: selftests: Speed up set_memory_region_test") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Message-Id: <20210624070931.565-1-yuzenghui@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24	sched/doc: Update the CPU capacity asymmetry bits	Beata Michalska
	Update the documentation bits referring to capacity aware scheduling with regards to newly introduced SD_ASYM_CPUCAPACITY_FULL sched_domain flag. Signed-off-by: Beata Michalska <beata.michalska@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20210603140627.8409-4-beata.michalska@arm.com
2021-06-24	sched/topology: Rework CPU capacity asymmetry detection	Beata Michalska
	Currently the CPU capacity asymmetry detection, performed through asym_cpu_capacity_level, tries to identify the lowest topology level at which the highest CPU capacity is being observed, not necessarily finding the level at which all possible capacity values are visible to all CPUs, which might be bit problematic for some possible/valid asymmetric topologies i.e.: DIE [ ] MC [ ][ ] CPU [0] [1] [2] [3] [4] [5] [6] [7] Capacity \|.....\| \|.....\| \|.....\| \|.....\| L M B B Where: arch_scale_cpu_capacity(L) = 512 arch_scale_cpu_capacity(M) = 871 arch_scale_cpu_capacity(B) = 1024 In this particular case, the asymmetric topology level will point at MC, as all possible CPU masks for that level do cover the CPU with the highest capacity. It will work just fine for the first cluster, not so much for the second one though (consider the find_energy_efficient_cpu which might end up attempting the energy aware wake-up for a domain that does not see any asymmetry at all) Rework the way the capacity asymmetry levels are being detected, allowing to point to the lowest topology level (for a given CPU), where full set of available CPU capacities is visible to all CPUs within given domain. As a result, the per-cpu sd_asym_cpucapacity might differ across the domains. This will have an impact on EAS wake-up placement in a way that it might see different range of CPUs to be considered, depending on the given current and target CPUs. Additionally, those levels, where any range of asymmetry (not necessarily full) is being detected will get identified as well. The selected asymmetric topology level will be denoted by SD_ASYM_CPUCAPACITY_FULL sched domain flag whereas the 'sub-levels' would receive the already used SD_ASYM_CPUCAPACITY flag. This allows maintaining the current behaviour for asymmetric topologies, with misfit migration operating correctly on lower levels, if applicable, as any asymmetry is enough to trigger the misfit migration. The logic there relies on the SD_ASYM_CPUCAPACITY flag and does not relate to the full asymmetry level denoted by the sd_asym_cpucapacity pointer. Detecting the CPU capacity asymmetry is being based on a set of available CPU capacities for all possible CPUs. This data is being generated upon init and updated once CPU topology changes are being detected (through arch_update_cpu_topology). As such, any changes to identified CPU capacities (like initializing cpufreq) need to be explicitly advertised by corresponding archs to trigger rebuilding the data. Additional -dflags- parameter, used when building sched domains, has been removed as well, as the asymmetry flags are now being set directly in sd_init. Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Beata Michalska <beata.michalska@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lore.kernel.org/r/20210603140627.8409-3-beata.michalska@arm.com
2021-06-24	sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag	Beata Michalska
	Introducing new, complementary to SD_ASYM_CPUCAPACITY, sched_domain topology flag, to distinguish between shed_domains where any CPU capacity asymmetry is detected (SD_ASYM_CPUCAPACITY) and ones where a full set of CPU capacities is visible to all domain members (SD_ASYM_CPUCAPACITY_FULL). With the distinction between full and partial CPU capacity asymmetry, brought in by the newly introduced flag, the scope of the original SD_ASYM_CPUCAPACITY flag gets shifted, still maintaining the existing behaviour when one is detected on a given sched domain, allowing misfit migrations within sched domains that do not observe full range of CPU capacities but still do have members with different capacity values. It loses though it's meaning when it comes to the lowest CPU asymmetry sched_domain level per-cpu pointer, which is to be now denoted by SD_ASYM_CPUCAPACITY_FULL flag. Signed-off-by: Beata Michalska <beata.michalska@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20210603140627.8409-2-beata.michalska@arm.com
2021-06-24	psi: Fix race between psi_trigger_create/destroy	Zhaoyang Huang
	Race detected between psi_trigger_destroy/create as shown below, which cause panic by accessing invalid psi_system->poll_wait->wait_queue_entry and psi_system->poll_timer->entry->next. Under this modification, the race window is removed by initialising poll_wait and poll_timer in group_init which are executed only once at beginning. psi_trigger_destroy() psi_trigger_create() mutex_lock(trigger_lock); rcu_assign_pointer(poll_task, NULL); mutex_unlock(trigger_lock); mutex_lock(trigger_lock); if (!rcu_access_pointer(group->poll_task)) { timer_setup(poll_timer, poll_timer_fn, 0); rcu_assign_pointer(poll_task, task); } mutex_unlock(trigger_lock); synchronize_rcu(); del_timer_sync(poll_timer); <-- poll_timer has been reinitialized by psi_trigger_create() So, trigger_lock/RCU correctly protects destruction of group->poll_task but misses this race affecting poll_timer and poll_wait. Fixes: 461daba06bdc ("psi: eliminate kthread_worker from psi trigger scheduling mechanism") Co-developed-by: ziwei.dai <ziwei.dai@unisoc.com> Signed-off-by: ziwei.dai <ziwei.dai@unisoc.com> Co-developed-by: ke.wang <ke.wang@unisoc.com> Signed-off-by: ke.wang <ke.wang@unisoc.com> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lkml.kernel.org/r/1623371374-15664-1-git-send-email-huangzhaoyang@gmail.com
2021-06-24	sched/fair: Introduce the burstable CFS controller	Huaixin Chang
	The CFS bandwidth controller limits CPU requests of a task group to quota during each period. However, parallel workloads might be bursty so that they get throttled even when their average utilization is under quota. And they are latency sensitive at the same time so that throttling them is undesired. We borrow time now against our future underrun, at the cost of increased interference against the other system users. All nicely bounded. Traditional (UP-EDF) bandwidth control is something like: (U = \Sum u_i) <= 1 This guaranteeds both that every deadline is met and that the system is stable. After all, if U were > 1, then for every second of walltime, we'd have to run more than a second of program time, and obviously miss our deadline, but the next deadline will be further out still, there is never time to catch up, unbounded fail. This work observes that a workload doesn't always executes the full quota; this enables one to describe u_i as a statistical distribution. For example, have u_i = {x,e}_i, where x is the p(95) and x+e p(100) (the traditional WCET). This effectively allows u to be smaller, increasing the efficiency (we can pack more tasks in the system), but at the cost of missing deadlines when all the odds line up. However, it does maintain stability, since every overrun must be paired with an underrun as long as our x is above the average. That is, suppose we have 2 tasks, both specify a p(95) value, then we have a p(95)p(95) = 90.25% chance both tasks are within their quota and everything is good. At the same time we have a p(5)p(5) = 0.25% chance both tasks will exceed their quota at the same time (guaranteed deadline fail). Somewhere in between there's a threshold where one exceeds and the other doesn't underrun enough to compensate; this depends on the specific CDFs. At the same time, we can say that the worst case deadline miss, will be \Sum e_i; that is, there is a bounded tardiness (under the assumption that x+e is indeed WCET). The benefit of burst is seen when testing with schbench. Default value of kernel.sched_cfs_bandwidth_slice_us(5ms) and CONFIG_HZ(1000) is used. mkdir /sys/fs/cgroup/cpu/test echo $$ > /sys/fs/cgroup/cpu/test/cgroup.procs echo 100000 > /sys/fs/cgroup/cpu/test/cpu.cfs_quota_us echo 100000 > /sys/fs/cgroup/cpu/test/cpu.cfs_burst_us ./schbench -m 1 -t 3 -r 20 -c 80000 -R 10 The average CPU usage is at 80%. I run this for 10 times, and got long tail latency for 6 times and got throttled for 8 times. Tail latencies are shown below, and it wasn't the worst case. Latency percentiles (usec) 50.0000th: 19872 75.0000th: 21344 90.0000th: 22176 95.0000th: 22496 99.0000th: 22752 99.5000th: 22752 99.9000th: 22752 min=0, max=22727 rps: 9.90 p95 (usec) 22496 p99 (usec) 22752 p95/cputime 28.12% p99/cputime 28.44% The interferenece when using burst is valued by the possibilities for missing the deadline and the average WCET. Test results showed that when there many cgroups or CPU is under utilized, the interference is limited. More details are shown in: https://lore.kernel.org/lkml/5371BD36-55AE-4F71-B9D7-B86DC32E3D2B@linux.alibaba.com/ Co-developed-by: Shanpei Chen <shanpeic@linux.alibaba.com> Signed-off-by: Shanpei Chen <shanpeic@linux.alibaba.com> Co-developed-by: Tianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ben Segall <bsegall@google.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20210621092800.23714-2-changhuaixin@linux.alibaba.com
2021-06-24	Merge tag 'amd-drm-fixes-5.13-2021-06-21' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.13-2021-06-21: amdgpu: - Revert GFX9, 10 doorbell fixes, we just end up trading one bug for another - Potential memory corruption fix in framebuffer handling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210621214132.4004-1-alexander.deucher@amd.com
2021-06-24	crypto: sl3516 - depends on HAS_IOMEM	Corentin Labbe
	The sl3516 driver need to depend on HAS_IOMEM. This fixes a build error: ERROR: modpost: "devm_platform_ioremap_resource" [drivers/crypto/gemini/sl3516-ce.ko] undefined! Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	objtool: Don't make .altinstructions writable	Josh Poimboeuf
	When objtool creates the .altinstructions section, it sets the SHF_WRITE flag to make the section writable -- unless the section had already been previously created by the kernel. The mismatch between kernel-created and objtool-created section flags can cause failures with external tooling (kpatch-build). And the section doesn't need to be writable anyway. Make the section flags consistent with the kernel's. Fixes: 9bc0bb50727c ("objtool/x86: Rewrite retpoline thunk calls") Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/6c284ae89717889ea136f9f0064d914cd8329d31.1624462939.git.jpoimboe@redhat.com
2021-06-24	crypto: hisilicon/qm - implement for querying hardware tasks status.	Wenkai Lin
	This patch adds a function hisi_qm_is_q_updated to check if the task is ready in hardware queue when user polls an UACCE queue.This prevents users from repeatedly querying whether the accelerator has completed tasks, which wastes CPU resources. Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	crypto: sl3516 - Fix build warning without CONFIG_PM	YueHaibing
	drivers/crypto/gemini/sl3516-ce-core.c:345:12: warning: ‘sl3516_ce_pm_resume’ defined but not used [-Wunused-function] static int sl3516_ce_pm_resume(struct device *dev) ^~~~~~~~~~~~~~~~~~~ The driver needs PM, otherwise clock and resets are never set. So make it depends on PM to fix this warning. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Suggested-by: LABBE Corentin <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	MAINTAINERS: update caam crypto driver maintainers list	Horia Geantă
	Aymen steps down as caam maintainer, being replaced by Pankaj. Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	crypto: nx - Fix numerous sparse byte-order warnings	Herbert Xu
	The nx driver started out its life as a BE-only driver. However, somewhere along the way LE support was partially added. This never seems to have been extended all the way but it does trigger numerous warnings during build. This patch fixes all those warnings, but it doesn't mean that the driver will work on LE. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	crypto: nx - Fix RCU warning in nx842_OF_upd_status	Herbert Xu
	The function nx842_OF_upd_status triggers a sparse RCU warning when it directly dereferences the RCU-protected devdata. This appears to be an accident as there was another variable of the same name that was passed in from the caller. After it was removed (because the main purpose of using it, to update the status member was itself removed) the global variable unintenionally stood in as its replacement. This patch restores the devdata parameter. Fixes: 90fd73f912f0 ("crypto: nx - remove pSeries NX 'status' field") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	crypto: api - Move crypto attr definitions out of crypto.h	Herbert Xu
	The definitions for crypto_attr-related types and enums are not needed by most Crypto API users. This patch moves them out of crypto.h and into algapi.h/internal.h depending on the extent of their use. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	crypto: nx - Fix memcpy() over-reading in nonce	Kees Cook
	Fix typo in memcpy() where size should be CTR_RFC3686_NONCE_SIZE. Fixes: 030f4e968741 ("crypto: nx - Fix reentrancy bugs") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	crypto: hisilicon/sec - Fix spelling mistake "fallbcak" -> "fallback"	Colin Ian King
	There is a spelling mistake in a dev_err message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	crypto: sa2ul - Remove unused auth_len variable	Herbert Xu
	This patch removes the unused auth_len variable from sa_aead_dma_in_callback. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	crypto: sl3516 - fix duplicated inclusion	kernel test robot
	drivers/crypto/gemini/sl3516-ce-cipher.c: linux/io.h is included more than once. Generated by: scripts/checkincludes.pl Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-06-24	perf/x86/intel/lbr: Zero the xstate buffer on allocation	Thomas Gleixner
	XRSTORS requires a valid xstate buffer to work correctly. XSAVES does not guarantee to write a fully valid buffer according to the SDM: "XSAVES does not write to any parts of the XSAVE header other than the XSTATE_BV and XCOMP_BV fields." XRSTORS triggers a #GP: "If bytes 63:16 of the XSAVE header are not all zero." It's dubious at best how this can work at all when the buffer is not zeroed before use. Allocate the buffers with __GFP_ZERO to prevent XRSTORS failure. Fixes: ce711ea3cab9 ("perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/87wnr0wo2z.ffs@nanos.tec.linutronix.de
2021-06-24	can: peak_pciefd: pucan_handle_status(): fix a potential starvation issue in ↵	Stephane Grosjean
	TX path Rather than just indicating that transmission can start, this patch requires the explicit flushing of the network TX queue when the driver is informed by the device that it can transmit, next to its configuration. In this way, if frames have already been written by the application, they will actually be transmitted. Fixes: ffd137f7043c ("can: peak/pcie_fd: remove useless code when interface starts") Link: https://lore.kernel.org/r/20210623142600.149904-1-s.grosjean@peak-system.com Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-24	can: j1939: j1939_sk_setsockopt(): prevent allocation of j1939 filter for ↵	Norbert Slusarek
	optlen == 0 If optval != NULL and optlen == 0 are specified for SO_J1939_FILTER in j1939_sk_setsockopt(), memdup_sockptr() will return ZERO_PTR for 0 size allocation. The new filter will be mistakenly assigned ZERO_PTR. This patch checks for optlen != 0 and filter will be assigned NULL in case of optlen == 0. Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Link: https://lore.kernel.org/r/20210620123842.117975-1-nslusarek@gmx.net Signed-off-by: Norbert Slusarek <nslusarek@gmx.net> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-24	PM / devfreq: passive: Fix get_target_freq when not using required-opp	Chanwoo Choi
	The 86ad9a24f21e ("PM / devfreq: Add required OPPs support to passive governor") supported the required-opp property for using devfreq passive governor. But, 86ad9a24f21e has caused the problem on use-case when required-opp is not used such as exynos-bus.c devfreq driver. So that fix the get_target_freq of passive governor for supporting the case of when required-opp is not used. Fixes: 86ad9a24f21e ("PM / devfreq: Add required OPPs support to passive governor") Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2021-06-23	cifs: missing null pointer check in cifs_mount	Steve French
	We weren't checking if tcon is null before setting dfs path, although we check for null tcon in an earlier assignment statement. Addresses-Coverity: 1476411 ("Dereference after null check") Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-23	smb3: fix possible access to uninitialized pointer to DACL	Steve French
	dacl_ptr can be null so we must check for it everywhere it is used in build_sec_desc. Addresses-Coverity: 1475598 ("Explicit null dereference") Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-23	cifs: missing null check for newinode pointer	Steve French
	in cifs_do_create we check if newinode is valid before referencing it but are missing the check in one place in fs/cifs/dir.c Addresses-Coverity: 1357292 ("Dereference after null check") Acked-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-06-23	Merge branch 'devlink-rate-limit-fixes'	David S. Miller
	Dmytro Linkin says: ==================== Fixes for devlink rate objects API Patch #1 fixes not decreased refcount of parent node for destroyed leaf object. Patch #2 fixes incorect eswitch mode check. Patch #3 protects list traversing with a lock. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	devlink: Protect rate list with lock while switching modes	Dmytro Linkin
	Devlink eswitch set command doesn't hold devlink->lock, which makes possible race condition between rate list traversing and others devlink rate KAPI calls, like devlink_rate_nodes_destroy(). Hold devlink lock while traversing the list. Fixes: a8ecb93ef03d ("devlink: Introduce rate nodes") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	devlink: Remove eswitch mode check for mode set call	Dmytro Linkin
	When eswitch is disabled, querying its current mode results in error. Due to this when trying to set the eswitch mode for mlx5 devices, it fails to set the eswitch switchdev mode. Hence remove such check. Fixes: a8ecb93ef03d ("devlink: Introduce rate nodes") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	devlink: Decrease refcnt of parent rate object on leaf destroy	Dmytro Linkin
	Port functions, like SFs, can be deleted by the user when its leaf rate object has parent node. In such case node refcnt won't be decreased which blocks the node from deletion later. Do simple refcnt decrease, since driver in cleanup stage. This: 1) assumes that driver took proper internal parent unset action; 2) allows to avoid nested callbacks call and deadlock. Fixes: d75559845078 ("devlink: Allow setting parent node of rate objects") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf 2021-06-23 The following pull-request contains BPF updates for your net tree. We've added 14 non-merge commits during the last 6 day(s) which contain a total of 13 files changed, 137 insertions(+), 64 deletions(-). Note that when you merge net into net-next, there is a small merge conflict between 9f2470fbc4cb ("skmsg: Improve udp_bpf_recvmsg() accuracy") from bpf with c49661aa6f70 ("skmsg: Remove unused parameters of sk_msg_wait_data()") from net-next. Resolution is to: i) net/ipv4/udp_bpf.c: take udp_msg_wait_data() and remove err parameter from the function, ii) net/ipv4/tcp_bpf.c: take tcp_msg_wait_data() and remove err parameter from the function, iii) for net/core/skmsg.c and include/linux/skmsg.h: remove the sk_msg_wait_data() implementation and its prototype in header. The main changes are: 1) Fix BPF poke descriptor adjustments after insn rewrite, from John Fastabend. 2) Fix regression when using BPF_OBJ_GET with non-O_RDWR flags, from Maciej Żenczykowski. 3) Various bug and error handling fixes for UDP-related sock_map, from Cong Wang. 4) Fix patching of vmlinux BTF IDs with correct endianness, from Tony Ambardar. 5) Two fixes for TX descriptor validation in AF_XDP, from Magnus Karlsson. 6) Fix overflow in size calculation for bpf_map_area_alloc(), from Bui Quang Minh. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	ipv6: exthdrs: do not blindly use init_net	Eric Dumazet
	I see no reason why max_dst_opts_cnt and max_hbh_opts_cnt are fetched from the initial net namespace. The other sysctls (max_dst_opts_len & max_hbh_opts_len) are in fact already using the current ns. Note: it is not clear why ipv6_destopt_rcv() use two ways to get to the netns : 1) dev_net(dst->dev) Originally used to increment IPSTATS_MIB_INHDRERRORS 2) dev_net(skb->dev) Tom used this variant in his patch. Maybe this calls to use ipv6_skb_net() instead ? Fixes: 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination options") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@quantonium.net> Cc: Coco Li <lixiaoyan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	virtio_net: Use virtio_find_vqs_ctx() helper	Xianting Tian
	virtio_find_vqs_ctx() is defined but never be called currently, it is the right place to use it. Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	net/tls: Remove the __TLS_DEC_STATS() macro.	Kuniyuki Iwashima
	The commit d26b698dd3cd ("net/tls: add skeleton of MIB statistics") introduced __TLS_DEC_STATS(), but it is not used and __SNMP_DEC_STATS() is not defined also. Let's remove it. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	net: bcmgenet: Fix attaching to PYH failed on RPi 4B	Jian-Hong Pan
	The Broadcom UniMAC MDIO bus from mdio-bcm-unimac module comes too late. So, GENET cannot find the ethernet PHY on UniMAC MDIO bus. This leads GENET fail to attach the PHY as following log: bcmgenet fd580000.ethernet: GENET 5.0 EPHY: 0x0000 ... could not attach to PHY bcmgenet fd580000.ethernet eth0: failed to connect to PHY uart-pl011 fe201000.serial: no DMA platform data libphy: bcmgenet MII bus: probed ... unimac-mdio unimac-mdio.-19: Broadcom UniMAC MDIO bus This patch adds the soft dependency to load mdio-bcm-unimac module before genet module to avoid the issue. Fixes: 9a4e79697009 ("net: bcmgenet: utilize generic Broadcom UniMAC MDIO controller driver") Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=213485 Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	bonding: allow nesting of bonding device	Di Zhu
	The commit 3c9ef511b9fa ("bonding: avoid adding slave device with IFF_MASTER flag") fix a crash when add slave device with IFF_MASTER, but it rejects the scenario of nested bonding device. As Eric Dumazet described: since there indeed is a usage scenario about nesting bonding, we should not break it. So we add a new judgment condition to allow nesting of bonding device. Fixes: 3c9ef511b9fa ("bonding: avoid adding slave device with IFF_MASTER flag") Suggested-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Di Zhu <zhudi21@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	tcp: Add stats for socket migration.	Kuniyuki Iwashima
	This commit adds two stats for the socket migration feature to evaluate the effectiveness: LINUX_MIB_TCPMIGRATEREQ(SUCCESS\|FAILURE). If the migration fails because of the own_req race in receiving ACK and sending SYN+ACK paths, we do not increment the failure stat. Then another CPU is responsible for the req. Link: https://lore.kernel.org/bpf/CAK6E8=cgFKuGecTzSCSQ8z3YJ_163C0uwO9yRvfDSE7vOe9mJA@mail.gmail.com/ Suggested-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	ibmveth: Set CHECKSUM_PARTIAL if NULL TCP CSUM.	David Wilder
	TCP checksums on received packets may be set to NULL by the sender if CSO is enabled. The hypervisor flags these packets as check-sum-ok and the skb is then flagged CHECKSUM_UNNECESSARY. If these packets are then forwarded the sender will not request CSO due to the CHECKSUM_UNNECESSARY flag. The result is a TCP packet sent with a bad checksum. This change sets up CHECKSUM_PARTIAL on these packets causing the sender to correctly request CSUM offload. Signed-off-by: David Wilder <dwilder@us.ibm.com> Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Tested-by: Cristobal Forno <cforno12@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23	Merge tag 'mlx5-net-next-2021-06-22' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-net-next-2021-06-22 1) Various minor cleanups and fixes from net-next branch 2) Optimize mlx5 feature check on tx and a fix to allow Vxlan with Ipsec offloads ==================== Signed-off-by: David S. Miller <davem@davemloft.net>