git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-04-01	drm/msm: Set drvdata to NULL when msm_drm_init() fails	Stephen Boyd
	We should set the platform device's driver data to NULL here so that code doesn't assume the struct drm_device pointer is valid when it could have been destroyed. The lifetime of this pointer is managed by a kref but when msm_drm_init() fails we call drm_dev_put() on the pointer which will free the pointer's memory. This driver uses the component model, so there's sort of two "probes" in this file, one for the platform device i.e. msm_pdev_probe() and one for the component i.e. msm_drm_bind(). The msm_drm_bind() code is using the platform device's driver data to store struct drm_device so the two functions are intertwined. This relationship becomes a problem for msm_pdev_shutdown() when it tests the NULL-ness of the pointer to see if it should call drm_atomic_helper_shutdown(). The NULL test is a proxy check for if the pointer has been freed by kref_put(). If the drm_device has been destroyed, then we shouldn't call the shutdown helper, and we know that is the case if msm_drm_init() failed, therefore set the driver data to NULL so that this pointer liveness is tracked properly. Fixes: 9d5cbf5fe46e ("drm/msm: add shutdown support for display platform_driver") Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Fabio Estevam <festevam@gmail.com> Cc: Krishna Manikandan <mkrishn@codeaurora.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Message-Id: <20210325212822.3663144-1-swboyd@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-04-01	Merge tag 'drm-fixes-2021-04-02' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Things have settled down in time for Easter, a random smattering of small fixes across a few drivers. I'm guessing though there might be some i915 and misc fixes out there I haven't gotten yet, but since today is a public holiday here, I'm sending this early so I can have the day off, I'll see if more requests come in and decide what to do with them later. amdgpu: - Polaris idle power fix - VM fix - Vangogh S3 fix - Fixes for non-4K page sizes amdkfd: - dqm fence memory corruption fix tegra: - lockdep warning fix - runtine PM reference fix - display controller fix - PLL Fix imx: - memory leak in error path fix - LDB driver channel registration fix - oob array warning in LDB driver exynos - unused header file removal" * tag 'drm-fixes-2021-04-02' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: check alignment on CPU page for bo map drm/amdgpu: Set a suitable dev_info.gart_page_size drm/amdgpu/vangogh: don't check for dpm in is_dpm_running when in suspend drm/amdkfd: dqm fence memory corruption drm/tegra: sor: Grab runtime PM reference across reset drm/tegra: dc: Restore coupling of display controllers gpu: host1x: Use different lock classes for each client drm/tegra: dc: Don't set PLL clock to 0Hz drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings() drm/amd/pm: no need to force MCLK to highest when no display connected drm/exynos/decon5433: Remove the unused include statements drm/imx: imx-ldb: fix out of bounds array access warning drm/imx: imx-ldb: Register LDB channel1 when it is the only channel to be used drm/imx: fix memory leak when fails to init
2021-04-02	Merge tag 'imx-drm-fixes-2021-04-01' of ↵	Dave Airlie
	git://git.pengutronix.de/git/pza/linux into drm-fixes drm/imx: imx-drm-core and imx-ldb fixes Fix a memory leak in an error path during DRM device initialization, fix the LDB driver to register channel 1 even if channel 0 is unused, and fix an out of bounds array access warning in the LDB driver. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Philipp Zabel <p.zabel@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/20210401092235.GA13586@pengutronix.de
2021-04-02	Merge tag 'drm/tegra/for-5.12-rc6' of ↵	Dave Airlie
	ssh://git.freedesktop.org/git/tegra/linux into drm-fixes drm/tegra: Fixes for v5.12-rc6 This contains a couple of fixes for various issues such as lockdep warnings, runtime PM references, coupled display controllers and misconfigured PLLs. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210401163352.3348296-1-thierry.reding@gmail.com
2021-04-01	RDMA/rtrs-clt: Close rtrs client conn before destroying rtrs clt session files	Md Haris Iqbal
	KASAN detected the following BUG: BUG: KASAN: use-after-free in rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] Read of size 8 at addr ffff88bf2fb4adc0 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 5.4.84-pserver #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012 Call Trace: <IRQ> dump_stack+0x96/0xe0 print_address_description.constprop.4+0x1f/0x300 ? irq_work_claim+0x2e/0x50 __kasan_report.cold.8+0x78/0x92 ? rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] kasan_report+0x10/0x20 rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] rtrs_clt_rdma_done+0xb1/0x760 [rtrs_client] ? lockdep_hardirqs_on+0x1a8/0x290 ? process_io_rsp+0xb0/0xb0 [rtrs_client] ? mlx4_ib_destroy_cq+0x100/0x100 [mlx4_ib] ? add_interrupt_randomness+0x1a2/0x340 __ib_process_cq+0x97/0x100 [ib_core] ib_poll_handler+0x41/0xb0 [ib_core] irq_poll_softirq+0xe0/0x260 __do_softirq+0x127/0x672 irq_exit+0xd1/0xe0 do_IRQ+0xa3/0x1d0 common_interrupt+0xf/0xf </IRQ> RIP: 0010:cpuidle_enter_state+0xea/0x780 Code: 31 ff e8 99 48 47 ff 80 7c 24 08 00 74 12 9c 58 f6 c4 02 0f 85 53 05 00 00 31 ff e8 b0 6f 53 ff e8 ab 4f 5e ff fb 8b 44 24 04 <85> c0 0f 89 f3 01 00 00 48 8d 7b 14 e8 65 1e 77 ff c7 43 14 00 00 RSP: 0018:ffffffffab007d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffca RAX: 0000000000000002 RBX: ffff88b803d69800 RCX: ffffffffa91a8298 RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffab021414 RBP: ffffffffab6329e0 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 R13: 000000bf39d82466 R14: ffffffffab632aa0 R15: ffffffffab632ae0 ? lockdep_hardirqs_on+0x1a8/0x290 ? cpuidle_enter_state+0xe5/0x780 cpuidle_enter+0x3c/0x60 do_idle+0x2fb/0x390 ? arch_cpu_idle_exit+0x40/0x40 ? schedule+0x94/0x120 cpu_startup_entry+0x19/0x1b start_kernel+0x5da/0x61b ? thread_stack_cache_init+0x6/0x6 ? load_ucode_amd_bsp+0x6f/0xc4 ? init_amd_microcode+0xa6/0xa6 ? x86_family+0x5/0x20 ? load_ucode_bsp+0x182/0x1fd secondary_startup_64+0xa4/0xb0 Allocated by task 5730: save_stack+0x19/0x80 __kasan_kmalloc.constprop.9+0xc1/0xd0 kmem_cache_alloc_trace+0x15b/0x350 alloc_sess+0xf4/0x570 [rtrs_client] rtrs_clt_open+0x3b4/0x780 [rtrs_client] find_and_get_or_create_sess+0x649/0x9d0 [rnbd_client] rnbd_clt_map_device+0xd7/0xf50 [rnbd_client] rnbd_clt_map_device_store+0x4ee/0x970 [rnbd_client] kernfs_fop_write+0x141/0x240 vfs_write+0xf3/0x280 ksys_write+0xba/0x150 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 5822: save_stack+0x19/0x80 __kasan_slab_free+0x125/0x170 kfree+0xe7/0x3f0 kobject_put+0xd3/0x240 rtrs_clt_destroy_sess_files+0x3f/0x60 [rtrs_client] rtrs_clt_close+0x3c/0x80 [rtrs_client] close_rtrs+0x45/0x80 [rnbd_client] rnbd_client_exit+0x10f/0x2bd [rnbd_client] __x64_sys_delete_module+0x27b/0x340 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe When rtrs_clt_close is triggered, it iterates over all the present rtrs_clt_sess and triggers close on them. However, the call to rtrs_clt_destroy_sess_files is done before the rtrs_clt_close_conns. This is incorrect since during the initialization phase we allocate rtrs_clt_sess first, and then we go ahead and create rtrs_clt_con for it. If we free the rtrs_clt_sess structure before closing the rtrs_clt_con, it may so happen that an inflight IO completion would trigger the function rtrs_clt_rdma_done, which would lead to the above UAF case. Hence close the rtrs_clt_con connections first, and then trigger the destruction of session files. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20210325153308.1214057-12-gi-oh.kim@ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01	i40e: Fix display statistics for veb_tc	Eryk Rybak
	If veb-stats was enabled, the ethtool stats triggered a warning due to invalid size: 'unexpected stat size for veb.tc_%u_tx_packets'. This was due to an incorrect structure definition for the statistics. Structures and functions have been improved in line with requirements for the presentation of statistics, in particular for the functions: 'i40e_add_ethtool_stats' and 'i40e_add_stat_strings'. Fixes: 1510ae0be2a4 ("i40e: convert VEB TC stats to use an i40e_stats array") Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com> Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-01	i40e: fix receiving of single packets in xsk zero-copy mode	Magnus Karlsson
	Fix so that single packets are received immediately instead of in batches of 8. If you sent 1 pps to a system, you received 8 packets every 8 seconds instead of 1 packet every second. The problem behind this was that the work_done reporting from the Tx part of the driver was broken. The work_done reporting in i40e controls not only the reporting back to the napi logic but also the setting of the interrupt throttling logic. When Tx or Rx reports that it has more to do, interrupts are throttled or coalesced and when they both report that they are done, interrupts are armed right away. If the wrong work_done value is returned, the logic will start to throttle interrupts in a situation where it should have just enabled them. This leads to the undesired batching behavior seen in user-space. Fix this by returning the correct boolean value from the Tx xsk zero-copy path. Return true if there is nothing to do or if we got fewer packets to process than we asked for. Return false if we got as many packets as the budget since there might be more packets we can process. Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance") Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-01	i40e: Fix inconsistent indenting	Arkadiusz Kubalewski
	Fixed new static analysis findings: "warn: inconsistent indenting" - introduced lately, reported with lkp and smatch build. Fixes: 4b208eaa8078 ("i40e: Add init and default config of software based DCB") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-01	ARM: mvebu: avoid clang -Wtautological-constant warning	Arnd Bergmann
	Clang warns about the comparison when using a 32-bit phys_addr_t: drivers/bus/mvebu-mbus.c:621:17: error: result of comparison of constant 4294967296 with expression of type 'phys_addr_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (reg_start >= 0x100000000ULL) Add a cast to shut up the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20210323131952.2835509-1-arnd@kernel.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-01	soc/fsl: qbman: fix conflicting alignment attributes	Arnd Bergmann
	When building with W=1, gcc points out that the __packed attribute on struct qm_eqcr_entry conflicts with the 8-byte alignment attribute on struct qm_fd inside it: drivers/soc/fsl/qbman/qman.c:189:1: error: alignment 1 of 'struct qm_eqcr_entry' is less than 8 [-Werror=packed-not-aligned] I assume that the alignment attribute is the correct one, and that qm_eqcr_entry cannot actually be unaligned in memory, so add the same alignment on the outer struct. Fixes: c535e923bb97 ("soc/fsl: Introduce DPAA 1.x QMan device driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20210323131530.2619900-1-arnd@kernel.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-01	null_blk: fix command timeout completion handling	Damien Le Moal
	Memory backed or zoned null block devices may generate actual request timeout errors due to the submission path being blocked on memory allocation or zone locking. Unlike fake timeouts or injected timeouts, the request submission path will call blk_mq_complete_request() or blk_mq_end_request() for these real timeout errors, causing a double completion and use after free situation as the block layer timeout handler executes blk_mq_rq_timed_out() and __blk_mq_free_request() in blk_mq_check_expired(). This problem often triggers a NULL pointer dereference such as: BUG: kernel NULL pointer dereference, address: 0000000000000050 RIP: 0010:blk_mq_sched_mark_restart_hctx+0x5/0x20 ... Call Trace: dd_finish_request+0x56/0x80 blk_mq_free_request+0x37/0x130 null_handle_cmd+0xbf/0x250 [null_blk] ? null_queue_rq+0x67/0xd0 [null_blk] blk_mq_dispatch_rq_list+0x122/0x850 __blk_mq_do_dispatch_sched+0xbb/0x2c0 __blk_mq_sched_dispatch_requests+0x13d/0x190 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x49/0x90 process_one_work+0x26c/0x580 worker_thread+0x55/0x3c0 ? process_one_work+0x580/0x580 kthread+0x134/0x150 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x1f/0x30 This problem very often triggers when running the full btrfs xfstests on a memory-backed zoned null block device in a VM with limited amount of memory. Avoid this by executing blk_mq_complete_request() in null_timeout_rq() only for commands that are marked for a fake timeout completion using the fake_timeout boolean in struct null_cmd. For timeout errors injected through debugfs, the timeout handler will execute blk_mq_complete_request()i as before. This is safe as the submission path does not execute complete requests in this case. In null_timeout_rq(), also make sure to set the command error field to BLK_STS_TIMEOUT and to propagate this error through to the request completion. Reported-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Reviewed-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210331225244.126426-1-damien.lemoal@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-01	ACPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()	Vitaly Kuznetsov
	Commit 496121c02127 ("ACPI: processor: idle: Allow probing on platforms with one ACPI C-state") broke CPU0 hotplug on certain systems, e.g. I'm observing the following on AWS Nitro (e.g r5b.xlarge but other instance types are affected as well): # echo 0 > /sys/devices/system/cpu/cpu0/online # echo 1 > /sys/devices/system/cpu/cpu0/online <10 seconds delay> -bash: echo: write error: Input/output error In fact, the above mentioned commit only revealed the problem and did not introduce it. On x86, to wakeup CPU an NMI is being used and hlt_play_dead()/mwait_play_dead() loops are prepared to handle it: /* * If NMI wants to wake up CPU0, start CPU0. */ if (wakeup_cpu0()) start_cpu0(); cpuidle_play_dead() -> acpi_idle_play_dead() (which is now being called on systems where it wasn't called before the above mentioned commit) serves the same purpose but it doesn't have a path for CPU0. What happens now on wakeup is: - NMI is sent to CPU0 - wakeup_cpu0_nmi() works as expected - we get back to while (1) loop in acpi_idle_play_dead() - safe_halt() puts CPU0 to sleep again. The straightforward/minimal fix is add the special handling for CPU0 on x86 and that's what the patch is doing. Fixes: 496121c02127 ("ACPI: processor: idle: Allow probing on platforms with one ACPI C-state") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: 5.10+ <stable@vger.kernel.org> # 5.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-01	Merge tag 'amd-drm-fixes-5.12-2021-03-31' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.12-2021-03-31: amdgpu: - Polaris idle power fix - VM fix - Vangogh S3 fix - Fixes for non-4K page sizes amdkfd: - dqm fence memory corruption fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210401020057.17831-1-alexander.deucher@amd.com
2021-04-01	Merge tag 'exynos-drm-fixes-for-v5.12-rc6' of ↵	Dave Airlie
	git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes Just one cleanup which drops of_gpio.h inclusion. - This header file isn't used anymore so drop it. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Inki Dae <inki.dae@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1617016858-14081-1-git-send-email-inki.dae@samsung.com
2021-03-31	drm/amdgpu: check alignment on CPU page for bo map	Xℹ Ruoyao
	The page table of AMDGPU requires an alignment to CPU page so we should check ioctl parameters for it. Return -EINVAL if some parameter is unaligned to CPU page, instead of corrupt the page table sliently. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-31	drm/amdgpu: Set a suitable dev_info.gart_page_size	Huacai Chen
	In Mesa, dev_info.gart_page_size is used for alignment and it was set to AMDGPU_GPU_PAGE_SIZE(4KB). However, the page table of AMDGPU driver requires an alignment on CPU pages. So, for non-4KB page system, gart_page_size should be max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE). Signed-off-by: Rui Wang <wangr@lemote.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Link: https://github.com/loongson-community/linux-stable/commit/caa9c0a1 [Xi: rebased for drm-next, use max_t for checkpatch, and reworded commit message.] Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang> BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1549 Tested-by: Dan Horák <dan@danny.cz> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-31	drm/amdgpu/vangogh: don't check for dpm in is_dpm_running when in suspend	Alex Deucher
	Do the same thing we do for Renoir. We can check, but since the sbios has started DPM, it will always return true which causes the driver to skip some of the SMU init when it shouldn't. Reviewed-by: Zhan Liu <zhan.liu@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-31	drm/amdkfd: dqm fence memory corruption	Qu Huang
	Amdgpu driver uses 4-byte data type as DQM fence memory, and transmits GPU address of fence memory to microcode through query status PM4 message. However, query status PM4 message definition and microcode processing are all processed according to 8 bytes. Fence memory only allocates 4 bytes of memory, but microcode does write 8 bytes of memory, so there is a memory corruption. Changes since v1: * Change dqm->fence_addr as a u64 pointer to fix this issue, also fix up query_status and amdkfd_fence_wait_timeout function uses 64 bit fence value to make them consistent. Signed-off-by: Qu Huang <jinsdb@126.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-31	net/mlx5e: Guarantee room for XSK wakeup NOP on async ICOSQ	Tariq Toukan
	XSK wakeup flow triggers an IRQ by posting a NOP WQE and hitting the doorbell on the async ICOSQ. It maintains its state so that it doesn't issue another NOP WQE if it has an outstanding one already. For this flow to work properly, the NOP post must not fail. Make sure to reserve room for the NOP WQE in all WQE posts to the async ICOSQ. Fixes: 8d94b590f1e4 ("net/mlx5e: Turn XSK ICOSQ into a general asynchronous one") Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support") Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	net/mlx5e: Consider geneve_opts for encap contexts	Dima Chumak
	Current algorithm for encap keys is legacy from initial vxlan implementation and doesn't take into account all possible fields of a tunnel. For example, for a Geneve tunnel, which may have additional TLV options, they are ignored when comparing encap keys and a rule can be attached to an incorrect encap entry. Fix that by introducing encap_info_equal() operation in struct mlx5e_tc_tunnel. Geneve tunnel type uses custom implementation, which extends generic algorithm and considers options if they are set. Fixes: 7f1a546e3222 ("net/mlx5e: Consider tunnel type for encap contexts") Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	net/mlx5: Don't request more than supported EQs	Daniel Jurgens
	Calculating the number of compeltion EQs based on the number of available IRQ vectors doesn't work now that all async EQs share one IRQ. Thus the max number of EQs can be exceeded on systems with more than approximately 256 CPUs. Take this into account when calculating the number of available completion EQs. Fixes: 81bfa206032a ("net/mlx5: Use a single IRQ for all async EQs") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	net/mlx5e: kTLS, Fix RX counters atomicity	Tariq Toukan
	Some TLS RX counters increment per socket/connection, and are not protected against parallel modifications from several cores. Switch them to atomic counters by taking them out of the RQ stats into the global atomic TLS stats. In this patch, we touch 'rx_tls_ctx/del' that count the number of device-offloaded RX TLS connections added/deleted. These counters are updated in the add/del callbacks, out of the fast data-path. This change is not needed for counters that increment only in NAPI context, as they are protected by the NAPI mechanism. Keep them as tls_* counters under 'struct mlx5e_rq_stats'. Fixes: 76c1e1ac2aae ("net/mlx5e: kTLS, Add kTLS RX stats") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	net/mlx5e: kTLS, Fix TX counters atomicity	Tariq Toukan
	Some TLS TX counters increment per socket/connection, and are not protected against parallel modifications from several cores. Switch them to atomic counters by taking them out of the SQ stats into the global atomic TLS stats. In this patch, we touch a single counter 'tx_tls_ctx' that counts the number of device-offloaded TX TLS connections added. Now that this counter can be increased without the for having the SQ context in hand, move it to the mlx5e_ktls_add_tx() callback where it really belongs, out of the fast data-path. This change is not needed for counters that increment only in NAPI context or under the TX lock, as they are already protected. Keep them as tls_* counters under 'struct mlx5e_sq_stats'. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	net/mlx5: E-switch, Create vport miss group only if src rewrite is supported	Maor Dickman
	Create send to vport miss group was added in order to support traffic recirculation to root table with metadata source rewrite. This group is created also in case source rewrite isn't supported. Fixed by creating send to vport miss group only if source rewrite is supported by FW. Fixes: 8e404fefa58b ("net/mlx5e: Match recirculated packet miss in slow table using reg_c1") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	net/mlx5e: Fix ethtool indication of connector type	Aya Levin
	Use connector_type read from PTYS register when it's valid, based on corresponding capability bit. Fixes: 5b4793f81745 ("net/mlx5e: Add support for reading connector type from PTYS") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	net/mlx5: Delete auxiliary bus driver eth-rep first	Maor Dickman
	Delete auxiliary bus drivers flow deletes the eth driver first and then the eth-reps driver but eth-reps devices resources are depend on eth device. Fixed by changing the delete order of auxiliary bus drivers to delete the eth-rep driver first and after it the eth driver. Fixes: 601c10c89cbb ("net/mlx5: Delete custom device management logic") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	net/mlx5e: Fix mapping of ct_label zero	Ariel Levkovich
	ct_label 0 is a default label each flow has and therefore there can be rules that match on ct_label=0 without a prior rule that set the ct_label to this value. The ct_label value is not used directly in the HW rules and instead it is mapped to some id within a defined range and this id is used to set and match the metadata register which carries the ct_label. If we have a rule that matches on ct_label=0, the hw rule will perform matching on a value that is != 0 because of the mapping from label to id. Since the metadata register default value is 0 and it was never set before to anything else by an action that sets the ct_label, there will always be a mismatch between that register and the value in the rule. To support such rule, a forced mapping of ct_label 0 to id=0 is done so that it will match the metadata register default value of 0. Fixes: 54b154ecfb8c ("net/mlx5e: CT: Map 128 bits labels to 32 bit map ID") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	gpio: sysfs: Obey valid_mask	Matti Vaittinen
	Do not allow exporting GPIOs which are set invalid by the driver's valid mask. Fixes: 726cb3ba4969 ("gpiolib: Support 'gpio-reserved-ranges' property") Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-03-31	Merge tag 'pinctrl-v5.12-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Some overly ripe fixes for the v5.12 kernel. I should have sent earlier but had my head stuck in GDB. All are driver fixes: - Fix up some Intel GPIO base calculations. - Fix a register offset in the Microchip driver. - Fix suspend/resume bug in the Rockchip driver. - Default pull up strength in the Qualcomm LPASS driver. - Fix two pingroup offsets in the Qualcomm SC7280 driver. - Fix SDC1 register offset in the Qualcomm SC7280 driver. - Fix a nasty string concatenation in the Qualcomm SDX55 driver. - Check the REVID register to see if the device is real or virtualized during virtualization in the Intel driver" * tag 'pinctrl-v5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: intel: check REVID register value for device presence pinctrl: qcom: fix unintentional string concatenation pinctrl: qcom: sc7280: Fix SDC1_RCLK configurations pinctrl: qcom: sc7280: Fix SDC_QDSD_PINGROUP and UFS_RESET offsets pinctrl: qcom: lpass lpi: use default pullup/strength values pinctrl: rockchip: fix restore error in resume pinctrl: microchip-sgpio: Fix wrong register offset for IRQ trigger pinctrl: intel: Show the GPIO base calculation explicitly
2021-03-31	Merge tag 'mute-led-rework' of ↵	Mark Brown
	https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound into asoc-5.13 ALSA: control - add generic LED API This patchset tries to resolve the diversity in the audio LED control among the ALSA drivers. A new control layer registration is introduced which allows to run additional operations on top of the elementary ALSA sound controls. A new control access group (three bits in the access flags) was introduced to carry the LED group information for the sound controls. The low-level sound drivers can just mark those controls using this access group. This information is not exported to the user space, but user space can manage the LED sound control associations through sysfs (last patch) per Mark's request. It makes things fully configurable in the kernel and user space (UCM). The actual state ('route') evaluation is really easy (the minimal value check for all channels / controls / cards). If there's more complicated logic for a given hardware, the card driver may eventually export a new read-only sound control for the LED group and do the logic itself. The new LED trigger control code is completely separated and possibly optional (there's no symbol dependency). The full code separation allows eventually to move this LED trigger control to the user space in future. Actually it replaces the already present functionality in the kernel space (HDA drivers) and allows a quick adoption for the recent hardware (ASoC codecs including SoundWire). snd_ctl_led 24576 0 The sound driver implementation is really easy: 1) call snd_ctl_led_request() when control LED layer should be automatically activated / it calls module_request("snd-ctl-led") on demand / 2) mark all related kcontrols with SNDRV_CTL_ELEM_ACCESS_SPK_LED or SNDRV_CTL_ELEM_ACCESS_MIC_LED Link: https://lore.kernel.org/r/20210317172945.842280-1-perex@perex.cz Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-03-31	i2c: hix5hd2: use the correct HiSilicon copyright	Hao Fang
	s/Hisilicon/HiSilicon/g. It should use capital S, according to https://www.hisilicon.com/en/terms-of-use. Signed-off-by: Hao Fang <fanghao11@huawei.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-03-31	i2c: stm32f4: Mundane typo fix	Bhaskar Chowdhury
	s/postion/position/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Alain Volmat <alain.volmat@foss.st.com> Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@foss.st.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-03-31	I2C: JZ4780: Fix bug for Ingenic X1000.	周琰杰 (Zhou Yanjie)
	Only send "X1000_I2C_DC_STOP" when last byte, or it will cause error when I2C write operation which should look like this: device_addr + w, reg_addr, data; But without this patch, it looks like this: device_addr + w, reg_addr, device_addr + w, data; Fixes: 21575a7a8d4c ("I2C: JZ4780: Add support for the X1000.") Reported-by: 杨文龙 (Yang Wenlong) <ywltyut@sina.cn> Tested-by: 杨文龙 (Yang Wenlong) <ywltyut@sina.cn> Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-03-30	net: phy: broadcom: Only advertise EEE for supported modes	Florian Fainelli
	We should not be advertising EEE for modes that we do not support, correct that oversight by looking at the PHY device supported linkmodes. Fixes: 99cec8a4dda2 ("net: phy: broadcom: Allow enabling or disabling of EEE") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	nfp: flower: ignore duplicate merge hints from FW	Yinjun Zhang
	A merge hint message needs some time to process before the merged flow actually reaches the firmware, during which we may get duplicate merge hints if there're more than one packet that hit the pre-merged flow. And processing duplicate merge hints will cost extra host_ctx's which are a limited resource. Avoid the duplicate merge by using hash table to store the sub_flows to be merged. Fixes: 8af56f40e53b ("nfp: flower: offload merge flows") Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	ACPI: scan: Fix _STA getting called on devices with unmet dependencies	Hans de Goede
	Commit 71da201f38df ("ACPI: scan: Defer enumeration of devices with _DEP lists") dropped the following 2 lines from acpi_init_device_object(): /* Assume there are unmet deps until acpi_device_dep_initialize() runs */ device->dep_unmet = 1; Leaving the initial value of dep_unmet at the 0 from the kzalloc(). This causes the acpi_bus_get_status() call in acpi_add_single_object() to actually call _STA, even though there maybe unmet deps, leading to errors like these: [ 0.123579] ACPI Error: No handler for Region [ECRM] (00000000ba9edc4c) [GenericSerialBus] (20170831/evregion-166) [ 0.123601] ACPI Error: Region GenericSerialBus (ID=9) has no handler (20170831/exfldio-299) [ 0.123618] ACPI Error: Method parse/execution failed \_SB.I2C1.BAT1._STA, AE_NOT_EXIST (20170831/psparse-550) Fix this by re-adding the dep_unmet = 1 initialization to acpi_init_device_object() and modifying acpi_bus_check_add() to make sure that dep_unmet always gets setup there, overriding the initial 1 value. This re-fixes the issue initially fixed by commit 63347db0affa ("ACPI / scan: Use acpi_bus_get_status() to initialize ACPI_TYPE_DEVICE devs"), which introduced the removed "device->dep_unmet = 1;" statement. This issue was noticed; and the fix tested on a Dell Venue 10 Pro 5055. Fixes: 71da201f38df ("ACPI: scan: Defer enumeration of devices with _DEP lists") Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: 5.11+ <stable@vger.kernel.org> # 5.11+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-30	drm/tegra: sor: Grab runtime PM reference across reset	Thierry Reding
	The SOR resets are exclusively shared with the SOR power domain. This means that exclusive access can only be granted temporarily and in order for that to work, a rigorous sequence must be observed. To ensure that a single consumer gets exclusive access to a reset, each consumer must implement a rigorous protocol using the reset_control_acquire() and reset_control_release() functions. However, these functions alone don't provide any guarantees at the system level. Drivers need to ensure that the only a single consumer has access to the reset at the same time. In order for the SOR to be able to exclusively access its reset, it must therefore ensure that the SOR power domain is not powered off by holding on to a runtime PM reference to that power domain across the reset assert/deassert operation. This used to work fine by accident, but was revealed when recently more devices started to rely on the SOR power domain. Fixes: 11c632e1cfd3 ("drm/tegra: sor: Implement acquire/release for reset") Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30	drm/tegra: dc: Restore coupling of display controllers	Thierry Reding
	Coupling of display controllers used to rely on runtime PM to take the companion controller out of reset. Commit fd67e9c6ed5a ("drm/tegra: Do not implement runtime PM") accidentally broke this when runtime PM was removed. Restore this functionality by reusing the hierarchical host1x client suspend/resume infrastructure that's similar to runtime PM and which perfectly fits this use-case. Fixes: fd67e9c6ed5a ("drm/tegra: Do not implement runtime PM") Reported-by: Dmitry Osipenko <digetx@gmail.com> Reported-by: Paul Fertser <fercerpav@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30	gpu: host1x: Use different lock classes for each client	Mikko Perttunen
	To avoid false lockdep warnings, give each client lock a different lock class, passed from the initialization site by macro. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30	drm/tegra: dc: Don't set PLL clock to 0Hz	Dmitry Osipenko
	RGB output doesn't allow to change parent clock rate of the display and PCLK rate is set to 0Hz in this case. The tegra_dc_commit_state() shall not set the display clock to 0Hz since this change propagates to the parent clock. The DISP clock is defined as a NODIV clock by the tegra-clk driver and all NODIV clocks use the CLK_SET_RATE_PARENT flag. This bug stayed unnoticed because by default PLLP is used as the parent clock for the display controller and PLLP silently skips the erroneous 0Hz rate changes because it always has active child clocks that don't permit rate changes. The PLLP isn't acceptable for some devices that we want to upstream (like Samsung Galaxy Tab and ASUS TF700T) due to a display panel clock rate requirements that can't be fulfilled by using PLLP and then the bug pops up in this case since parent clock is set to 0Hz, killing the display output. Don't touch DC clock if pclk=0 in order to fix the problem. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30	Merge tag 'vfio-v5.12-rc6' of git://github.com/awilliam/linux-vfio	Linus Torvalds
	Pull VFIO fixes from Alex Williamson: - Fix pfnmap batch carryover (Daniel Jordan) - Fix nvlink Kconfig dependency (Jason Gunthorpe) * tag 'vfio-v5.12-rc6' of git://github.com/awilliam/linux-vfio: vfio/nvlink: Add missing SPAPR_TCE_IOMMU depends vfio/type1: Empty batch for pfnmap pages
2021-03-30	Merge tag 'for-linus-5.12b-rc6-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "One Xen related security fix (XSA-371)" * tag 'for-linus-5.12b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen-blkback: don't leak persistent grants from xen_blkbk_map()
2021-03-30	thunderbolt: Fix off by one in tb_port_find_retimer()	Dan Carpenter
	This array uses 1-based indexing so it corrupts memory one element beyond of the array. Fix it by making the array one element larger. Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2021-03-30	thunderbolt: Fix a leak in tb_retimer_add()	Dan Carpenter
	After the device_register() succeeds, then the correct way to clean up is to call device_unregister(). The unregister calls both device_del() and device_put(). Since this code was only device_del() it results in a memory leak. Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2021-03-29	scsi: iscsi: Fix race condition between login and sync thread	Gulam Mohamed
	A kernel panic was observed due to a timing issue between the sync thread and the initiator processing a login response from the target. The session reopen can be invoked both from the session sync thread when iscsid restarts and from iscsid through the error handler. Before the initiator receives the response to a login, another reopen request can be sent from the error handler/sync session. When the initial login response is subsequently processed, the connection has been closed and the socket has been released. To fix this a new connection state, ISCSI_CONN_BOUND, is added: - Set the connection state value to ISCSI_CONN_DOWN upon iscsi_if_ep_disconnect() and iscsi_if_stop_conn() - Set the connection state to the newly created value ISCSI_CONN_BOUND after bind connection (transport->bind_conn()) - In iscsi_set_param(), return -ENOTCONN if the connection state is not either ISCSI_CONN_BOUND or ISCSI_CONN_UP Link: https://lore.kernel.org/r/20210325093248.284678-1-gulam.mohamed@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> index 91074fd97f64..f4bf62b007a0 100644
2021-03-29	ethernet/netronome/nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx	Lv Yunlong
	In nfp_bpf_ctrl_msg_rx, if nfp_ccm_get_type(skb) == NFP_CCM_TYPE_BPF_BPF_EVENT is true, the skb will be freed. But the skb is still used by nfp_ccm_rx(&bpf->ccm, skb). My patch adds a return when the skb was freed. Fixes: bcf0cafab44fd ("nfp: split out common control message handling code") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	Merge branch '100GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-03-29 This series contains updates to ice driver only. Ani does not fail on link/PHY errors during probe as this is not a fatal error to prevent the user from remedying the problem. He also corrects checking Wake on LAN support to be port number, not PF ID. Fabio increases the AdminQ timeout as some commands can take longer than the current value. Chinh fixes iSCSI to use be able to use port 860 by using information from DCBx and other QoS configuration info. Krzysztof fixes a possible race between ice_open() and ice_stop(). Bruce corrects the ordering of arguments in a memory allocation call. Dave removes DCBNL device reset bit which is blocking changes coming from DCBNL interface. Jacek adds error handling for filter allocation failure. Robert ensures memory is freed if VSI filter list issues are encountered. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29	cxgb4: avoid collecting SGE_QBASE regs during traffic	Rahul Lakkireddy
	Accessing SGE_QBASE_MAP[0-3] and SGE_QBASE_INDEX registers can lead to SGE missing doorbells under heavy traffic. So, only collect them when adapter is idle. Also update the regdump range to skip collecting these registers. Fixes: 80a95a80d358 ("cxgb4: collect SGE PF/VF queue map") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30	Merge tag 'intel-pinctrl-v5.12-3' of ↵	Linus Walleij
	gitolite.kernel.org:pub/scm/linux/kernel/git/pinctrl/intel into fixes intel-pinctrl for v5.12-3 * Check if device is present, which is not the case in Xen The following is an automated git shortlog grouped by driver: intel: - check REVID register value for device presence
2021-03-29	clk: qcom: camcc: Update the clock ops for the SC7180	Taniya Das
	Some of the RCGs could be always ON from the XO source and could be used as the clock on signal for the GDSC to be operational. In the cases where the GDSCs are parked at different source with the source clock disabled, it could lead to the GDSC to be stuck at ON/OFF during gdsc disable/enable. Thus park the RCGs at XO during clock disable and update the rcg_ops to use the shared_ops. Fixes: 15d09e830bbc ("clk: qcom: camcc: Add camera clock controller driver for SC7180") Signed-off-by: Taniya Das <tdas@codeaurora.org> Link: https://lore.kernel.org/r/1616809265-11912-1-git-send-email-tdas@codeaurora.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>