linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-08-15	drm/amdkfd: fix double assign skip process context clear	Jonathan Kim
	Remove redundant assignment when skipping process ctx clear. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amd/display: Update replay for clk_mgr optimizations	Bhawanpreet Lakha
	Add Replay calls to clk_mgr updates (just like PSR) Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amdgpu: Fix identifier names to function definition arguments in atom.h	Srinivasan Shanmugam
	Fixes the following: WARNING: function definition argument 'struct card_info ' should also have an identifier name WARNING: function definition argument 'uint32_t' should also have an identifier name WARNING: function definition argument 'void ' should also have an identifier name WARNING: function definition argument 'struct atom_context ' should also have an identifier name WARNING: function definition argument 'int' should also have an identifier name WARNING: function definition argument 'uint32_t ' should also have an identifier name WARNING: Unnecessary space before function pointer name ERROR: space prohibited after that '*' (ctx:BxW) CHECK: Prefer kernel type 'u32' over 'uint32_t' Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amdgpu/pm: fix throttle_status for other than MP1 11.0.7	Umio Yasuno
	Use the right metrics table version based on the firmware. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2720 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Umio Yasuno <coelacanth_dream@protonmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amdgpu: mode1 reset needs to recover mp1 for mp0 v13_0_10	YiPeng Chai
	Mode1 reset needs to recover mp1 in fatal error case for mp0 v13_0_10. v2: Define a macro to wrap psp function calls. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amd/pm: avoid driver getting empty metrics table for the first time	Yang Wang
	add metrics.AccumulationCouter check to avoid driver getting an empty metrics data since metrics table not updated completely in pmfw side. Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amdkfd: Use memdup_user() rather than duplicating its implementation	Atul Raut
	To prevent its redundant implementation and streamline code, use memdup_user. This fixes warnings reported by Coccinelle: ./drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:2811:13-20: WARNING opportunity for memdup_user Signed-off-by: Atul Raut <rauji.raut@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amdgpu: Remove unnecessary ras cap check	Hawking Zhang
	RAS global isr will only be invoked by hardware interrupt. Don't need to query ras capability in isr In addition, amdgpu_ras_interrupt_fatal_error_handler ensures the isr won't be called from guest linux side by accident. The RAS cap check in isr that introduced to fix sriov crash is not needed any more Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amdkfd: fix build failure without CONFIG_DYNAMIC_DEBUG	Arnd Bergmann
	When CONFIG_DYNAMIC_DEBUG is disabled altogether, calling _dynamic_func_call_no_desc() does not work: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c: In function 'svm_range_set_attr': drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:52:9: error: implicit declaration of function '_dynamic_func_call_no_desc' [-Werror=implicit-function-declaration] 52 \| _dynamic_func_call_no_desc("svm_range_dump", svm_range_debug_dump, svms) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:3564:9: note: in expansion of macro 'dynamic_svm_range_dump' 3564 \| dynamic_svm_range_dump(svms); \| ^~~~~~~~~~~~~~~~~~~~~~ Add a compile-time conditional in addition to the runtime check. Fixes: 8923137dbe4b ("drm/amdkfd: avoid svm dump when dynamic debug disabled") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	genetlink: use attrs from struct genl_info	Jakub Kicinski
	Since dumps carry struct genl_info now, use the attrs pointer from genl_info and remove the one in struct genl_dumpit_info. Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20230814214723.2924989-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15	list: Introduce CONFIG_LIST_HARDENED	Marco Elver
	Numerous production kernel configs (see [1, 2]) are choosing to enable CONFIG_DEBUG_LIST, which is also being recommended by KSPP for hardened configs [3]. The motivation behind this is that the option can be used as a security hardening feature (e.g. CVE-2019-2215 and CVE-2019-2025 are mitigated by the option [4]). The feature has never been designed with performance in mind, yet common list manipulation is happening across hot paths all over the kernel. Introduce CONFIG_LIST_HARDENED, which performs list pointer checking inline, and only upon list corruption calls the reporting slow path. To generate optimal machine code with CONFIG_LIST_HARDENED: 1. Elide checking for pointer values which upon dereference would result in an immediate access fault (i.e. minimal hardening checks). The trade-off is lower-quality error reports. 2. Use the __preserve_most function attribute (available with Clang, but not yet with GCC) to minimize the code footprint for calling the reporting slow path. As a result, function size of callers is reduced by avoiding saving registers before calling the rarely called reporting slow path. Note that all TUs in lib/Makefile already disable function tracing, including list_debug.c, and __preserve_most's implied notrace has no effect in this case. 3. Because the inline checks are a subset of the full set of checks in __list_*_valid_or_report(), always return false if the inline checks failed. This avoids redundant compare and conditional branch right after return from the slow path. As a side-effect of the checks being inline, if the compiler can prove some condition to always be true, it can completely elide some checks. Since DEBUG_LIST is functionally a superset of LIST_HARDENED, the Kconfig variables are changed to reflect that: DEBUG_LIST selects LIST_HARDENED, whereas LIST_HARDENED itself has no dependency on DEBUG_LIST. Running netperf with CONFIG_LIST_HARDENED (using a Clang compiler with "preserve_most") shows throughput improvements, in my case of ~7% on average (up to 20-30% on some test cases). Link: https://r.android.com/1266735 [1] Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/blob/main/config [2] Link: https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings [3] Link: https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html [4] Signed-off-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/20230811151847.1594958-3-elver@google.com Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-15	genetlink: remove userhdr from struct genl_info	Jakub Kicinski
	Only three families use info->userhdr today and going forward we discourage using fixed headers in new families. So having the pointer to user header in struct genl_info is an overkill. Compute the header pointer at runtime. Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Link: https://lore.kernel.org/r/20230814214723.2924989-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-15	drm/amdgpu: disable mcbp if parameter zero is set	Jiadong Zhu
	The parameter amdgpu_mcbp shall have priority against the default value calculated from the chip version. User could disable mcbp by setting the parameter mcbp as zero. v2: do not trigger preemption in sw ring muxer when mcbp is disabled. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/radeon: Fix multiple line dereference in 'atom_iio_execute'	Srinivasan Shanmugam
	Fixes the following: WARNING: Avoid multiple line dereference - prefer 'ctx->io_attr' + ((ctx-> + io_attr >> CU8(base + 2)) & (0xFFFFFFFF >> (32 - Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amd/pm: Add vclk and dclk sysnode for GC 9.4.3	Asad Kamal
	Expose sysfs vclck and dclk entries for GC version 9.4.3 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amd/pm: disallow the fan setting if there is no fan on smu 13.0.0	Kenneth Feng
	drm/amd/pm: disallow the fan setting if there is no fan on smu 13.0.0 V2: depend on pm.no_fan to check Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amdkfd: Add missing tba_hi programming on aldebaran	Jay Cornwall
	Previously asymptomatic because high 32 bits were zero. Fixes: 96c211f1f9ef ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole") Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amdgpu: Fix missing comment for mb() in 'amdgpu_device_aper_access'	Srinivasan Shanmugam
	This patch adds the missing code comment for memory barrier WARNING: memory barrier without comment + mb(); WARNING: memory barrier without comment + mb(); Cc: Guchun Chen <guchun.chen@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	drm/amd/display: Add Replay supported/enabled checks	Bhawanpreet Lakha
	- Add checks for Cursor update and dirty rects (sending updates to dmub) - Add checks for dc_notify_vsync, and fbc and subvp Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15	fbdev: goldfishfb: Do not check 0 for platform_get_irq()	Zhu Wang
	Since platform_get_irq() never returned zero, so it need not to check whether it returned zero, and we use the return error code of platform_get_irq() to replace the current return error code. Please refer to the commit a85a6c86c25b ("driver core: platform: Clarify that IRQ 0 is invalid") to get that platform_get_irq() never returned zero. Signed-off-by: Zhu Wang <wangzhu9@huawei.com> Signed-off-by: Helge Deller <deller@gmx.de>
2023-08-15	fbdev: atmel_lcdfb: Remove redundant of_match_ptr()	Ruan Jinjie
	The driver depends on CONFIG_OF, it is not necessary to use of_match_ptr() here. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Signed-off-by: Helge Deller <deller@gmx.de>
2023-08-15	ipmi_si: fix -Wvoid-pointer-to-enum-cast warning	Justin Stitt
	With W=1 we see the following warning: \| drivers/char/ipmi/ipmi_si_platform.c:272:15: error: \ \| cast to smaller integer type 'enum si_type' from \ \| 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] \| 272 \| io.si_type = (enum si_type) match->data; \| \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ This is due to the fact that the `si_type` enum members are int-width and a cast from pointer-width down to int will cause truncation and possible data loss. Although in this case `si_type` has only a few enumerated fields and thus there is likely no data loss occurring. Nonetheless, this patch is necessary to the goal of promoting this warning out of W=1. Link: https://github.com/ClangBuiltLinux/linux/issues/1902 Link: https://lore.kernel.org/llvm/202308081000.tTL1ElTr-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Justin Stitt <justinstitt@google.com> Message-Id: <20230809-cbl-1902-v1-1-92def12d1dea@google.com> Signed-off-by: Corey Minyard <minyard@acm.org>
2023-08-15	Merge tag 'regulator-fix-v6.5-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "Two small driver specific fixes: one incorrect definition for one of the Qualcomm regulators and better handling of poorly formed DTs in the DA9063 driver" * tag 'regulator-fix-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: qcom-rpmh: Fix LDO 12 regulator for PM8550 regulator: da9063: better fix null deref with partial DT
2023-08-15	spi: tegra114: Remove unnecessary NULL-pointer checks	Alexander Danilenko
	cs_setup, cs_hold and cs_inactive points to fields of spi_device struct, so there is no sense in checking them for NULL. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 04e6bb0d6bb1 ("spi: modify set_cs_timing parameter") Signed-off-by: Alexander Danilenko <al.b.danilenko@gmail.com> Link: https://lore.kernel.org/r/20230815092058.4083-1-al.b.danilenko@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-08-15	RDMA/bnxt_re: Add support for dmabuf pinned memory regions	Saravanan Vajravel
	Support the new verb which indicates dmabuf support. bnxt doesn't support ODP. So use the pinned version of the dmabuf APIs to enable bnxt_re devices to work as dmabuf importer. Link: https://lore.kernel.org/r/1690790473-25850-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-08-15	gpio: sim: replace memmove() + strstrip() with skip_spaces() + strim()	Bartosz Golaszewski
	Turns out we can avoid the memmove() by using skip_spaces() and strim(). We did that in gpio-consumer, let's do it in gpio-sim. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	Input: goodix - add support for ACPI ID GDX9110	Felix Engelhardt
	The Goodix touchscreen controller with ACPI ID GDX9110 was not recognized by the goodix driver. This patch adds this ID to the list of supported IDs, allowing the driver to be used with this device. The change will allow Linux to be used on ~1 million tablet devices used in Kenyan primary schools. Signed-off-by: Felix Engelhardt <felix.engelhardt@eidu.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230807124723.382899-1-felix.engelhardt@eidu.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2023-08-15	firmware: cs_dsp: Fix new control name check	Vlad Karpovich
	Before adding a new FW control, its name is checked against existing controls list. But the string length in strncmp used to compare controls names is taken from the list, so if beginnings of the controls are matching, then the new control is not created. For example, if CAL_R control already exists, CAL_R_SELECTED is not created. The fix is to compare string lengths as well. Fixes: 6477960755fb ("ASoC: wm_adsp: Move check for control existence") Signed-off-by: Vlad Karpovich <vkarpovi@opensource.cirrus.com> Link: https://lore.kernel.org/r/20230815172908.3454056-1-vkarpovi@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-08-15	leds: pca995x: Fix MODULE_DEVICE_TABLE for OF	Marek Vasut
	Fix copy-paste error in MODULE_DEVICE_TABLE() for the OF table, use the 'of' first parameter instead of duplicate 'i2c'. Fixes: ee4e80b2962e ("leds: pca995x: Add support for PCA995X chips") Signed-off-by: Marek Vasut <marex@denx.de> Link: https://lore.kernel.org/r/20230809125314.531806-1-marex@denx.de Signed-off-by: Lee Jones <lee@kernel.org>
2023-08-15	drm/msm/adreno: Add missing MODULE_FIRMWARE macros	Juerg Haefliger
	The driver references some firmware files that don't have corresponding MODULE_FIRMWARE macros and thus won't be listed via modinfo. Fix that. Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Patchwork: https://patchwork.freedesktop.org/patch/543290/ [rob: drop a690_gmu.bin as a690 is using same fw as a660 now] Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-08-15	drm/msm/gpu: Push gpu lock down past runpm	Rob Clark
	Avoid holding gpu lock when calling runpm, to avoid this lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 6.4.3-debug+ #14 Not tainted ------------------------------------------------------ ring0/373 is trying to acquire lock: ffffffead86efb98 (prepare_lock){+.+.}-{3:3}, at: clk_prepare_lock+0x70/0x98 but task is already holding lock: ffffff809cd19170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x7c/0x128 [msm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&gpu->lock){+.+.}-{3:3}: __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 msm_job_run+0x7c/0x128 [msm] drm_sched_main+0x264/0x354 [gpu_sched] kthread+0xf0/0x100 ret_from_fork+0x10/0x20 -> #3 (dma_fence_map){++++}-{0:0}: __dma_fence_might_wait+0x74/0xc0 dma_resv_lockdep+0x1f0/0x2e8 do_one_initcall+0xb4/0x214 kernel_init_freeable+0x338/0x33c kernel_init+0x30/0x134 ret_from_fork+0x10/0x20 -> #2 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}: fs_reclaim_acquire+0x7c/0x9c slab_pre_alloc_hook.constprop.0+0x40/0x250 __kmem_cache_alloc_node+0x60/0x18c kmalloc_node_trace+0x40/0x84 alloc_worker+0x2c/0x64 init_rescuer+0x34/0xe0 workqueue_init+0x168/0x1fc kernel_init_freeable+0x15c/0x33c kernel_init+0x30/0x134 ret_from_fork+0x10/0x20 -> #1 (fs_reclaim){+.+.}-{0:0}: __fs_reclaim_acquire+0x3c/0x48 fs_reclaim_acquire+0x50/0x9c slab_pre_alloc_hook.constprop.0+0x40/0x250 __kmem_cache_alloc_node+0x60/0x18c kmalloc_trace+0x44/0x88 clk_rcg2_dfs_determine_rate+0x60/0x214 clk_core_determine_round_nolock+0xb8/0xf0 clk_core_round_rate_nolock+0x84/0x118 clk_core_round_rate_nolock+0xd8/0x118 clk_round_rate+0x6c/0xd0 geni_se_clk_tbl_get+0x78/0xc0 geni_se_clk_freq_match+0x44/0xe4 get_spi_clk_cfg+0x50/0xf4 geni_spi_set_clock_and_bw+0x54/0x104 spi_geni_prepare_message+0x130/0x174 __spi_pump_transfer_message+0x200/0x4d8 __spi_sync+0x13c/0x23c spi_sync_locked+0x18/0x24 do_cros_ec_pkt_xfer_spi+0x124/0x3f0 cros_ec_xfer_high_pri_work+0x28/0x3c kthread_worker_fn+0x14c/0x27c kthread+0xf0/0x100 ret_from_fork+0x10/0x20 -> #0 (prepare_lock){+.+.}-{3:3}: __lock_acquire+0xdf8/0x109c lock_acquire+0x234/0x284 __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 clk_prepare_lock+0x70/0x98 clk_prepare+0x24/0x50 clk_bulk_prepare+0x50/0x9c a6xx_gmu_resume+0x94/0x800 [msm] a6xx_gmu_pm_resume+0x38/0x158 [msm] adreno_runtime_resume+0x2c/0x38 [msm] pm_generic_runtime_resume+0x30/0x44 __rpm_callback+0x4c/0x134 rpm_callback+0x78/0x7c rpm_resume+0x3a4/0x46c __pm_runtime_resume+0x78/0xbc pm_runtime_get_sync.isra.0+0x14/0x20 [msm] msm_gpu_submit+0x4c/0x12c [msm] msm_job_run+0x88/0x128 [msm] drm_sched_main+0x264/0x354 [gpu_sched] kthread+0xf0/0x100 ret_from_fork+0x10/0x20 other info that might help us debug this: Chain exists of: prepare_lock --> dma_fence_map --> &gpu->lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&gpu->lock); lock(dma_fence_map); lock(&gpu->lock); lock(prepare_lock); * DEADLOCK * 2 locks held by ring0/373: #0: ffffffead875ae50 (dma_fence_map){++++}-{0:0}, at: drm_sched_main+0x54/0x354 [gpu_sched] #1: ffffff809cd19170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x7c/0x128 [msm] stack backtrace: CPU: 2 PID: 373 Comm: ring0 Not tainted 6.4.3-debug+ #14 Hardware name: Google Villager (rev1+) with LTE (DT) Call trace: dump_backtrace+0xb4/0xf0 show_stack+0x20/0x30 dump_stack_lvl+0x60/0x84 dump_stack+0x18/0x24 print_circular_bug+0x1cc/0x234 check_noncircular+0x78/0xac __lock_acquire+0xdf8/0x109c lock_acquire+0x234/0x284 __mutex_lock+0xc8/0x388 mutex_lock_nested+0x2c/0x38 clk_prepare_lock+0x70/0x98 clk_prepare+0x24/0x50 clk_bulk_prepare+0x50/0x9c a6xx_gmu_resume+0x94/0x800 [msm] a6xx_gmu_pm_resume+0x38/0x158 [msm] adreno_runtime_resume+0x2c/0x38 [msm] pm_generic_runtime_resume+0x30/0x44 __rpm_callback+0x4c/0x134 rpm_callback+0x78/0x7c rpm_resume+0x3a4/0x46c __pm_runtime_resume+0x78/0xbc pm_runtime_get_sync.isra.0+0x14/0x20 [msm] msm_gpu_submit+0x4c/0x12c [msm] msm_job_run+0x88/0x128 [msm] drm_sched_main+0x264/0x354 [gpu_sched] kthread+0xf0/0x100 ret_from_fork+0x10/0x20 Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/552298/
2023-08-15	md/raid5-cache: fix null-ptr-deref for r5l_flush_stripe_to_raid()	Yu Kuai
	r5l_flush_stripe_to_raid() will check if the list 'flushing_ios' is empty, and then submit 'flush_bio', however, r5l_log_flush_endio() is clearing the list first and then clear the bio, which will cause null-ptr-deref: T1: submit flush io raid5d handle_active_stripes r5l_flush_stripe_to_raid // list is empty // add 'io_end_ios' to the list bio_init submit_bio // io1 T2: io1 is done r5l_log_flush_endio list_splice_tail_init // clear the list T3: submit new flush io ... r5l_flush_stripe_to_raid // list is empty // add 'io_end_ios' to the list bio_init bio_uninit // clear bio->bi_blkg submit_bio // null-ptr-deref Fix this problem by clearing bio before clearing the list in r5l_log_flush_endio(). Fixes: 0dd00cba99c3 ("raid5-cache: fully initialize flush_bio when needed") Reported-and-tested-by: Corey Hickey <bugfood-ml@fatooh.org> Closes: https://lore.kernel.org/all/cddd7213-3dfd-4ab7-a3ac-edd54d74a626@fatooh.org/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2023-08-15	md: Hold mddev->reconfig_mutex when trying to get mddev->sync_thread	Li Lingfeng
	Commit ba9d9f1a707f ("Revert "md: unlock mddev before reap sync_thread in action_store"") removed the scenario of calling md_unregister_thread() without holding mddev->reconfig_mutex, so add a lock holding check before acquiring mddev->sync_thread by passing mdev to md_unregister_thread(). Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20230803071711.2546560-1-lilingfeng@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>
2023-08-15	md/raid10: fix a 'conf->barrier' leakage in raid10_takeover()	Yu Kuai
	After commit b39f35ebe86d ("md: don't quiesce in mddev_suspend()"), 'conf->barrier' will be leaked in the case that raid10 takeover raid0: level_store pers->takeover -> raid10_takeover raid10_takeover_raid0 WRITE_ONCE(conf->barrier, 1) mddev_suspend // still raid0 mddev->pers = pers // switch to raid10 mddev_resume // resume without suspend After the above commit, mddev_resume() will not decrease 'conf->barrier' that is set in raid10_takeover_raid0(). Fix this problem by not setting 'conf->barrier' in raid10_takeover_raid0(). By the way, this problem is found while I'm trying to make mddev_suspend/resume() to be independent from raid personalities. raid10 is the only personality to use reference count in the quiesce() callback and this problem is only related to raid10. Fixes: b39f35ebe86d ("md: don't quiesce in mddev_suspend()") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/r/20230731022800.1424902-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>
2023-08-15	md: raid1: fix potential OOB in raid1_remove_disk()	Zhang Shurong
	If rddev->raid_disk is greater than mddev->raid_disks, there will be an out-of-bounds in raid1_remove_disk(). We have already found similar reports as follows: 1) commit d17f744e883b ("md-raid10: fix KASAN warning") 2) commit 1ebc2cec0b7d ("dm raid: fix KASAN warning in raid5_remove_disk") Fix this bug by checking whether the "number" variable is valid. Signed-off-by: Zhang Shurong <zhang_shurong@foxmail.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/tencent_0D24426FAC6A21B69AC0C03CE4143A508F09@qq.com Signed-off-by: Song Liu <song@kernel.org>
2023-08-15	md/raid5-cache: fix a deadlock in r5l_exit_log()	Yu Kuai
	Commit b13015af94cf ("md/raid5-cache: Clear conf->log after finishing work") introduce a new problem: // caller hold reconfig_mutex r5l_exit_log flush_work(&log->disable_writeback_work) r5c_disable_writeback_async wait_event /* * conf->log is not NULL, and mddev_trylock() * will fail, wait_event() can never pass. */ conf->log = NULL Fix this problem by setting 'config->log' to NULL before wake_up() as it used to be, so that wait_event() from r5c_disable_writeback_async() can exist. In the meantime, move forward md_unregister_thread() so that null-ptr-deref this commit fixed can still be fixed. Fixes: b13015af94cf ("md/raid5-cache: Clear conf->log after finishing work") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20230708091727.1417894-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org>
2023-08-15	clk: qcom: gcc-ipq4019: add missing networking resets	Robert Marko
	IPQ4019 has more networking related resets that will be required for future wired networking support, so lets add them. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Link: https://lore.kernel.org/r/20230814104119.96858-2-robert.marko@sartura.hr Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-08-15	pinctrl: intel: Switch to use exported namespace	Andy Shevchenko
	We already have a few symbols exported in the namespace. Let's do the same for others (except PM for now). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	ublk: Switch to memdup_user_nul() helper	Ruan Jinjie
	Use memdup_user_nul() helper instead of open-coding to simplify the code. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230815114815.1551171-1-ruanjinjie@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-15	accel/qaic: Clean up integer overflow checking in map_user_pages()	Dan Carpenter
	The encode_dma() function has some validation on in_trans->size but it would be more clear to move those checks to find_and_map_user_pages(). The encode_dma() had two checks: if (in_trans->addr + in_trans->size < in_trans->addr \|\| !in_trans->size) return -EINVAL; The in_trans->addr variable is the starting address. The in_trans->size variable is the total size of the transfer. The transfer can occur in parts and the resources->xferred_dma_size tracks how many bytes we have already transferred. This patch introduces a new variable "remaining" which represents the amount we want to transfer (in_trans->size) minus the amount we have already transferred (resources->xferred_dma_size). I have modified the check for if in_trans->size is zero to instead check if in_trans->size is less than resources->xferred_dma_size. If we have already transferred more bytes than in_trans->size then there are negative bytes remaining which doesn't make sense. If there are zero bytes remaining to be copied, just return success. The check in encode_dma() checked that "addr + size" could not overflow and barring a driver bug that should work, but it's easier to check if we do this in parts. First check that "in_trans->addr + resources->xferred_dma_size" is safe. Then check that "xfer_start_addr + remaining" is safe. My final concern was that we are dealing with u64 values but on 32bit systems the kmalloc() function will truncate the sizes to 32 bits. So I calculated "total = in_trans->size + offset_in_page(xfer_start_addr);" and returned -EINVAL if it were >= SIZE_MAX. This will not affect 64bit systems. Fixes: 129776ac2e38 ("accel/qaic: Add control path") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/24d3348b-25ac-4c1b-b171-9dae7c43e4e0@moroto.mountain
2023-08-15	accel/qaic: Fix slicing memory leak	Pranjal Ramajor Asha Kanojiya
	The temporary buffer storing slicing configuration data from user is only freed on error. This is a memory leak. Free the buffer unconditionally. Fixes: ff13be830333 ("accel/qaic: Add datapath") Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230802145937.14827-1-quic_jhugo@quicinc.com
2023-08-15	Merge patch series "Reuse common functions from pinctrl-intel"	Andy Shevchenko
	Raag Jadav <raag.jadav@intel.com> says: This series exports common pinctrl functions that are used across Intel specific platform drivers to PINCTRL_INTEL namespace and reuses them into Baytrail, Cherryview and Lynxpoint drivers. This helps reduce their code and memory footprint. X86 kernels are fairly unikernels such that pinctrl-intel driver is enabled by most Linux distributions and most Intel specific platform drivers (inside drivers/pinctrl/intel) depend on it. The only exception to this is Lynxpoint. But taking into account its fairly old age, it wouldn't suffer much from pinctrl-intel dependency. bloat-o-meter: ============== Intel: add/remove: 17/10 grow/shrink: 0/0 up/down: 375/-319 (56) Total: Before=9598, After=9654, chg +0.58% Baytrail: add/remove: 1/6 grow/shrink: 0/2 up/down: 41/-441 (-400) Total: Before=16538, After=16138, chg -2.42% Cherryview: add/remove: 1/6 grow/shrink: 2/0 up/down: 90/-272 (-182) Total: Before=18133, After=17951, chg -1.00% Lynxpoint: add/remove: 1/6 grow/shrink: 0/1 up/down: 24/-354 (-330) Total: Before=7836, After=7506, chg -4.21% Link: https://lore.kernel.org/r/20230814060311.15945-1-raag.jadav@intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	pinctrl: lynxpoint: reuse common functions from pinctrl-intel	Raag Jadav
	Reuse common functions from pinctrl-intel driver. While at it, select pinctrl-intel for Intel Lynxpoint driver. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20230814060311.15945-5-raag.jadav@intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	pinctrl: cherryview: reuse common functions from pinctrl-intel	Raag Jadav
	Reuse common functions from pinctrl-intel driver. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20230814060311.15945-4-raag.jadav@intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	pinctrl: baytrail: reuse common functions from pinctrl-intel	Raag Jadav
	Reuse common functions from pinctrl-intel driver. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20230814060311.15945-3-raag.jadav@intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	pinctrl: intel: export common pinctrl functions	Raag Jadav
	Export common pinctrl functions that are used across Intel specific platform drivers, so that they can be reused. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20230814060311.15945-2-raag.jadav@intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	Merge patch series "Introduce Intel Tangier pinctrl driver"	Andy Shevchenko
	Raag Jadav <raag.jadav@intel.com> says: Merrifield and Moorefield pinctrl driver implementations are similar in terms of how they access the hardware. We can consolidate their pinctrl functionalities into a common library driver. This patch set introduces: 1. Intel Tangier driver that supports the common pinctrl functionalities for Merrifield and Moorefield platforms. 2. Intel Tangier adaptation for Merrifield pinctrl driver. 3. Intel Tangier adaptation for Moorefield pinctrl driver. Tested on Intel Edison platform. No deviation observed in the contents of below entries before and after this patchset. - /proc/interrupts - /sys/kernel/debug/gpio - /sys/kernel/debug/pinctrl/*/pins Link: https://lore.kernel.org/r/20230814054033.12004-1-raag.jadav@intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	pinctrl: moorefield: Adapt to Intel Tangier driver	Raag Jadav
	Make use of Intel Tangier as a library driver for Moorefield. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Link: https://lore.kernel.org/r/20230814054033.12004-4-raag.jadav@intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	pinctrl: merrifield: Adapt to Intel Tangier driver	Raag Jadav
	Make use of Intel Tangier as a library driver for Merrifield. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Link: https://lore.kernel.org/r/20230814054033.12004-3-raag.jadav@intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-08-15	pinctrl: tangier: Introduce Intel Tangier driver	Raag Jadav
	Intel Tangier implements the common pinctrl functionalities for Merrifield and Moorefield platforms. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Link: https://lore.kernel.org/r/20230814054033.12004-2-raag.jadav@intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>