linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-08-26	drm/xe/vm: Don't pin the vm_resv during validation	Thomas Hellström
	The pinning has the odd side-effect that unlocking any resv during validation triggers an "unlocking pinned lock" warning. Cc: Matthew Brost <matthew.brost@intel.com> Fixes: 5cc3325584c4 ("drm/xe: Rework eviction rejection of bound external bos") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250821143045.106005-2-thomas.hellstrom@linux.intel.com (cherry picked from commit 0a51bf3e54dd8b77e6f1febbbb66def0660862d2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-08-26	drm/xe/xe_sync: avoid race during ufence signaling	Zbigniew Kempczyński
	Marking ufence as signalled after copy_to_user() is too late. Worker thread which signals ufence by memory write might be raced with another userspace vm-bind call. In map/unmap scenario unmap may still see ufence is not signalled causing -EBUSY. Change the order of marking / write to user-fence fixes this issue. Fixes: 977e5b82e090 ("drm/xe: Expose user fence from xe_sync_entry") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5536 Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250820083903.2109891-2-zbigniew.kempczynski@intel.com (cherry picked from commit 8ae04fe9ffc93d6bc3bc63ac08375427d69cee06) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-08-26	PM: sleep: annotate RCU list iterations	Johannes Berg
	These iterations require the read lock, otherwise RCU lockdep will splat: ============================= WARNING: suspicious RCU usage 6.17.0-rc3-00014-g31419c045d64 #6 Tainted: G O ----------------------------- drivers/base/power/main.c:1333 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 5 locks held by rtcwake/547: #0: 00000000643ab418 (sb_writers#6){.+.+}-{0:0}, at: file_start_write+0x2b/0x3a #1: 0000000067a0ca88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x181/0x24b #2: 00000000631eac40 (kn->active#3){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x191/0x24b #3: 00000000609a1308 (system_transition_mutex){+.+.}-{4:4}, at: pm_suspend+0xaf/0x30b #4: 0000000060c0fdb0 (device_links_srcu){.+.+}-{0:0}, at: device_links_read_lock+0x75/0x98 stack backtrace: CPU: 0 UID: 0 PID: 547 Comm: rtcwake Tainted: G O 6.17.0-rc3-00014-g31419c045d64 #6 VOLUNTARY Tainted: [O]=OOT_MODULE Stack: 223721b3a80 6089eac6 00000001 00000001 ffffff00 6089eac6 00000535 6086e528 721b3ac0 6003c294 00000000 60031fc0 Call Trace: [<600407ed>] show_stack+0x10e/0x127 [<6003c294>] dump_stack_lvl+0x77/0xc6 [<6003c2fd>] dump_stack+0x1a/0x20 [<600bc2f8>] lockdep_rcu_suspicious+0x116/0x13e [<603d8ea1>] dpm_async_suspend_superior+0x117/0x17e [<603d980f>] device_suspend+0x528/0x541 [<603da24b>] dpm_suspend+0x1a2/0x267 [<603da837>] dpm_suspend_start+0x5d/0x72 [<600ca0c9>] suspend_devices_and_enter+0xab/0x736 [...] Add the fourth argument to the iteration to annotate this and avoid the splat. Fixes: 06799631d522 ("PM: sleep: Make async suspend handle suppliers like parents") Fixes: ed18738fff02 ("PM: sleep: Make async resume handle consumers like children") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20250826134348.aba79f6e6299.I9ecf55da46ccf33778f2c018a82e1819d815b348@changeid Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-08-26	efi: stmm: Drop unneeded null pointer check	Jan Kiszka
	The API documenation of setup_mm_hdr does not mention that dptr can be NULL, this is a local function, and no caller passes NULL. So drop the unneeded check. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-08-26	efi: stmm: Drop unused EFI error from setup_mm_hdr arguments	Jan Kiszka
	No caller ever evaluates what we return in 'ret'. They only use the return code of the function. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-08-26	efi: stmm: Do not return EFI_OUT_OF_RESOURCES on internal errors	Jan Kiszka
	When we are low on memory or when the internal API is violated, we cannot return EFI_OUT_OF_RESOURCES. According to the UEFI standard, that error code is either related to persistent storage used for the variable or even not foreseen as possible error (GetVariable e.g.). Use the not fully accurate but compliant error code EFI_DEVICE_ERROR in those cases. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-08-26	efi: stmm: Fix incorrect buffer allocation method	Jan Kiszka
	The communication buffer allocated by setup_mm_hdr() is later on passed to tee_shm_register_kernel_buf(). The latter expects those buffers to be contiguous pages, but setup_mm_hdr() just uses kmalloc(). That can cause various corruptions or BUGs, specifically since commit 9aec2fb0fd5e ("slab: allocate frozen pages"), though it was broken before as well. Fix this by using alloc_pages_exact() instead of kmalloc(). Fixes: c44b6be62e8d ("efi: Add tee-based EFI variable driver") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Sumit Garg <sumit.garg@oss.qualcomm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-08-26	HID: quirks: add support for Legion Go dual dinput modes	Antheas Kapenekakis
	The Legion Go features detachable controllers which support a dual dinput mode. In this mode, the controllers appear under a single HID device with two applications. Currently, both controllers appear under the same event device, causing their controls to be mixed up. This patch separates the two so that they can be used independently. In addition, the latest firmware update for the Legion Go swaps the IDs to the ones used by the Legion Go 2, so add those IDs as well. [jkosina@suse.com: improved shortlog] Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-08-26	HID: elecom: add support for ELECOM M-DT2DRBK	Martin Hilgendorf
	The DT2DRBK trackball has 8 buttons, but the report descriptor only specifies 5. This patch adds the device ID and performs a similar fixup as for other ELECOM devices to enable the remaining 3 buttons. Signed-off-by: Martin Hilgendorf <martin.hilgendorf@posteo.de> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-08-26	MAINTAINERS: Change Altera-PIO driver maintainer	Adrian Ng Ho Yin
	Update Altera-PIO Driver maintainer from <mun.yew.tham@intel.com> to <adrianhoyin.ng@altera.com> as Mun Yew is no longer with Altera. Signed-off-by: Adrian Ng Ho Yin <adrianhoyin.ng@altera.com> Acked-by: Mun Yew Tham <mun.yew.tham@intel.com> Link: https://lore.kernel.org/r/20250825071637.40441-1-adrianhoyin.ng@altera.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-08-26	gpio: timberdale: fix off-by-one in IRQ type boundary check	Junjie Cao
	timbgpio_irq_type() currently accepts offset == ngpio, violating gpiolib's [0..ngpio-1] contract. This can lead to undefined behavior when computing '1 << offset', and it is also inconsistent with users that iterate with for_each_set_bit(..., ngpio). Tighten the upper bound to reject offset == ngpio. No functional change for in-range offsets. Signed-off-by: Junjie Cao <junjie.cao@intel.com> Link: https://lore.kernel.org/r/20250825090850.127163-1-junjie.cao@intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-08-26	xfs: do not propagate ENODATA disk errors into xattr code	Eric Sandeen
	ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code; namely, that the requested attribute name could not be found. However, a medium error from disk may also return ENODATA. At best, this medium error may escape to userspace as "attribute not found" when in fact it's an IO (disk) error. At worst, we may oops in xfs_attr_leaf_get() when we do: error = xfs_attr_leaf_hasname(args, &bp); if (error == -ENOATTR) { xfs_trans_brelse(args->trans, bp); return error; } because an ENODATA/ENOATTR error from disk leaves us with a null bp, and the xfs_trans_brelse will then null-deref it. As discussed on the list, we really need to modify the lower level IO functions to trap all disk errors and ensure that we don't let unique errors like this leak up into higher xfs functions - many like this should be remapped to EIO. However, this patch directly addresses a reported bug in the xattr code, and should be safe to backport to stable kernels. A larger-scope patch to handle more unique errors at lower levels can follow later. (Note, prior to 07120f1abdff we did not oops, but we did return the wrong error code to userspace.) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines") Cc: stable@vger.kernel.org # v5.9+ Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-26	sched/deadline: Don't count nr_running for dl_server proxy tasks	Yicong Yang
	On CPU offline the kernel stalled with below call trace: INFO: task kworker/0:1:11 blocked for more than 120 seconds. cpuhp hold the cpu hotplug lock endless and stalled vmstat_shepherd. This is because we count nr_running twice on cpuhp enqueuing and failed the wait condition of cpuhp: enqueue_task_fair() // pick cpuhp from idle, rq->nr_running = 0 dl_server_start() [...] add_nr_running() // rq->nr_running = 1 add_nr_running() // rq->nr_running = 2 [switch to cpuhp, waiting on balance_hotplug_wait()] rcuwait_wait_event(rq->nr_running == 1 && ...) // failed, rq->nr_running=2 schedule() // wait again It doesn't make sense to count the dl_server towards runnable tasks, since it runs other tasks. Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250627035420.37712-1-yangyicong@huawei.com
2025-08-26	sched/deadline: Fix RT task potential starvation when expiry time passed	kuyo chang
	[Symptom] The fair server mechanism, which is intended to prevent fair starvation when higher-priority tasks monopolize the CPU. Specifically, RT tasks on the runqueue may not be scheduled as expected. [Analysis] The log "sched: DL replenish lagged too much" triggered. By memory dump of dl_server: curr = 0xFFFFFF80D6A0AC00 ( dl_server = 0xFFFFFF83CD5B1470( dl_runtime = 0x02FAF080, dl_deadline = 0x3B9ACA00, dl_period = 0x3B9ACA00, dl_bw = 0xCCCC, dl_density = 0xCCCC, runtime = 0x02FAF080, deadline = 0x0000082031EB0E80, flags = 0x0, dl_throttled = 0x0, dl_yielded = 0x0, dl_non_contending = 0x0, dl_overrun = 0x0, dl_server = 0x1, dl_server_active = 0x1, dl_defer = 0x1, dl_defer_armed = 0x0, dl_defer_running = 0x1, dl_timer = ( node = ( expires = 0x000008199756E700), _softexpires = 0x000008199756E700, function = 0xFFFFFFDB9AF44D30 = dl_task_timer, base = 0xFFFFFF83CD5A12C0, state = 0x0, is_rel = 0x0, is_soft = 0x0, clock_update_flags = 0x4, clock = 0x000008204A496900, - The timer expiration time (rq->curr->dl_server->dl_timer->expires) is already in the past, indicating the timer has expired. - The timer state (rq->curr->dl_server->dl_timer->state) is 0. [Suspected Root Cause] The relevant code flow in the throttle path of update_curr_dl_se() as follows: dequeue_dl_entity(dl_se, 0); // the DL entity is dequeued if (unlikely(is_dl_boosted(dl_se) \|\| !start_dl_timer(dl_se))) { if (dl_server(dl_se)) // timer registration fails enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH);//enqueue immediately ... } The failure of `start_dl_timer` is caused by attempting to register a timer with an expiration time that is already in the past. When this situation persists, the code repeatedly re-enqueues the DL entity without properly replenishing or restarting the timer, resulting in RT task may not be scheduled as expected. [Proposed Solution]: Instead of immediately re-enqueuing the DL entity on timer registration failure, this change ensures the DL entity is properly replenished and the timer is restarted, preventing RT potential starvation. Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers") Signed-off-by: kuyo chang <kuyo.chang@mediatek.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Closes: https://lore.kernel.org/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Jiri Slaby <jirislaby@kernel.org> Tested-by: Diederik de Haas <didi.debian@cknow.org> Link: https://lkml.kernel.org/r/20250615131129.954975-1-kuyo.chang@mediatek.com
2025-08-26	sched/deadline: Always stop dl-server before changing parameters	Juri Lelli
	Commit cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling") reduced dl-server overhead by delaying disabling servers only after there are no fair task around for a whole period, which means that deadline entities are not dequeued right away on a server stop event. However, the delay opens up a window in which a request for changing server parameters can break per-runqueue running_bw tracking, as reported by Yuri. Close the problematic window by unconditionally calling dl_server_stop() before applying the new parameters (ensuring deadline entities go through an actual dequeue). Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling") Reported-by: Yuri Andriaccio <yurand2000@gmail.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://lore.kernel.org/r/20250721-upstream-fix-dlserver-lessaggressive-b4-v1-1-4ebc10c87e40@redhat.com
2025-08-26	sched/deadline: Fix dl_server_stopped()	Huacai Chen
	Commit cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") introduces dl_server_stopped(). But it is obvious that dl_server_stopped() should return true if dl_se->dl_server_active is 0. Fixes: cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250809130419.1980742-1-chenhuacai@loongson.cn
2025-08-26	Revert "drm/tegra: Use dma_buf from GEM object instance"	Thomas Zimmermann
	This reverts commit 482c7e296edc0f594e8869a789a40be53c49bd6a. The dma_buf field in struct drm_gem_object is not stable over the object instance's lifetime. The field becomes NULL when user space releases the final GEM handle on the buffer object. This resulted in a NULL-pointer deref. Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer: Acquire internal references on GEM handles") only solved the problem partially. They especially don't work for buffer objects without a DRM framebuffer associated. Hence, this revert to going back to using .import_attach->dmabuf. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://lore.kernel.org/r/20250715084549.41473-1-tzimmermann@suse.de
2025-08-26	memblock: fix kernel-doc for MEMBLOCK_RSRV_NOINIT	Mike Rapoport (Microsoft)
	The kernel-doc description of MEMBLOCK_RSRV_NOINIT and memblock_reserved_mark_noinit() do not accurately describe their functionality. Expand their kernel doc to make it clear that the user of MEMBLOCK_RSRV_NOINIT is responsible to properly initialize the struct pages for such regions and add more details about effects of using this flag. Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/f8140a17-c4ec-489b-b314-d45abe48bf36@redhat.com Link: https://lore.kernel.org/r/20250826071947.1949725-1-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2025-08-26	virtio_net: adjust the execution order of function `virtnet_close` during freeze	Junnan Wu
	"Use after free" issue appears in suspend once race occurs when napi poll scheduls after `netif_device_detach` and before napi disables. For details, during suspend flow of virtio-net, the tx queue state is set to "__QUEUE_STATE_DRV_XOFF" by CPU-A. And at some coincidental times, if a TCP connection is still working, CPU-B does `virtnet_poll` before napi disable. In this flow, the state "__QUEUE_STATE_DRV_XOFF" of tx queue will be cleared. This is not the normal process it expects. After that, CPU-A continues to close driver then virtqueue is removed. Sequence likes below: -------------------------------------------------------------------------- CPU-A CPU-B ----- ----- suspend is called A TCP based on virtio-net still work virtnet_freeze \|- virtnet_freeze_down \| \|- netif_device_detach \| \| \|- netif_tx_stop_all_queues \| \| \|- netif_tx_stop_queue \| \| \|- set_bit \| \| (__QUEUE_STATE_DRV_XOFF,...) \| \| softirq rasied \| \| \|- net_rx_action \| \| \|- napi_poll \| \| \|- virtnet_poll \| \| \|- virtnet_poll_cleantx \| \| \|- netif_tx_wake_queue \| \| \|- test_and_clear_bit \| \| (__QUEUE_STATE_DRV_XOFF,...) \| \|- virtnet_close \| \|- virtnet_disable_queue_pair \| \|- virtnet_napi_tx_disable \|- remove_vq_common -------------------------------------------------------------------------- When TCP delayack timer is up, a cpu gets softirq and irq handler `tcp_delack_timer_handler` will be called, which will finally call `start_xmit` in virtio net driver. Then the access to tx virtq will cause panic. The root cause of this issue is that napi tx is not disable before `netif_tx_stop_queue`, once `virnet_poll` schedules in such coincidental time, the tx queue state will be cleared. To solve this issue, adjusts the order of function `virtnet_close` in `virtnet_freeze_down`. Co-developed-by: Ying Xu <ying123.xu@samsung.com> Signed-off-by: Ying Xu <ying123.xu@samsung.com> Signed-off-by: Junnan Wu <junnan01.wu@samsung.com> Message-Id: <20250812090817.3463403-1-junnan01.wu@samsung.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-08-26	virtio_input: Improve freeze handling	Ying Gao
	When executing suspend to ram, if lacking the operations to reset device and free unused buffers before deleting a vq, resource leaks and inconsistent device status will appear. According to chapter "3.3.1 Driver Requirements: Device Cleanup:" of virtio-specification: Driver MUST ensure a virtqueue isn’t live (by device reset) before removing exposed buffers. Therefore, modify the virtinput_freeze function to reset the device and delete the unused buffers before deleting the virtqueue, just like virtinput_remove does. Co-developed-by: Ying Xu <ying123.xu@samsung.com> Signed-off-by: Ying Xu <ying123.xu@samsung.com> Co-developed-by: Junnan Wu <junnan01.wu@samsung.com> Signed-off-by: Junnan Wu <junnan01.wu@samsung.com> Signed-off-by: Ying Gao <ying01.gao@samsung.com> Message-Id: <20250812095118.3622717-1-ying01.gao@samsung.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-08-26	vhost: Fix ioctl # for VHOST_[GS]ET_FORK_FROM_OWNER	Namhyung Kim
	The VHOST_[GS]ET_FEATURES_ARRAY ioctl already took 0x83 and it would result in a build error when the vhost uapi header is used for perf tool build like below. In file included from trace/beauty/ioctl.c:93: tools/perf/trace/beauty/generated/ioctl/vhost_virtio_ioctl_array.c: In function ‘ioctl__scnprintf_vhost_virtio_cmd’: tools/perf/trace/beauty/generated/ioctl/vhost_virtio_ioctl_array.c:36:18: error: initialized field overwritten [-Werror=override-init] 36 \| [0x83] = "SET_FORK_FROM_OWNER", \| ^~~~~~~~~~~~~~~~~~~~~ tools/perf/trace/beauty/generated/ioctl/vhost_virtio_ioctl_array.c:36:18: note: (near initialization for ‘vhost_virtio_ioctl_cmds[131]’) Fixes: 7d9896e9f6d02d8a ("vhost: Reintroduce kthread API and add mode selection") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Message-Id: <20250819063958.833770-1-namhyung@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>
2025-08-26	Revert "virtio: reject shm region if length is zero"	Igor Torrente
	The commit 206cc44588f7 ("virtio: reject shm region if length is zero") breaks the Virtio-gpu `host_visible` feature. As you can see in the snippet below, host_visible_region is zero because of the `kzalloc`. It's using the `vm_get_shm_region` (drivers/virtio/virtio_mmio.c:536) to read the `addr` and `len` from qemu/crosvm. ``` drivers/gpu/drm/virtio/virtgpu_kms.c 132 vgdev = drmm_kzalloc(dev, sizeof(struct virtio_gpu_device), GFP_KERNEL); [...] 177 if (virtio_get_shm_region(vgdev->vdev, &vgdev->host_visible_region, 178 VIRTIO_GPU_SHM_ID_HOST_VISIBLE)) { ``` Now it always fails. To fix, revert the offending commit. Fixes: 206cc44588f7 ("virtio: reject shm region if length is zero") Signed-off-by: Igor Torrente <igor.torrente@collabora.com> Message-Id: <20250807124145.81816-1-igor.torrente@collabora.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-08-26	vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()	Nikolay Kuratov
	When operating on struct vhost_net_ubuf_ref, the following execution sequence is theoretically possible: CPU0 is finalizing DMA operation CPU1 is doing VHOST_NET_SET_BACKEND // ubufs->refcount == 2 vhost_net_ubuf_put() vhost_net_ubuf_put_wait_and_free(oldubufs) vhost_net_ubuf_put_and_wait() vhost_net_ubuf_put() int r = atomic_sub_return(1, &ubufs->refcount); // r = 1 int r = atomic_sub_return(1, &ubufs->refcount); // r = 0 wait_event(ubufs->wait, !atomic_read(&ubufs->refcount)); // no wait occurs here because condition is already true kfree(ubufs); if (unlikely(!r)) wake_up(&ubufs->wait); // use-after-free This leads to use-after-free on ubufs access. This happens because CPU1 skips waiting for wake_up() when refcount is already zero. To prevent that use a read-side RCU critical section in vhost_net_ubuf_put(), as suggested by Hillf Danton. For this lock to take effect, free ubufs with kfree_rcu(). Cc: stable@vger.kernel.org Fixes: 0ad8b480d6ee9 ("vhost: fix ref cnt checking deadlock") Reported-by: Andrey Ryabinin <arbn@yandex-team.com> Suggested-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru> Message-Id: <20250805130917.727332-1-kniv@yandex-team.ru> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-08-26	virtio_pci: Fix misleading comment for queue vector	Liming Wu
	This patch fixes misleading comments in both legacy and modern virtio-pci device implementations. The comments previously referred to the "config vector" for parameters and return values of the `vp_legacy_queue_vector()` and `vp_modern_queue_vector()` functions, which is incorrect. Signed-off-by: Liming Wu <liming.wu@jaguarmicro.com> Message-Id: <20250731092757.1000-1-liming.wu@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-08-25	Merge tag 'devicetree-fixes-for-6.17-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Fix a memory leak for of_pci_add_properties() failure case. Then fix the introduced UAF. - Add missing IORESOURCE_MEM flag on of_reserved_mem_region_to_resource() - Add already in use vendor prefix "eswin" - Clarify "of of" comment in of_match_device(). After many years of drive-by patches dropping the 2nd "of" (which referred to OpenFirmware), a correct patch finally arrived * tag 'devicetree-fixes-for-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: dynamic: Fix use after free in of_changeset_add_prop_helper() dt-bindings: vendor-prefixes: add eswin of: reserved_mem: Add missing IORESOURCE_MEM flag on resources of: dynamic: Fix memleak when of_pci_add_properties() failed of: Clarify OF device context in of_match_device() comment
2025-08-25	net: dlink: fix multicast stats being counted incorrectly	Yeounsu Moon
	`McstFramesRcvdOk` counts the number of received multicast packets, and it reports the value correctly. However, reading `McstFramesRcvdOk` clears the register to zero. As a result, the driver was reporting only the packets since the last read, instead of the accumulated total. Fix this by updating the multicast statistics accumulatively instaed of instantaneously. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Tested-on: D-Link DGE-550T Rev-A3 Signed-off-by: Yeounsu Moon <yyyynoom@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250823182927.6063-3-yyyynoom@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-25	mISDN: hfcpci: Fix warning when deleting uninitialized timer	Vladimir Riabchun
	With CONFIG_DEBUG_OBJECTS_TIMERS unloading hfcpci module leads to the following splat: [ 250.215892] ODEBUG: assert_init not available (active state 0) object: ffffffffc01a3dc0 object type: timer_list hint: 0x0 [ 250.217520] WARNING: CPU: 0 PID: 233 at lib/debugobjects.c:612 debug_print_object+0x1b6/0x2c0 [ 250.218775] Modules linked in: hfcpci(-) mISDN_core [ 250.219537] CPU: 0 UID: 0 PID: 233 Comm: rmmod Not tainted 6.17.0-rc2-g6f713187ac98 #2 PREEMPT(voluntary) [ 250.220940] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 250.222377] RIP: 0010:debug_print_object+0x1b6/0x2c0 [ 250.223131] Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 4f 41 56 48 8b 14 dd a0 4e 01 9f 48 89 ee 48 c7 c7 20 46 01 9f e8 cb 84d [ 250.225805] RSP: 0018:ffff888015ea7c08 EFLAGS: 00010286 [ 250.226608] RAX: 0000000000000000 RBX: 0000000000000005 RCX: ffffffff9be93a95 [ 250.227708] RDX: 1ffff1100d945138 RSI: 0000000000000008 RDI: ffff88806ca289c0 [ 250.228993] RBP: ffffffff9f014a00 R08: 0000000000000001 R09: ffffed1002bd4f39 [ 250.230043] R10: ffff888015ea79cf R11: 0000000000000001 R12: 0000000000000001 [ 250.231185] R13: ffffffff9eea0520 R14: 0000000000000000 R15: ffff888015ea7cc8 [ 250.232454] FS: 00007f3208f01540(0000) GS:ffff8880caf5a000(0000) knlGS:0000000000000000 [ 250.233851] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 250.234856] CR2: 00007f32090a7421 CR3: 0000000004d63000 CR4: 00000000000006f0 [ 250.236117] Call Trace: [ 250.236599] <TASK> [ 250.236967] ? trace_irq_enable.constprop.0+0xd4/0x130 [ 250.237920] debug_object_assert_init+0x1f6/0x310 [ 250.238762] ? __pfx_debug_object_assert_init+0x10/0x10 [ 250.239658] ? __lock_acquire+0xdea/0x1c70 [ 250.240369] __try_to_del_timer_sync+0x69/0x140 [ 250.241172] ? __pfx___try_to_del_timer_sync+0x10/0x10 [ 250.242058] ? __timer_delete_sync+0xc6/0x120 [ 250.242842] ? lock_acquire+0x30/0x80 [ 250.243474] ? __timer_delete_sync+0xc6/0x120 [ 250.244262] __timer_delete_sync+0x98/0x120 [ 250.245015] HFC_cleanup+0x10/0x20 [hfcpci] [ 250.245704] __do_sys_delete_module+0x348/0x510 [ 250.246461] ? __pfx___do_sys_delete_module+0x10/0x10 [ 250.247338] do_syscall_64+0xc1/0x360 [ 250.247924] entry_SYSCALL_64_after_hwframe+0x77/0x7f Fix this by initializing hfc_tl timer with DEFINE_TIMER macro. Also, use mod_timer instead of manual timeout update. Fixes: 87c5fa1bb426 ("mISDN: Add different different timer settings for hfc-pci") Fixes: 175302f6b79e ("mISDN: hfcpci: Fix use-after-free bug in hfcpci_softirq") Signed-off-by: Vladimir Riabchun <ferr.lambarginio@gmail.com> Link: https://patch.msgid.link/aKiy2D_LiWpQ5kXq@vova-pc Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-25	Octeontx2-af: Fix NIX X2P calibration failures	Hariprasad Kelam
	Before configuring the NIX block, the AF driver initiates the "NIX block X2P bus calibration" and verifies that NIX interfaces such as CGX and LBK are active and functioning correctly. On few silicon variants(CNF10KA and CNF10KB), X2P calibration failures have been observed on some CGX blocks that are not mapped to the NIX block. Since both NIX-mapped and non-NIX-mapped CGX blocks share the same VENDOR,DEVICE,SUBSYS_DEVID, it's not possible to skip probe based on these parameters. This patch introuduces "is_cgx_mapped_to_nix" API to detect and skip probe of non NIX mapped CGX blocks. Fixes: aba53d5dbcea ("octeontx2-af: NIX block admin queue init") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://patch.msgid.link/20250822105805.2236528-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-26	ata: ahci: Allow ignoring the external/hotplug capability of ports	Damien Le Moal
	Commit 4edf1505b76d ("ata: ahci: Disallow LPM policy control for external ports") introduced disabling link power management (LPM) for ports that are advertized as external/hotplug capable. This is necessary to force the maximum power policy (ATA_LPM_MAX_POWER) onto the port link to ensure that the hotplug capability of the port is functional. However, doing so blindly for all ports can prevent systems from going into a low power state, even if the external/hotplug ports on the system are unused. E.g., a laptop may see the internal SATA slot of a docking station as an external hotplug capable port, and in such case, the user may prefer to not use the port and to favor instead enabling LPM to allow the laptop to transition to low power states. Since there is no easy method to automatically detect such choice, introduce the new mask_port_ext module parameter to allow a user to ignore the external/hotplug capability of a port. The format for this parameter value is identical to the format used for the mask_port_map parameter: a mask can be defined for all AHCI adapters of a system or for a particular adapters identified with their PCI IDs (bus:dev.func format). The function ahci_get_port_map_mask() is renamed to ahci_get_port_mask() and modified to return a mask, either for the port map mask of an adapter (to ignore ports) or for the external/hotplug capability of an adapter. Differentiation between map_port_mask and map_port_ext_mask is done by passing the parameter string to ahci_get_port_mask() as a second argument. To be consistent with this change, the function ahci_apply_port_map_mask() is renamed ahci_port_mask() and changed to return a mask value. The mask for the external/hotplug capability for an adapter, if defined by the map_port_ext_mask parameter, is stored in the new field mask_port_ext of struct ahci_host_priv. ahci_mark_external_port() is modified to not set the ATA_PFLAG_EXTERNAL flag for a port if hpriv->mask_port_ext includes the number of the port. In such case, an information message is printed to notify that the external/hotplug capability is being ignored. Reported-by: Dieter Mummenschanz <dmummenschanz@web.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220465 Fixes: 4edf1505b76d ("ata: ahci: Disallow LPM policy control for external ports") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Dieter Mummenschanz <dmummenschanz@web.de>
2025-08-25	perf symbol: Add blocking argument to filename__read_build_id	Ian Rogers
	When synthesizing build-ids, for build ID mmap2 events, they will be added for data mmaps if -d/--data is specified. The files opened for their build IDs may block on the open causing perf to hang during synthesis. There is some robustness in existing calls to filename__read_build_id by checking the file path is to a regular file, which unfortunately fails for symlinks. Rather than adding more is_regular_file calls, switch filename__read_build_id to take a "block" argument and specify O_NONBLOCK when this is false. The existing is_regular_file checking callers and the event synthesis callers are made to pass false and thereby avoiding the hang. Fixes: 53b00ff358dc ("perf record: Make --buildid-mmap the default") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250823000024.724394-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-08-25	perf symbol-minimal: Fix ehdr reading in filename__read_build_id	Ian Rogers
	The e_ident is part of the ehdr and so reading it a second time would mean the read ehdr was displaced by 16-bytes. Switch from stdio to open/read/lseek syscalls for similarity with the symbol-elf version of the function and so that later changes can alter then open flags. Fixes: fef8f648bb47 ("perf symbol: Fix use-after-free in filename__read_build_id") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250823000024.724394-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-08-25	soc: qcom: use no-UBWC config for MSM8956/76	Dmitry Baryshkov
	Both MSM8956 and MSM8976 have MDSS 1.11 which doesn't support UBWC (although they also have Adreno 510, which might support UBWC). Disable UBWC support for those platforms. Fixes: 1924272b9ce1 ("soc: qcom: Add UBWC config provider") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/668503/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2025-08-25	soc: qcom: add configuration for MSM8929	Dmitry Baryshkov
	MSM8929 is similar to MSM8939, it doesn't support UBWC. Provide no-UBWC config for the platform. Fixes: 197713d0cf01 ("soc: qcom: ubwc: provide no-UBWC configuration") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/668502/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2025-08-25	soc: qcom: ubwc: add more missing platforms	Dmitry Baryshkov
	Add UBWC configuration for SDA660 (modem-less variant of SDM660), SDM450 (similar to MSM8953), SDM632 (similar to MSM8953) and SM7325 (similar to SC7280). Fixes: 1924272b9ce1 ("soc: qcom: Add UBWC config provider") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/668501/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2025-08-25	soc: qcom: ubwc: use no-uwbc config for MSM8917	Dmitry Baryshkov
	MSM8917 has MDSS 1.15 and Adreno 308, neither of which support UBWC. Change UBWC configuration to point out that UBWC is not supported on this platform. Fixes: 1924272b9ce1 ("soc: qcom: Add UBWC config provider") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/668500/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2025-08-25	drm/msm/dpu: Add a null ptr check for dpu_encoder_needs_modeset	Chenyuan Yang
	The drm_atomic_get_new_connector_state() can return NULL if the connector is not part of the atomic state. Add a check to prevent a NULL pointer dereference. This follows the same pattern used in dpu_encoder_update_topology() within the same file, which checks for NULL before using conn_state. Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com> Fixes: 1ce69c265a53 ("drm/msm/dpu: move resource allocation to CRTC") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/665188/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2025-08-25	dt-bindings: display/msm: qcom,mdp5: drop lut clock	Dmitry Baryshkov
	None of MDP5 platforms have a LUT clock on the display-controller, it was added by the mistake. Drop it, fixing DT warnings on MSM8976 / MSM8956 platforms. Technically it's an ABI break, but no other platforms are affected. Fixes: 385c8ac763b3 ("dt-bindings: display/msm: convert MDP5 schema to YAML format") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Patchwork: https://patchwork.freedesktop.org/patch/667822/ Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2025-08-25	drm/gpuvm: fix various typos in .c and .h gpuvm file	Alice Ryhl
	After working with this code for a while, I came across several typos. This patch fixes them. Signed-off-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250825-gpuvm-typo-fix-v1-1-14e9e78e28e6@google.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-25	ixgbe: fix ixgbe_orom_civd_info struct layout	Jedrzej Jagielski
	The current layout of struct ixgbe_orom_civd_info causes incorrect data storage due to compiler-inserted padding. This results in issues when writing OROM data into the structure. Add the __packed attribute to ensure the structure layout matches the expected binary format without padding. Fixes: 70db0788a262 ("ixgbe: read the OROM version information") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25	ice: fix incorrect counter for buffer allocation failures	Michal Kubiak
	Currently, the driver increments `alloc_page_failed` when buffer allocation fails in `ice_clean_rx_irq()`. However, this counter is intended for page allocation failures, not buffer allocation issues. This patch corrects the counter by incrementing `alloc_buf_failed` instead, ensuring accurate statistics reporting for buffer allocation failures. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reported-by: Jacob Keller <jacob.e.keller@intel.com> Suggested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Priya Singh <priyax.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25	ice: use fixed adapter index for E825C embedded devices	Jacob Keller
	The ice_adapter structure is used by the ice driver to connect multiple physical functions of a device in software. It was introduced by commit 0e2bddf9e5f9 ("ice: add ice_adapter for shared data across PFs on the same NIC") and is primarily used for PTP support, as well as for handling certain cross-PF synchronization. The original design of ice_adapter used PCI address information to determine which devices should be connected. This was extended to support E825C devices by commit fdb7f54700b1 ("ice: Initial support for E825C hardware in ice_adapter"), which used the device ID for E825C devices instead of the PCI address. Later, commit 0093cb194a75 ("ice: use DSN instead of PCI BDF for ice_adapter index") replaced the use of Bus/Device/Function addressing with use of the device serial number. E825C devices may appear in "Dual NAC" configuration which has multiple physical devices tied to the same clock source and which need to use the same ice_adapter. Unfortunately, each "NAC" has its own NVM which has its own unique Device Serial Number. Thus, use of the DSN for connecting ice_adapter does not work properly. It "worked" in the pre-production systems because the DSN was not initialized on the test NVMs and all the NACs had the same zero'd serial number. Since we cannot rely on the DSN, lets fall back to the logic in the original E825C support which used the device ID. This is safe for E825C only because of the embedded nature of the device. It isn't a discreet adapter that can be plugged into an arbitrary system. All E825C devices on a given system are connected to the same clock source and need to be configured through the same PTP clock. To make this separation clear, reserve bit 63 of the 64-bit index values as a "fixed index" indicator. Always clear this bit when using the device serial number as an index. For E825C, use a fixed value defined as the 0x579C E825C backplane device ID bitwise ORed with the fixed index indicator. This is slightly different than the original logic of just using the device ID directly. Doing so prevents a potential issue with systems where only one of the NACs is connected with an external PHY over SGMII. In that case, one NAC would have the E825C_SGMII device ID, but the other would not. Separate the determination of the full 64-bit index from the 32-bit reduction logic. Provide both ice_adapter_index() and a wrapping ice_adapter_xa_index() which handles reducing the index to a long on 32-bit systems. As before, cache the full index value in the adapter structure to warn about collisions. This fixes issues with E825C not initializing PTP on both NACs, due to failure to connect the appropriate devices to the same ice_adapter. Fixes: 0093cb194a75 ("ice: use DSN instead of PCI BDF for ice_adapter index") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25	ice: don't leave device non-functional if Tx scheduler config fails	Jacob Keller
	The ice_cfg_tx_topo function attempts to apply Tx scheduler topology configuration based on NVM parameters, selecting either a 5 or 9 layer topology. As part of this flow, the driver acquires the "Global Configuration Lock", which is a hardware resource associated with programming the DDP package to the device. This "lock" is implemented by firmware as a way to guarantee that only one PF can program the DDP for a device. Unlike a traditional lock, once a PF has acquired this lock, no other PF will be able to acquire it again (including that PF) until a CORER of the device. Future requests to acquire the lock report that global configuration has already completed. The following flow is used to program the Tx topology: * Read the DDP package for scheduler configuration data * Acquire the global configuration lock * Program Tx scheduler topology according to DDP package data * Trigger a CORER which clears the global configuration lock This is followed by the flow for programming the DDP package: * Acquire the global configuration lock (again) * Download the DDP package to the device * Release the global configuration lock. However, if configuration of the Tx topology fails, (i.e. ice_get_set_tx_topo returns an error code), the driver exits ice_cfg_tx_topo() immediately, and fails to trigger CORER. While the global configuration lock is held, the firmware rejects most AdminQ commands, as it is waiting for the DDP package download (or Tx scheduler topology programming) to occur. The current driver flows assume that the global configuration lock has been reset by CORER after programming the Tx topology. Thus, the same PF attempts to acquire the global lock again, and fails. This results in the driver reporting "an unknown error occurred when loading the DDP package". It then attempts to enter safe mode, but ultimately fails to finish ice_probe() since nearly all AdminQ command report error codes, and the driver stops loading the device at some point during its initialization. The only currently known way that ice_get_set_tx_topo() can fail is with certain older DDP packages which contain invalid topology configuration, on firmware versions which strictly validate this data. The most recent releases of the DDP have resolved the invalid data. However, it is still poor practice to essentially brick the device, and prevent access to the device even through safe mode or recovery mode. It is also plausible that this command could fail for some other reason in the future. We cannot simply release the global lock after a failed call to ice_get_set_tx_topo(). Releasing the lock indicates to firmware that global configuration (downloading of the DDP) has completed. Future attempts by this or other PFs to load the DDP will fail with a report that the DDP package has already been downloaded. Then, PFs will enter safe mode as they realize that the package on the device does not meet the minimum version requirement to load. The reported error messages are confusing, as they indicate the version of the default "safe mode" package in the NVM, rather than the version of the file loaded from /lib/firmware. Instead, we need to trigger CORER to clear global configuration. This is the lowest level of hardware reset which clears the global configuration lock and related state. It also clears any already downloaded DDP. Crucially, it does not clear the Tx scheduler topology configuration. Refactor ice_cfg_tx_topo() to always trigger a CORER after acquiring the global lock, regardless of success or failure of the topology configuration. We need to re-initialize the HW structure when we trigger the CORER. Thus, it makes sense for this to be the responsibility of ice_cfg_tx_topo() rather than its caller, ice_init_tx_topology(). This avoids needless re-initialization in cases where we don't attempt to update the Tx scheduler topology, such as if it has already been programmed. There is one catch: failure to re-initialize the HW struct should stop ice_probe(). If this function fails, we won't have a valid HW structure and cannot ensure the device is functioning properly. To handle this, ensure ice_cfg_tx_topo() returns a limited set of error codes. Set aside one specifically, -ENODEV, to indicate that the ice_init_tx_topology() should fail and stop probe. Other error codes indicate failure to apply the Tx scheduler topology. This is treated as a non-fatal error, with an informational message informing the system administrator that the updated Tx topology did not apply. This allows the device to load and function with the default Tx scheduler topology, rather than failing to load entirely. Note that this use of CORER will not result in loops with future PFs attempting to also load the invalid Tx topology configuration. The first PF will acquire the global configuration lock as part of programming the DDP. Each PF after this will attempt to acquire the global lock as part of programming the Tx topology, and will fail with the indication from firmware that global configuration is already complete. Tx scheduler topology configuration is only performed during driver init (probe or devlink reload) and not during cleanup for a CORER that happens after probe completes. Fixes: 91427e6d9030 ("ice: Support 5 layer topology") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25	ice: fix NULL pointer dereference in ice_unplug_aux_dev() on reset	Emil Tantilov
	Issuing a reset when the driver is loaded without RDMA support, will results in a crash as it attempts to remove RDMA's non-existent auxbus device: echo 1 > /sys/class/net/<if>/device/reset BUG: kernel NULL pointer dereference, address: 0000000000000008 ... RIP: 0010:ice_unplug_aux_dev+0x29/0x70 [ice] ... Call Trace: <TASK> ice_prepare_for_reset+0x77/0x260 [ice] pci_dev_save_and_disable+0x2c/0x70 pci_reset_function+0x88/0x130 reset_store+0x5a/0xa0 kernfs_fop_write_iter+0x15e/0x210 vfs_write+0x273/0x520 ksys_write+0x6b/0xe0 do_syscall_64+0x79/0x3b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e ice_unplug_aux_dev() checks pf->cdev_info->adev for NULL pointer, but pf->cdev_info will also be NULL, leading to the deref in the trace above. Introduce a flag to be set when the creation of the auxbus device is successful, to avoid multiple NULL pointer checks in ice_unplug_aux_dev(). Fixes: c24a65b6a27c7 ("iidc/ice/irdma: Update IDC to support multiple consumers") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25	drm/nouveau: remove unused memory target test	Timur Tabi
	The memory target check is a hold-over from a refactor. It's harmless but distracting, so just remove it. Fixes: 2541626cfb79 ("drm/nouveau/acr: use common falcon HS FW code for ACR FWs") Signed-off-by: Timur Tabi <ttabi@nvidia.com> Link: https://lore.kernel.org/r/20250813001004.2986092-3-ttabi@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-25	drm/nouveau: remove unused increment in gm200_flcn_pio_imem_wr	Timur Tabi
	The 'tag' parameter is passed by value and is not actually used after being incremented, so remove the increment. It's the function that calls gm200_flcn_pio_imem_wr that is supposed to (and does) increment 'tag'. Fixes: 0e44c2170876 ("drm/nouveau/flcn: new code to load+boot simple HS FWs (VPR scrubber)") Reviewed-by: Philipp Stanner <phasta@kernel.org> Signed-off-by: Timur Tabi <ttabi@nvidia.com> Link: https://lore.kernel.org/r/20250813001004.2986092-2-ttabi@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-25	drm/nouveau: fix error path in nvkm_gsp_fwsec_v2	Timur Tabi
	Function nvkm_gsp_fwsec_v2() sets 'ret' if the kmemdup() call fails, but it never uses or returns 'ret' after that point. We always need to release the firmware regardless, so do that and then check for error. Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Timur Tabi <ttabi@nvidia.com> Link: https://lore.kernel.org/r/20250813001004.2986092-1-ttabi@nvidia.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-25	Merge tag 'pinctrl-v6.17-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Module macro parameter fix for the meson driver so that it actually modprobes - ACPI quirk for the ASUS ProArt PX13 - Build dependency for the STMFX driver - Proper return value for the pinconf callbacks in the Airhoa driver * tag 'pinctrl-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: airoha: Fix return value in pinconf callbacks pinctrl: STMFX: add missing HAS_IOMEM dependency gpiolib: acpi: Add quirk for ASUS ProArt PX13 pinctrl: meson: Fix typo in device table macro
2025-08-25	smb3 client: fix return code mapping of remap_file_range	Steve French
	We were returning -EOPNOTSUPP for various remap_file_range cases but for some of these the copy_file_range_syscall() requires -EINVAL to be returned (e.g. where source and target file ranges overlap when source and target are the same file). This fixes xfstest generic/157 which was expecting EINVAL for that (and also e.g. for when the src offset is beyond end of file). Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-08-25	loop: fix zero sized loop for block special file	Yu Kuai
	By default, /dev/sda is block special file from devtmpfs, getattr will return file size as zero, causing loop failed for raw block device. We can add bdev_statx() to return device size, however this may introduce changes that are not acknowledged by user. Fix this problem by reverting changes for block special file, file mapping host is set to bdev inode while opening, and use i_size_read() directly to get device size. Fixes: 47b71abd5846 ("loop: use vfs_getattr_nosec for accurate file size") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202508200409.b2459c02-lkp@intel.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250825093205.3684121-1-yukuai1@huaweicloud.com [axboe: fix spelling error] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-25	ARM: stacktrace: include asm/sections.h in asm/stacktrace.h	Arnd Bergmann
	The recent kstack erase changes appear to have uncovered an existing issue with a missing header inclusion: In file included from drivers/misc/lkdtm/kstack_erase.c:12: In file included from include/linux/kstack_erase.h:16: arch/arm/include/asm/stacktrace.h:48:21: error: call to undeclared function 'in_entry_text'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 48 \| frame->ex_frame = in_entry_text(frame->pc); \| ^ Include asm/sections.h here so the compiler can see the in_entry_text() declaration. Fixes: 752ec621ef5c ("ARM: 9234/1: stacktrace: Avoid duplicate saving of exception PC value") Cc: Kees Cook <kees@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20250807071902.4077714-1-arnd@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>