summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-07-01sched/core: Fix migrate_swap() vs. hotplugPeter Zijlstra
On Mon, Jun 02, 2025 at 03:22:13PM +0800, Kuyo Chang wrote: > So, the potential race scenario is: > > CPU0 CPU1 > // doing migrate_swap(cpu0/cpu1) > stop_two_cpus() > ... > // doing _cpu_down() > sched_cpu_deactivate() > set_cpu_active(cpu, false); > balance_push_set(cpu, true); > cpu_stop_queue_two_works > __cpu_stop_queue_work(stopper1,...); > __cpu_stop_queue_work(stopper2,..); > stop_cpus_in_progress -> true > preempt_enable(); > ... > 1st balance_push > stop_one_cpu_nowait > cpu_stop_queue_work > __cpu_stop_queue_work > list_add_tail -> 1st add push_work > wake_up_q(&wakeq); -> "wakeq is empty. > This implies that the stopper is at wakeq@migrate_swap." > preempt_disable > wake_up_q(&wakeq); > wake_up_process // wakeup migrate/0 > try_to_wake_up > ttwu_queue > ttwu_queue_cond ->meet below case > if (cpu == smp_processor_id()) > return false; > ttwu_do_activate > //migrate/0 wakeup done > wake_up_process // wakeup migrate/1 > try_to_wake_up > ttwu_queue > ttwu_queue_cond > ttwu_queue_wakelist > __ttwu_queue_wakelist > __smp_call_single_queue > preempt_enable(); > > 2nd balance_push > stop_one_cpu_nowait > cpu_stop_queue_work > __cpu_stop_queue_work > list_add_tail -> 2nd add push_work, so the double list add is detected > ... > ... > cpu1 get ipi, do sched_ttwu_pending, wakeup migrate/1 > So this balance_push() is part of schedule(), and schedule() is supposed to switch to stopper task, but because of this race condition, stopper task is stuck in WAKING state and not actually visible to be picked. Therefore CPU1 can do another schedule() and end up doing another balance_push() even though the last one hasn't been done yet. This is a confluence of fail, where both wake_q and ttwu_wakelist can cause crucial wakeups to be delayed, resulting in the malfunction of balance_push. Since there is only a single stopper thread to be woken, the wake_q doesn't really add anything here, and can be removed in favour of direct wakeups of the stopper thread. Then add a clause to ttwu_queue_cond() to ensure the stopper threads are never queued / delayed. Of all 3 moving parts, the last addition was the balance_push() machinery, so pick that as the point the bug was introduced. Fixes: 2558aacff858 ("sched/hotplug: Ensure only per-cpu kthreads run during hotplug") Reported-by: Kuyo Chang <kuyo.chang@mediatek.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Kuyo Chang <kuyo.chang@mediatek.com> Link: https://lkml.kernel.org/r/20250605100009.GO39944@noisy.programming.kicks-ass.net
2025-07-01sched: Fix preemption string of preempt_dynamic_noneThomas Weißschuh
Zero is a valid value for "preempt_dynamic_mode", namely "preempt_dynamic_none". Fix the off-by-one in preempt_model_str(), so that "preempty_dynamic_none" is correctly formatted as PREEMPT(none) instead of PREEMPT(undef). Fixes: 8bdc5daaa01e ("sched: Add a generic function to return the preemption string") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20250626-preempt-str-none-v2-1-526213b70a89@linutronix.de
2025-07-01nvme-multipath: fix suspicious RCU usage warningGeliang Tang
When I run the NVME over TCP test in virtme-ng, I get the following "suspicious RCU usage" warning in nvme_mpath_add_sysfs_link(): ''' [ 5.024557][ T44] nvmet: Created nvm controller 1 for subsystem nqn.2025-06.org.nvmexpress.mptcp for NQN nqn.2014-08.org.nvmexpress:uuid:f7f6b5e0-ff97-4894-98ac-c85309e0bc77. [ 5.027401][ T183] nvme nvme0: creating 2 I/O queues. [ 5.029017][ T183] nvme nvme0: mapped 2/0/0 default/read/poll queues. [ 5.032587][ T183] nvme nvme0: new ctrl: NQN "nqn.2025-06.org.nvmexpress.mptcp", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:f7f6b5e0-ff97-4894-98ac-c85309e0bc77 [ 5.042214][ T25] [ 5.042440][ T25] ============================= [ 5.042579][ T25] WARNING: suspicious RCU usage [ 5.042705][ T25] 6.16.0-rc3+ #23 Not tainted [ 5.042812][ T25] ----------------------------- [ 5.042934][ T25] drivers/nvme/host/multipath.c:1203 RCU-list traversed in non-reader section!! [ 5.043111][ T25] [ 5.043111][ T25] other info that might help us debug this: [ 5.043111][ T25] [ 5.043341][ T25] [ 5.043341][ T25] rcu_scheduler_active = 2, debug_locks = 1 [ 5.043502][ T25] 3 locks held by kworker/u9:0/25: [ 5.043615][ T25] #0: ffff888008730948 ((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x7ed/0x1350 [ 5.043830][ T25] #1: ffffc900001afd40 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0xcf3/0x1350 [ 5.044084][ T25] #2: ffff888013ee0020 (&head->srcu){.+.+}-{0:0}, at: nvme_mpath_add_sysfs_link.part.0+0xb4/0x3a0 [ 5.044300][ T25] [ 5.044300][ T25] stack backtrace: [ 5.044439][ T25] CPU: 0 UID: 0 PID: 25 Comm: kworker/u9:0 Not tainted 6.16.0-rc3+ #23 PREEMPT(full) [ 5.044441][ T25] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 5.044442][ T25] Workqueue: async async_run_entry_fn [ 5.044445][ T25] Call Trace: [ 5.044446][ T25] <TASK> [ 5.044449][ T25] dump_stack_lvl+0x6f/0xb0 [ 5.044453][ T25] lockdep_rcu_suspicious.cold+0x4f/0xb1 [ 5.044457][ T25] nvme_mpath_add_sysfs_link.part.0+0x2fb/0x3a0 [ 5.044459][ T25] ? queue_work_on+0x90/0xf0 [ 5.044461][ T25] ? lockdep_hardirqs_on+0x78/0x110 [ 5.044466][ T25] nvme_mpath_set_live+0x1e9/0x4f0 [ 5.044470][ T25] nvme_mpath_add_disk+0x240/0x2f0 [ 5.044472][ T25] ? __pfx_nvme_mpath_add_disk+0x10/0x10 [ 5.044475][ T25] ? add_disk_fwnode+0x361/0x580 [ 5.044480][ T25] nvme_alloc_ns+0x81c/0x17c0 [ 5.044483][ T25] ? kasan_quarantine_put+0x104/0x240 [ 5.044487][ T25] ? __pfx_nvme_alloc_ns+0x10/0x10 [ 5.044495][ T25] ? __pfx_nvme_find_get_ns+0x10/0x10 [ 5.044496][ T25] ? rcu_read_lock_any_held+0x45/0xa0 [ 5.044498][ T25] ? validate_chain+0x232/0x4f0 [ 5.044503][ T25] nvme_scan_ns+0x4c8/0x810 [ 5.044506][ T25] ? __pfx_nvme_scan_ns+0x10/0x10 [ 5.044508][ T25] ? find_held_lock+0x2b/0x80 [ 5.044512][ T25] ? ktime_get+0x16d/0x220 [ 5.044517][ T25] ? kvm_clock_get_cycles+0x18/0x30 [ 5.044520][ T25] ? __pfx_nvme_scan_ns_async+0x10/0x10 [ 5.044522][ T25] async_run_entry_fn+0x97/0x560 [ 5.044523][ T25] ? rcu_is_watching+0x12/0xc0 [ 5.044526][ T25] process_one_work+0xd3c/0x1350 [ 5.044532][ T25] ? __pfx_process_one_work+0x10/0x10 [ 5.044536][ T25] ? assign_work+0x16c/0x240 [ 5.044539][ T25] worker_thread+0x4da/0xd50 [ 5.044545][ T25] ? __pfx_worker_thread+0x10/0x10 [ 5.044546][ T25] kthread+0x356/0x5c0 [ 5.044548][ T25] ? __pfx_kthread+0x10/0x10 [ 5.044549][ T25] ? ret_from_fork+0x1b/0x2e0 [ 5.044552][ T25] ? __lock_release.isra.0+0x5d/0x180 [ 5.044553][ T25] ? ret_from_fork+0x1b/0x2e0 [ 5.044555][ T25] ? rcu_is_watching+0x12/0xc0 [ 5.044557][ T25] ? __pfx_kthread+0x10/0x10 [ 5.044559][ T25] ret_from_fork+0x218/0x2e0 [ 5.044561][ T25] ? __pfx_kthread+0x10/0x10 [ 5.044562][ T25] ret_from_fork_asm+0x1a/0x30 [ 5.044570][ T25] </TASK> ''' This patch uses sleepable RCU version of helper list_for_each_entry_srcu() instead of list_for_each_entry_rcu() to fix it. Fixes: 4dbd2b2ebe4c ("nvme-multipath: Add visibility for round-robin io-policy") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-07-01drm/i915/gsc: mei interrupt top half should be in irq disabled contextJunxiao Chang
MEI GSC interrupt comes from i915. It has top half and bottom half. Top half is called from i915 interrupt handler. It should be in irq disabled context. With RT kernel, by default i915 IRQ handler is in threaded IRQ. MEI GSC top half might be in threaded IRQ context. generic_handle_irq_safe API could be called from either IRQ or process context, it disables local IRQ then calls MEI GSC interrupt top half. This change fixes A380/A770 GPU boot hang issue with RT kernel. Fixes: 1e3dc1d8622b ("drm/i915/gsc: add gsc as a mei auxiliary device") Tested-by: Furong Zhou <furong.zhou@intel.com> Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> Link: https://lore.kernel.org/r/20250425151108.643649-1-junxiao.chang@intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit dccf655f69002d496a527ba441b4f008aa5bebbf) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-07-01drm/i915/gt: Fix timeline left held on VMA alloc errorJanusz Krzysztofik
The following error has been reported sporadically by CI when a test unbinds the i915 driver on a ring submission platform: <4> [239.330153] ------------[ cut here ]------------ <4> [239.330166] i915 0000:00:02.0: [drm] drm_WARN_ON(dev_priv->mm.shrink_count) <4> [239.330196] WARNING: CPU: 1 PID: 18570 at drivers/gpu/drm/i915/i915_gem.c:1309 i915_gem_cleanup_early+0x13e/0x150 [i915] ... <4> [239.330640] RIP: 0010:i915_gem_cleanup_early+0x13e/0x150 [i915] ... <4> [239.330942] Call Trace: <4> [239.330944] <TASK> <4> [239.330949] i915_driver_late_release+0x2b/0xa0 [i915] <4> [239.331202] i915_driver_release+0x86/0xa0 [i915] <4> [239.331482] devm_drm_dev_init_release+0x61/0x90 <4> [239.331494] devm_action_release+0x15/0x30 <4> [239.331504] release_nodes+0x3d/0x120 <4> [239.331517] devres_release_all+0x96/0xd0 <4> [239.331533] device_unbind_cleanup+0x12/0x80 <4> [239.331543] device_release_driver_internal+0x23a/0x280 <4> [239.331550] ? bus_find_device+0xa5/0xe0 <4> [239.331563] device_driver_detach+0x14/0x20 ... <4> [357.719679] ---[ end trace 0000000000000000 ]--- If the test also unloads the i915 module then that's followed with: <3> [357.787478] ============================================================================= <3> [357.788006] BUG i915_vma (Tainted: G U W N ): Objects remaining on __kmem_cache_shutdown() <3> [357.788031] ----------------------------------------------------------------------------- <3> [357.788204] Object 0xffff888109e7f480 @offset=29824 <3> [357.788670] Allocated in i915_vma_instance+0xee/0xc10 [i915] age=292729 cpu=4 pid=2244 <4> [357.788994] i915_vma_instance+0xee/0xc10 [i915] <4> [357.789290] init_status_page+0x7b/0x420 [i915] <4> [357.789532] intel_engines_init+0x1d8/0x980 [i915] <4> [357.789772] intel_gt_init+0x175/0x450 [i915] <4> [357.790014] i915_gem_init+0x113/0x340 [i915] <4> [357.790281] i915_driver_probe+0x847/0xed0 [i915] <4> [357.790504] i915_pci_probe+0xe6/0x220 [i915] ... Closer analysis of CI results history has revealed a dependency of the error on a few IGT tests, namely: - igt@api_intel_allocator@fork-simple-stress-signal, - igt@api_intel_allocator@two-level-inception-interruptible, - igt@gem_linear_blits@interruptible, - igt@prime_mmap_coherency@ioctl-errors, which invisibly trigger the issue, then exhibited with first driver unbind attempt. All of the above tests perform actions which are actively interrupted with signals. Further debugging has allowed to narrow that scope down to DRM_IOCTL_I915_GEM_EXECBUFFER2, and ring_context_alloc(), specific to ring submission, in particular. If successful then that function, or its execlists or GuC submission equivalent, is supposed to be called only once per GEM context engine, followed by raise of a flag that prevents the function from being called again. The function is expected to unwind its internal errors itself, so it may be safely called once more after it returns an error. In case of ring submission, the function first gets a reference to the engine's legacy timeline and then allocates a VMA. If the VMA allocation fails, e.g. when i915_vma_instance() called from inside is interrupted with a signal, then ring_context_alloc() fails, leaving the timeline held referenced. On next I915_GEM_EXECBUFFER2 IOCTL, another reference to the timeline is got, and only that last one is put on successful completion. As a consequence, the legacy timeline, with its underlying engine status page's VMA object, is still held and not released on driver unbind. Get the legacy timeline only after successful allocation of the context engine's VMA. v2: Add a note on other submission methods (Krzysztof Karas): Both execlists and GuC submission use lrc_alloc() which seems free from a similar issue. Fixes: 75d0a7f31eec ("drm/i915: Lift timeline into intel_context") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061 Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Krzysztof Karas <krzysztof.karas@intel.com> Reviewed-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Reviewed-by: Krzysztof Niemiec <krzysztof.niemiec@intel.com> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20250611104352.1014011-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit cc43422b3cc79eacff4c5a8ba0d224688ca9dd4f) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-06-30drm/vmwgfx: Fix guests running with TDX/SEVMarko Kiiskila
Commit 81256a50aa0f ("x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default") changed the default behavior of memremap(MEMREMAP_WB) and started mapping memory as encrypted. The driver requires the fifo memory to be decrypted to communicate with the host but was relaying on the old default behavior of memremap(MEMREMAP_WB) and thus broke. Fix it by explicitly specifying the desired behavior and passing MEMREMAP_DEC to memremap. Fixes: 81256a50aa0f ("x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default") Signed-off-by: Marko Kiiskila <marko.kiiskila@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20250618192926.1092450-1-zack.rusin@broadcom.com
2025-06-30Merge tag 'for-net-2025-06-27' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - MGMT: set_mesh: update LE scan interval and window - MGMT: mesh_send: check instances prior disabling advertising - hci_sync: revert some mesh modifications - hci_sync: Set extended advertising data synchronously - hci_sync: Prevent unintended pause by checking if advertising is active * tag 'for-net-2025-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: HCI: Set extended advertising data synchronously Bluetooth: MGMT: mesh_send: check instances prior disabling advertising Bluetooth: MGMT: set_mesh: update LE scan interval and window Bluetooth: hci_sync: revert some mesh modifications Bluetooth: Prevent unintended pause by checking if advertising is active ==================== Link: https://patch.msgid.link/20250627181601.520435-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-30MAINTAINERS: adjust file entry after renaming rzv2h-gbeth dtbLukas Bulwahn
Commit d53320aeef18 ("dt-bindings: net: Rename renesas,r9a09g057-gbeth.yaml") renames the net devicetree binding renesas,r9a09g057-gbeth.yaml to renesas,rzv2h-gbeth.yaml, but misses to adjust the file entry in the RENESAS RZ/V2H(P) DWMAC GBETH GLUE LAYER DRIVER section in MAINTAINERS. Adjust the file entry after this file renaming. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://patch.msgid.link/20250627134453.51780-1-lukas.bulwahn@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-30net: usb: lan78xx: fix WARN in __netif_napi_del_locked on disconnectOleksij Rempel
Remove redundant netif_napi_del() call from disconnect path. A WARN may be triggered in __netif_napi_del_locked() during USB device disconnect: WARNING: CPU: 0 PID: 11 at net/core/dev.c:7417 __netif_napi_del_locked+0x2b4/0x350 This happens because netif_napi_del() is called in the disconnect path while NAPI is still enabled. However, it is not necessary to call netif_napi_del() explicitly, since unregister_netdev() will handle NAPI teardown automatically and safely. Removing the redundant call avoids triggering the warning. Full trace: lan78xx 1-1:1.0 enu1: Failed to read register index 0x000000c4. ret = -ENODEV lan78xx 1-1:1.0 enu1: Failed to set MAC down with error -ENODEV lan78xx 1-1:1.0 enu1: Link is Down lan78xx 1-1:1.0 enu1: Failed to read register index 0x00000120. ret = -ENODEV ------------[ cut here ]------------ WARNING: CPU: 0 PID: 11 at net/core/dev.c:7417 __netif_napi_del_locked+0x2b4/0x350 Modules linked in: flexcan can_dev fuse CPU: 0 UID: 0 PID: 11 Comm: kworker/0:1 Not tainted 6.16.0-rc2-00624-ge926949dab03 #9 PREEMPT Hardware name: SKOV IMX8MP CPU revC - bd500 (DT) Workqueue: usb_hub_wq hub_event pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __netif_napi_del_locked+0x2b4/0x350 lr : __netif_napi_del_locked+0x7c/0x350 sp : ffffffc085b673c0 x29: ffffffc085b673c0 x28: ffffff800b7f2000 x27: ffffff800b7f20d8 x26: ffffff80110bcf58 x25: ffffff80110bd978 x24: 1ffffff0022179eb x23: ffffff80110bc000 x22: ffffff800b7f5000 x21: ffffff80110bc000 x20: ffffff80110bcf38 x19: ffffff80110bcf28 x18: dfffffc000000000 x17: ffffffc081578940 x16: ffffffc08284cee0 x15: 0000000000000028 x14: 0000000000000006 x13: 0000000000040000 x12: ffffffb0022179e8 x11: 1ffffff0022179e7 x10: ffffffb0022179e7 x9 : dfffffc000000000 x8 : 0000004ffdde8619 x7 : ffffff80110bcf3f x6 : 0000000000000001 x5 : ffffff80110bcf38 x4 : ffffff80110bcf38 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 1ffffff0022179e7 x0 : 0000000000000000 Call trace: __netif_napi_del_locked+0x2b4/0x350 (P) lan78xx_disconnect+0xf4/0x360 usb_unbind_interface+0x158/0x718 device_remove+0x100/0x150 device_release_driver_internal+0x308/0x478 device_release_driver+0x1c/0x30 bus_remove_device+0x1a8/0x368 device_del+0x2e0/0x7b0 usb_disable_device+0x244/0x540 usb_disconnect+0x220/0x758 hub_event+0x105c/0x35e0 process_one_work+0x760/0x17b0 worker_thread+0x768/0xce8 kthread+0x3bc/0x690 ret_from_fork+0x10/0x20 irq event stamp: 211604 hardirqs last enabled at (211603): [<ffffffc0828cc9ec>] _raw_spin_unlock_irqrestore+0x84/0x98 hardirqs last disabled at (211604): [<ffffffc0828a9a84>] el1_dbg+0x24/0x80 softirqs last enabled at (211296): [<ffffffc080095f10>] handle_softirqs+0x820/0xbc8 softirqs last disabled at (210993): [<ffffffc080010288>] __do_softirq+0x18/0x20 ---[ end trace 0000000000000000 ]--- lan78xx 1-1:1.0 enu1: failed to kill vid 0081/0 Fixes: ec4c7e12396b ("lan78xx: Introduce NAPI polling support") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20250627051346.276029-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-30net: txgbe: fix the issue of TX failureJiawen Wu
There is a occasional problem that ping is failed between AML devices. That is because the manual enablement of the security Tx path on the hardware is missing, no matter what its previous state was. Fixes: 6f8b4c01a8cd ("net: txgbe: Implement PHYLINK for AML 25G/10G devices") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/5BDFB14C57D1C42A+20250626085153.86122-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-30Merge tag 'io_uring-6.16-20250630' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fix from Jens Axboe: "Now that anonymous inodes set S_IFREG, this breaks the io_uring read/write retries for short reads/writes. As things like timerfd and eventfd are anon inodes, applications that previously did: unsigned long event_data[2]; io_uring_prep_read(sqe, evfd, event_data, sizeof(event_data), 0); and just got a short read when 1 event was posted, will now wait for the full amount before posting a completion. This caused issues for the ghostty application, making it basically unusable due to excessive buffering" * tag 'io_uring-6.16-20250630' of git://git.kernel.dk/linux: io_uring: gate REQ_F_ISREG on !S_ANON_INODE as well
2025-07-01x86/sev: Use TSC_FACTOR for Secure TSC frequency calculationNikunj A Dadhania
When using Secure TSC, the GUEST_TSC_FREQ MSR reports a frequency based on the nominal P0 frequency, which deviates slightly (typically ~0.2%) from the actual mean TSC frequency due to clocking parameters. Over extended VM uptime, this discrepancy accumulates, causing clock skew between the hypervisor and a SEV-SNP VM, leading to early timer interrupts as perceived by the guest. The guest kernel relies on the reported nominal frequency for TSC-based timekeeping, while the actual frequency set during SNP_LAUNCH_START may differ. This mismatch results in inaccurate time calculations, causing the guest to perceive hrtimers as firing earlier than expected. Utilize the TSC_FACTOR from the SEV firmware's secrets page (see "Secrets Page Format" in the SNP Firmware ABI Specification) to calculate the mean TSC frequency, ensuring accurate timekeeping and mitigating clock skew in SEV-SNP VMs. Use early_ioremap_encrypted() to map the secrets page as ioremap_encrypted() uses kmalloc() which is not available during early TSC initialization and causes a panic. [ bp: Drop the silly dummy var: https://lore.kernel.org/r/20250630192726.GBaGLlHl84xIopx4Pt@fat_crate.local ] Fixes: 73bbf3b0fbba ("x86/tsc: Init the TSC for Secure TSC guests") Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250630081858.485187-1-nikunj@amd.com
2025-06-30bcachefs: Fix incorrect transaction restart handlingAlan Huang
Reported-by: syzbot+cc7567f096079cb4146f@syzkaller.appspotmail.com Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-30cifs: all initializations for tcon should happen in tcon_info_allocShyam Prasad N
Today, a few work structs inside tcon are initialized inside cifs_get_tcon and not in tcon_info_alloc. As a result, if a tcon is obtained from tcon_info_alloc, but not called as a part of cifs_get_tcon, we may trip over. Cc: <stable@vger.kernel.org> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-30powercap: intel_rapl: Do not change CLAMPING bit if ENABLE bit cannot be changedZhang Rui
PL1 cannot be disabled on some platforms. The ENABLE bit is still set after software clears it. This behavior leads to a scenario where, upon user request to disable the Power Limit through the powercap sysfs, the ENABLE bit remains set while the CLAMPING bit is inadvertently cleared. According to the Intel Software Developer's Manual, the CLAMPING bit, "When set, allows the processor to go below the OS requested P states in order to maintain the power below specified Platform Power Limit value." Thus this means the system may operate at higher power levels than intended on such platforms. Enhance the code to check ENABLE bit after writing to it, and stop further processing if ENABLE bit cannot be changed. Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Fixes: 2d281d8196e3 ("PowerCap: Introduce Intel RAPL power capping driver") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Link: https://patch.msgid.link/20250619071340.384782-1-rui.zhang@intel.com [ rjw: Use str_enabled_disabled() instead of open-coded equivalent ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-30smb: client: fix warning when reconnecting channelPaulo Alcantara
When reconnecting a channel in smb2_reconnect_server(), a dummy tcon is passed down to smb2_reconnect() with ->query_interface uninitialized, so we can't call queue_delayed_work() on it. Fix the following warning by ensuring that we're queueing the delayed worker from correct tcon. WARNING: CPU: 4 PID: 1126 at kernel/workqueue.c:2498 __queue_delayed_work+0x1d2/0x200 Modules linked in: cifs cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CPU: 4 UID: 0 PID: 1126 Comm: kworker/4:0 Not tainted 6.16.0-rc3 #5 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-4.fc42 04/01/2014 Workqueue: cifsiod smb2_reconnect_server [cifs] RIP: 0010:__queue_delayed_work+0x1d2/0x200 Code: 41 5e 41 5f e9 7f ee ff ff 90 0f 0b 90 e9 5d ff ff ff bf 02 00 00 00 e8 6c f3 07 00 89 c3 eb bd 90 0f 0b 90 e9 57 f> 0b 90 e9 65 fe ff ff 90 0f 0b 90 e9 72 fe ff ff 90 0f 0b 90 e9 RSP: 0018:ffffc900014afad8 EFLAGS: 00010003 RAX: 0000000000000000 RBX: ffff888124d99988 RCX: ffffffff81399cc1 RDX: dffffc0000000000 RSI: ffff888114326e00 RDI: ffff888124d999f0 RBP: 000000000000ea60 R08: 0000000000000001 R09: ffffed10249b3331 R10: ffff888124d9998f R11: 0000000000000004 R12: 0000000000000040 R13: ffff888114326e00 R14: ffff888124d999d8 R15: ffff888114939020 FS: 0000000000000000(0000) GS:ffff88829f7fe000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe7a2b4038 CR3: 0000000120a6f000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> queue_delayed_work_on+0xb4/0xc0 smb2_reconnect+0xb22/0xf50 [cifs] smb2_reconnect_server+0x413/0xd40 [cifs] ? __pfx_smb2_reconnect_server+0x10/0x10 [cifs] ? local_clock_noinstr+0xd/0xd0 ? local_clock+0x15/0x30 ? lock_release+0x29b/0x390 process_one_work+0x4c5/0xa10 ? __pfx_process_one_work+0x10/0x10 ? __list_add_valid_or_report+0x37/0x120 worker_thread+0x2f1/0x5a0 ? __kthread_parkme+0xde/0x100 ? __pfx_worker_thread+0x10/0x10 kthread+0x1fe/0x380 ? kthread+0x10f/0x380 ? __pfx_kthread+0x10/0x10 ? local_clock_noinstr+0xd/0xd0 ? ret_from_fork+0x1b/0x1f0 ? local_clock+0x15/0x30 ? lock_release+0x29b/0x390 ? rcu_is_watching+0x20/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x15b/0x1f0 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 1116206 hardirqs last enabled at (1116205): [<ffffffff8143af42>] __up_console_sem+0x52/0x60 hardirqs last disabled at (1116206): [<ffffffff81399f0e>] queue_delayed_work_on+0x6e/0xc0 softirqs last enabled at (1116138): [<ffffffffc04562fd>] __smb_send_rqst+0x42d/0x950 [cifs] softirqs last disabled at (1116136): [<ffffffff823d35e1>] release_sock+0x21/0xf0 Cc: linux-cifs@vger.kernel.org Reported-by: David Howells <dhowells@redhat.com> Fixes: 42ca547b13a2 ("cifs: do not disable interface polling on failure") Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Steve French <stfrench@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-30drm/amd/display: Don't allow OLED to go down to fully offMario Limonciello
[Why] OLED panels can be fully off, but this behavior is unexpected. [How] Ensure that minimum luminance is at least 1. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4338 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 51496c7737d06a74b599d0aa7974c3d5a4b1162e)
2025-06-30drm/amd/display: Added case for when RR equals panel's max RR using freesyncHarold Sun
[WHY] Rounding error sometimes occurs when the refresh rate is equal to a panel's max refresh rate, causing HDMI compliance failures. [HOW] Added a case so that we round up to avoid v_total_min to be below a panel's minimum bound. Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Harold Sun <Harold.Sun@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit fe7645d22bc0f7c1558296538ec49987bf268ef6)
2025-06-30drm/amdkfd: add hqd_sdma_get_doorbell callbacks for gfx7/8Alex Deucher
These were missed when support was added for other generations. The callbacks are called unconditionally so we need to make sure all generations have them. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4304 Link: https://github.com/ROCm/ROCm/issues/4965 Fixes: bac38ca8c475 ("drm/amdkfd: implement per queue sdma reset for gfx 9.4+") Cc: Jonathan Kim <jonathan.kim@amd.com> Reported-by: Johl Brown <johlbrown@gmail.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1e9d17a5dcf1242e9518e461d8e63ad35240e49e) Cc: stable@vger.kernel.org
2025-06-30drm/amdgpu: Fix memory leak in amdgpu_ctx_mgr_entity_finiLin.Cao
patch dd64956685fa ("drm/amdgpu: Remove duplicated "context still alive" check") removed ctx put, which will cause amdgpu_ctx_fini() cannot be called and then cause some finished fence that added by amdgpu_ctx_add_fence() cannot be released and cause memleak. Fixes: dd64956685fa ("drm/amdgpu: Remove duplicated "context still alive" check") Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8cf66089e28108dedd47e6156a48489303cf525c) Cc: stable@vger.kernel.org
2025-06-30drm/amdkfd: Don't call mmput from MMU notifier callbackPhilip Yang
If the process is exiting, the mmput inside mmu notifier callback from compactd or fork or numa balancing could release the last reference of mm struct to call exit_mmap and free_pgtable, this triggers deadlock with below backtrace. The deadlock will leak kfd process as mmu notifier release is not called and cause VRAM leaking. The fix is to take mm reference mmget_non_zero when adding prange to the deferred list to pair with mmput in deferred list work. If prange split and add into pchild list, the pchild work_item.mm is not used, so remove the mm parameter from svm_range_unmap_split and svm_range_add_child. The backtrace of hung task: INFO: task python:348105 blocked for more than 64512 seconds. Call Trace: __schedule+0x1c3/0x550 schedule+0x46/0xb0 rwsem_down_write_slowpath+0x24b/0x4c0 unlink_anon_vmas+0xb1/0x1c0 free_pgtables+0xa9/0x130 exit_mmap+0xbc/0x1a0 mmput+0x5a/0x140 svm_range_cpu_invalidate_pagetables+0x2b/0x40 [amdgpu] mn_itree_invalidate+0x72/0xc0 __mmu_notifier_invalidate_range_start+0x48/0x60 try_to_unmap_one+0x10fa/0x1400 rmap_walk_anon+0x196/0x460 try_to_unmap+0xbb/0x210 migrate_page_unmap+0x54d/0x7e0 migrate_pages_batch+0x1c3/0xae0 migrate_pages_sync+0x98/0x240 migrate_pages+0x25c/0x520 compact_zone+0x29d/0x590 compact_zone_order+0xb6/0xf0 try_to_compact_pages+0xbe/0x220 __alloc_pages_direct_compact+0x96/0x1a0 __alloc_pages_slowpath+0x410/0x930 __alloc_pages_nodemask+0x3a9/0x3e0 do_huge_pmd_anonymous_page+0xd7/0x3e0 __handle_mm_fault+0x5e3/0x5f0 handle_mm_fault+0xf7/0x2e0 hmm_vma_fault.isra.0+0x4d/0xa0 walk_pmd_range.isra.0+0xa8/0x310 walk_pud_range+0x167/0x240 walk_pgd_range+0x55/0x100 __walk_page_range+0x87/0x90 walk_page_range+0xf6/0x160 hmm_range_fault+0x4f/0x90 amdgpu_hmm_range_get_pages+0x123/0x230 [amdgpu] amdgpu_ttm_tt_get_user_pages+0xb1/0x150 [amdgpu] init_user_pages+0xb1/0x2a0 [amdgpu] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x543/0x7d0 [amdgpu] kfd_ioctl_alloc_memory_of_gpu+0x24c/0x4e0 [amdgpu] kfd_ioctl+0x29d/0x500 [amdgpu] Fixes: fa582c6f3684 ("drm/amdkfd: Use mmget_not_zero in MMU notifier") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a29e067bd38946f752b0ef855f3dfff87e77bec7) Cc: stable@vger.kernel.org
2025-06-30i2c/designware: Fix an initialization issueMichael J. Ruhl
The i2c_dw_xfer_init() function requires msgs and msg_write_idx from the dev context to be initialized. amd_i2c_dw_xfer_quirk() inits msgs and msgs_num, but not msg_write_idx. This could allow an out of bounds access (of msgs). Initialize msg_write_idx before calling i2c_dw_xfer_init(). Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Fixes: 17631e8ca2d3 ("i2c: designware: Add driver support for AMD NAVI GPU") Cc: <stable@vger.kernel.org> # v5.13+ Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20250627143511.489570-1-michael.j.ruhl@intel.com
2025-06-30drm/amdgpu: Include sdma_4_4_4.binKent Russell
This got missed during SDMA 4.4.4 support. Fixes: 968e3811c3e8 ("drm/amdgpu: add initial support for sdma444") Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 51526efe02714339ed6139f7bc348377d363200a) Cc: stable@vger.kernel.org
2025-06-30amdkfd: MTYPE_UC for ext-coherent system memoryDavid Yat Sin
Set memory mtype to UC host memory when ext-coherent flag is set and memory is registered as a SVM allocation. Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5d14fdab4778c29cfd39e62c3ce84d232b4a7d8c)
2025-06-30drm/amdgpu/sdma5.x: suspend KFD queues in ring resetAlex Deucher
SDMA 5.x only supports engine soft reset which resets all queues on the engine. As such, we need to suspend KFD queues around resets like we do for SDMA 4.x. Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 61feed0baa1a0d094af0e07e968b1e6e875f07d0)
2025-06-30Revert "PCI: ecam: Allow cfg->priv to be pre-populated from the root port ↵Marc Zyngier
device" This reverts commit 4900454b4f819e88e9c57ed93542bf9325d7e161. Now that nobody relies of cfg->priv containing anything useful before the .init() callback is used, restore the previous behaviour. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250625111806.4153773-4-maz@kernel.org
2025-06-30PCI: host-generic: Set driver_data before calling gen_pci_init()Geert Uytterhoeven
On MicroChip MPFS Icicle: microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges: microchip-pcie 2000000000.pcie: Parsing ranges property... microchip-pcie 2000000000.pcie: MEM 0x2008000000..0x2087ffffff -> 0x0008000000 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000368 Current swapper/0 pgtable: 4K pagesize, 39-bit VAs, pgdp=0x00000000814f1000 [0000000000000368] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Oops [#1] Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-rc1-icicle-00003-gafc0a570bb61 #232 NONE Hardware name: Microchip PolarFire-SoC Icicle Kit (DT) [...] [<ffffffff803fb8a4>] plda_pcie_setup_iomems+0xe/0x78 [<ffffffff803fc246>] mc_platform_init+0x80/0x1d2 [<ffffffff803f9c88>] pci_ecam_create+0x104/0x1e2 [<ffffffff8000adbe>] pci_host_common_init+0x120/0x228 [<ffffffff8000af42>] pci_host_common_probe+0x7c/0x8a The initialization of driver_data was moved after the call to gen_pci_init(), while the pci_ecam_ops.init() callback mc_platform_init() expects it has already been initialized. Fix this by moving the initialization of driver_data up. Fixes: afc0a570bb613871 ("PCI: host-generic: Extract an ECAM bridge creation helper from pci_host_common_probe()") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/774290708a6f0f683711914fda110742c18a7fb2.1750787223.git.geert+renesas@glider.be Link: https://patch.msgid.link/20250625111806.4153773-2-maz@kernel.org
2025-06-30PCI: apple: Add tracking of probed root portsMarc Zyngier
The apple driver relies on being able to directly find the matching root port structure from the platform device that represents this port. A previous hack stashed a pointer to the root port structure in the config window private pointer, but that ended up relying on assumptions that break other drivers. Instead, bite the bullet and track the association as part of the driver itself as a list of probed root ports. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250625111806.4153773-3-maz@kernel.org
2025-06-30Input: alps - use scnprintf() to suppress truncation warningAndy Shevchenko
The commit 76c968e75715 ("Input: alps - switch to use scnprintf() to suppress truncation warning") converted one place in the driver while the other left untouched. Convert the other place as well. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250630082245.1416796-1-andriy.shevchenko@linux.intel.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-06-30MAINTAINERS: Add myself as mlx5 core and mlx5e co-maintainerMark Bloch
I have been working on mlx5 related code for several years, contributing features, code reviews, and occasional maintainer tasks when needed. This patch makes my maintainer role official. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://patch.msgid.link/20250627014252.1262592-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-30drm/bridge: aux-hpd-bridge: fix assignment of the of_nodeDmitry Baryshkov
Perform fix similar to the one in the commit 85e444a68126 ("drm/bridge: Fix assignment of the of_node of the parent to aux bridge"). The assignment of the of_node to the aux HPD bridge needs to mark the of_node as reused, otherwise driver core will attempt to bind resources like pinctrl, which is going to fail as corresponding pins are already marked as used by the parent device. Fix that by using the device_set_of_node_from_dev() helper instead of assigning it directly. Fixes: e560518a6c2e ("drm/bridge: implement generic DP HPD bridge") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250608-fix-aud-hpd-bridge-v1-1-4641a6f8e381@oss.qualcomm.com
2025-06-30drm/bridge: panel: move prepare_prev_first handling to ↵Dmitry Baryshkov
drm_panel_bridge_add_typed The commit 5ea6b1702781 ("drm/panel: Add prepare_prev_first flag to drm_panel") and commit 0974687a19c3 ("drm/bridge: panel: Set pre_enable_prev_first from drmm_panel_bridge_add") added handling of panel's prepare_prev_first to devm_panel_bridge_add() and drmm_panel_bridge_add(). However if the driver calls drm_panel_bridge_add_typed() directly, then the flag won't be handled and thus the drm_bridge.pre_enable_prev_first will not be set. Move prepare_prev_first handling to the drm_panel_bridge_add_typed() so that there is no way to miss the flag. Fixes: 5ea6b1702781 ("drm/panel: Add prepare_prev_first flag to drm_panel") Fixes: 0974687a19c3 ("drm/bridge: panel: Set pre_enable_prev_first from drmm_panel_bridge_add") Reported-by: Svyatoslav Ryhel <clamor95@gmail.com> Closes: https://lore.kernel.org/dri-devel/CAPVz0n3YZass3Bns1m0XrFxtAC0DKbEPiW6vXimQx97G243sXw@mail.gmail.com/ Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250220-panel_prev_first-v1-1-b9e787825a1a@linaro.org
2025-06-30Merge tag 'rtc-6.16-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC fixes from Alexandre Belloni: "Some fixes for 6.16. The cmos one is important for PREEMPT_RT. I've also added the s5m changes as they had a dependency on the MFD pull request that was included in 6.16-rc1 and we didn't synchronize before the merge window and they won't hurt. - cmos: use spin_lock_irqsave in cmos_interrupt - pcf2127: fix SPI command byte for PCF2131 - s5m: add S2MPG10 support" * tag 'rtc-6.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: rtc: pcf2127: add missing semicolon after statement rtc: pcf2127: fix SPI command byte for PCF2131 rtc: cmos: use spin_lock_irqsave in cmos_interrupt rtc: s5m: replace open-coded read/modify/write registers with regmap helpers rtc: s5m: replace regmap_update_bits with regmap_clear/set_bits rtc: s5m: switch to devm_device_init_wakeup rtc: s5m: fix a typo: peding -> pending rtc: s5m: add support for S2MPG10 RTC rtc: s5m: prepare for external regmap rtc: s5m: cache device type during probe
2025-06-30regulator: mp886x: Fix ID table driver_dataJisheng Zhang
Currently, the driver_data of the i2c ID table is wrong, so it won't work if any mp886x user makes use of the ID table. Fortunately, there's no such user in upstream source code, we can fix the issue by using different ID table entry for mp8867 and mp8869. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Link: https://patch.msgid.link/20250629095918.912-1-jszhang@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-30regulator: sy8824x: Fix ID table driver_dataJisheng Zhang
Currently, the driver_data of the i2c ID table is wrong, so it won't work if any sy8824x user makes use of the ID table. Fortunately, there's no such user in upstream source code, we can fix the issue by using different ID table entry for sy8824c, sy8824e, sy20276 and sy20278. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Link: https://patch.msgid.link/20250629095905.898-1-jszhang@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-30irqchip/irq-msi-lib: Select CONFIG_GENERIC_MSI_IRQNam Cao
irq-msi-lib directly uses struct msi_domain_info and more things which are only available when CONFIG_GENERIC_MSI_IRQ=y. However, there is no dependency specified and CONFIG_IRQ_MSI_LIB can be enabled without CONFIG_GENERIC_MSI_IRQ, which causes the kernel build fail. Make IRQ_MSI_LIB select GENEREIC_MSI_IRQ to prevent that. Fixes: 72e257c6f058 ("irqchip: Provide irq-msi-lib") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/b0c44007f3b7e062228349a2395f8d850050db33.1751277765.git.namcao@linutronix.de Closes: https://lore.kernel.org/oe-kbuild-all/202506282256.cHlEHrdc-lkp@intel.com/
2025-06-30usb: hub: Fix flushing of delayed work used for post resume purposesMathias Nyman
Delayed work that prevents USB3 hubs from runtime-suspending too early needed to be flushed in hub_quiesce() to resolve issues detected on QC SC8280XP CRD board during suspend resume testing. This flushing did however trigger new issues on Raspberry Pi 3B+, which doesn't have USB3 ports, and doesn't queue any post resume delayed work. The flushed 'hub->init_work' item is used for several purposes, and is originally initialized with a 'NULL' work function. The work function is also changed on the fly, which may contribute to the issue. Solve this by creating a dedicated delayed work item for post resume work, and flush that delayed work in hub_quiesce() Cc: stable <stable@kernel.org> Fixes: a49e1e2e785f ("usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm") Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/linux-usb/aF5rNp1l0LWITnEB@finisterre.sirena.org.uk Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Tested-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> # SC8280XP CRD Tested-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250627164348.3982628-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-30wifi: zd1211rw: Fix potential NULL pointer dereference in zd_mac_tx_to_dev()Daniil Dulov
There is a potential NULL pointer dereference in zd_mac_tx_to_dev(). For example, the following is possible: T0 T1 zd_mac_tx_to_dev() /* len == skb_queue_len(q) */ while (len > ZD_MAC_MAX_ACK_WAITERS) { filter_ack() spin_lock_irqsave(&q->lock, flags); /* position == skb_queue_len(q) */ for (i=1; i<position; i++) skb = __skb_dequeue(q) if (mac->type == NL80211_IFTYPE_AP) skb = __skb_dequeue(q); spin_unlock_irqrestore(&q->lock, flags); skb_dequeue() -> NULL Since there is a small gap between checking skb queue length and skb being unconditionally dequeued in zd_mac_tx_to_dev(), skb_dequeue() can return NULL. Then the pointer is passed to zd_mac_tx_status() where it is dereferenced. In order to avoid potential NULL pointer dereference due to situations like above, check if skb is not NULL before passing it to zd_mac_tx_status(). Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 459c51ad6e1f ("zd1211rw: port to mac80211") Signed-off-by: Daniil Dulov <d.dulov@aladdin.ru> Link: https://patch.msgid.link/20250626114619.172631-1-d.dulov@aladdin.ru Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-30wifi: cfg80211: fix S1G beacon head validation in nl80211Lachlan Hodges
S1G beacons contain fixed length optional fields that precede the variable length elements, ensure we take this into account when validating the beacon. This particular case was missed in 1e1f706fc2ce ("wifi: cfg80211/mac80211: correctly parse S1G beacon optional elements"). Fixes: 1d47f1198d58 ("nl80211: correctly validate S1G beacon head") Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250626115118.68660-1-lachlan.hodges@morsemicro.com [shorten/reword subject] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-30xfs: add FALLOC_FL_ALLOCATE_RANGE to supported flags maskYouling Tang
Add FALLOC_FL_ALLOCATE_RANGE to the set of supported fallocate flags in XFS_FALLOC_FL_SUPPORTED. This change improves code clarity and maintains by explicitly showing this flag in the supported flags mask. Note that since FALLOC_FL_ALLOCATE_RANGE is defined as 0x00, this addition has no functional modifications. Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-06-30spi: spi-qpic-snand: avoid memory corruptionMark Brown
Merge series from Gabor Juhos <j4g8y7@gmail.com>: The 'spi-qpic-nand' driver may cause memory corruption under some circumstances. The first patch in the series changes the driver to avoid that, whereas the second adds some sanity checks to the common QPIC code in order to make detecting such errors easier in the future.
2025-06-30drm/ttm: fix error handling in ttm_buffer_object_transferChristian König
Unlocking the resv object was missing in the error path, additionally to that we should move over the resource only after the fence slot was reserved. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Fixes: c8d4c18bfbc4a ("dma-buf/drivers: make reserving a shared slot mandatory v4") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20250616130726.22863-3-christian.koenig@amd.com
2025-06-30dma-buf: fix timeout handling in dma_resv_wait_timeout v2Christian König
Even the kerneldoc says that with a zero timeout the function should not wait for anything, but still return 1 to indicate that the fences are signaled now. Unfortunately that isn't what was implemented, instead of only returning 1 we also waited for at least one jiffies. Fix that by adjusting the handling to what the function is actually documented to do. v2: improve code readability Reported-by: Marek Olšák <marek.olsak@amd.com> Reported-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20250129105841.1806-1-christian.koenig@amd.com
2025-06-30platform/mellanox: mlxreg-lc: Fix logic error in power state checkAlok Tiwari
Fixes a logic issue in mlxreg_lc_completion_notify() where the intention was to check if MLXREG_LC_POWERED flag is not set before powering on the device. The original code used "state & ~MLXREG_LC_POWERED" to check for the absence of the POWERED bit. However this condition evaluates to true even when other bits are set, leading to potentially incorrect behavior. Corrected the logic to explicitly check for the absence of MLXREG_LC_POWERED using !(state & MLXREG_LC_POWERED). Fixes: 62f9529b8d5c ("platform/mellanox: mlxreg-lc: Add initial support for Nvidia line card devices") Suggested-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://lore.kernel.org/r/20250630105812.601014-1-alok.a.tiwari@oracle.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-06-30i2c: Re-enable piix4 driver on non-x86Mario Limonciello
loongarch and mips both support hardware that uses the i2c-piix4 driver. It should not be restricted to x86, so drop the x86 dependency. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/all/CAMuHMdVPQLjOox5sMp34Z5MTwKv2WOpHa=MpZr8hWG22fQKcjw@mail.gmail.com/ Fixes: 7e173eb82ae97175 ("i2c: piix4: Make CONFIG_I2C_PIIX4 dependent on CONFIG_X86"). Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Hans de Goede <hansg@kernel.org> Acked-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20250610205817.3912944-2-superm1@kernel.org Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-06-30Move FCH header to a location accessible by all archsMario Limonciello
A new header fch.h was created to store registers used by different AMD drivers. This header was included by i2c-piix4 in commit 624b0d5696a8 ("i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to <asm/amd/fch.h>"). To prevent compile failures on non-x86 archs i2c-piix4 was set to only compile on x86 by commit 7e173eb82ae9717 ("i2c: piix4: Make CONFIG_I2C_PIIX4 dependent on CONFIG_X86"). This was not a good decision because loongarch and mips both actually support i2c-piix4 and set it enabled in the defconfig. Move the header to a location accessible by all architectures. Fixes: 624b0d5696a89 ("i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to <asm/amd/fch.h>") Suggested-by: Hans de Goede <hansg@kernel.org> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Hans de Goede <hansg@kernel.org> Link: https://lore.kernel.org/r/20250610205817.3912944-1-superm1@kernel.org Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-06-30platform/x86/intel/hid: Add Wildcat Lake supportSrinivas Pandruvada
Add ACPI ID for Wildcat Lake. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20250617232913.3314765-1-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-06-30platform/x86: dell-wmi-sysman: Fix class device unregistrationKurt Borja
Devices under the firmware_attributes_class do not have unique a dev_t. Therefore, device_unregister() should be used instead of device_destroy(), since the latter may match any device with a given dev_t. Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems") Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250625-dest-fix-v1-3-3a0f342312bb@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-06-30platform/x86: think-lmi: Fix class device unregistrationKurt Borja
Devices under the firmware_attributes_class do not have unique a dev_t. Therefore, device_unregister() should be used instead of device_destroy(), since the latter may match any device with a given dev_t. Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250625-dest-fix-v1-2-3a0f342312bb@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-06-30platform/x86: hp-bioscfg: Fix class device unregistrationKurt Borja
Devices under the firmware_attributes_class do not have unique a dev_t. Therefore, device_unregister() should be used instead of device_destroy(), since the latter may match any device with a given dev_t. Fixes: a34fc329b189 ("platform/x86: hp-bioscfg: bioscfg") Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250625-dest-fix-v1-1-3a0f342312bb@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>