linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-08-11	iio: proximity: isl29501: fix buffered read on big-endian systems	David Lechner
	Fix passing a u32 value as a u16 buffer scan item. This works on little- endian systems, but not on big-endian systems. A new local variable is introduced for getting the register value and the array is changed to a struct to make the data layout more explicit rather than just changing the type and having to recalculate the proper length needed for the timestamp. Fixes: 1c28799257bc ("iio: light: isl29501: Add support for the ISL29501 ToF sensor.") Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250722-iio-use-more-iio_declare_buffer_with_ts-7-v2-1-d3ebeb001ed3@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-08-11	iio: accel: sca3300: fix uninitialized iio scan data	David Lechner
	Fix potential leak of uninitialized stack data to userspace by ensuring that the `channels` array is zeroed before use. Fixes: edeb67fbbf4b ("iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS") Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250723-iio-accel-sca3300-fix-uninitialized-iio-scan-data-v1-1-12dbfb3307b7@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-08-11	x86/fpu: Fix NULL dereference in avx512_status()	Fushuai Wang
	Problem ------- With CONFIG_X86_DEBUG_FPU enabled, reading /proc/[kthread]/arch_status causes a warning and a NULL pointer dereference. This is because the AVX-512 timestamp code uses x86_task_fpu() but doesn't check it for NULL. CONFIG_X86_DEBUG_FPU addles that function for kernel threads (PF_KTHREAD specifically), making it return NULL. The point of the warning was to ensure that kernel threads only access task->fpu after going through kernel_fpu_begin()/_end(). Note: all kernel tasks exposed in /proc have a valid task->fpu. Solution -------- One option is to silence the warning and check for NULL from x86_task_fpu(). However, that warning is fairly fresh and seems like a defense against misuse of the FPU state in kernel threads. Instead, stop outputting AVX-512_elapsed_ms for kernel threads altogether. The data was garbage anyway because avx512_timestamp is only updated for user threads, not kernel threads. If anyone ever wants to track kernel thread AVX-512 use, they can come back later and do it properly, separate from this bug fix. [ dhansen: mostly rewrite changelog ] Fixes: 22aafe3bcb67 ("x86/fpu: Remove init_task FPU state dependencies, add debugging warning for PF_KTHREAD tasks") Co-developed-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Fushuai Wang <wangfushuai@baidu.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20250811185044.2227268-1-sohil.mehta%40intel.com
2025-08-11	cpufreq: intel_pstate: Support Clearwater Forest OOB mode	Srinivas Pandruvada
	Prevent intel_pstate from loading when OOB (Out Of Band) P-states mode is enabled. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20250808145122.4057208-1-srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-08-11	cpuidle: governors: menu: Avoid using invalid recent intervals data	Rafael J. Wysocki
	Marc has reported that commit 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") caused the number of wakeup interrupts to increase on an idle system [1], which was not expected to happen after merely allowing shallower idle states to be selected by the governor in some cases. However, on the system in question, all of the idle states deeper than WFI are rejected by the driver due to a firmware issue [2]. This causes the governor to only consider the recent interval duriation data corresponding to attempts to enter WFI that are successful and the recent invervals table is filled with values lower than the scheduler tick period. Consequently, the governor predicts an idle duration below the scheduler tick period length and avoids stopping the tick more often which leads to the observed symptom. Address it by modifying the governor to update the recent intervals table also when entering the previously selected idle state fails, so it knows that the short idle intervals might have been the minority had the selected idle states been actually entered every time. Fixes: 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") Link: https://lore.kernel.org/linux-pm/86o6sv6n94.wl-maz@kernel.org/ [1] Link: https://lore.kernel.org/linux-pm/7ffcb716-9a1b-48c2-aaa4-469d0df7c792@arm.com/ [2] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/2793874.mvXUDI8C0e@rafael.j.wysocki
2025-08-11	intel_idle: Allow loading ACPI tables for any family	Len Brown
	There is no reason to limit intel_idle's loading of ACPI tables to family 6. Upcoming Intel processors are not in family 6. Below "Fixes" really means "applies cleanly until". That syntax commit didn't change the previous logic, but shows this patch applies back 5-years. Fixes: 4a9f45a0533f ("intel_idle: Convert to new X86 CPU match macros") Signed-off-by: Len Brown <len.brown@intel.com> Link: https://patch.msgid.link/06101aa4fe784e5b0be1cb2c0bdd9afcf16bd9d4.1754681697.git.len.brown@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-08-11	selftests/sched_ext: Remove duplicate sched.h header	Jiapeng Chong
	./tools/testing/selftests/sched_ext/hotplug.c: sched.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=22941 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-11	sched/ext: Fix invalid task state transitions on class switch	Andrea Righi
	When enabling a sched_ext scheduler, we may trigger invalid task state transitions, resulting in warnings like the following (which can be easily reproduced by running the hotplug selftest in a loop): sched_ext: Invalid task state transition 0 -> 3 for fish[770] WARNING: CPU: 18 PID: 787 at kernel/sched/ext.c:3862 scx_set_task_state+0x7c/0xc0 ... RIP: 0010:scx_set_task_state+0x7c/0xc0 ... Call Trace: <TASK> scx_enable_task+0x11f/0x2e0 switching_to_scx+0x24/0x110 scx_enable.isra.0+0xd14/0x13d0 bpf_struct_ops_link_create+0x136/0x1a0 __sys_bpf+0x1edd/0x2c30 __x64_sys_bpf+0x21/0x30 do_syscall_64+0xbb/0x370 entry_SYSCALL_64_after_hwframe+0x77/0x7f This happens because we skip initialization for tasks that are already dead (with their usage counter set to zero), but we don't exclude them during the scheduling class transition phase. Fix this by also skipping dead tasks during class swiching, preventing invalid task state transitions. Fixes: a8532fac7b5d2 ("sched_ext: TASK_DEAD tasks must be switched into SCX on ops_enable") Cc: stable@vger.kernel.org # v6.12+ Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-11	blk-wbt: doc: Update the doc of the wbt_lat_usec interface	Tang Yizhou
	The symbol wb_window_usec cannot be found. Update the doc to reflect the latest implementation, in other words, the debugfs interface 'curr_win_nsec'. Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250727173959.160835-4-yizhou.tang@shopee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11	blk-wbt: Eliminate ambiguity in the comments of struct rq_wb	Tang Yizhou
	In the current implementation, the last_issue and last_comp members of struct rq_wb are used only by read requests and not by non-throttled write requests. Therefore, eliminate the ambiguity here. Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20250727173959.160835-3-yizhou.tang@shopee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11	blk-wbt: Optimize wbt_done() for non-throttled writes	Tang Yizhou
	In the current implementation, the sync_cookie and last_cookie members of struct rq_wb are used only by read requests and not by non-throttled write requests. Based on this, we can optimize wbt_done() by removing one if condition check for non-throttled write requests. Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250727173959.160835-2-yizhou.tang@shopee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11	regulator: dt-bindings: infineon,ir38060: Add Guenter as maintainer from IBM	Krzysztof Kozlowski
	The infineon,ir38060 binding never got maintainer and fake "Not Me" entry have been causing dt_binding_check warnings for 1.5 years now: regulator/infineon,ir38060.yaml: maintainers:0: 'Not Me.' does not match '@' Guenter agreed to keep an eye for this hardware and binding. Cc: Guenter Roeck <linux@roeck-us.net> Cc: Conor Dooley <conor.dooley@microchip.com> Cc: Andrew Jeffery <andrew@codeconstruct.com.au> Cc: Ninad Palsule <ninad@linux.ibm.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://patch.msgid.link/20250811141526.168752-2-krzysztof.kozlowski@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-08-11	futex: Use user_write_access_begin/_end() in futex_put_value()	Waiman Long
	Commit cec199c5e39b ("futex: Implement FUTEX2_NUMA") introduced the futex_put_value() helper to write a value to the given user address. However, it uses user_read_access_begin() before the write. For architectures that differentiate between read and write accesses, like PowerPC, futex_put_value() fails with -EFAULT. Fix that by using the user_write_access_begin/user_write_access_end() pair instead. Fixes: cec199c5e39b ("futex: Implement FUTEX2_NUMA") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20250811141147.322261-1-longman@redhat.com
2025-08-11	MAINTAINERS: entry for DRM GPUVM	Danilo Krummrich
	GPUVM deserves a bit more coordination, also given the upcoming Rust work for GPUVM, hence add a dedicated maintainers entry for DRM GPUVM. Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250808092432.461250-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-11	x86/bugs: Select best SRSO mitigation	David Kaplan
	The SRSO bug can theoretically be used to conduct user->user or guest->guest attacks and requires a mitigation (namely IBPB instead of SBPB on context switch) for these. So mark SRSO as being applicable to the user->user and guest->guest attack vectors. Additionally, SRSO supports multiple mitigations which mitigate different potential attack vectors. Some CPUs are also immune to SRSO from certain attack vectors (like user->kernel). Use the specific attack vectors requiring mitigation to select the best SRSO mitigation to avoid unnecessary performance hits. Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250721160310.1804203-1-david.kaplan@amd.com
2025-08-11	iosys-map: Fix undefined behavior in iosys_map_clear()	Nitin Gote
	The current iosys_map_clear() implementation reads the potentially uninitialized 'is_iomem' boolean field to decide which union member to clear. This causes undefined behavior when called on uninitialized structures, as 'is_iomem' may contain garbage values like 0xFF. UBSAN detects this as: UBSAN: invalid-load in include/linux/iosys-map.h:267 load of value 255 is not a valid value for type '_Bool' Fix by unconditionally clearing the entire structure with memset(), eliminating the need to read uninitialized data and ensuring all fields are set to known good values. Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14639 Fixes: 01fd30da0474 ("dma-buf: Add struct dma-buf-map for storing struct dma_buf.vaddr_ptr") Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250718105051.2709487-1-nitin.r.gote@intel.com
2025-08-11	drm/tests: Fix drm_test_fb_xrgb8888_to_xrgb2101010() on big-endian	José Expósito
	Fix failures on big-endian architectures on tests cases single_pixel_source_buffer, single_pixel_clip_rectangle, well_known_colors and destination_pitch. Fixes: 15bda1f8de5d ("drm/tests: Add calls to drm_fb_blit() on supported format conversion tests") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250630090054.353246-2-jose.exposito89@gmail.com
2025-08-11	drm/tests: Fix endian warning	José Expósito
	When compiling with sparse enabled, this warning is thrown: warning: incorrect type in argument 2 (different base types) expected restricted __le32 const [usertype] buf got unsigned int [usertype] [assigned] buf Add a cast to fix it. Fixes: 453114319699 ("drm/format-helper: Add KUnit tests for drm_fb_xrgb8888_to_xrgb2101010()") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250630090054.353246-1-jose.exposito89@gmail.com
2025-08-11	Merge drm/drm-fixes into drm-misc-fixes	Thomas Zimmermann
	Updating drm-misc-fixes to the state of v6.17-rc1. Begins a new release cycle. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-08-11	Merge tag 'nfsd-6.17-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - A correctness fix for delegated timestamps - Address an NFSD shutdown hang when LOCALIO is in use - Prevent a remotely exploitable crasher when TLS is in use * tag 'nfsd-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: sunrpc: fix handling of server side tls alerts nfsd: avoid ref leak in nfsd_open_local_fh() nfsd: don't set the ctime on delegated atime updates
2025-08-11	ALSA: hda/realtek: Fix headset mic on HONOR BRB-X	Vasiliy Kovalev
	Add a PCI quirk to enable microphone input on the headphone jack on the HONOR BRB-X M1010 laptop. Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20250811132716.45076-1-kovalev@altlinux.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-11	module: Rename EXPORT_SYMBOL_GPL_FOR_MODULES to EXPORT_SYMBOL_FOR_MODULES	Vlastimil Babka
	Christoph suggested that the explicit _GPL_ can be dropped from the module namespace export macro, as it's intended for in-tree modules only. It would be possible to restrict it technically, but it was pointed out [2] that some cases of using an out-of-tree build of an in-tree module with the same name are legitimate. But in that case those also have to be GPL anyway so it's unnecessary to spell it out in the macro name. Link: https://lore.kernel.org/all/aFleJN_fE-RbSoFD@infradead.org/ [1] Link: https://lore.kernel.org/all/CAK7LNATRkZHwJGpojCnvdiaoDnP%2BaeUXgdey5sb_8muzdWTMkA@mail.gmail.com/ [2] Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Shivank Garg <shivankg@amd.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Nicolas Schier <n.schier@avm.de> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/20250808-export_modules-v4-1-426945bcc5e1@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11	fs: fix incorrect lflags value in the move_mount syscall	Yuntao Wang
	The lflags value used to look up from_path was overwritten by the one used to look up to_path. In other words, from_path was looked up with the wrong lflags value. Fix it. Fixes: f9fde814de37 ("fs: support getname_maybe_null() in move_mount()") Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev> Link: https://lore.kernel.org/20250811052426.129188-1-yuntao.wang@linux.dev [Christian Brauner <brauner@kernel.org>: massage patch] Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11	block: fix kobject double initialization in add_disk	Zheng Qixing
	Device-mapper can call add_disk() multiple times for the same gendisk due to its two-phase creation process (dm create + dm load). This leads to kobject double initialization errors when the underlying iSCSI devices become temporarily unavailable and then reappear. However, if the first add_disk() call fails and is retried, the queue_kobj gets initialized twice, causing: kobject: kobject (ffff88810c27bb90): tried to init an initialized object, something is seriously wrong. Call Trace: <TASK> dump_stack_lvl+0x5b/0x80 kobject_init.cold+0x43/0x51 blk_register_queue+0x46/0x280 add_disk_fwnode+0xb5/0x280 dm_setup_md_queue+0x194/0x1c0 table_load+0x297/0x2d0 ctl_ioctl+0x2a2/0x480 dm_ctl_ioctl+0xe/0x20 __x64_sys_ioctl+0xc7/0x110 do_syscall_64+0x72/0x390 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fix this by separating kobject initialization from sysfs registration: - Initialize queue_kobj early during gendisk allocation - add_disk() only adds the already-initialized kobject to sysfs - del_gendisk() removes from sysfs but doesn't destroy the kobject - Final cleanup happens when the disk is released Fixes: 2bd85221a625 ("block: untangle request_queue refcounting from sysfs") Reported-by: Li Lingfeng <lilingfeng3@huawei.com> Closes: https://lore.kernel.org/all/83591d0b-2467-433c-bce0-5581298eb161@huawei.com/ Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Link: https://lore.kernel.org/r/20250808053609.3237836-1-zhengqixing@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11	blk-cgroup: remove redundant __GFP_NOWARN	Qianfeng Rong
	Commit 16f5dfbc851b ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made GFP_NOWAIT implicitly include __GFP_NOWARN. Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g., `GFP_NOWAIT \| __GFP_NOWARN`) is now redundant. Let's clean up these redundant flags across subsystems. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250809141358.168781-1-rongqianfeng@vivo.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11	block, bfq: remove redundant __GFP_NOWARN	Qianfeng Rong
	Commit 16f5dfbc851b ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made GFP_NOWAIT implicitly include __GFP_NOWARN. Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g., `GFP_NOWAIT \| __GFP_NOWARN`) is now redundant. Let's clean up these redundant flags across subsystems. Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Link: https://lore.kernel.org/r/20250811081135.374315-1-rongqianfeng@vivo.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11	ublk: check for unprivileged daemon on each I/O fetch	Caleb Sander Mateos
	Commit ab03a61c6614 ("ublk: have a per-io daemon instead of a per-queue daemon") allowed each ublk I/O to have an independent daemon task. However, nr_privileged_daemon is only computed based on whether the last I/O fetched in each ublk queue has an unprivileged daemon task. Fix this by checking whether every fetched I/O's daemon is privileged. Change nr_privileged_daemon from a count of queues to a boolean indicating whether any I/Os have an unprivileged daemon. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: ab03a61c6614 ("ublk: have a per-io daemon instead of a per-queue daemon") Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250808155216.296170-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11	ublk: don't quiesce in ublk_ch_release	Uday Shankar
	ublk_ch_release currently quiesces the device's request_queue while setting force_abort/fail_io. This avoids data races by preventing concurrent reads from the I/O path, but is not strictly needed - at this point, canceling is already set and guaranteed to be observed by any concurrently executing I/Os, so they will be handled properly even if the changes to force_abort/fail_io propagate to the I/O path later. Remove the quiesce/unquiesce calls from ublk_ch_release. This makes the writes to force_abort/fail_io concurrent with the reads in the I/O path, so make the accesses atomic. Before this change, the call to blk_mq_quiesce_queue was responsible for most (90%) of the runtime of ublk_ch_release. With that call eliminated, ublk_ch_release runs much faster. Here is a comparison of the total time spent in calls to ublk_ch_release when a server handling 128 devices exits, before and after this change: before: 1.11s after: 0.09s Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250808-ublk_quiesce2-v1-1-f87ade33fa3d@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11	drbd: Remove the open-coded page pool	Philipp Reisner
	If the network stack keeps a reference for too long, DRBD keeps references on a higher number of pages as a consequence. Fix all that by no longer relying on page reference counts dropping to an expected value. Instead, DRBD gives up its reference and lets the system handle everything else. While at it, remove the open-coded custom page pool mechanism and use the page_pool included in the kernel. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Tested-by: Eric Hagberg <ehagberg@janestreet.com> Link: https://lore.kernel.org/r/20250605103852.23029-1-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11	selftests/coredump: Remove the read() that fails the test	Nam Cao
	Resolve a conflict between commit 6a68d28066b6 ("selftests/coredump: Fix "socket_detect_userspace_client" test failure") and commit 994dc26302ed ("selftests/coredump: fix build") The first commit adds a read() to wait for write() from another thread to finish. But the second commit removes the write(). Now that the two commits are in the same tree, the read() now gets EOF and the test fails. Remove this read() so that the test passes. Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/20250811074957.4079616-1-namcao@linutronix.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11	fuse: keep inode->i_blkbits constant	Joanne Koong
	With fuse now using iomap for writeback handling, inode blkbits changes are problematic because iomap relies on inode->i_blkbits for its internal bitmap logic. Currently we change inode->i_blkbits in fuse to match the attr->blksize value passed in by the server. This commit keeps inode->i_blkbits constant in fuse. Any attr->blksize values passed in by the server will not update inode->i_blkbits. The client-side behavior for stat is unaffected, stat will still reflect the blocksize passed in by the server. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/20250807175015.515192-1-joannelkoong@gmail.com Fixes: ef7e7cbb32 ("fuse: use iomap for writeback") Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11	Merge patch series "open_tree_attr: do not allow id-mapping changes without ↵	Christian Brauner
	OPEN_TREE_CLONE" Aleksa Sarai <cyphar@cyphar.com> says: As described in commit 7a54947e727b ('Merge patch series "fs: allow changing idmappings"'), open_tree_attr(2) was necessary in order to allow for a detached mount to be created and have its idmappings changed without the risk of any racing threads operating on it. For this reason, mount_setattr(2) still does not allow for id-mappings to be changed. However, there was a bug in commit 2462651ffa76 ("fs: allow changing idmappings") which allowed users to bypass this restriction by calling open_tree_attr(2) without OPEN_TREE_CLONE. can_idmap_mount() prevented this bug from allowing an attached mountpoint's id-mapping from being modified (thanks to an is_anon_ns() check), but this still allows for detached (but visible) mounts to have their be id-mapping changed. This risks the same UAF and locking issues as described in the merge commit, and was likely unintentional. For what it's worth, I found this while working on the open_tree_attr(2) man page, and was trying to figure out what open_tree_attr(2)'s behaviour was in the (slightly fruity) ~OPEN_TREE_CLONE case. * patches from https://lore.kernel.org/20250808-open_tree_attr-bugfix-idmap-v1-0-0ec7bc05646c@cyphar.com: selftests/mount_setattr: add smoke tests for open_tree_attr(2) bug open_tree_attr: do not allow id-mapping changes without OPEN_TREE_CLONE Link: https://lore.kernel.org/20250808-open_tree_attr-bugfix-idmap-v1-0-0ec7bc05646c@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11	iomap: Fix broken data integrity guarantees for O_SYNC writes	Jan Kara
	Commit d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") has broken the logic in iomap_dio_bio_iter() in a way that when the device does support FUA (or has no writeback cache) and the direct IO happens to freshly allocated or unwritten extents, we will not issue fsync after completing direct IO O_SYNC / O_DSYNC write because the IOMAP_DIO_WRITE_THROUGH flag stays mistakenly set. Fix the problem by clearing IOMAP_DIO_WRITE_THROUGH whenever we do not perform FUA write as it was originally intended. CC: John Garry <john.g.garry@oracle.com> CC: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Fixes: d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/20250730102840.20470-2-jack@suse.cz Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11	selftests/mount_setattr: add smoke tests for open_tree_attr(2) bug	Aleksa Sarai
	There appear to be no other open_tree_attr(2) tests at the moment, but as a minimal solution just add some additional checks in the existing MOUNT_ATTR_IDMAP tests to make sure that open_tree_attr(2) cannot be used to bypass the tested restrictions that apply to mount_setattr(2). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/20250808-open_tree_attr-bugfix-idmap-v1-2-0ec7bc05646c@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11	open_tree_attr: do not allow id-mapping changes without OPEN_TREE_CLONE	Aleksa Sarai
	As described in commit 7a54947e727b ('Merge patch series "fs: allow changing idmappings"'), open_tree_attr(2) was necessary in order to allow for a detached mount to be created and have its idmappings changed without the risk of any racing threads operating on it. For this reason, mount_setattr(2) still does not allow for id-mappings to be changed. However, there was a bug in commit 2462651ffa76 ("fs: allow changing idmappings") which allowed users to bypass this restriction by calling open_tree_attr(2) without OPEN_TREE_CLONE. can_idmap_mount() prevented this bug from allowing an attached mountpoint's id-mapping from being modified (thanks to an is_anon_ns() check), but this still allows for detached (but visible) mounts to have their be id-mapping changed. This risks the same UAF and locking issues as described in the merge commit, and was likely unintentional. Fixes: 2462651ffa76 ("fs: allow changing idmappings") Cc: stable@vger.kernel.org # v6.15+ Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/20250808-open_tree_attr-bugfix-idmap-v1-1-0ec7bc05646c@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11	fs: writeback: fix use-after-free in __mark_inode_dirty()	Jiufei Xue
	An use-after-free issue occurred when __mark_inode_dirty() get the bdi_writeback that was in the progress of switching. CPU: 1 PID: 562 Comm: systemd-random- Not tainted 6.6.56-gb4403bd46a8e #1 ...... pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __mark_inode_dirty+0x124/0x418 lr : __mark_inode_dirty+0x118/0x418 sp : ffffffc08c9dbbc0 ........ Call trace: __mark_inode_dirty+0x124/0x418 generic_update_time+0x4c/0x60 file_modified+0xcc/0xd0 ext4_buffered_write_iter+0x58/0x124 ext4_file_write_iter+0x54/0x704 vfs_write+0x1c0/0x308 ksys_write+0x74/0x10c __arm64_sys_write+0x1c/0x28 invoke_syscall+0x48/0x114 el0_svc_common.constprop.0+0xc0/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x40/0xe4 el0t_64_sync_handler+0x120/0x12c el0t_64_sync+0x194/0x198 Root cause is: systemd-random-seed kworker ---------------------------------------------------------------------- ___mark_inode_dirty inode_switch_wbs_work_fn spin_lock(&inode->i_lock); inode_attach_wb locked_inode_to_wb_and_lock_list get inode->i_wb spin_unlock(&inode->i_lock); spin_lock(&wb->list_lock) spin_lock(&inode->i_lock) inode_io_list_move_locked spin_unlock(&wb->list_lock) spin_unlock(&inode->i_lock) spin_lock(&old_wb->list_lock) inode_do_switch_wbs spin_lock(&inode->i_lock) inode->i_wb = new_wb spin_unlock(&inode->i_lock) spin_unlock(&old_wb->list_lock) wb_put_many(old_wb, nr_switched) cgwb_release old wb released wb_wakeup_delayed() accesses wb, then trigger the use-after-free issue Fix this race condition by holding inode spinlock until wb_wakeup_delayed() finished. Signed-off-by: Jiufei Xue <jiufei.xue@samsung.com> Link: https://lore.kernel.org/20250728100715.3863241-1-jiufei.xue@samsung.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11	xfs: split xfs_zone_record_blocks	Christoph Hellwig
	xfs_zone_record_blocks not only records successfully written blocks that now back file data, but is also used for blocks speculatively written by garbage collection that were never linked to an inode and instantly become invalid. Split the latter functionality out to be easier to understand. This also make it clear that we don't need to attach the rmap inode to a transaction for the skipped blocks case as we never dirty any peristent data structure. Also make the argument order to xfs_zone_record_blocks a bit more natural. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11	xfs: fix scrub trace with null pointer in quotacheck	Andrey Albershteyn
	The quotacheck doesn't initialize sc->ip. Cc: stable@vger.kernel.org # v6.8 Fixes: 21d7500929c8a0 ("xfs: improve dquot iteration for scrub") Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11	xfs: reject max_atomic_write mount option for no reflink	John Garry
	If the FS has no reflink, then atomic writes greater than 1x block are not supported. As such, for no reflink it is pointless to accept setting max_atomic_write when it cannot be supported, so reject max_atomic_write mount option in this case. It could be still possible to accept max_atomic_write option of size 1x block if HW atomics are supported, so check for this specifically. Fixes: 4528b9052731 ("xfs: allow sysadmins to specify a maximum atomic write limit at mount time") Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11	xfs: disallow atomic writes on DAX	John Garry
	Atomic writes are not currently supported for DAX, but two problems exist: - we may go down DAX write path for IOCB_ATOMIC, which does not handle IOCB_ATOMIC properly - we report non-zero atomic write limits in statx (for DAX inodes) We may want atomic writes support on DAX in future, but just disallow for now. For this, ensure when IOCB_ATOMIC is set that we check the write size versus the atomic write min and max before branching off to the DAX write path. This is not strictly required for DAX, as we should not get this far in the write path as FMODE_CAN_ATOMIC_WRITE should not be set. In addition, due to reflink being supported for DAX, we automatically get CoW-based atomic writes support being advertised. Remedy this by disallowing atomic writes for a DAX inode for both sw and hw modes. Reported-by: Darrick J. Wong <djwong@kernel.org> Fixes: 9dffc58f2384 ("xfs: update atomic write limits") Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11	fs/dax: Reject IOCB_ATOMIC in dax_iomap_rw()	John Garry
	The DAX write path does not support IOCB_ATOMIC, so reject it when set. Suggested-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11	xfs: remove XFS_IBULK_SAME_AG	Christoph Hellwig
	Add a new field to struct xfs_ibulk to directly pass XFS_IWALK* flags, and thus remove the need to indirect the SAME_AG flag through XFS_IBULK*. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11	xfs: fully decouple XFS_IBULK* flags from XFS_IWALK* flags	Christoph Hellwig
	Fix up xfs_inumbers to now pass in the XFS_IBULK* flags into the flags argument to xfs_inobt_walk, which expects the XFS_IWALK* flags. Currently passing the wrong flags works for non-debug builds because the only XFS_IWALK* flag has the same encoding as the corresponding XFS_IBULK* flag, but in debug builds it can trigger an assert that no incorrect flag is passed. Instead just extra the relevant flag. Fixes: 5b35d922c52798 ("xfs: Decouple XFS_IBULK flags from XFS_IWALK flags") Cc: <stable@vger.kernel.org> # v5.19 Reported-by: cen zhang <zzzccc427@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11	xfs: fix frozen file system assert in xfs_trans_alloc	Christoph Hellwig
	Commit 83a80e95e797 ("xfs: decouple xfs_trans_alloc_empty from xfs_trans_alloc") move the place of the assert for a frozen file system after the sb_start_intwrite call that ensures it doesn't run on frozen file systems, and thus allows to incorrect trigger it. Fix that by moving it back to where it belongs. Fixes: 83a80e95e797 ("xfs: decouple xfs_trans_alloc_empty from xfs_trans_alloc") Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11	HID: intel-thc-hid: intel-thc: Fix incorrect pointer arithmetic in I2C regs save	Aaron Ma
	Improper use of secondary pointer (&dev->i2c_subip_regs) caused kernel crash and out-of-bounds error: BUG: KASAN: slab-out-of-bounds in _regmap_bulk_read+0x449/0x510 Write of size 4 at addr ffff888136005dc0 by task kworker/u33:5/5107 CPU: 3 UID: 0 PID: 5107 Comm: kworker/u33:5 Not tainted 6.16.0+ #3 PREEMPT(voluntary) Workqueue: async async_run_entry_fn Call Trace: <TASK> dump_stack_lvl+0x76/0xa0 print_report+0xd1/0x660 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? kasan_complete_mode_report_info+0x26/0x200 kasan_report+0xe1/0x120 ? _regmap_bulk_read+0x449/0x510 ? _regmap_bulk_read+0x449/0x510 __asan_report_store4_noabort+0x17/0x30 _regmap_bulk_read+0x449/0x510 ? __pfx__regmap_bulk_read+0x10/0x10 regmap_bulk_read+0x270/0x3d0 pio_complete+0x1ee/0x2c0 [intel_thc] ? __pfx_pio_complete+0x10/0x10 [intel_thc] ? __pfx_pio_wait+0x10/0x10 [intel_thc] ? regmap_update_bits_base+0x13b/0x1f0 thc_i2c_subip_pio_read+0x117/0x270 [intel_thc] thc_i2c_subip_regs_save+0xc2/0x140 [intel_thc] ? __pfx_thc_i2c_subip_regs_save+0x10/0x10 [intel_thc] [...] The buggy address belongs to the object at ffff888136005d00 which belongs to the cache kmalloc-rnd-12-192 of size 192 The buggy address is located 0 bytes to the right of allocated 192-byte region [ffff888136005d00, ffff888136005dc0) Replaced with direct array indexing (&dev->i2c_subip_regs[i]) to ensure safe memory access. Fixes: 4228966def884 ("HID: intel-thc-hid: intel-thc: Add THC I2C config interfaces") Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Reviewed-by: Even Xu <even.xu@intel.com> Tested-by: Even Xu <even.xu@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-08-11	HID: intel-thc-hid: intel-quicki2c: Fix ACPI dsd ICRS/ISUB length	Aaron Ma
	The QuickI2C ACPI _DSD methods return ICRS and ISUB data with a trailing byte, making the actual length is one more byte than the structs defined. It caused stack-out-of-bounds and kernel crash: kernel: BUG: KASAN: stack-out-of-bounds in quicki2c_acpi_get_dsd_property.constprop.0+0x111/0x1b0 [intel_quicki2c] kernel: Write of size 12 at addr ffff888106d1f900 by task kworker/u33:2/75 kernel: kernel: CPU: 3 UID: 0 PID: 75 Comm: kworker/u33:2 Not tainted 6.16.0+ #3 PREEMPT(voluntary) kernel: Workqueue: async async_run_entry_fn kernel: Call Trace: kernel: <TASK> kernel: dump_stack_lvl+0x76/0xa0 kernel: print_report+0xd1/0x660 kernel: ? __pfx__raw_spin_lock_irqsave+0x10/0x10 kernel: ? __kasan_slab_free+0x5d/0x80 kernel: ? kasan_addr_to_slab+0xd/0xb0 kernel: kasan_report+0xe1/0x120 kernel: ? quicki2c_acpi_get_dsd_property.constprop.0+0x111/0x1b0 [intel_quicki2c] kernel: ? quicki2c_acpi_get_dsd_property.constprop.0+0x111/0x1b0 [intel_quicki2c] kernel: kasan_check_range+0x11c/0x200 kernel: __asan_memcpy+0x3b/0x80 kernel: quicki2c_acpi_get_dsd_property.constprop.0+0x111/0x1b0 [intel_quicki2c] kernel: ? __pfx_quicki2c_acpi_get_dsd_property.constprop.0+0x10/0x10 [intel_quicki2c] kernel: quicki2c_get_acpi_resources+0x237/0x730 [intel_quicki2c] [...] kernel: </TASK> kernel: kernel: The buggy address belongs to stack of task kworker/u33:2/75 kernel: and is located at offset 48 in frame: kernel: quicki2c_get_acpi_resources+0x0/0x730 [intel_quicki2c] kernel: kernel: This frame has 3 objects: kernel: [32, 36) 'hid_desc_addr' kernel: [48, 59) 'i2c_param' kernel: [80, 224) 'i2c_config' ACPI DSD methods return: \_SB.PC00.THC0.ICRS Buffer 000000003fdc947b 001 Len 0C = 0A 00 80 1A 06 00 00 00 00 00 00 00 \_SB.PC00.THC0.ISUB Buffer 00000000f2fcbdc4 001 Len 91 = 00 00 00 00 00 00 00 00 00 00 00 00 Adding reserved padding to quicki2c_subip_acpi_parameter/config. Fixes: 5282e45ccbfa9 ("HID: intel-thc-hid: intel-quicki2c: Add THC QuickI2C ACPI interfaces") Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Reviewed-by: Even Xu <even.xu@intel.com> Tested-by: Even Xu <even.xu@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-08-11	of: reserved_mem: Restructure call site for dma_contiguous_early_fixup()	Oreoluwa Babatunde
	Restructure the call site for dma_contiguous_early_fixup() to where the reserved_mem nodes are being parsed from the DT so that dma_mmu_remap[] is populated before dma_contiguous_remap() is called. Fixes: 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved memory regions are processed") Signed-off-by: Oreoluwa Babatunde <oreoluwa.babatunde@oss.qualcomm.com> Tested-by: William Zhang <william.zhang@broadcom.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20250806172421.2748302-1-oreoluwa.babatunde@oss.qualcomm.com
2025-08-11	soc/tegra: pmc: Ensure power-domains are in a known state	Jon Hunter
	After commit 13a4b7fb6260 ("pmdomain: core: Leave powered-on genpds on until late_initcall_sync") was applied, the Tegra210 Jetson TX1 board failed to boot. Looking into this issue, before this commit was applied, if any of the Tegra power-domains were in 'on' state when the kernel booted, they were being turned off by the genpd core before any driver had chance to request them. This was purely by luck and a consequence of the power-domains being turned off earlier during boot. After this commit was applied, any power-domains in the 'on' state are kept on for longer during boot and therefore, may never transitioned to the off state before they are requested/used. The hang on the Tegra210 Jetson TX1 is caused because devices in some power-domains are accessed without the power-domain being turned off and on, indicating that the power-domain is not in a completely on state. >From reviewing the Tegra PMC driver code, if a power-domain is in the 'on' state there is no guarantee that all the necessary clocks associated with the power-domain are on and even if they are they would not have been requested via the clock framework and so could be turned off later. Some power-domains also have a 'clamping' register that needs to be configured as well. In short, if a power-domain is already 'on' it is difficult to know if it has been configured correctly. Given that the power-domains happened to be switched off during boot previously, to ensure that they are in a good known state on boot, fix this by switching off any power-domains that are on initially when registering the power-domains with the genpd framework. Note that commit 05cfb988a4d0 ("soc/tegra: pmc: Initialise resets associated with a power partition") updated the tegra_powergate_of_get_resets() function to pass the 'off' to ensure that the resets for the power-domain are in the correct state on boot. However, now that we may power off a domain on boot, if it is on, it is better to move this logic into the tegra_powergate_add() function so that there is a single place where we are handling the initial state of the power-domain. Fixes: a38045121bf4 ("soc/tegra: pmc: Add generic PM domain support") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250731121832.213671-1-jonathanh@nvidia.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-08-11	ALSA: hda/realtek: Add Framework Laptop 13 (AMD Ryzen AI 300) to quirks	Christopher Eby
	Framework Laptop 13 (AMD Ryzen AI 300) requires the same quirk for headset detection as other Framework 13 models. Signed-off-by: Christopher Eby <kreed@kreed.org> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20250810030006.9060-1-kreed@kreed.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-11	rcu: Fix racy re-initialization of irq_work causing hangs	Frederic Weisbecker
	RCU re-initializes the deferred QS irq work everytime before attempting to queue it. However there are situations where the irq work is attempted to be queued even though it is already queued. In that case re-initializing messes-up with the irq work queue that is about to be handled. The chances for that to happen are higher when the architecture doesn't support self-IPIs and irq work are then all lazy, such as with the following sequence: 1) rcu_read_unlock() is called when IRQs are disabled and there is a grace period involving blocked tasks on the node. The irq work is then initialized and queued. 2) The related tasks are unblocked and the CPU quiescent state is reported. rdp->defer_qs_iw_pending is reset to DEFER_QS_IDLE, allowing the irq work to be requeued in the future (note the previous one hasn't fired yet). 3) A new grace period starts and the node has blocked tasks. 4) rcu_read_unlock() is called when IRQs are disabled again. The irq work is re-initialized (but it's queued! and its node is cleared) and requeued. Which means it's requeued to itself. 5) The irq work finally fires with the tick. But since it was requeued to itself, it loops and hangs. Fix this with initializing the irq work only once before the CPU boots. Fixes: b41642c87716 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202508071303.c1134cce-lkp@intel.com Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.upadhyay@kernel.org>