summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-08-11rust: drm: don't pass the address of drm::Device to drm_dev_put()Danilo Krummrich
In drm_dev_put() call in AlwaysRefCounted::dec_ref() we rely on struct drm_device to be the first field in drm::Device, whereas everywhere else we correctly obtain the address of the actual struct drm_device. Analogous to the from_drm_device() helper, provide the into_drm_device() helper in order to address this. Fixes: 1e4b8896c0f3 ("rust: drm: add device abstraction") Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250731154919.4132-5-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-11rust: drm: remove pin annotations from drm::DeviceDanilo Krummrich
The #[pin_data] and #[pin] annotations are not necessary for drm::Device, since we don't use any pin-init macros, but only __pinned_init() on the impl PinInit<T::Data, Error> argument of drm::Device::new(). Fixes: 1e4b8896c0f3 ("rust: drm: add device abstraction") Reviewed-by: Benno Lossin <lossin@kernel.org> Link: https://lore.kernel.org/r/20250731154919.4132-4-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-11rust: drm: ensure kmalloc() compatible LayoutDanilo Krummrich
drm::Device is allocated through __drm_dev_alloc() (which uses kmalloc()) and the driver private data, <T as drm::Driver>::Data, is initialized in-place. Due to the order of fields in drm::Device pub struct Device<T: drm::Driver> { dev: Opaque<bindings::drm_device>, data: T::Data, } even with an arbitrary large alignment requirement of T::Data it can't happen that the size of Device is smaller than its alignment requirement. However, let's not rely on this subtle circumstance and create a proper kmalloc() compatible Layout. Fixes: 1e4b8896c0f3 ("rust: drm: add device abstraction") Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250731154919.4132-3-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-11rust: alloc: replace aligned_size() with Kmalloc::aligned_layout()Danilo Krummrich
aligned_size() dates back to when Rust did support kmalloc() only, but is now used in ReallocFunc::call() and hence for all allocators. However, the additional padding applied by aligned_size() is only required by the kmalloc() allocator backend. Hence, replace aligned_size() with Kmalloc::aligned_layout() and use it for the affected allocators, i.e. kmalloc() and kvmalloc(), only. While at it, make Kmalloc::aligned_layout() public, such that Rust abstractions, which have to call subsystem specific kmalloc() based allocation primitives directly, can make use of it. Fixes: 8a799831fc63 ("rust: alloc: implement `ReallocFunc`") Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250731154919.4132-2-dakr@kernel.org [ Remove `const` from Kmalloc::aligned_layout(). - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-11iio: imu: inv_icm42600: change invalid data error to -EBUSYJean-Baptiste Maneyrol
Temperature sensor returns the temperature of the mechanical parts of the chip. If both accel and gyro are off, the temperature sensor is also automatically turned off and returns invalid data. In this case, returning -EBUSY error code is better then -EINVAL and indicates userspace that it needs to retry reading temperature in another context. Fixes: bc3eb0207fb5 ("iio: imu: inv_icm42600: add temperature sensor support") Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com> Cc: stable@vger.kernel.org Reviewed-by: Andy Shevchenko <andy@kernel.org> Reviewed-by: Sean Nyekjaer <sean@geanix.com> Link: https://patch.msgid.link/20250808-inv-icm42600-change-temperature-error-code-v1-1-986fbf63b77d@tdk.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-08-11iio: adc: ad7124: fix channel lookup in syscalib functionsDavid Lechner
Fix possible incorrect channel lookup in the syscalib functions by using the correct channel address instead of the channel number. In the ad7124 driver, the channel field of struct iio_chan_spec is the input pin number of the positive input of the channel. This can be, but is not always the same as the index in the channels array. The correct index in the channels array is stored in the address field (and also scan_index). We use the address field to perform the correct lookup. Fixes: 47036a03a303 ("iio: adc: ad7124: Implement internal calibration at probe time") Signed-off-by: David Lechner <dlechner@baylibre.com> Reviewed-by: Nuno Sá <nuno.sa@analog.com> Link: https://patch.msgid.link/20250726-iio-adc-ad7124-fix-channel-lookup-in-syscalib-v1-1-b9d14bb684af@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-08-11iio: temperature: maxim_thermocouple: use DMA-safe buffer for spi_read()David Lechner
Replace using stack-allocated buffers with a DMA-safe buffer for use with spi_read(). This allows the driver to be safely used with DMA-enabled SPI controllers. The buffer array is also converted to a struct with a union to make the usage of the memory in the buffer more clear and ensure proper alignment. Fixes: 1f25ca11d84a ("iio: temperature: add support for Maxim thermocouple chips") Signed-off-by: David Lechner <dlechner@baylibre.com> Reviewed-by: Nuno Sá <nuno.sa@analog.com> Link: https://patch.msgid.link/20250721-iio-use-more-iio_declare_buffer_with_ts-3-v2-1-0c68d41ccf6c@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-08-11iio: adc: ad7173: prevent scan if too many setups requestedDavid Lechner
Add a check to ad7173_update_scan_mode() to ensure that we didn't exceed the maximum number of unique channel configurations. In the AD7173 family of chips, there are some chips that have 16 CHANNELx registers but only 8 setups (combination of CONFIGx, FILTERx, GAINx and OFFSETx registers). Since commit 92c247216918 ("iio: adc: ad7173: fix num_slots"), it is possible to have more than 8 channels enabled in a scan at the same time, so it is possible to get a bad configuration when more than 8 channels are using unique configurations. This happens because the algorithm to allocate the setup slots only takes into account which slot has been least recently used and doesn't know about the maximum number of slots available. Since the algorithm to allocate the setup slots is quite complex, it is simpler to check after the fact if the current state is valid or not. So this patch adds a check in ad7173_update_scan_mode() after setting up all of the configurations to make sure that the actual setup still matches the requested setup for each enabled channel. If not, we prevent the scan from being enabled and return an error. The setup comparison in ad7173_setup_equal() is refactored to a separate function since we need to call it in two places now. Fixes: 92c247216918 ("iio: adc: ad7173: fix num_slots") Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250722-iio-adc-ad7173-fix-setup-use-limits-v2-1-8e96bdb72a9c@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-08-11iio: proximity: isl29501: fix buffered read on big-endian systemsDavid Lechner
Fix passing a u32 value as a u16 buffer scan item. This works on little- endian systems, but not on big-endian systems. A new local variable is introduced for getting the register value and the array is changed to a struct to make the data layout more explicit rather than just changing the type and having to recalculate the proper length needed for the timestamp. Fixes: 1c28799257bc ("iio: light: isl29501: Add support for the ISL29501 ToF sensor.") Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250722-iio-use-more-iio_declare_buffer_with_ts-7-v2-1-d3ebeb001ed3@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-08-11iio: accel: sca3300: fix uninitialized iio scan dataDavid Lechner
Fix potential leak of uninitialized stack data to userspace by ensuring that the `channels` array is zeroed before use. Fixes: edeb67fbbf4b ("iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS") Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250723-iio-accel-sca3300-fix-uninitialized-iio-scan-data-v1-1-12dbfb3307b7@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-08-11x86/fpu: Fix NULL dereference in avx512_status()Fushuai Wang
Problem ------- With CONFIG_X86_DEBUG_FPU enabled, reading /proc/[kthread]/arch_status causes a warning and a NULL pointer dereference. This is because the AVX-512 timestamp code uses x86_task_fpu() but doesn't check it for NULL. CONFIG_X86_DEBUG_FPU addles that function for kernel threads (PF_KTHREAD specifically), making it return NULL. The point of the warning was to ensure that kernel threads only access task->fpu after going through kernel_fpu_begin()/_end(). Note: all kernel tasks exposed in /proc have a valid task->fpu. Solution -------- One option is to silence the warning and check for NULL from x86_task_fpu(). However, that warning is fairly fresh and seems like a defense against misuse of the FPU state in kernel threads. Instead, stop outputting AVX-512_elapsed_ms for kernel threads altogether. The data was garbage anyway because avx512_timestamp is only updated for user threads, not kernel threads. If anyone ever wants to track kernel thread AVX-512 use, they can come back later and do it properly, separate from this bug fix. [ dhansen: mostly rewrite changelog ] Fixes: 22aafe3bcb67 ("x86/fpu: Remove init_task FPU state dependencies, add debugging warning for PF_KTHREAD tasks") Co-developed-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Fushuai Wang <wangfushuai@baidu.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20250811185044.2227268-1-sohil.mehta%40intel.com
2025-08-11cpufreq: intel_pstate: Support Clearwater Forest OOB modeSrinivas Pandruvada
Prevent intel_pstate from loading when OOB (Out Of Band) P-states mode is enabled. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20250808145122.4057208-1-srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-08-11cpuidle: governors: menu: Avoid using invalid recent intervals dataRafael J. Wysocki
Marc has reported that commit 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") caused the number of wakeup interrupts to increase on an idle system [1], which was not expected to happen after merely allowing shallower idle states to be selected by the governor in some cases. However, on the system in question, all of the idle states deeper than WFI are rejected by the driver due to a firmware issue [2]. This causes the governor to only consider the recent interval duriation data corresponding to attempts to enter WFI that are successful and the recent invervals table is filled with values lower than the scheduler tick period. Consequently, the governor predicts an idle duration below the scheduler tick period length and avoids stopping the tick more often which leads to the observed symptom. Address it by modifying the governor to update the recent intervals table also when entering the previously selected idle state fails, so it knows that the short idle intervals might have been the minority had the selected idle states been actually entered every time. Fixes: 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") Link: https://lore.kernel.org/linux-pm/86o6sv6n94.wl-maz@kernel.org/ [1] Link: https://lore.kernel.org/linux-pm/7ffcb716-9a1b-48c2-aaa4-469d0df7c792@arm.com/ [2] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/2793874.mvXUDI8C0e@rafael.j.wysocki
2025-08-11intel_idle: Allow loading ACPI tables for any familyLen Brown
There is no reason to limit intel_idle's loading of ACPI tables to family 6. Upcoming Intel processors are not in family 6. Below "Fixes" really means "applies cleanly until". That syntax commit didn't change the previous logic, but shows this patch applies back 5-years. Fixes: 4a9f45a0533f ("intel_idle: Convert to new X86 CPU match macros") Signed-off-by: Len Brown <len.brown@intel.com> Link: https://patch.msgid.link/06101aa4fe784e5b0be1cb2c0bdd9afcf16bd9d4.1754681697.git.len.brown@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-08-11selftests/sched_ext: Remove duplicate sched.h headerJiapeng Chong
./tools/testing/selftests/sched_ext/hotplug.c: sched.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=22941 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-11sched/ext: Fix invalid task state transitions on class switchAndrea Righi
When enabling a sched_ext scheduler, we may trigger invalid task state transitions, resulting in warnings like the following (which can be easily reproduced by running the hotplug selftest in a loop): sched_ext: Invalid task state transition 0 -> 3 for fish[770] WARNING: CPU: 18 PID: 787 at kernel/sched/ext.c:3862 scx_set_task_state+0x7c/0xc0 ... RIP: 0010:scx_set_task_state+0x7c/0xc0 ... Call Trace: <TASK> scx_enable_task+0x11f/0x2e0 switching_to_scx+0x24/0x110 scx_enable.isra.0+0xd14/0x13d0 bpf_struct_ops_link_create+0x136/0x1a0 __sys_bpf+0x1edd/0x2c30 __x64_sys_bpf+0x21/0x30 do_syscall_64+0xbb/0x370 entry_SYSCALL_64_after_hwframe+0x77/0x7f This happens because we skip initialization for tasks that are already dead (with their usage counter set to zero), but we don't exclude them during the scheduling class transition phase. Fix this by also skipping dead tasks during class swiching, preventing invalid task state transitions. Fixes: a8532fac7b5d2 ("sched_ext: TASK_DEAD tasks must be switched into SCX on ops_enable") Cc: stable@vger.kernel.org # v6.12+ Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-11blk-wbt: doc: Update the doc of the wbt_lat_usec interfaceTang Yizhou
The symbol wb_window_usec cannot be found. Update the doc to reflect the latest implementation, in other words, the debugfs interface 'curr_win_nsec'. Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250727173959.160835-4-yizhou.tang@shopee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11blk-wbt: Eliminate ambiguity in the comments of struct rq_wbTang Yizhou
In the current implementation, the last_issue and last_comp members of struct rq_wb are used only by read requests and not by non-throttled write requests. Therefore, eliminate the ambiguity here. Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20250727173959.160835-3-yizhou.tang@shopee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11blk-wbt: Optimize wbt_done() for non-throttled writesTang Yizhou
In the current implementation, the sync_cookie and last_cookie members of struct rq_wb are used only by read requests and not by non-throttled write requests. Based on this, we can optimize wbt_done() by removing one if condition check for non-throttled write requests. Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250727173959.160835-2-yizhou.tang@shopee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11regulator: dt-bindings: infineon,ir38060: Add Guenter as maintainer from IBMKrzysztof Kozlowski
The infineon,ir38060 binding never got maintainer and fake "Not Me" entry have been causing dt_binding_check warnings for 1.5 years now: regulator/infineon,ir38060.yaml: maintainers:0: 'Not Me.' does not match '@' Guenter agreed to keep an eye for this hardware and binding. Cc: Guenter Roeck <linux@roeck-us.net> Cc: Conor Dooley <conor.dooley@microchip.com> Cc: Andrew Jeffery <andrew@codeconstruct.com.au> Cc: Ninad Palsule <ninad@linux.ibm.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://patch.msgid.link/20250811141526.168752-2-krzysztof.kozlowski@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-08-11futex: Use user_write_access_begin/_end() in futex_put_value()Waiman Long
Commit cec199c5e39b ("futex: Implement FUTEX2_NUMA") introduced the futex_put_value() helper to write a value to the given user address. However, it uses user_read_access_begin() before the write. For architectures that differentiate between read and write accesses, like PowerPC, futex_put_value() fails with -EFAULT. Fix that by using the user_write_access_begin/user_write_access_end() pair instead. Fixes: cec199c5e39b ("futex: Implement FUTEX2_NUMA") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20250811141147.322261-1-longman@redhat.com
2025-08-11MAINTAINERS: entry for DRM GPUVMDanilo Krummrich
GPUVM deserves a bit more coordination, also given the upcoming Rust work for GPUVM, hence add a dedicated maintainers entry for DRM GPUVM. Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250808092432.461250-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-11x86/bugs: Select best SRSO mitigationDavid Kaplan
The SRSO bug can theoretically be used to conduct user->user or guest->guest attacks and requires a mitigation (namely IBPB instead of SBPB on context switch) for these. So mark SRSO as being applicable to the user->user and guest->guest attack vectors. Additionally, SRSO supports multiple mitigations which mitigate different potential attack vectors. Some CPUs are also immune to SRSO from certain attack vectors (like user->kernel). Use the specific attack vectors requiring mitigation to select the best SRSO mitigation to avoid unnecessary performance hits. Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250721160310.1804203-1-david.kaplan@amd.com
2025-08-11iosys-map: Fix undefined behavior in iosys_map_clear()Nitin Gote
The current iosys_map_clear() implementation reads the potentially uninitialized 'is_iomem' boolean field to decide which union member to clear. This causes undefined behavior when called on uninitialized structures, as 'is_iomem' may contain garbage values like 0xFF. UBSAN detects this as: UBSAN: invalid-load in include/linux/iosys-map.h:267 load of value 255 is not a valid value for type '_Bool' Fix by unconditionally clearing the entire structure with memset(), eliminating the need to read uninitialized data and ensuring all fields are set to known good values. Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14639 Fixes: 01fd30da0474 ("dma-buf: Add struct dma-buf-map for storing struct dma_buf.vaddr_ptr") Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250718105051.2709487-1-nitin.r.gote@intel.com
2025-08-11drm/tests: Fix drm_test_fb_xrgb8888_to_xrgb2101010() on big-endianJosé Expósito
Fix failures on big-endian architectures on tests cases single_pixel_source_buffer, single_pixel_clip_rectangle, well_known_colors and destination_pitch. Fixes: 15bda1f8de5d ("drm/tests: Add calls to drm_fb_blit() on supported format conversion tests") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250630090054.353246-2-jose.exposito89@gmail.com
2025-08-11drm/tests: Fix endian warningJosé Expósito
When compiling with sparse enabled, this warning is thrown: warning: incorrect type in argument 2 (different base types) expected restricted __le32 const [usertype] *buf got unsigned int [usertype] *[assigned] buf Add a cast to fix it. Fixes: 453114319699 ("drm/format-helper: Add KUnit tests for drm_fb_xrgb8888_to_xrgb2101010()") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250630090054.353246-1-jose.exposito89@gmail.com
2025-08-11Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann
Updating drm-misc-fixes to the state of v6.17-rc1. Begins a new release cycle. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-08-11Merge tag 'nfsd-6.17-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - A correctness fix for delegated timestamps - Address an NFSD shutdown hang when LOCALIO is in use - Prevent a remotely exploitable crasher when TLS is in use * tag 'nfsd-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: sunrpc: fix handling of server side tls alerts nfsd: avoid ref leak in nfsd_open_local_fh() nfsd: don't set the ctime on delegated atime updates
2025-08-11ALSA: hda/realtek: Fix headset mic on HONOR BRB-XVasiliy Kovalev
Add a PCI quirk to enable microphone input on the headphone jack on the HONOR BRB-X M1010 laptop. Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20250811132716.45076-1-kovalev@altlinux.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-11module: Rename EXPORT_SYMBOL_GPL_FOR_MODULES to EXPORT_SYMBOL_FOR_MODULESVlastimil Babka
Christoph suggested that the explicit _GPL_ can be dropped from the module namespace export macro, as it's intended for in-tree modules only. It would be possible to restrict it technically, but it was pointed out [2] that some cases of using an out-of-tree build of an in-tree module with the same name are legitimate. But in that case those also have to be GPL anyway so it's unnecessary to spell it out in the macro name. Link: https://lore.kernel.org/all/aFleJN_fE-RbSoFD@infradead.org/ [1] Link: https://lore.kernel.org/all/CAK7LNATRkZHwJGpojCnvdiaoDnP%2BaeUXgdey5sb_8muzdWTMkA@mail.gmail.com/ [2] Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Shivank Garg <shivankg@amd.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Nicolas Schier <n.schier@avm.de> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/20250808-export_modules-v4-1-426945bcc5e1@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11fs: fix incorrect lflags value in the move_mount syscallYuntao Wang
The lflags value used to look up from_path was overwritten by the one used to look up to_path. In other words, from_path was looked up with the wrong lflags value. Fix it. Fixes: f9fde814de37 ("fs: support getname_maybe_null() in move_mount()") Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev> Link: https://lore.kernel.org/20250811052426.129188-1-yuntao.wang@linux.dev [Christian Brauner <brauner@kernel.org>: massage patch] Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11block: fix kobject double initialization in add_diskZheng Qixing
Device-mapper can call add_disk() multiple times for the same gendisk due to its two-phase creation process (dm create + dm load). This leads to kobject double initialization errors when the underlying iSCSI devices become temporarily unavailable and then reappear. However, if the first add_disk() call fails and is retried, the queue_kobj gets initialized twice, causing: kobject: kobject (ffff88810c27bb90): tried to init an initialized object, something is seriously wrong. Call Trace: <TASK> dump_stack_lvl+0x5b/0x80 kobject_init.cold+0x43/0x51 blk_register_queue+0x46/0x280 add_disk_fwnode+0xb5/0x280 dm_setup_md_queue+0x194/0x1c0 table_load+0x297/0x2d0 ctl_ioctl+0x2a2/0x480 dm_ctl_ioctl+0xe/0x20 __x64_sys_ioctl+0xc7/0x110 do_syscall_64+0x72/0x390 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fix this by separating kobject initialization from sysfs registration: - Initialize queue_kobj early during gendisk allocation - add_disk() only adds the already-initialized kobject to sysfs - del_gendisk() removes from sysfs but doesn't destroy the kobject - Final cleanup happens when the disk is released Fixes: 2bd85221a625 ("block: untangle request_queue refcounting from sysfs") Reported-by: Li Lingfeng <lilingfeng3@huawei.com> Closes: https://lore.kernel.org/all/83591d0b-2467-433c-bce0-5581298eb161@huawei.com/ Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Link: https://lore.kernel.org/r/20250808053609.3237836-1-zhengqixing@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11blk-cgroup: remove redundant __GFP_NOWARNQianfeng Rong
Commit 16f5dfbc851b ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made GFP_NOWAIT implicitly include __GFP_NOWARN. Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g., `GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean up these redundant flags across subsystems. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250809141358.168781-1-rongqianfeng@vivo.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11block, bfq: remove redundant __GFP_NOWARNQianfeng Rong
Commit 16f5dfbc851b ("gfp: include __GFP_NOWARN in GFP_NOWAIT") made GFP_NOWAIT implicitly include __GFP_NOWARN. Therefore, explicit __GFP_NOWARN combined with GFP_NOWAIT (e.g., `GFP_NOWAIT | __GFP_NOWARN`) is now redundant. Let's clean up these redundant flags across subsystems. Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Link: https://lore.kernel.org/r/20250811081135.374315-1-rongqianfeng@vivo.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11ublk: check for unprivileged daemon on each I/O fetchCaleb Sander Mateos
Commit ab03a61c6614 ("ublk: have a per-io daemon instead of a per-queue daemon") allowed each ublk I/O to have an independent daemon task. However, nr_privileged_daemon is only computed based on whether the last I/O fetched in each ublk queue has an unprivileged daemon task. Fix this by checking whether every fetched I/O's daemon is privileged. Change nr_privileged_daemon from a count of queues to a boolean indicating whether any I/Os have an unprivileged daemon. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: ab03a61c6614 ("ublk: have a per-io daemon instead of a per-queue daemon") Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250808155216.296170-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11ublk: don't quiesce in ublk_ch_releaseUday Shankar
ublk_ch_release currently quiesces the device's request_queue while setting force_abort/fail_io. This avoids data races by preventing concurrent reads from the I/O path, but is not strictly needed - at this point, canceling is already set and guaranteed to be observed by any concurrently executing I/Os, so they will be handled properly even if the changes to force_abort/fail_io propagate to the I/O path later. Remove the quiesce/unquiesce calls from ublk_ch_release. This makes the writes to force_abort/fail_io concurrent with the reads in the I/O path, so make the accesses atomic. Before this change, the call to blk_mq_quiesce_queue was responsible for most (90%) of the runtime of ublk_ch_release. With that call eliminated, ublk_ch_release runs much faster. Here is a comparison of the total time spent in calls to ublk_ch_release when a server handling 128 devices exits, before and after this change: before: 1.11s after: 0.09s Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250808-ublk_quiesce2-v1-1-f87ade33fa3d@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11drbd: Remove the open-coded page poolPhilipp Reisner
If the network stack keeps a reference for too long, DRBD keeps references on a higher number of pages as a consequence. Fix all that by no longer relying on page reference counts dropping to an expected value. Instead, DRBD gives up its reference and lets the system handle everything else. While at it, remove the open-coded custom page pool mechanism and use the page_pool included in the kernel. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Tested-by: Eric Hagberg <ehagberg@janestreet.com> Link: https://lore.kernel.org/r/20250605103852.23029-1-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-11selftests/coredump: Remove the read() that fails the testNam Cao
Resolve a conflict between commit 6a68d28066b6 ("selftests/coredump: Fix "socket_detect_userspace_client" test failure") and commit 994dc26302ed ("selftests/coredump: fix build") The first commit adds a read() to wait for write() from another thread to finish. But the second commit removes the write(). Now that the two commits are in the same tree, the read() now gets EOF and the test fails. Remove this read() so that the test passes. Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/20250811074957.4079616-1-namcao@linutronix.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11fuse: keep inode->i_blkbits constantJoanne Koong
With fuse now using iomap for writeback handling, inode blkbits changes are problematic because iomap relies on inode->i_blkbits for its internal bitmap logic. Currently we change inode->i_blkbits in fuse to match the attr->blksize value passed in by the server. This commit keeps inode->i_blkbits constant in fuse. Any attr->blksize values passed in by the server will not update inode->i_blkbits. The client-side behavior for stat is unaffected, stat will still reflect the blocksize passed in by the server. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/20250807175015.515192-1-joannelkoong@gmail.com Fixes: ef7e7cbb32 ("fuse: use iomap for writeback") Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11Merge patch series "open_tree_attr: do not allow id-mapping changes without ↵Christian Brauner
OPEN_TREE_CLONE" Aleksa Sarai <cyphar@cyphar.com> says: As described in commit 7a54947e727b ('Merge patch series "fs: allow changing idmappings"'), open_tree_attr(2) was necessary in order to allow for a detached mount to be created and have its idmappings changed without the risk of any racing threads operating on it. For this reason, mount_setattr(2) still does not allow for id-mappings to be changed. However, there was a bug in commit 2462651ffa76 ("fs: allow changing idmappings") which allowed users to bypass this restriction by calling open_tree_attr(2) *without* OPEN_TREE_CLONE. can_idmap_mount() prevented this bug from allowing an attached mountpoint's id-mapping from being modified (thanks to an is_anon_ns() check), but this still allows for detached (but visible) mounts to have their be id-mapping changed. This risks the same UAF and locking issues as described in the merge commit, and was likely unintentional. For what it's worth, I found this while working on the open_tree_attr(2) man page, and was trying to figure out what open_tree_attr(2)'s behaviour was in the (slightly fruity) ~OPEN_TREE_CLONE case. * patches from https://lore.kernel.org/20250808-open_tree_attr-bugfix-idmap-v1-0-0ec7bc05646c@cyphar.com: selftests/mount_setattr: add smoke tests for open_tree_attr(2) bug open_tree_attr: do not allow id-mapping changes without OPEN_TREE_CLONE Link: https://lore.kernel.org/20250808-open_tree_attr-bugfix-idmap-v1-0-0ec7bc05646c@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11iomap: Fix broken data integrity guarantees for O_SYNC writesJan Kara
Commit d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") has broken the logic in iomap_dio_bio_iter() in a way that when the device does support FUA (or has no writeback cache) and the direct IO happens to freshly allocated or unwritten extents, we will *not* issue fsync after completing direct IO O_SYNC / O_DSYNC write because the IOMAP_DIO_WRITE_THROUGH flag stays mistakenly set. Fix the problem by clearing IOMAP_DIO_WRITE_THROUGH whenever we do not perform FUA write as it was originally intended. CC: John Garry <john.g.garry@oracle.com> CC: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Fixes: d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/20250730102840.20470-2-jack@suse.cz Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11selftests/mount_setattr: add smoke tests for open_tree_attr(2) bugAleksa Sarai
There appear to be no other open_tree_attr(2) tests at the moment, but as a minimal solution just add some additional checks in the existing MOUNT_ATTR_IDMAP tests to make sure that open_tree_attr(2) cannot be used to bypass the tested restrictions that apply to mount_setattr(2). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/20250808-open_tree_attr-bugfix-idmap-v1-2-0ec7bc05646c@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11open_tree_attr: do not allow id-mapping changes without OPEN_TREE_CLONEAleksa Sarai
As described in commit 7a54947e727b ('Merge patch series "fs: allow changing idmappings"'), open_tree_attr(2) was necessary in order to allow for a detached mount to be created and have its idmappings changed without the risk of any racing threads operating on it. For this reason, mount_setattr(2) still does not allow for id-mappings to be changed. However, there was a bug in commit 2462651ffa76 ("fs: allow changing idmappings") which allowed users to bypass this restriction by calling open_tree_attr(2) *without* OPEN_TREE_CLONE. can_idmap_mount() prevented this bug from allowing an attached mountpoint's id-mapping from being modified (thanks to an is_anon_ns() check), but this still allows for detached (but visible) mounts to have their be id-mapping changed. This risks the same UAF and locking issues as described in the merge commit, and was likely unintentional. Fixes: 2462651ffa76 ("fs: allow changing idmappings") Cc: stable@vger.kernel.org # v6.15+ Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/20250808-open_tree_attr-bugfix-idmap-v1-1-0ec7bc05646c@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11fs: writeback: fix use-after-free in __mark_inode_dirty()Jiufei Xue
An use-after-free issue occurred when __mark_inode_dirty() get the bdi_writeback that was in the progress of switching. CPU: 1 PID: 562 Comm: systemd-random- Not tainted 6.6.56-gb4403bd46a8e #1 ...... pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __mark_inode_dirty+0x124/0x418 lr : __mark_inode_dirty+0x118/0x418 sp : ffffffc08c9dbbc0 ........ Call trace: __mark_inode_dirty+0x124/0x418 generic_update_time+0x4c/0x60 file_modified+0xcc/0xd0 ext4_buffered_write_iter+0x58/0x124 ext4_file_write_iter+0x54/0x704 vfs_write+0x1c0/0x308 ksys_write+0x74/0x10c __arm64_sys_write+0x1c/0x28 invoke_syscall+0x48/0x114 el0_svc_common.constprop.0+0xc0/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x40/0xe4 el0t_64_sync_handler+0x120/0x12c el0t_64_sync+0x194/0x198 Root cause is: systemd-random-seed kworker ---------------------------------------------------------------------- ___mark_inode_dirty inode_switch_wbs_work_fn spin_lock(&inode->i_lock); inode_attach_wb locked_inode_to_wb_and_lock_list get inode->i_wb spin_unlock(&inode->i_lock); spin_lock(&wb->list_lock) spin_lock(&inode->i_lock) inode_io_list_move_locked spin_unlock(&wb->list_lock) spin_unlock(&inode->i_lock) spin_lock(&old_wb->list_lock) inode_do_switch_wbs spin_lock(&inode->i_lock) inode->i_wb = new_wb spin_unlock(&inode->i_lock) spin_unlock(&old_wb->list_lock) wb_put_many(old_wb, nr_switched) cgwb_release old wb released wb_wakeup_delayed() accesses wb, then trigger the use-after-free issue Fix this race condition by holding inode spinlock until wb_wakeup_delayed() finished. Signed-off-by: Jiufei Xue <jiufei.xue@samsung.com> Link: https://lore.kernel.org/20250728100715.3863241-1-jiufei.xue@samsung.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-08-11xfs: split xfs_zone_record_blocksChristoph Hellwig
xfs_zone_record_blocks not only records successfully written blocks that now back file data, but is also used for blocks speculatively written by garbage collection that were never linked to an inode and instantly become invalid. Split the latter functionality out to be easier to understand. This also make it clear that we don't need to attach the rmap inode to a transaction for the skipped blocks case as we never dirty any peristent data structure. Also make the argument order to xfs_zone_record_blocks a bit more natural. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11xfs: fix scrub trace with null pointer in quotacheckAndrey Albershteyn
The quotacheck doesn't initialize sc->ip. Cc: stable@vger.kernel.org # v6.8 Fixes: 21d7500929c8a0 ("xfs: improve dquot iteration for scrub") Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11xfs: reject max_atomic_write mount option for no reflinkJohn Garry
If the FS has no reflink, then atomic writes greater than 1x block are not supported. As such, for no reflink it is pointless to accept setting max_atomic_write when it cannot be supported, so reject max_atomic_write mount option in this case. It could be still possible to accept max_atomic_write option of size 1x block if HW atomics are supported, so check for this specifically. Fixes: 4528b9052731 ("xfs: allow sysadmins to specify a maximum atomic write limit at mount time") Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11xfs: disallow atomic writes on DAXJohn Garry
Atomic writes are not currently supported for DAX, but two problems exist: - we may go down DAX write path for IOCB_ATOMIC, which does not handle IOCB_ATOMIC properly - we report non-zero atomic write limits in statx (for DAX inodes) We may want atomic writes support on DAX in future, but just disallow for now. For this, ensure when IOCB_ATOMIC is set that we check the write size versus the atomic write min and max before branching off to the DAX write path. This is not strictly required for DAX, as we should not get this far in the write path as FMODE_CAN_ATOMIC_WRITE should not be set. In addition, due to reflink being supported for DAX, we automatically get CoW-based atomic writes support being advertised. Remedy this by disallowing atomic writes for a DAX inode for both sw and hw modes. Reported-by: Darrick J. Wong <djwong@kernel.org> Fixes: 9dffc58f2384 ("xfs: update atomic write limits") Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11fs/dax: Reject IOCB_ATOMIC in dax_iomap_rw()John Garry
The DAX write path does not support IOCB_ATOMIC, so reject it when set. Suggested-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-08-11xfs: remove XFS_IBULK_SAME_AGChristoph Hellwig
Add a new field to struct xfs_ibulk to directly pass XFS_IWALK* flags, and thus remove the need to indirect the SAME_AG flag through XFS_IBULK*. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>