summaryrefslogtreecommitdiff
path: root/drivers/block
AgeCommit message (Collapse)Author
2025-03-28ublk: add segment parameterMing Lei
IO split is usually bad in io_uring world, since -EAGAIN is caused and IO handling may have to fallback to io-wq, this way does hurt performance. ublk starts to support zero copy recently, for avoiding unnecessary IO split, ublk driver's segment limit should be aligned with backend device's segment limit. Another reason is that io_buffer_register_bvec() needs to allocate bvecs, which number is aligned with ublk request segment number, so that big memory allocation can be avoided by setting reasonable max_segments limit. So add segment parameter for providing ublk server chance to align segment limit with backend, and keep it reasonable from implementation viewpoint. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-7-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: call io_uring_cmd_to_pdu to get uring_cmd pduMing Lei
Call io_uring_cmd_to_pdu() to get uring_cmd pdu, and one big benefit is the automatic pdu size build check. Suggested-by: Uday Shankar <ushankar@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250327095123.179113-6-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: add helper of ublk_need_map_io()Ming Lei
ublk_need_map_io() is more readable. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: remove two unused fields from 'struct ublk_queue'Ming Lei
Remove two unused fields(`io_addr` & `max_io_sz`) from `struct ublk_queue`. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: comment on ubq->canceling handling in ublk_queue_rq()Ming Lei
In ublk_queue_rq(), ubq->canceling has to be handled after ->fail_io and ->force_abort are dealt with, otherwise the request may not be failed when deleting disk. Add comment on this usage. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: make sure ubq->canceling is set when queue is frozenMing Lei
Now ublk driver depends on `ubq->canceling` for deciding if the request can be dispatched via uring_cmd & io_uring_cmd_complete_in_task(). Once ubq->canceling is set, the uring_cmd can be done via ublk_cancel_cmd() and io_uring_cmd_done(). So set ubq->canceling when queue is frozen, this way makes sure that the flag can be observed from ublk_queue_rq() reliably, and avoids use-after-free on uring_cmd. Fixes: 216c8f5ef0f2 ("ublk: replace monitor with cancelable uring_cmd") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-26Merge tag 'for-6.15/block-20250322' of git://git.kernel.dk/linuxLinus Torvalds
Pull block updates from Jens Axboe: - Fixes for integrity handling - NVMe pull request via Keith: - Secure concatenation for TCP transport (Hannes) - Multipath sysfs visibility (Nilay) - Various cleanups (Qasim, Baruch, Wang, Chen, Mike, Damien, Li) - Correct use of 64-bit BARs for pci-epf target (Niklas) - Socket fix for selinux when used in containers (Peijie) - MD pull request via Yu: - fix recovery can preempt resync (Li Nan) - fix md-bitmap IO limit (Su Yue) - fix raid10 discard with REQ_NOWAIT (Xiao Ni) - fix raid1 memory leak (Zheng Qixing) - fix mddev uaf (Yu Kuai) - fix raid1,raid10 IO flags (Yu Kuai) - some refactor and cleanup (Yu Kuai) - Series cleaning up and fixing bugs in the bad block handling code - Improve support for write failure simulation in null_blk - Various lock ordering fixes - Fixes for locking for debugfs attributes - Various ublk related fixes and improvements - Cleanups for blk-rq-qos wait handling - blk-throttle fixes - Fixes for loop dio and sync handling - Fixes and cleanups for the auto-PI code - Block side support for hardware encryption keys in blk-crypto - Various cleanups and fixes * tag 'for-6.15/block-20250322' of git://git.kernel.dk/linux: (105 commits) nvmet: replace max(a, min(b, c)) by clamp(val, lo, hi) nvme-tcp: fix selinux denied when calling sock_sendmsg nvmet: pci-epf: Always configure BAR0 as 64-bit nvmet: Remove duplicate uuid_copy nvme: zns: Simplify nvme_zone_parse_entry() nvmet: pci-epf: Remove redundant 'flush_workqueue()' calls nvmet-fc: Remove unused functions nvme-pci: remove stale comment nvme-fc: Utilise min3() to simplify queue count calculation nvme-multipath: Add visibility for queue-depth io-policy nvme-multipath: Add visibility for numa io-policy nvme-multipath: Add visibility for round-robin io-policy nvmet: add tls_concat and tls_key debugfs entries nvmet-tcp: support secure channel concatenation nvmet: Add 'sq' argument to alloc_ctrl_args nvme-fabrics: reset admin connection for secure concatenation nvme-tcp: request secure channel concatenation nvme-keyring: add nvme_tls_psk_refresh() nvme: add nvme_auth_derive_tls_psk() nvme: add nvme_auth_generate_digest() ...
2025-03-26Merge tag 'for-6.15/io_uring-20250322' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring updates from Jens Axboe: "This is the first of the io_uring pull requests for the 6.15 merge window, there will be others once the net tree has gone in. This contains: - Cleanup and unification of cancelation handling across various request types. - Improvement for bundles, supporting them both for incrementally consumed buffers, and for non-multishot requests. - Enable toggling of using iowait while waiting on io_uring events or not. Unfortunately this is still tied with CPU frequency boosting on short waits, as the scheduler side has not been very receptive to splitting the (useless) iowait stat from the cpufreq implied boost. - Add support for kbuf nodes, enabling zero-copy support for the ublk block driver. - Various cleanups for resource node handling. - Series greatly cleaning up the legacy provided (non-ring based) buffers. For years, we've been pushing the ring provided buffers as the way to go, and that is what people have been using. Reduce the complexity and code associated with legacy provided buffers. - Series cleaning up the compat handling. - Series improving and cleaning up the recvmsg/sendmsg iovec and msg handling. - Series of cleanups for io-wq. - Start adding a bunch of selftests. The liburing repository generally carries feature and regression tests for everything, but at least for ublk initially, we'll try and go the route of having it in selftests as well. We'll see how this goes, might decide to migrate more tests this way in the future. - Various little cleanups and fixes" * tag 'for-6.15/io_uring-20250322' of git://git.kernel.dk/linux: (108 commits) selftests: ublk: add stripe target selftests: ublk: simplify loop io completion selftests: ublk: enable zero copy for null target selftests: ublk: prepare for supporting stripe target selftests: ublk: move common code into common.c selftests: ublk: increase max buffer size to 1MB selftests: ublk: add single sqe allocator helper selftests: ublk: add generic_01 for verifying sequential IO order selftests: ublk: fix starting ublk device io_uring: enable toggle of iowait usage when waiting on CQEs selftests: ublk: fix write cache implementation selftests: ublk: add variable for user to not show test result selftests: ublk: don't show `modprobe` failure selftests: ublk: add one dependency header io_uring/kbuf: enable bundles for incrementally consumed buffers Revert "io_uring/rsrc: simplify the bvec iter count calculation" selftests: ublk: improve test usability selftests: ublk: add stress test for covering IO vs. killing ublk server selftests: ublk: add one stress test for covering IO vs. removing device selftests: ublk: load/unload ublk_drv when preparing & cleaning up tests ...
2025-03-25Merge tag 'timers-cleanups-2025-03-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "A treewide hrtimer timer cleanup hrtimers are initialized with hrtimer_init() and a subsequent store to the callback pointer. This turned out to be suboptimal for the upcoming Rust integration and is obviously a silly implementation to begin with. This cleanup replaces the hrtimer_init(T); T->function = cb; sequence with hrtimer_setup(T, cb); The conversion was done with Coccinelle and a few manual fixups. Once the conversion has completely landed in mainline, hrtimer_init() will be removed and the hrtimer::function becomes a private member" * tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits) wifi: rt2x00: Switch to use hrtimer_update_function() io_uring: Use helper function hrtimer_update_function() serial: xilinx_uartps: Use helper function hrtimer_update_function() ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup() RDMA: Switch to use hrtimer_setup() virtio: mem: Switch to use hrtimer_setup() drm/vmwgfx: Switch to use hrtimer_setup() drm/xe/oa: Switch to use hrtimer_setup() drm/vkms: Switch to use hrtimer_setup() drm/msm: Switch to use hrtimer_setup() drm/i915/request: Switch to use hrtimer_setup() drm/i915/uncore: Switch to use hrtimer_setup() drm/i915/pmu: Switch to use hrtimer_setup() drm/i915/perf: Switch to use hrtimer_setup() drm/i915/gvt: Switch to use hrtimer_setup() drm/i915/huc: Switch to use hrtimer_setup() drm/amdgpu: Switch to use hrtimer_setup() stm class: heartbeat: Switch to use hrtimer_setup() i2c: Switch to use hrtimer_setup() iio: Switch to use hrtimer_setup() ...
2025-03-19ublk: remove io_cmds list in ublk_queueUday Shankar
The current I/O dispatch mechanism - queueing I/O by adding it to the io_cmds list (and poking task_work as needed), then dispatching it in ublk server task context by reversing io_cmds and completing the io_uring command associated to each one - was introduced by commit 7d4a93176e014 ("ublk_drv: don't forward io commands in reserve order") to ensure that the ublk server received I/O in the same order that the block layer submitted it to ublk_drv. This mechanism was only needed for the "raw" task_work submission mechanism, since the io_uring task work wrapper maintains FIFO ordering (using quite a similar mechanism in fact). The "raw" task_work submission mechanism is no longer supported in ublk_drv as of commit 29dc5d06613f2 ("ublk: kill queuing request by task_work_add"), so the explicit llist/reversal is no longer needed - it just duplicates logic already present in the underlying io_uring APIs. Remove it. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250318-ublk_io_cmds-v1-1-c1bb74798fef@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-18loop: move vfs_fsync() out of loop_update_dio()Ming Lei
If vfs_flush() is called with queue frozen, the queue freeze lock may be connected with FS internal lock, and lockdep warning can be triggered because the queue freeze lock is connected with too many global or sub-system locks. Fix the warning by moving vfs_fsync() out of loop_update_dio(): - vfs_fsync() is only needed when switching to dio - only loop_change_fd() and loop_configure() may switch from buffered IO to direct IO, so call vfs_fsync() directly here. This way is safe because either loop is in unbound, or new file isn't attached - for the other two cases of set_status and set_block_size, direct IO can only become off, so no need to call vfs_fsync() Cc: Christoph Hellwig <hch@infradead.org> Reported-by: Kun Hu <huk23@m.fudan.edu.cn> Reported-by: Jiaji Qin <jjtan24@m.fudan.edu.cn> Closes: https://lore.kernel.org/linux-block/359BC288-B0B1-4815-9F01-3A349B12E816@m.fudan.edu.cn/T/#u Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250318072955.3893805-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-16zram: add might_sleep to zcomp APISergey Senozhatsky
Explicitly state that zcomp compress/decompress must be called from non-atomic context. Link: https://lkml.kernel.org/r/20250303022425.285971-20-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: do not leak page on writeback_store error pathSergey Senozhatsky
Ensure the page used for local object data is freed on error out path. Link: https://lkml.kernel.org/r/20250303022425.285971-19-senozhatsky@chromium.org Fixes: 330edc2bc059 (zram: rework writeback target selection strategy) Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: do not leak page on recompress_store error pathSergey Senozhatsky
Ensure the page used for local object data is freed on error out path. Link: https://lkml.kernel.org/r/20250303022425.285971-18-senozhatsky@chromium.org Fixes: 3f909a60cec1 ("zram: rework recompress target selection strategy") Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: permit reclaim in zstd custom allocatorSergey Senozhatsky
When configured with pre-trained compression/decompression dictionary support, zstd requires custom memory allocator, which it calls internally from compression()/decompression() routines. That means allocation from atomic context (either under entry spin-lock, or per-CPU local-lock or both). Now, with non-atomic zram read()/write(), those limitations are relaxed and we can allow direct and indirect reclaim. Link: https://lkml.kernel.org/r/20250303022425.285971-17-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: switch to new zsmalloc object mapping APISergey Senozhatsky
Use new read/write zsmalloc object API. For cases when RO mapped object spans two physical pages (requires temp buffer) compression streams now carry around one extra physical page. Link: https://lkml.kernel.org/r/20250303022425.285971-16-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: move post-processing target allocationSergey Senozhatsky
Allocate post-processing target in place_pp_slot(). This simplifies scan_slots_for_writeback() and scan_slots_for_recompress() loops because we don't need to track pps pointer state anymore. Previously we have to explicitly NULL the point if it has been added to a post-processing bucket or re-use previously allocated pointer otherwise and make sure we don't leak the memory in the end. We are also fine doing GFP_NOIO allocation, as post-processing can be called under memory pressure so we better pick as many slots as we can as soon as we can and start post-processing them, possibly saving the memory. Allocation failure there is not fatal, we will post-process whatever we put into the buckets on previous iterations. Link: https://lkml.kernel.org/r/20250303022425.285971-12-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: rework recompression loopSergey Senozhatsky
This reworks recompression loop handling: - set a rule that stream-put NULLs the stream pointer If the loop returns with a non-NULL stream then it's a successful recompression, otherwise the stream should always be NULL. - do not count the number of recompressions Mark object as incompressible as soon as the algorithm with the highest priority failed to compress that object. - count compression errors as resource usage Even if compression has failed, we still need to bump num_recomp_pages counter. Link: https://lkml.kernel.org/r/20250303022425.285971-11-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: filter out recomp targets based on prioritySergey Senozhatsky
Do no select for post processing slots that are already compressed with same or higher priority compression algorithm. This should save some memory, as previously we would still put those entries into corresponding post-processing buckets and filter them out later in recompress_slot(). Link: https://lkml.kernel.org/r/20250303022425.285971-10-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: limit max recompress prio to num_active_compsSergey Senozhatsky
Use the actual number of algorithms zram was configure with instead of theoretical limit of ZRAM_MAX_COMPS. Also make sure that min prio is not above max prio. Link: https://lkml.kernel.org/r/20250303022425.285971-9-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: remove writestall zram_stats memberSergey Senozhatsky
There is no zsmalloc handle allocation slow path now and writestall is not possible any longer. Remove it from zram_stats. Link: https://lkml.kernel.org/r/20250303022425.285971-8-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: add GFP_NOWARN to incompressible zsmalloc handle allocationSergey Senozhatsky
We normally use __GFP_NOWARN for zsmalloc handle allocations, add it to write_incompressible_page() allocation too. Link: https://lkml.kernel.org/r/20250303022425.285971-7-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: remove second stage of handle allocationSergey Senozhatsky
Previously zram write() was atomic which required us to pass __GFP_KSWAPD_RECLAIM to zsmalloc handle allocation on a fast path and attempt a slow path allocation (with recompression) if the fast path failed. Since we are not in atomic context anymore we can permit direct reclaim during handle allocation, and hence can have a single allocation path. There is no slow path anymore so we don't unlock per-CPU stream (and don't lose compressed data) which means that there is no need to do recompression now (which should reduce CPU and battery usage). Link: https://lkml.kernel.org/r/20250303022425.285971-6-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: remove max_comp_streams device attrSergey Senozhatsky
max_comp_streams device attribute has been defunct since May 2016 when zram switched to per-CPU compression streams, remove it. Link: https://lkml.kernel.org/r/20250303022425.285971-5-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: remove unused crypto includeSergey Senozhatsky
We stopped using crypto API (for the time being), so remove its include and replace CRYPTO_MAX_ALG_NAME with a local define. Link: https://lkml.kernel.org/r/20250303022425.285971-4-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: permit preemption with active compression streamSergey Senozhatsky
Currently, per-CPU stream access is done from a non-preemptible (atomic) section, which imposes the same atomicity requirements on compression backends as entry spin-lock, and makes it impossible to use algorithms that can schedule/wait/sleep during compression and decompression. Switch to preemptible per-CPU model, similar to the one used in zswap. Instead of a per-CPU local lock, each stream carries a mutex which is locked throughout entire time zram uses it for compression or decompression, so that cpu-dead event waits for zram to stop using a particular per-CPU stream and release it. Link: https://lkml.kernel.org/r/20250303022425.285971-3-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Suggested-by: Yosry Ahmed <yosry.ahmed@linux.dev> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16zram: sleepable entry lockingSergey Senozhatsky
Patch series "zsmalloc/zram: there be preemption", v10. Currently zram runs compression and decompression in non-preemptible sections, e.g. zcomp_stream_get() // grabs CPU local lock zcomp_compress() or zram_slot_lock() // grabs entry spin-lock zcomp_stream_get() // grabs CPU local lock zs_map_object() // grabs rwlock and CPU local lock zcomp_decompress() Potentially a little troublesome for a number of reasons. For instance, this makes it impossible to use async compression algorithms or/and H/W compression algorithms, which can wait for OP completion or resource availability. This also restricts what compression algorithms can do internally, for example, zstd can allocate internal state memory for C/D dictionaries: do_fsync() do_writepages() zram_bio_write() zram_write_page() // become non-preemptible zcomp_compress() zstd_compress() ZSTD_compress_usingCDict() ZSTD_compressBegin_usingCDict_internal() ZSTD_resetCCtx_usingCDict() ZSTD_resetCCtx_internal() zstd_custom_alloc() // memory allocation Not to mention that the system can be configured to maximize compression ratio at a cost of CPU/HW time (e.g. lz4hc or deflate with very high compression level) so zram can stay in non-preemptible section (even under spin-lock or/and rwlock) for an extended period of time. Aside from compression algorithms, this also restricts what zram can do. One particular example is zram_write_page() zsmalloc handle allocation, which has an optimistic allocation (disallowing direct reclaim) and a pessimistic fallback path, which then forces zram to compress the page one more time. This series changes zram to not directly impose atomicity restrictions on compression algorithms (and on itself), which makes zram write() fully preemptible; zram read(), sadly, is not always preemptible yet. There are still indirect atomicity restrictions imposed by zsmalloc(). One notable example is object mapping API, which returns with: a) local CPU lock held b) zspage rwlock held First, zsmalloc's zspage lock is converted from rwlock to a special type of RW-lookalike look with some extra guarantees/features. Second, a new handle mapping is introduced which doesn't use per-CPU buffers (and hence no local CPU lock), does fewer memcpy() calls, but requires users to provide a pointer to temp buffer for object copy-in (when needed). Third, zram is converted to the new zsmalloc mapping API and thus zram read() becomes preemptible. This patch (of 19): Concurrent modifications of meta table entries is now handled by per-entry spin-lock. This has a number of shortcomings. First, this imposes atomic requirements on compression backends. zram can call both zcomp_compress() and zcomp_decompress() under entry spin-lock, which implies that we can use only compression algorithms that don't schedule/sleep/wait during compression and decompression. This, for instance, makes it impossible to use some of the ASYNC compression algorithms (H/W compression, etc.) implementations. Second, this can potentially trigger watchdogs. For example, entry re-compression with secondary algorithms is performed under entry spin-lock. Given that we chain secondary compression algorithms and that some of them can be configured for best compression ratio (and worst compression speed) zram can stay under spin-lock for quite some time. Having a per-entry mutex (or, for instance, a rw-semaphore) significantly increases sizeof() of each entry and hence the meta table. Therefore entry locking returns back to bit locking, as before, however, this time also preempt-rt friendly, because if waits-on-bit instead of spinning-on-bit. Lock owners are also now permitted to schedule, which is a first step on the path of making zram non-atomic. Link: https://lkml.kernel.org/r/20250303022425.285971-1-senozhatsky@chromium.org Link: https://lkml.kernel.org/r/20250303022425.285971-2-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-14Merge tag 'block-6.14-20250313' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Concurrent pci error and hotplug handling fix (Keith) - Endpoint function fixes (Damien) - Fix for a regression introduced in this cycle with error checking for batched request completions (Shin'ichiro) * tag 'block-6.14-20250313' of git://git.kernel.dk/linux: block: change blk_mq_add_to_batch() third argument type to bool nvme: move error logging from nvme_end_req() to __nvme_end_req() nvmet: pci-epf: Do not add an IRQ vector if not needed nvmet: pci-epf: Set NVMET_PCI_EPF_Q_LIVE when a queue is fully created nvme-pci: fix stuck reset on concurrent DPC and HP
2025-03-13block: remove unused parameter 'q' parameter in __blk_rq_map_sg()Anuj Gupta
request_queue param is no longer used by blk_rq_map_sg and __blk_rq_map_sg. Remove it. Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250313035322.243239-1-anuj20.g@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-12block: change blk_mq_add_to_batch() third argument type to boolShin'ichiro Kawasaki
Commit 1f47ed294a2b ("block: cleanup and fix batch completion adding conditions") modified the evaluation criteria for the third argument, 'ioerror', in the blk_mq_add_to_batch() function. Initially, the function had checked if 'ioerror' equals zero. Following the commit, it started checking for negative error values, with the presumption that such values, for instance -EIO, would be passed in. However, blk_mq_add_to_batch() callers do not pass negative error values. Instead, they pass status codes defined in various ways: - NVMe PCI and Apple drivers pass NVMe status code - virtio_blk driver passes the virtblk request header status byte - null_blk driver passes blk_status_t These codes are either zero or positive, therefore the revised check fails to function as intended. Specifically, with the NVMe PCI driver, this modification led to the failure of the blktests test case nvme/039. In this test scenario, errors are artificially injected to the NVMe driver, resulting in positive NVMe status codes passed to blk_mq_add_to_batch(), which unexpectedly processes the failed I/O in a batch. Hence the failure. To correct the ioerror check within blk_mq_add_to_batch(), make all callers to uniformly pass the argument as boolean. Modify the callers to check their specific status codes and pass the boolean value 'is_error'. Also describe the arguments of blK_mq_add_to_batch as kerneldoc. Fixes: 1f47ed294a2b ("block: cleanup and fix batch completion adding conditions") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20250311104359.1767728-3-shinichiro.kawasaki@wdc.com [axboe: fold in documentation update] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10rust: module: introduce `authors` keyGuilherme Giacomo Simoes
In the `module!` macro, the `author` field is currently of type `String`. Since modules can have multiple authors, this limitation prevents specifying more than one. Add an `authors` field as `Option<Vec<String>>` to allow creating modules with multiple authors, and change the documentation and all current users to use it. Eventually, the single `author` field may be removed. [ The `modinfo` key needs to still be `author`; otherwise, tooling may not work properly, e.g.: $ modinfo --author samples/rust/rust_print.ko Rust for Linux Contributors I have also kept the original `author` field (undocumented), so that we can drop it more easily in a kernel cycle or two. - Miguel ] Suggested-by: Miguel Ojeda <ojeda@kernel.org> Link: https://github.com/Rust-for-Linux/linux/issues/244 Reviewed-by: Charalampos Mitrodimas <charmitro@posteo.net> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org> Signed-off-by: Guilherme Giacomo Simoes <trintaeoitogc@gmail.com> Link: https://lore.kernel.org/r/20250309175712.845622-2-trintaeoitogc@gmail.com [ Fixed `modinfo` key. Kept `author` field. Reworded message accordingly. Updated my email. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-03-07Merge tag 'block-6.14-20250306' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - TCP use after free fix on polling (Sagi) - Controller memory buffer cleanup fixes (Icenowy) - Free leaking requests on bad user passthrough commands (Keith) - TCP error message fix (Maurizio) - TCP corruption fix on partial PDU (Maurizio) - TCP memory ordering fix for weakly ordered archs (Meir) - Type coercion fix on message error for TCP (Dan) - Name the RQF flags enum, fixing issues with anon enums and BPF import of it - ublk parameter setting fix - GPT partition 7-bit conversion fix * tag 'block-6.14-20250306' of git://git.kernel.dk/linux: block: Name the RQF flags enum nvme-tcp: fix signedness bug in nvme_tcp_init_connection() block: fix conversion of GPT partition name to 7-bit ublk: set_params: properly check if parameters can be applied nvmet-tcp: Fix a possible sporadic response drops in weakly ordered arch nvme-tcp: fix potential memory corruption in nvme_tcp_recv_pdu() nvme-tcp: Fix a C2HTermReq error message nvmet: remove old function prototype nvme-ioctl: fix leaked requests on mapping error nvme-pci: skip CMB blocks incompatible with PCI P2P DMA nvme-pci: clean up CMBMSC when registering CMB fails nvme-tcp: fix possible UAF in nvme_tcp_poll
2025-03-06badblocks: use sector_t instead of int to avoid truncation of badblocks lengthZheng Qixing
There is a truncation of badblocks length issue when set badblocks as follow: echo "2055 4294967299" > bad_blocks cat bad_blocks 2055 3 Change 'sectors' argument type from 'int' to 'sector_t'. This change avoids truncation of badblocks length for large sectors by replacing 'int' with 'sector_t' (u64), enabling proper handling of larger disk sizes and ensuring compatibility with 64-bit sector addressing. Fixes: 9e0e252a048b ("badblocks: Add core badblock management code") Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Coly Li <colyli@kernel.org> Link: https://lore.kernel.org/r/20250227075507.151331-13-zhengqixing@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-06badblocks: return boolean from badblocks_set() and badblocks_clear()Zheng Qixing
Change the return type of badblocks_set() and badblocks_clear() from int to bool, indicating success or failure. Specifically: - _badblocks_set() and _badblocks_clear() functions now return true for success and false for failure. - All calls to these functions are updated to handle the new boolean return type. - This change improves code clarity and ensures a more consistent handling of success and failure states. Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Coly Li <colyli@kernel.org> Acked-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20250227075507.151331-11-zhengqixing@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-05ublk: set_params: properly check if parameters can be appliedUday Shankar
The parameters set by the set_params call are only applied to the block device in the start_dev call. So if a device has already been started, a subsequently issued set_params on that device will not have the desired effect, and should return an error. There is an existing check for this - set_params fails on devices in the LIVE state. But this check is not sufficient to cover the recovery case. In this case, the device will be in the QUIESCED or FAIL_IO states, so set_params will succeed. But this success is misleading, because the parameters will not be applied, since the device has already been started (by a previous ublk server). The bit UB_STATE_USED is set on completion of the start_dev; use it to detect and fail set_params commands which arrive too late to be applied (after start_dev). Signed-off-by: Uday Shankar <ushankar@purestorage.com> Fixes: 0aa73170eba5 ("ublk_drv: add SET_PARAMS/GET_PARAMS control command") Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250304-set_params-v1-1-17b5e0887606@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04ublk: enforce ublks_max only for unprivileged devicesUday Shankar
Commit 403ebc877832 ("ublk_drv: add module parameter of ublks_max for limiting max allowed ublk dev"), claimed ublks_max was added to prevent a DoS situation with an untrusted user creating too many ublk devices. If that's the case, ublks_max should only restrict the number of unprivileged ublk devices in the system. Enforce the limit only for unprivileged ublk devices, and rename variables accordingly. Leave the external-facing parameter name unchanged, since changing it may break systems which use it (but still update its documentation to reflect its new meaning). As a result of this change, in a system where there are only normal (non-unprivileged) devices, the maximum number of such devices is increased to 1 << MINORBITS, or 1048576. That ought to be enough for anyone, right? Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250228-ublks_max-v1-1-04b7379190c0@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04loop: Remove struct loop_func_tableZhu Yanjun
The struct is introduced in the commit 754d96798fab ("loop: remove loop.h"), but it is not used now. So remove it. No functional changes. Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250227163343.55952-1-yanjun.zhu@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-04ublk: don't cast registered buffer index to intCaleb Sander Mateos
io_buffer_register_bvec() takes index as an unsigned int argument, but ublk_register_io_buf() casts ub_cmd->addr (a u64) to int. Remove the misleading cast and instead pass index as an unsigned value to ublk_register_io_buf() and ublk_unregister_io_buf(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250301190317.950208-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-03ublk: add DMA alignment limitMing Lei
The in-tree ublk driver doesn't need DMA alignment limit because there is one data copy between request pages and the userspace buffer. However, ublk is going to support zero copy, then DMA alignment limit is required, because same IO buffer is forwarded to backend which may have specific buffer DMA alignment limit, so the limit has to be exposed from the frontend driver to client application. Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250227103707.2640014-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-03null_blk: do partial IO for bad blocksShin'ichiro Kawasaki
The current null_blk implementation checks if any bad blocks exist in the target blocks of each IO. If so, the IO fails and data is not transferred for all of the IO target blocks. However, when real storage devices have bad blocks, the devices may transfer data partially up to the first bad blocks (e.g., SAS drives). Especially, when the IO is a write operation, such partial IO leaves partially written data on the device. To simulate such partial IO using null_blk, introduce the new parameter 'badblocks_partial_io'. When this parameter is set, null_handle_badblocks() returns the number of the sectors for the partial IO as its third pointer argument. Pass the returned number of sectors to the following calls to null_handle_memory_backend() in null_process_cmd() and null_zone_write(). Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20250226100613.1622564-6-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-03null_blk: pass transfer size to null_handle_rq()Shin'ichiro Kawasaki
As preparation to support partial data transfer, add a new argument to null_handle_rq() to pass the number of sectors to transfer. While at it, rename the function from null_handle_rq to null_handle_data_transfer. This commit does not change the behavior. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20250226100613.1622564-5-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-03null_blk: replace null_process_cmd() call in null_zone_write()Shin'ichiro Kawasaki
As a preparation to support partial data transfer due to badblocks, replace the null_process_cmd() call in null_zone_write() with equivalent calls to null_handle_badblocks() and null_handle_memory_backed(). This commit does not change behavior. It will enable null_handle_badblocks() to return the size of partial data transfer in the following commit, allowing null_zone_write() to move write pointers appropriately. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20250226100613.1622564-4-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-03null_blk: introduce badblocks_once parameterShin'ichiro Kawasaki
When IO errors happen on real storage devices, the IOs repeated to the same target range can success by virtue of recovery features by devices, such as reserved block assignment. To simulate such IO errors and recoveries, introduce the new parameter badblocks_once parameter. When this parameter is set to 1, the specified badblocks are cleared after the first IO error, so that the next IO to the blocks succeed. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20250226100613.1622564-3-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-03null_blk: generate null_blk configfs features stringShin'ichiro Kawasaki
The null_blk configfs file 'features' provides a string that lists available null_blk features for userspace programs to reference. The string is defined as a long constant in the code, which tends to be forgotten for updates. It also causes checkpatch.pl to report "WARNING: quoted string split across lines". To avoid these drawbacks, generate the feature string on the fly. Refer to the ca_name field of each element in the nullb_device_attrs table and concatenate them in the given buffer. Also, sorted nullb_device_attrs table elements in alphabetical order. Of note is that the feature "index" was missing before this commit. This commit adds it to the generated string. Suggested-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20250226100613.1622564-2-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-03ublk: complete command synchronously on errorCaleb Sander Mateos
In case of an error, ublk's ->uring_cmd() functions currently return -EIOCBQUEUED and immediately call io_uring_cmd_done(). -EIOCBQUEUED and io_uring_cmd_done() are intended for asynchronous completions. For synchronous completions, the ->uring_cmd() function can just return the negative return code directly. This skips io_uring_cmd_del_cancelable(), and deferring the completion to task work. So return the error code directly from __ublk_ch_uring_cmd() and ublk_ctrl_uring_cmd(). Update ublk_ch_uring_cmd_cb(), which currently ignores the return value from __ublk_ch_uring_cmd(), to call io_uring_cmd_done() for synchronous completions. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250225212456.2902549-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28io_uring/ublk: report error when unregister operation failsCaleb Sander Mateos
Indicate to userspace applications if a UBLK_IO_UNREGISTER_IO_BUF command specifies an invalid buffer index by returning an error code. Return -EINVAL if no buffer is registered with the given index, and -EBUSY if the registered buffer is not a kernel bvec. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250228231432.642417-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-28ublk: zc register/unregister bvecKeith Busch
Provide new operations for the user to request mapping an active request to an io uring instance's buf_table. The user has to provide the index it wants to install the buffer. A reference count is taken on the request to ensure it can't be completed while it is active in a ring's buf_table. Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250227223916.143006-6-kbusch@meta.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24loop: take the file system minimum dio alignment into accountChristoph Hellwig
The loop driver currently uses the logical block size of the underlying bdev as the lower bound of the loop device block size. While this works for many cases, it fails for file systems made up of multiple devices with different logical block sizes (e.g. XFS with a RT device that has a larger logical block size), or when the file systems doesn't support direct I/O writes at the sector size granularity (e.g. because it does out of place writes with a file system block size larger than the sector size). Fix this by querying the minimum direct I/O alignment from statx when available. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250131120120.1315125-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24loop: check in LO_FLAGS_DIRECT_IO in loop_default_blocksizeChristoph Hellwig
We can't go below the minimum direct I/O size no matter if direct I/O is enabled by passing in an O_DIRECT file descriptor or due to the explicit flag. Now that LO_FLAGS_DIRECT_IO is set earlier after assigning a backing file, loop_default_blocksize can check it instead of the O_DIRECT flag to handle both conditions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250131120120.1315125-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-24loop: set LO_FLAGS_DIRECT_IO in loop_assign_backing_fileChristoph Hellwig
Assigning LO_FLAGS_DIRECT_IO from the O_DIRECT flag is related to assigning a new backing file. Move the assignment in preparation of using the flag more and earlier. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250131120120.1315125-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>