diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-09-16 12:13:31 +0200 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-09-16 12:13:31 +0200 |
commit | 35219bc5c71f4197c8bd10297597de797c1eece5 (patch) | |
tree | 2448156135b78f54cd341a8457ccd84a371ddac7 /fs/netfs/read_pgpriv2.c | |
parent | 9020d0d844ad58a051f90b1e5b82ba34123925b9 (diff) | |
parent | 4b40d43d9f951d87ae8dc414c2ef5ae50303a266 (diff) |
Merge tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull netfs updates from Christian Brauner:
"This contains the work to improve read/write performance for the new
netfs library.
The main performance enhancing changes are:
- Define a structure, struct folio_queue, and a new iterator type,
ITER_FOLIOQ, to hold a buffer as a replacement for ITER_XARRAY. See
that patch for questions about naming and form.
ITER_FOLIOQ is provided as a replacement for ITER_XARRAY. The
problem with an xarray is that accessing it requires the use of a
lock (typically the RCU read lock) - and this means that we can't
supply iterate_and_advance() with a step function that might sleep
(crypto for example) without having to drop the lock between pages.
ITER_FOLIOQ is the iterator for a chain of folio_queue structs,
where each folio_queue holds a small list of folios. A folio_queue
struct is a simpler structure than xarray and is not subject to
concurrent manipulation by the VM. folio_queue is used rather than
a bvec[] as it can form lists of indefinite size, adding to one end
and removing from the other on the fly.
- Provide a copy_folio_from_iter() wrapper.
- Make cifs RDMA support ITER_FOLIOQ.
- Use folio queues in the write-side helpers instead of xarrays.
- Add a function to reset the iterator in a subrequest.
- Simplify the write-side helpers to use sheaves to skip gaps rather
than trying to work out where gaps are.
- In afs, make the read subrequests asynchronous, putting them into
work items to allow the next patch to do progressive
unlocking/reading.
- Overhaul the read-side helpers to improve performance.
- Fix the caching of a partial block at the end of a file.
- Allow a store to be cancelled.
Then some changes for cifs to make it use folio queues instead of
xarrays for crypto bufferage:
- Use raw iteration functions rather than manually coding iteration
when hashing data.
- Switch to using folio_queue for crypto buffers.
- Remove the xarray bits.
Make some adjustments to the /proc/fs/netfs/stats file such that:
- All the netfs stats lines begin 'Netfs:' but change this to
something a bit more useful.
- Add a couple of stats counters to track the numbers of skips and
waits on the per-inode writeback serialisation lock to make it
easier to check for this as a source of performance loss.
Miscellaneous work:
- Ensure that the sb_writers lock is taken around
vfs_{set,remove}xattr() in the cachefiles code.
- Reduce the number of conditional branches in netfs_perform_write().
- Move the CIFS_INO_MODIFIED_ATTR flag to the netfs_inode struct and
remove cifs_post_modify().
- Move the max_len/max_nr_segs members from netfs_io_subrequest to
netfs_io_request as they're only needed for one subreq at a time.
- Add an 'unknown' source value for tracing purposes.
- Remove NETFS_COPY_TO_CACHE as it's no longer used.
- Set the request work function up front at allocation time.
- Use bh-disabling spinlocks for rreq->lock as cachefiles completion
may be run from block-filesystem DIO completion in softirq context.
- Remove fs/netfs/io.c"
* tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
docs: filesystems: corrected grammar of netfs page
cifs: Don't support ITER_XARRAY
cifs: Switch crypto buffer to use a folio_queue rather than an xarray
cifs: Use iterate_and_advance*() routines directly for hashing
netfs: Cancel dirty folios that have no storage destination
cachefiles, netfs: Fix write to partial block at EOF
netfs: Remove fs/netfs/io.c
netfs: Speed up buffered reading
afs: Make read subreqs async
netfs: Simplify the writeback code
netfs: Provide an iterator-reset function
netfs: Use new folio_queue data type and iterator instead of xarray iter
cifs: Provide the capability to extract from ITER_FOLIOQ to RDMA SGEs
iov_iter: Provide copy_folio_from_iter()
mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios
netfs: Use bh-disabling spinlocks for rreq->lock
netfs: Set the request work function upon allocation
netfs: Remove NETFS_COPY_TO_CACHE
netfs: Reserve netfs_sreq_source 0 as unset/unknown
netfs: Move max_len/max_nr_segs from netfs_io_subrequest to netfs_io_stream
...
Diffstat (limited to 'fs/netfs/read_pgpriv2.c')
-rw-r--r-- | fs/netfs/read_pgpriv2.c | 264 |
1 files changed, 264 insertions, 0 deletions
diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c new file mode 100644 index 000000000000..ba5af89d37fa --- /dev/null +++ b/fs/netfs/read_pgpriv2.c @@ -0,0 +1,264 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Read with PG_private_2 [DEPRECATED]. + * + * Copyright (C) 2024 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include <linux/export.h> +#include <linux/fs.h> +#include <linux/mm.h> +#include <linux/pagemap.h> +#include <linux/slab.h> +#include <linux/task_io_accounting_ops.h> +#include "internal.h" + +/* + * [DEPRECATED] Mark page as requiring copy-to-cache using PG_private_2. The + * third mark in the folio queue is used to indicate that this folio needs + * writing. + */ +void netfs_pgpriv2_mark_copy_to_cache(struct netfs_io_subrequest *subreq, + struct netfs_io_request *rreq, + struct folio_queue *folioq, + int slot) +{ + struct folio *folio = folioq_folio(folioq, slot); + + trace_netfs_folio(folio, netfs_folio_trace_copy_to_cache); + folio_start_private_2(folio); + folioq_mark3(folioq, slot); +} + +/* + * [DEPRECATED] Cancel PG_private_2 on all marked folios in the event of an + * unrecoverable error. + */ +static void netfs_pgpriv2_cancel(struct folio_queue *folioq) +{ + struct folio *folio; + int slot; + + while (folioq) { + if (!folioq->marks3) { + folioq = folioq->next; + continue; + } + + slot = __ffs(folioq->marks3); + folio = folioq_folio(folioq, slot); + + trace_netfs_folio(folio, netfs_folio_trace_cancel_copy); + folio_end_private_2(folio); + folioq_unmark3(folioq, slot); + } +} + +/* + * [DEPRECATED] Copy a folio to the cache with PG_private_2 set. + */ +static int netfs_pgpriv2_copy_folio(struct netfs_io_request *wreq, struct folio *folio) +{ + struct netfs_io_stream *cache = &wreq->io_streams[1]; + size_t fsize = folio_size(folio), flen = fsize; + loff_t fpos = folio_pos(folio), i_size; + bool to_eof = false; + + _enter(""); + + /* netfs_perform_write() may shift i_size around the page or from out + * of the page to beyond it, but cannot move i_size into or through the + * page since we have it locked. + */ + i_size = i_size_read(wreq->inode); + + if (fpos >= i_size) { + /* mmap beyond eof. */ + _debug("beyond eof"); + folio_end_private_2(folio); + return 0; + } + + if (fpos + fsize > wreq->i_size) + wreq->i_size = i_size; + + if (flen > i_size - fpos) { + flen = i_size - fpos; + to_eof = true; + } else if (flen == i_size - fpos) { + to_eof = true; + } + + _debug("folio %zx %zx", flen, fsize); + + trace_netfs_folio(folio, netfs_folio_trace_store_copy); + + /* Attach the folio to the rolling buffer. */ + if (netfs_buffer_append_folio(wreq, folio, false) < 0) + return -ENOMEM; + + cache->submit_extendable_to = fsize; + cache->submit_off = 0; + cache->submit_len = flen; + + /* Attach the folio to one or more subrequests. For a big folio, we + * could end up with thousands of subrequests if the wsize is small - + * but we might need to wait during the creation of subrequests for + * network resources (eg. SMB credits). + */ + do { + ssize_t part; + + wreq->io_iter.iov_offset = cache->submit_off; + + atomic64_set(&wreq->issued_to, fpos + cache->submit_off); + cache->submit_extendable_to = fsize - cache->submit_off; + part = netfs_advance_write(wreq, cache, fpos + cache->submit_off, + cache->submit_len, to_eof); + cache->submit_off += part; + if (part > cache->submit_len) + cache->submit_len = 0; + else + cache->submit_len -= part; + } while (cache->submit_len > 0); + + wreq->io_iter.iov_offset = 0; + iov_iter_advance(&wreq->io_iter, fsize); + atomic64_set(&wreq->issued_to, fpos + fsize); + + if (flen < fsize) + netfs_issue_write(wreq, cache); + + _leave(" = 0"); + return 0; +} + +/* + * [DEPRECATED] Go through the buffer and write any folios that are marked with + * the third mark to the cache. + */ +void netfs_pgpriv2_write_to_the_cache(struct netfs_io_request *rreq) +{ + struct netfs_io_request *wreq; + struct folio_queue *folioq; + struct folio *folio; + int error = 0; + int slot = 0; + + _enter(""); + + if (!fscache_resources_valid(&rreq->cache_resources)) + goto couldnt_start; + + /* Need the first folio to be able to set up the op. */ + for (folioq = rreq->buffer; folioq; folioq = folioq->next) { + if (folioq->marks3) { + slot = __ffs(folioq->marks3); + break; + } + } + if (!folioq) + return; + folio = folioq_folio(folioq, slot); + + wreq = netfs_create_write_req(rreq->mapping, NULL, folio_pos(folio), + NETFS_PGPRIV2_COPY_TO_CACHE); + if (IS_ERR(wreq)) { + kleave(" [create %ld]", PTR_ERR(wreq)); + goto couldnt_start; + } + + trace_netfs_write(wreq, netfs_write_trace_copy_to_cache); + netfs_stat(&netfs_n_wh_copy_to_cache); + + for (;;) { + error = netfs_pgpriv2_copy_folio(wreq, folio); + if (error < 0) + break; + + folioq_unmark3(folioq, slot); + if (!folioq->marks3) { + folioq = folioq->next; + if (!folioq) + break; + } + + slot = __ffs(folioq->marks3); + folio = folioq_folio(folioq, slot); + } + + netfs_issue_write(wreq, &wreq->io_streams[1]); + smp_wmb(); /* Write lists before ALL_QUEUED. */ + set_bit(NETFS_RREQ_ALL_QUEUED, &wreq->flags); + + netfs_put_request(wreq, false, netfs_rreq_trace_put_return); + _leave(" = %d", error); +couldnt_start: + netfs_pgpriv2_cancel(rreq->buffer); +} + +/* + * [DEPRECATED] Remove the PG_private_2 mark from any folios we've finished + * copying. + */ +bool netfs_pgpriv2_unlock_copied_folios(struct netfs_io_request *wreq) +{ + struct folio_queue *folioq = wreq->buffer; + unsigned long long collected_to = wreq->collected_to; + unsigned int slot = wreq->buffer_head_slot; + bool made_progress = false; + + if (slot >= folioq_nr_slots(folioq)) { + folioq = netfs_delete_buffer_head(wreq); + slot = 0; + } + + for (;;) { + struct folio *folio; + unsigned long long fpos, fend; + size_t fsize, flen; + + folio = folioq_folio(folioq, slot); + if (WARN_ONCE(!folio_test_private_2(folio), + "R=%08x: folio %lx is not marked private_2\n", + wreq->debug_id, folio->index)) + trace_netfs_folio(folio, netfs_folio_trace_not_under_wback); + + fpos = folio_pos(folio); + fsize = folio_size(folio); + flen = fsize; + + fend = min_t(unsigned long long, fpos + flen, wreq->i_size); + + trace_netfs_collect_folio(wreq, folio, fend, collected_to); + + /* Unlock any folio we've transferred all of. */ + if (collected_to < fend) + break; + + trace_netfs_folio(folio, netfs_folio_trace_end_copy); + folio_end_private_2(folio); + wreq->cleaned_to = fpos + fsize; + made_progress = true; + + /* Clean up the head folioq. If we clear an entire folioq, then + * we can get rid of it provided it's not also the tail folioq + * being filled by the issuer. + */ + folioq_clear(folioq, slot); + slot++; + if (slot >= folioq_nr_slots(folioq)) { + if (READ_ONCE(wreq->buffer_tail) == folioq) + break; + folioq = netfs_delete_buffer_head(wreq); + slot = 0; + } + + if (fpos + fsize >= collected_to) + break; + } + + wreq->buffer = folioq; + wreq->buffer_head_slot = slot; + return made_progress; +} |