Age | Commit message (Collapse) | Author |
|
The f_frsize maybe changed in the quota size is less than the defualt
4MB.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
When the quota is approaching we need to notify it to the MDS as
soon as possible, or the client could write to the directory more
than expected.
This will flush the dirty caps without delaying after each write,
though this couldn't prevent the real size of a directory exceed
the quota but could prevent it as soon as possible.
Link: https://tracker.ceph.com/issues/56180
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Luís Henriques <lhenriques@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
If the 'i_inline_version' is 1, that means the file is just new
created and there shouldn't have any inline data in it, we should
skip retrieving the inline data from MDS.
This also could help reduce possiblity of dead lock issue introduce
by the inline data and Fcr caps.
Gradually we will remove the inline feature from kclient after ceph's
scrub too have support to unline the inline data, currently this
could help reduce the teuthology test failures.
This is possiblly could also fix a bug that for some old clients if
they couldn't explictly uninline the inline data when writing, the
inline version will keep as 1 always. We may always reading non-exist
data from inline data.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
For async create we will always try to choose the auth MDS of frag
the dentry belonged to of the parent directory to send the request
and ususally this works fine, but if the MDS migrated the directory
to another MDS before it could be handled the request will be
forwarded. And then the auth cap will be changed.
We need to update the auth cap in this case before the request is
forwarded.
Link: https://tracker.ceph.com/issues/55857
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Willy requested that we change this back to warning on folio->private
being non-NULl. He's trying to kill off the PG_private flag, and so we'd
like to catch where it's non-NULL.
Add a VM_WARN_ON_FOLIO (since it doesn't exist yet) and change over to
using that instead of VM_BUG_ON_FOLIO along with testing the ->private
pointer.
[ xiubli: define VM_WARN_ON_FOLIO macro in case DEBUG_VM is disabled
reported by kernel test robot <lkp@intel.com> ]
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
"was_async" is a bit misleadingly named. It's supposed to indicate
whether it's safe to call blocking operations from the context you're
calling it from, but it sounds like it's asking whether this was done
via async operation. For ceph, this it's always called from kernel
thread context so it should be safe to set this to false.
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
There's no reason we need to lock the inode for write in order to handle
an llseek. I suspect this should have been dropped in 2013 when we
stopped doing vmtruncate in llseek.
With that gone, ceph_llseek is functionally equivalent to
generic_file_llseek, so just call that after getting the size.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Luís Henriques <lhenriques@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
When handle_cap_grant is called on an IMPORT op, then the snap_rwsem is
held and the function is expected to release it before returning. It
currently fails to do that in all cases which could lead to a deadlock.
Fixes: 6f05b30ea063 ("ceph: reset i_requested_max_size if file write is not wanted")
Link: https://tracker.ceph.com/issues/55857
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Luís Henriques <lhenriques@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
The MDS tries to enforce a limit on the total key/values in extended
attributes. However, this limit is enforced only if doing a synchronous
operation (MDS_OP_SETXATTR) -- if we're buffering the xattrs, the MDS
doesn't have a chance to enforce these limits.
This patch adds support for decoding the xattrs maximum size setting that is
distributed in the mdsmap. Then, when setting an xattr, the kernel client
will revert to do a synchronous operation if that maximum size is exceeded.
While there, fix a dout() that would trigger a printk warning:
[ 98.718078] ------------[ cut here ]------------
[ 98.719012] precision 65536 too large
[ 98.719039] WARNING: CPU: 1 PID: 3755 at lib/vsprintf.c:2703 vsnprintf+0x5e3/0x600
...
Link: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
And for the 'Xs' caps for getxattr we will also choose the auth MDS,
because the MDS side code is buggy due to setxattr won't notify the
replica MDSes when the values changed and the replica MDS will return
the old values. Though we will fix it in MDS code, but this still
makes sense for old ceph.
Link: https://tracker.ceph.com/issues/55331
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
If the connection was accidently closed due to the socket issue or
something else the clients will try to open the opened sessions, the
MDSes will send the session open reply one more time if the clients
support the notify feature.
When the clients retry to open the sessions the s_seq will be 0 as
default, we need to update it anyway.
Link: https://tracker.ceph.com/issues/53911
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
In async unlink case the kclient won't wait for the first reply
from MDS and just drop all the links and unhash the dentry and then
succeeds immediately.
For any new create/link/rename,etc requests followed by using the
same file names we must wait for the first reply of the inflight
unlink request, or the MDS possibly will fail these following
requests with -EEXIST if the inflight async unlink request was
delayed for some reasons.
And the worst case is that for the none async openc request it will
successfully open the file if the CDentry hasn't been unlinked yet,
but later the previous delayed async unlink request will remove the
CDenty. That means the just created file is possiblly deleted later
by accident.
We need to wait for the inflight async unlink requests to finish
when creating new files/directories by using the same file names.
Link: https://tracker.ceph.com/issues/55332
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Compare dentry name with case-exact name, return true if names
are same, or false.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
This macro was added but never be used. And check the ceph code
there has another CEPHFS_FEATURES_MDS_REQUIRED but always be empty.
We should clean up all this related code, which make no sense but
introducing confusion.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Luís Henriques <lhenriques@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Feature bits have to be encoded into the correct locations. This hasn't
been an issue so far because the only hole in the feature bits was in bit
10 (CEPHFS_FEATURE_RECLAIM_CLIENT), which is located in the 2nd byte. When
adding more bits that go beyond the this 2nd byte, the bug will show up.
[xiubli: remove incorrect comment for CEPHFS_FEATURES_CLIENT_SUPPORTED]
Fixes: 9ba1e224538a ("ceph: allocate the correct amount of extra bytes for the session features")
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Most filesystems just call fscrypt_set_context on new inodes, which
usually causes a setxattr. That's a bit late for ceph, which can send
along a full set of attributes with the create request.
Doing so allows it to avoid race windows that where the new inode could
be seen by other clients without the crypto context attached. It also
avoids the separate round trip to the server.
Refactor the fscrypt code a bit to allow us to create a new crypto
context, attach it to the inode, and write it to the buffer, but without
calling set_context on it. ceph can later use this to marshal the
context into the attributes we send along with the create request.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Acked-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
For ceph, we want to use our own scheme for handling filenames that are
are longer than NAME_MAX after encryption and Base64 encoding. This
allows us to have a consistent view of the encrypted filenames for
clients that don't support fscrypt and clients that do but that don't
have the key.
Currently, fs/crypto only supports encrypting filenames using
fscrypt_setup_filename, but that also handles encoding nokey names. Ceph
can't use that because it handles nokey names in a different way.
Export fscrypt_fname_encrypt. Rename fscrypt_fname_encrypted_size to
__fscrypt_fname_encrypted_size and add a new wrapper called
fscrypt_fname_encrypted_size that takes an inode argument rather than a
pointer to a fscrypt_policy union.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Acked-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
inode_insert5 currently looks at I_CREATING to decide whether to insert
the inode into the sb list. This test is a bit ambiguous, as I_CREATING
state is not directly related to that list.
This test is also problematic for some upcoming ceph changes to add
fscrypt support. We need to be able to allocate an inode using new_inode
and insert it into the hash later iff we end up using it, and doing that
now means that we double add it and corrupt the list.
What we really want to know in this test is whether the inode is already
in its superblock list, and then add it if it isn't. Have it test for
list_empty instead and ensure that we always initialize the list by
doing it in inode_init_once. It's only ever removed from the list with
list_del_init, so that should be sufficient.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fsverity update from Eric Biggers:
"Just a small documentation update to mention the btrfs support"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fs-verity: mention btrfs support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve updates from Kees Cook:
- Allow unsharing time namespace on vfork+exec (Andrei Vagin)
- Replace usage of deprecated kmap APIs (Fabio M. De Francesco)
- Fix spelling mistake (Zhang Jiaming)
* tag 'execve-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
exec: Call kmap_local_page() in copy_string_kernel()
exec: Fix a spelling mistake
selftests/timens: add a test for vfork+exit
fs/exec: allow to unshare a time namespace on vfork+exec
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore updates from Kees Cook:
- Migrate to modern acomp crypto interface (Ard Biesheuvel)
- Use better return type for "rcnt" (Dan Carpenter)
* tag 'pstore-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore/zone: cleanup "rcnt" type
pstore: migrate to crypto acomp interface
|
|
Pull block updates from Jens Axboe:
- Improve the type checking of request flags (Bart)
- Ensure queue mapping for a single queues always picks the right queue
(Bart)
- Sanitize the io priority handling (Jan)
- rq-qos race fix (Jinke)
- Reserved tags handling improvements (John)
- Separate memory alignment from file/disk offset aligment for O_DIRECT
(Keith)
- Add new ublk driver, userspace block driver using io_uring for
communication with the userspace backend (Ming)
- Use try_cmpxchg() to cleanup the code in various spots (Uros)
- Finally remove bdevname() (Christoph)
- Clean up the zoned device handling (Christoph)
- Clean up independent access range support (Christoph)
- Clean up and improve block sysfs handling (Christoph)
- Clean up and improve teardown of block devices.
This turns the usual two step process into something that is simpler
to implement and handle in block drivers (Christoph)
- Clean up chunk size handling (Christoph)
- Misc cleanups and fixes (Bart, Bo, Dan, GuoYong, Jason, Keith, Liu,
Ming, Sebastian, Yang, Ying)
* tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block: (178 commits)
ublk_drv: fix double shift bug
ublk_drv: make sure that correct flags(features) returned to userspace
ublk_drv: fix error handling of ublk_add_dev
ublk_drv: fix lockdep warning
block: remove __blk_get_queue
block: call blk_mq_exit_queue from disk_release for never added disks
blk-mq: fix error handling in __blk_mq_alloc_disk
ublk: defer disk allocation
ublk: rewrite ublk_ctrl_get_queue_affinity to not rely on hctx->cpumask
ublk: fold __ublk_create_dev into ublk_ctrl_add_dev
ublk: cleanup ublk_ctrl_uring_cmd
ublk: simplify ublk_ch_open and ublk_ch_release
ublk: remove the empty open and release block device operations
ublk: remove UBLK_IO_F_PREFLUSH
ublk: add a MAINTAINERS entry
block: don't allow the same type rq_qos add more than once
mmc: fix disk/queue leak in case of adding disk failure
ublk_drv: fix an IS_ERR() vs NULL check
ublk: remove UBLK_IO_F_INTEGRITY
ublk_drv: remove unneeded semicolon
...
|
|
git://git.kernel.dk/linux-block
Pull io_uring buffered writes support from Jens Axboe:
"This contains support for buffered writes, specifically for XFS. btrfs
is in progress, will be coming in the next release.
io_uring does support buffered writes on any file type, but since the
buffered write path just always -EAGAIN (or -EOPNOTSUPP) any attempt
to do so if IOCB_NOWAIT is set, any buffered write will effectively be
handled by io-wq offload. This isn't very efficient, and we even have
specific code in io-wq to serialize buffered writes to the same inode
to avoid further inefficiencies with thread offload.
This is particularly sad since most buffered writes don't block, they
simply copy data to a page and dirty it. With this pull request, we
can handle buffered writes a lot more effiently.
If balance_dirty_pages() needs to block, we back off on writes as
indicated.
This improves buffered write support by 2-3x.
Jan Kara helped with the mm bits for this, and Stefan handled the
fs/iomap/xfs/io_uring parts of it"
* tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block:
mm: honor FGP_NOWAIT for page cache page allocation
xfs: Add async buffered write support
xfs: Specify lockmode when calling xfs_ilock_for_iomap()
io_uring: Add tracepoint for short writes
io_uring: fix issue with io_write() not always undoing sb_start_write()
io_uring: Add support for async buffered writes
fs: Add async write file modification handling.
fs: Split off inode_needs_update_time and __file_update_time
fs: add __remove_file_privs() with flags parameter
fs: add a FMODE_BUF_WASYNC flags for f_mode
iomap: Return -EAGAIN from iomap_write_iter()
iomap: Add async buffered write support
iomap: Add flags parameter to iomap_page_create()
mm: Add balance_dirty_pages_ratelimited_flags() function
mm: Move updates of dirty_exceeded into one place
mm: Move starting of background writeback into the main balancing loop
|
|
Pull io_uring updates from Jens Axboe:
- As per (valid) complaint in the last merge window, fs/io_uring.c has
grown quite large these days. io_uring isn't really tied to fs
either, as it supports a wide variety of functionality outside of
that.
Move the code to io_uring/ and split it into files that either
implement a specific request type, and split some code into helpers
as well. The code is organized a lot better like this, and io_uring.c
is now < 4K LOC (me).
- Deprecate the epoll_ctl opcode. It'll still work, just trigger a
warning once if used. If we don't get any complaints on this, and I
don't expect any, then we can fully remove it in a future release
(me).
- Improve the cancel hash locking (Hao)
- kbuf cleanups (Hao)
- Efficiency improvements to the task_work handling (Dylan, Pavel)
- Provided buffer improvements (Dylan)
- Add support for recv/recvmsg multishot support. This is similar to
the accept (or poll) support for have for multishot, where a single
SQE can trigger everytime data is received. For applications that
expect to do more than a few receives on an instantiated socket, this
greatly improves efficiency (Dylan).
- Efficiency improvements for poll handling (Pavel)
- Poll cancelation improvements (Pavel)
- Allow specifiying a range for direct descriptor allocations (Pavel)
- Cleanup the cqe32 handling (Pavel)
- Move io_uring types to greatly cleanup the tracing (Pavel)
- Tons of great code cleanups and improvements (Pavel)
- Add a way to do sync cancelations rather than through the sqe -> cqe
interface, as that's a lot easier to use for some use cases (me).
- Add support to IORING_OP_MSG_RING for sending direct descriptors to a
different ring. This avoids the usually problematic SCM case, as we
disallow those. (me)
- Make the per-command alloc cache we use for apoll generic, place
limits on it, and use it for netmsg as well (me).
- Various cleanups (me, Michal, Gustavo, Uros)
* tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block: (172 commits)
io_uring: ensure REQ_F_ISREG is set async offload
net: fix compat pointer in get_compat_msghdr()
io_uring: Don't require reinitable percpu_ref
io_uring: fix types in io_recvmsg_multishot_overflow
io_uring: Use atomic_long_try_cmpxchg in __io_account_mem
io_uring: support multishot in recvmsg
net: copy from user before calling __get_compat_msghdr
net: copy from user before calling __copy_msghdr
io_uring: support 0 length iov in buffer select in compat
io_uring: fix multishot ending when not polled
io_uring: add netmsg cache
io_uring: impose max limit on apoll cache
io_uring: add abstraction around apoll cache
io_uring: move apoll cache to poll.c
io_uring: consolidate hash_locked io-wq handling
io_uring: clear REQ_F_HASH_LOCKED on hash removal
io_uring: don't race double poll setting REQ_F_ASYNC_DATA
io_uring: don't miss setting REQ_F_DOUBLE_POLL
io_uring: disable multishot recvmsg
io_uring: only trace one of complete or overflow
...
|
|
If someone cancels the open RPC call, then we must not try to free
either the open slot or the layoutget operation arguments, since they
are likely still in use by the hung RPC call.
Fixes: 6949493884fe ("NFSv4: Don't hold the layoutget locks across multiple RPC calls")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
It is not safe to call filemap_fdatawrite_range() from
nfs_async_write_reschedule_io(), since we're often calling from a page
reclaim context. Just let fsync() redrive the writeback for us.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Reference-putting functions should not access the object being put after
decrementing the refcount unless they reduce the refcount to zero.
Fix a couple of instances of this in afs by copying the information to be
logged by tracepoint to local variables before doing the decrement.
[Fixed a bit in afs_put_server() that I'd missed but Marc caught]
Fixes: 341f741f04be ("afs: Refcount the afs_call struct")
Fixes: 452181936931 ("afs: Trace afs_server usage")
Fixes: 977e5f8ed0ab ("afs: Split the usage count on struct afs_server")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/165911278430.3745403.16526310736054780645.stgit@warthog.procyon.org.uk/ # v1
|
|
Use refcount_t rather than atomic_t in afs to make use of the count
checking facilities provided.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/165911277768.3745403.423349776836296452.stgit@warthog.procyon.org.uk/ # v1
|
|
No one calls mpage_writepages with a NULL get_block paramter, so remove
support for that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
All callers of mpage_writepage use block_write_full_page as their
->writepage implementation when called from mpage_writepages
(although for ntfs3 this is obsfucated a bit).
Just call block_write_full_page directly instead of going through
the ->writepage indirection.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
All callers are gone, so remove the now dead code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
The nobh mode is an obscure feature to save lowlevel for large memory
32-bit configurations while trading for much slower performance and
has been long obsolete. Switch to the regular buffer head based helpers
instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
The nobh mode is an obscure feature to save lowlevel for large memory
32-bit configurations while trading for much slower performance and
has been long obsolete. Remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
Handle the resident case with an explicit generic_writepages call instead
of using the obscure overload that makes mpage_writepages with a NULL
get_block do the same thing.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
This involves converting migrate_huge_page_move_mapping(). We also need a
folio variant of hugetlb_set_page_subpool(), but that's for a later patch.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
|
|
Use a folio throughout this function.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
filemap_migrate_folio() fits f2fs's needs perfectly.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Chao Yu <chao@kernel.org>
|
|
filemap_migrate_folio() is a little more general than ubifs really needs,
but it's better to share the code.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
Use filemap_migrate_folio() to do the bulk of the work, and then copy
the ordered flag across if needed.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Sterba <dsterba@suse.com>
|
|
There is nothing iomap-specific about iomap_migratepage(), and it fits
a pattern used by several other filesystems, so move it to mm/migrate.c,
convert it to be filemap_migrate_folio() and convert the iomap filesystems
to use it.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
|
Convert all callers to pass a folio. Most have the folio
already available. Switch all users from aops->migratepage to
aops->migrate_folio. Also turn the documentation into kerneldoc.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Sterba <dsterba@suse.com>
|
|
Use a folio throughout this function. migrate_page() will be converted
later.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Use a folio throughout this function. migrate_page() will be converted
later.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Sterba <dsterba@suse.com>
|
|
Use a folio throughout __buffer_migrate_folio(), add kernel-doc for
buffer_migrate_folio() and buffer_migrate_folio_norefs(), move their
declarations to buffer.h and switch all filesystems that have wired
them up.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Remove a call to read_mapping_page().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
Use folio_put_refs() to perform only one atomic operation instead of two.
The other changes are straightforward conversions from page APIs to
their folio equivalents.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
Use the folio API throughout. There are a few places where we convert
back to a page to call into the rest of the filesystem, so folio usage
needs to be pushed down to those functions later.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
Reorganise the file to remove the forward declaration.
Use folios throughout vxfs_immed_read_folio().
Use memcpy_to_page() instead of an open-coded kmap()/kunmap().
Remove flush_dcache_page() as this is embedded in memcpy_to_page().
Use folio_pos() instead of opencoding it.
Handle multi-page folios.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
This is a straightforward conversion from the page APIs to the folio
APIs. Symlinks are not allowed to be larger than PAGE_SIZE, so there
is little work to do here.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|