summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2025-01-09bcachefs: Don't set btree_path to updtodate if we don't fillKent Overstreet
This fixes various locking asserts, and a null ptr deref in bch2_btree_iter_peek_path(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: __bch2_btree_pos_to_text()Kent Overstreet
Factor out a version of bch2_btree_pos_to_text() that doesn't take a pointer to a in-memory btree node, to be used for btree node scrub. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: printbuf_reset() handles tabstopsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Silence read-only errors when deleting snapshotsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Dropped superblock write is no longer a fatal errorKent Overstreet
Just emit a warning if errors=continue or fix_safe. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: bch2_trans_node_drop()Kent Overstreet
Factor out a small common helper. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: bch2_trans_unlock_write()Kent Overstreet
New helper for dropping all write locks; which is distinct from the helper the transaction commit path uses, which is faster and only touches updates. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: btree_node_unlock() can now drop write locksKent Overstreet
Prep work for reworking btree node locking during interior btree updates. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: six locks: write locks can now be held recursivelyKent Overstreet
This is needed for the interior update locking rework, where we'll be holding node write locks for the duration of the update - which is needed for synchronizing with online check_allocations. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: bch2_fs_btree_gc_init()Kent Overstreet
Now returns errors, prep work for check_allocations_done_lock Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Assert that btree write buffer only touches the right btreesKent Overstreet
More asserts, more better. Also, clean up the per-btree flags a bit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: bch2_inum_path() now crosses subvolumes correctlyKent Overstreet
The dirent that points to a subvolume root is in the parent subvolume. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: bch2_inum_path() no longer returns an error for disconnected inumsKent Overstreet
bch2_inum_path() should work even if the filesystem is corrupted - we don't want it to cause fsck to fail. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: btree_path_very_locks(): verify lock seqKent Overstreet
If the btree_path's lock seq is wrong, the next bch2_trans_relock() operation is guaranteed to fail and we take an unnecessary transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: fix bch2_btree_key_cache_drop()Kent Overstreet
When evicting, we shouldn't leave a pointer to the key cache entry lying around - that screws up btree path asserts we're adding. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: bch2_btree_node_write_trans()Kent Overstreet
Avoiding screwing up path->lock_seq. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Fixes for snapshot_tree.master_subvolKent Overstreet
Ensure that snapshot_tree.master_subvol is cleared when we delete the master subvolume in a tree of snapshots, and allow for snapshot trees that don't have a master subvolume in fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Don't rely on snapshot_tree.master_subvol for reattachingKent Overstreet
Previously, fsck used the snapshot tree's master subvol for finding the root inode number - but the master subvol might have been deleting, and setting a new one should be a user operation; meaning we can't rely on it existing. Fortunately, for finding the root inode number in a tree of snapshots, finding any associated subvolume works. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: bch2_kvmalloc()Kent Overstreet
Add a version of kvmalloc() that doesn't have the INT_MAX limit; large filesystems do hit this. We'll want to get rid of the in-memory bucket gens array, but we're not there quite yet. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Fix assert for online fsckKent Overstreet
We can't check if we're racing with fsck ending until mark_lock is held. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Handle -BCH_ERR_need_mark_replicas in gcKent Overstreet
Locking considerations (possibly no longer relevant?) mean that when an accounting update needs a new superblock replicas entry to be created, it's deferred to the transaction commit error path. But accounting updates for gc/fcsk aren't done from the transaction commit path - so we need to handle -BCH_ERR_btree_insert_need_mark_replicas locally. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Write lock btree node in key cache fillsKent Overstreet
this addresses a key cache coherency bug Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: kill __bch2_btree_iter_flags()Kent Overstreet
bch2_btree_iter_flags() now takes a level parameter; this fixes a bug where using a node iterator on a leaf wouldn't set BTREE_ITER_with_key_cache, leading to fun cache coherency bugs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Drop redundant "read error" call from btree_gcKent Overstreet
The btree node read error path already calls topology error, so this is entirely redundant, and we're not specific enough about our error codes - this was triggering for bucket_ref_update() errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Drop racy warningKent Overstreet
Checking for writing past i_size after unlocking the folio and clearing the dirty bit is racy, and we already check it at the start. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: better check_bp_exists() error messageKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: add counter_flags for countersHongbo Li
In bcachefs, io_read and io_write counter record the amount of data which has been read and written. They increase in unit of sector, so to display correctly, they need to be shifted to the left by the size of a sector. Other counters like io_move, move_extent_{read, write, finish} also have this problem. In order to support different unit, we add extra column to mark the counter type by using TYPE_COUNTER and TYPE_SECTORS in BCH_PERSISTENT_COUNTERS(). Fixes: 1c6fdbd8f246 ("bcachefs: Initial commit") Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: bcachefs_metadata_version_autofix_errorsKent Overstreet
It's time to make self healing the default: change the error action for old filesystems to fix_safe, matching the default for current filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: bcachefs_metadata_version_persistent_inode_cursorsKent Overstreet
Persistent cursors for inode allocation. A free inodes btree would add substantial overhead to inode allocation and freeing - a "next num to allocate" cursor is always going to be faster. We just need it to be persistent, to avoid scanning the inodes btree from the start on startup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09Merge tag '6.13-rc6-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: "Four ksmbd server fixes, most also for stable: - fix for reporting special file type more accurately when POSIX extensions negotiated - minor cleanup - fix possible incorrect creation path when dirname is not present. In some cases, Windows apps create files without checking if they exist. - fix potential NULL pointer dereference sending interim response" * tag '6.13-rc6-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: Implement new SMB3 POSIX type ksmbd: fix unexpectedly changed path in ksmbd_vfs_kern_path_locked ksmbd: Remove unneeded if check in ksmbd_rdma_capable_netdev() ksmbd: fix a missing return value check bug
2025-01-09Merge tag 'for-6.13-rc6-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more fixes. Besides the one-liners in Btrfs there's fix to the io_uring and encoded read integration (added in this development cycle). The update to io_uring provides more space for the ongoing command that is then used in Btrfs to handle some cases. - io_uring and encoded read: - provide stable storage for io_uring command data - make a copy of encoded read ioctl call, reuse that in case the call would block and will be called again - properly initialize zlib context for hardware compression on s390 - fix max extent size calculation on filesystems with non-zoned devices - fix crash in scrub on crafted image due to invalid extent tree" * tag 'for-6.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zlib: fix avail_in bytes for s390 zlib HW compression path btrfs: zoned: calculate max_extent_size properly on non-zoned setup btrfs: avoid NULL pointer dereference if no valid extent tree btrfs: don't read from userspace twice in btrfs_uring_encoded_read() io_uring: add io_uring_cmd_get_async_data helper io_uring/cmd: add per-op data to struct io_uring_cmd_data io_uring/cmd: rename struct uring_cache to io_uring_cmd_data
2025-01-09afs: Fix merge preference rule failure conditionLizhi Xu
syzbot reported a lock held when returning to userspace[1]. This is because if argc is less than 0 and the function returns directly, the held inode lock is not released. Fix this by store the error in ret and jump to done to clean up instead of returning directly. [dh: Modified Lizhi Xu's original patch to make it honour the error code from afs_split_string()] [1] WARNING: lock held when returning to user space! 6.13.0-rc3-syzkaller-00209-g499551201b5f #0 Not tainted ------------------------------------------------ syz-executor133/5823 is leaving the kernel with locks still held! 1 lock held by syz-executor133/5823: #0: ffff888071cffc00 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: inode_lock include/linux/fs.h:818 [inline] #0: ffff888071cffc00 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: afs_proc_addr_prefs_write+0x2bb/0x14e0 fs/afs/addr_prefs.c:388 Reported-by: syzbot+76f33569875eb708e575@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=76f33569875eb708e575 Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241226012616.2348907-1-lizhi.xu@windriver.com/ Link: https://lore.kernel.org/r/529850.1736261552@warthog.procyon.org.uk Tested-by: syzbot+76f33569875eb708e575@syzkaller.appspotmail.com cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09netfs: Fix read-retry for fs with no ->prepare_read()David Howells
Fix netfslib's read-retry to only call ->prepare_read() in the backing filesystem such a function is provided. We can get to this point if a there's an active cache as failed reads from the cache need negotiating with the server instead. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/529329.1736261010@warthog.procyon.org.uk cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09netfs: Fix kernel async DIODavid Howells
Netfslib needs to be able to handle kernel-initiated asynchronous DIO that is supplied with a bio_vec[] array. Currently, because of the async flag, this gets passed to netfs_extract_user_iter() which throws a warning and fails because it only handles IOVEC and UBUF iterators. This can be triggered through a combination of cifs and a loopback blockdev with something like: mount //my/cifs/share /foo dd if=/dev/zero of=/foo/m0 bs=4K count=1K losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0 echo hello >/dev/loop2046 This causes the following to appear in syslog: WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 netfs_extract_user_iter+0x170/0x250 [netfs] and the write to fail. Fix this by removing the check in netfs_unbuffered_write_iter_locked() that causes async kernel DIO writes to be handled as userspace writes. Note that this change relies on the kernel caller maintaining the existence of the bio_vec array (or kvec[] or folio_queue) until the op is complete. Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support") Reported-by: Nicolas Baranger <nicolas.baranger@3xo.fr> Closes: https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/608725.1736275167@warthog.procyon.org.uk Tested-by: Nicolas Baranger <nicolas.baranger@3xo.fr> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Steve French <smfrench@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09Merge tag 'vfs-6.14-rc7.mount.fixes'Christian Brauner
Bring in the fix for the mount namespace rbtree. It is used as the base for the vfs mount work for this cycle and so shouldn't be applied directly. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: remove useless lockdep assertionChristian Brauner
mnt_ns_release() can run asynchronously via call_rcu() so hitting that lockdep assertion means someone else already grabbed the mnt_ns_tree_lock and causes a false positive. That assertion has likely always been wrong. call_rcu() just makes it more likely to hit. Link: https://lore.kernel.org/r/Z2PlT5rcRTIhCpft@ly-workstation Link: https://lore.kernel.org/r/20241219-darben-quietschen-b6e1d80327bb@brauner Reported-by: Lai, Yi <yi1.lai@linux.intel.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: use xarray for old mount idChristian Brauner
While the ida does use the xarray internally we can use it explicitly which allows us to increment the unique mount id under the xa lock. This allows us to remove the atomic as we're now allocating both ids in one go. Link: https://lore.kernel.org/r/20241217-erhielten-regung-44bb1604ca8f@brauner Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: cache first and last mountChristian Brauner
Speed up listmount() by caching the first and last node making retrieval of the first and last mount of each mount namespace O(1). Link: https://lore.kernel.org/r/20241215-vfs-6-14-mount-work-v1-2-fd55922c4af8@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: simplify rwlock to spinlockChristian Brauner
We're not taking the read_lock() anymore now that all lookup is lockless. Just use a simple spinlock. Link: https://lore.kernel.org/r/20241213-work-mount-rbtree-lockless-v3-6-6e3cdaf9b280@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: lockless mntns lookup for nsfsChristian Brauner
We already made the rbtree lookup lockless for the simple lookup case. However, walking the list of mount namespaces via nsfs still happens with taking the read lock blocking concurrent additions of new mount namespaces pointlessly. Plus, such additions are rare anyway so allow lockless lookup of the previous and next mount namespace by keeping a separate list. This also allows to make some things simpler in the code. Link: https://lore.kernel.org/r/20241213-work-mount-rbtree-lockless-v3-5-6e3cdaf9b280@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: lockless mntns rbtree lookupChristian Brauner
Currently we use a read-write lock but for the simple search case we can make this lockless. Creating a new mount namespace is a rather rare event compared with querying mounts in a foreign mount namespace. Once this is picked up by e.g., systemd to list mounts in another mount in it's isolated services or in containers this will be used a lot so this seems worthwhile doing. Link: https://lore.kernel.org/r/20241213-work-mount-rbtree-lockless-v3-3-6e3cdaf9b280@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: add mount namespace to rbtree lateChristian Brauner
There's no point doing that under the namespace semaphore it just gives the false impression that it protects the mount namespace rbtree and it simply doesn't. Link: https://lore.kernel.org/r/20241213-work-mount-rbtree-lockless-v3-2-6e3cdaf9b280@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09mount: remove inlude/nospec.h includeChristian Brauner
It's not needed, so remove it. Link: https://lore.kernel.org/r/20241213-work-mount-rbtree-lockless-v3-1-6e3cdaf9b280@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: prepend statmount.mnt_opts string with security_sb_mnt_opts()Jeff Layton
Currently these mount options aren't accessible via statmount(). The read handler for /proc/#/mountinfo calls security_sb_show_options() to emit the security options after emitting superblock flag options, but before calling sb->s_op->show_options. Have statmount_mnt_opts() call security_sb_show_options() before calling ->show_options. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20241115-statmount-v2-2-cd29aeff9cbb@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: kill MNT_ONRBChristian Brauner
Move mnt->mnt_node into the union with mnt->mnt_rcu and mnt->mnt_llist instead of keeping it with mnt->mnt_list. This allows us to use RB_CLEAR_NODE(&mnt->mnt_node) in umount_tree() as well as list_empty(&mnt->mnt_node). That in turn allows us to remove MNT_ONRB. This also fixes the bug reported in [1] where seemingly MNT_ONRB wasn't set in @mnt->mnt_flags even though the mount was present in the mount rbtree of the mount namespace. The root cause is the following race. When a btrfs subvolume is mounted a temporary mount is created: btrfs_get_tree_subvol() { mnt = fc_mount() // Register the newly allocated mount with sb->mounts: lock_mount_hash(); list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts); unlock_mount_hash(); } and registered on sb->s_mounts. Later it is added to an anonymous mount namespace via mount_subvol(): -> mount_subvol() -> mount_subtree() -> alloc_mnt_ns() mnt_add_to_ns() vfs_path_lookup() put_mnt_ns() The mnt_add_to_ns() call raises MNT_ONRB in @mnt->mnt_flags. If someone concurrently does a ro remount: reconfigure_super() -> sb_prepare_remount_readonly() { list_for_each_entry(mnt, &sb->s_mounts, mnt_instance) { } all mounts registered in sb->s_mounts are visited and first MNT_WRITE_HOLD is raised, then MNT_READONLY is raised, and finally MNT_WRITE_HOLD is removed again. The flag modification for MNT_WRITE_HOLD/MNT_READONLY and MNT_ONRB race so MNT_ONRB might be lost. Fixes: 2eea9ce4310d ("mounts: keep list of mounts in an rbtree") Cc: <stable@kernel.org> # v6.8+ Link: https://lore.kernel.org/r/20241215-vfs-6-14-mount-work-v1-1-fd55922c4af8@kernel.org Link: https://lore.kernel.org/r/ec6784ed-8722-4695-980a-4400d4e7bd1a@gmx.com [1] Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09xfs: report larger dio alignment for COW inodesChristoph Hellwig
For I/O to reflinked blocks we always need to write an entire new file system block, and the code enforces the file system block alignment for the entire file if it has any reflinked blocks. Mirror the larger value reported in the statx in the dio_offset_align in the xfs-specific XFS_IOC_DIOINFO ioctl for the same reason. Don't bother adding a new field for the read alignment to this legacy ioctl as all new users should use statx instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250109083109.1441561-6-hch@lst.de Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09xfs: report the correct read/write dio alignment for reflinked inodesChristoph Hellwig
For I/O to reflinked blocks we always need to write an entire new file system block, and the code enforces the file system block alignment for the entire file if it has any reflinked blocks. Use the new STATX_DIO_READ_ALIGN flag to report the asymmetric read vs write alignments for reflinked files. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250109083109.1441561-5-hch@lst.de Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09xfs: cleanup xfs_vn_getattrChristoph Hellwig
Split the two bits of optional statx reporting into their own helpers so that they are self-contained instead of deeply indented in the main getattr handler. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250109083109.1441561-4-hch@lst.de Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09fs: add STATX_DIO_READ_ALIGNChristoph Hellwig
Add a separate dio read align field, as many out of place write file systems can easily do reads aligned to the device sector size, but require bigger alignment for writes. This is usually papered over by falling back to buffered I/O for smaller writes and doing read-modify-write cycles, but performance for this sucks, so applications benefit from knowing the actual write alignment. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250109083109.1441561-3-hch@lst.de Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09iomap: avoid avoid truncating 64-bit offset to 32 bitsMarco Nelissen
on 32-bit kernels, iomap_write_delalloc_scan() was inadvertently using a 32-bit position due to folio_next_index() returning an unsigned long. This could lead to an infinite loop when writing to an xfs filesystem. Signed-off-by: Marco Nelissen <marco.nelissen@gmail.com> Link: https://lore.kernel.org/r/20250109041253.2494374-1-marco.nelissen@gmail.com Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>