summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2024-03-05btrfs: tree-checker: dump the page status if hit something wrongQu Wenruo
[BUG] There is a bug report about very suspicious tree-checker got triggered: BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] SELinux: inode_doinit_use_xattr: getxattr returned 117 for dev=dm-0 ino=5737268 [ANALYZE] The root cause is still unclear, but there are some clues already: - Unaligned eb bytenr The block bytenr is 8550954455682405139, which is not even aligned to 2. This bytenr is fetched from extent buffer header, not from eb->start. This means, at the initial time of read, eb header bytenr is still correct (the very basis check to continue read), but later something wrong happened, got at least the first page corrupted. Thus we got such obviously incorrect value. - Invalid extent buffer header owner The read itself is triggered for subvolume 256, but the eb header owner is 11858205567642294356, which is not really possible. The problem here is, subvolume id is limited to (1 << 48 - 1), and this one definitely goes beyond that limit. So this value is another garbage. We already got two garbage from an extent buffer, which passed the initial bytenr and csum checks, but later the contents become garbage at some point. This looks like a page lifespan problem (e.g. we didn't properly hold the page). [ENHANCEMENT] The current tree-checker only outputs things from the extent buffer, nothing with the page status. So this patch would enhance the tree-checker output by also dumping the first page, which would look like this: page:00000000aa9f3ce8 refcount:4 mapcount:0 mapping:00000000169aa6b6 index:0x1d0c pfn:0x1022e5 memcg:ffff888103456000 aops:btree_aops [btrfs] ino:1 flags: 0x2ffff0000008000(private|node=0|zone=2|lastcpupid=0xffff) page_type: 0xffffffff() raw: 02ffff0000008000 0000000000000000 dead000000000122 ffff88811e06e220 raw: 0000000000001d0c ffff888102fdb1d8 00000004ffffffff ffff888103456000 page dumped because: eb page dump BTRFS critical (device dm-3): corrupt leaf: root=5 block=30457856 slot=6 ino=257 file_offset=0, invalid disk_bytenr for file extent, have 10617606235235216665, should be aligned to 4096 BTRFS error (device dm-3): read time tree block corruption detected on logical 30457856 mirror 1 From the dump we can see some extra info, something can help us to do extra cross-checks: - Page refcount if it's too low, it definitely means something bad. - Page aops Any mapped eb page should have btree_aops with inode number 1. - Page index Since a mapped eb page should has its bytenr matching the page position, (index << PAGE_SHIFT) should match the bytenr of the bytenr from the critical line. - Page Private flags A mapped eb page should have Private flag set to indicate it's managed by btrfs. Link: https://lore.kernel.org/linux-btrfs/CAHk-=whNdMaN9ntZ47XRKP6DBes2E5w7fi-0U3H2+PS18p+Pzw@mail.gmail.com/ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05btrfs: compression: remove dead comments in btrfs_compress_heuristic()Qu Wenruo
Since commit a440d48c7f93 ("Btrfs: heuristic: implement sampling logic"), btrfs_compress_heuristic() is no longer a simple "return true", but more complex to determine if we should compress. Thus the comment is dead and can be confusing, just remove it. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05btrfs: subpage: make writer lock utilize bitmapQu Wenruo
For the writer counter, it's pretty much the same as the reader counter, and they are exclusive. So move them to the new locked bitmap. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05btrfs: subpage: make reader lock utilize bitmapQu Wenruo
Currently btrfs_subpage utilizes its atomic member @reader to manage the reader counter. However it is only utilized to prevent the page to be released/unlocked when we still have reads underway. In that use case, we don't really allow multiple readers on the same subpage sector. So here we can introduce a new locked bitmap to represent exactly which subpage range is locked for read. In theory we can remove btrfs_subpage::reader as it's just the set bits of the new locked bitmap. But unfortunately bitmap doesn't provide such handy API yet, so we still keep the reader counter. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05btrfs: unexport btrfs_subpage_start_writer() and ↵Qu Wenruo
btrfs_subpage_end_and_test_writer() Both functions were introduced in commit 1e1de38792e0 ("btrfs: make process_one_page() to handle subpage locking"), but they have never been utilized out of subpage code. So just unexport them. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05btrfs: pass a valid extent map cache pointer to __get_extent_map()David Sterba
We can pass a valid em cache pointer down to __get_extent_map() and drop the validity check. This avoids the special case, the call stacks are simple: btrfs_read_folio btrfs_do_readpage __get_extent_map extent_readahead contiguous_readpages btrfs_do_readpage __get_extent_map Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05NFSD: send OP_CB_RECALL_ANY to clients when number of delegations reaches ↵Dai Ngo
its limit The NFS server should ask clients to voluntarily return unused delegations when the number of granted delegations reaches the max_delegations. This is so that the server can continue to grant delegations for new requests. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Chen Hanxiao <chenhx.fnst@fujitsu.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-05NFSD: Document nfsd_setattr() fill-attributes behaviorChuck Lever
Add an explanation to prevent the future removal of the fill- attribute call sites in nfsd_setattr(). Some NFSv3 client implementations don't behave correctly if wcc data is not present in an NFSv3 SETATTR reply. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-05udf: remove SLAB_MEM_SPREAD flag usageChengming Zhou
The SLAB_MEM_SPREAD flag is already a no-op after removal of SLAB allocator and in [1] it was fully deprecated. Remove its usage so we can delete it from slab. No functional change. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/ Message-Id: <20240224135229.830356-1-chengming.zhou@linux.dev>
2024-03-05quota: remove SLAB_MEM_SPREAD flag usageChengming Zhou
The SLAB_MEM_SPREAD flag is already a no-op after removal of SLAB allocator and in [1] it was fully deprecated. Remove its usage so we can delete it from slab. No functional change. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/ Message-Id: <20240224135118.830073-1-chengming.zhou@linux.dev>
2024-03-05isofs: remove SLAB_MEM_SPREAD flag usageChengming Zhou
The SLAB_MEM_SPREAD flag is already a no-op after removal of SLAB allocator and in [1] it was fully deprecated. Remove its usage so we can delete it from slab. No functional change. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/ Message-Id: <20240224134901.829591-1-chengming.zhou@linux.dev>
2024-03-05ext2: remove SLAB_MEM_SPREAD flag usageChengming Zhou
The SLAB_MEM_SPREAD flag is already a no-op after removal of SLAB allocator and in [1] it was fully deprecated. Remove its usage so we can delete it from slab. No functional change. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/ Message-Id: <20240224134816.829424-1-chengming.zhou@linux.dev>
2024-03-05fuse: Convert fuse_writepage_locked to take a folioMatthew Wilcox (Oracle)
The one remaining caller of fuse_writepage_locked() already has a folio, so convert this function entirely. Saves a few calls to compound_head() but no attempt is made to support large folios in this patch. Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: Remove fuse_writepageMatthew Wilcox (Oracle)
The writepage operation is deprecated as it leads to worse performance under high memory pressure due to folios being written out in LRU order rather than sequentially within a file. Use filemap_migrate_folio() to support dirty folio migration instead of writepage. Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05virtio_fs: remove duplicate check if queue is brokenLi RongQing
virtqueue_enable_cb() will call virtqueue_poll() which will check if queue is broken at beginning, so remove the virtqueue_is_broken() call Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: use FUSE_ROOT_ID in fuse_get_root_inode()Miklos Szeredi
...when calling fuse_iget(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: don't unhash rootMiklos Szeredi
The root inode is assumed to be always hashed. Do not unhash the root inode even if it is marked BAD. Fixes: 5d069dbe8aaf ("fuse: fix bad inode") Cc: <stable@vger.kernel.org> # v5.11 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: fix root lookup with nonzero generationMiklos Szeredi
The root inode has a fixed nodeid and generation (1, 0). Prior to the commit 15db16837a35 ("fuse: fix illegal access to inode with reused nodeid") generation number on lookup was ignored. After this commit lookup with the wrong generation number resulted in the inode being unhashed. This is correct for non-root inodes, but replacing the root inode is wrong and results in weird behavior. Fix by reverting to the old behavior if ignoring the generation for the root inode, but issuing a warning in dmesg. Reported-by: Antonio SJ Musumeci <trapexit@spawn.link> Closes: https://lore.kernel.org/all/CAOQ4uxhek5ytdN8Yz2tNEOg5ea4NkBb4nk0FGPjPk_9nz-VG3g@mail.gmail.com/ Fixes: 15db16837a35 ("fuse: fix illegal access to inode with reused nodeid") Cc: <stable@vger.kernel.org> # v5.14 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: replace remaining make_bad_inode() with fuse_make_bad()Miklos Szeredi
fuse_do_statx() was added with the wrong helper. Fixes: d3045530bdd2 ("fuse: implement statx") Cc: <stable@vger.kernel.org> # v6.6 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05virtiofs: drop __exit from virtio_fs_sysfs_exit()Stefan Hajnoczi
virtio_fs_sysfs_exit() is called by: - static int __init virtio_fs_init(void) - static void __exit virtio_fs_exit(void) Remove __exit from virtio_fs_sysfs_exit() since virtio_fs_init() is not an __exit function. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402270649.GYjNX0yw-lkp@intel.com/ Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: implement passthrough for mmapAmir Goldstein
An mmap request for a file open in passthrough mode, maps the memory directly to the backing file. An mmap of a file in direct io mode, usually uses cached mmap and puts the inode in caching io mode, which denies new passthrough opens of that inode, because caching io mode is conflicting with passthrough io mode. For the same reason, trying to mmap a direct io file, while there is a passthrough file open on the same inode will fail with -ENODEV. An mmap of a file in direct io mode, also needs to wait for parallel dio writes in-progress to complete. If a passthrough file is opened, while an mmap of another direct io file is waiting for parallel dio writes to complete, the wait is aborted and mmap fails with -ENODEV. A FUSE server that uses passthrough and direct io opens on the same inode that may also be mmaped, is advised to provide a backing fd also for the files that are open in direct io mode (i.e. use the flags combination FOPEN_DIRECT_IO | FOPEN_PASSTHROUGH), so that mmap will always use the backing file, even if read/write do not passthrough. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: implement splice read/write passthroughAmir Goldstein
This allows passing fstests generic/249 and generic/591. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: implement read/write passthroughAmir Goldstein
Use the backing file read/write helpers to implement read/write passthrough to a backing file. After read/write, we invalidate a/c/mtime/size attributes. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: implement open in passthrough modeAmir Goldstein
After getting a backing file id with FUSE_DEV_IOC_BACKING_OPEN ioctl, a FUSE server can reply to an OPEN request with flag FOPEN_PASSTHROUGH and the backing file id. The FUSE server should reuse the same backing file id for all the open replies of the same FUSE inode and open will fail (with -EIO) if a the server attempts to open the same inode with conflicting io modes or to setup passthrough to two different backing files for the same FUSE inode. Using the same backing file id for several different inodes is allowed. Opening a new file with FOPEN_DIRECT_IO for an inode that is already open for passthrough is allowed, but only if the FOPEN_PASSTHROUGH flag and correct backing file id are specified as well. The read/write IO of such files will not use passthrough operations to the backing file, but mmap, which does not support direct_io, will use the backing file insead of using the page cache as it always did. Even though all FUSE passthrough files of the same inode use the same backing file as a backing inode reference, each FUSE file opens a unique instance of a backing_file object to store the FUSE path that was used to open the inode and the open flags of the specific open file. The per-file, backing_file object is released along with the FUSE file. The inode associated fuse_backing object is released when the last FUSE passthrough file of that inode is released AND when the backing file id is closed by the server using the FUSE_DEV_IOC_BACKING_CLOSE ioctl. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: prepare for opening file in passthrough modeAmir Goldstein
In preparation for opening file in passthrough mode, store the fuse_open_out argument in ff->args to be passed into fuse_file_io_open() with the optional backing_id member. This will be used for setting up passthrough to backing file on open reply with FOPEN_PASSTHROUGH flag and a valid backing_id. Opening a file in passthrough mode may fail for several reasons, such as missing capability, conflicting open flags or inode in caching mode. Return EIO from fuse_file_io_open() in those cases. The combination of FOPEN_PASSTHROUGH and FOPEN_DIRECT_IO is allowed - it mean that read/write operations will go directly to the server, but mmap will be done to the backing file. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fuse: implement ioctls to manage backing filesAmir Goldstein
FUSE server calls the FUSE_DEV_IOC_BACKING_OPEN ioctl with a backing file descriptor. If the call succeeds, a backing file identifier is returned. A later change will be using this backing file id in a reply to OPEN request with the flag FOPEN_PASSTHROUGH to setup passthrough of file operations on the open FUSE file to the backing file. The FUSE server should call FUSE_DEV_IOC_BACKING_CLOSE ioctl to close the backing file by its id. This can be done at any time, but if an open reply with FOPEN_PASSTHROUGH flag is still in progress, the open may fail if the backing file is closed before the fuse file was opened. Setting up backing files requires a server with CAP_SYS_ADMIN privileges. For the backing file to be successfully setup, the backing file must implement both read_iter and write_iter file operations. The limitation on the level of filesystem stacking allowed for the backing file is enforced before setting up the backing file. Signed-off-by: Alessio Balsini <balsini@android.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-03-05fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversionBart Van Assche
The first kiocb_set_cancel_fn() argument may point at a struct kiocb that is not embedded inside struct aio_kiocb. With the current code, depending on the compiler, the req->ki_ctx read happens either before the IOCB_AIO_RW test or after that test. Move the req->ki_ctx read such that it is guaranteed that the IOCB_AIO_RW test happens first. Reported-by: Eric Biggers <ebiggers@kernel.org> Cc: Benjamin LaHaise <ben@communityfibre.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Sandeep Dhavale <dhavale@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: stable@vger.kernel.org Fixes: b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240304235715.3790858-1-bvanassche@acm.org Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-05Revert "fs/aio: Make io_cancel() generate completions again"Bart Van Assche
Patch "fs/aio: Make io_cancel() generate completions again" is based on the assumption that calling kiocb->ki_cancel() does not complete R/W requests. This is incorrect: the two drivers that call kiocb_set_cancel_fn() callers set a cancellation function that calls usb_ep_dequeue(). According to its documentation, usb_ep_dequeue() calls the completion routine with status -ECONNRESET. Hence this revert. Cc: Benjamin LaHaise <ben@communityfibre.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Sandeep Dhavale <dhavale@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: stable@vger.kernel.org Reported-by: syzbot+b91eb2ed18f599dd3c31@syzkaller.appspotmail.com Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240304182945.3646109-1-bvanassche@acm.org Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-04f2fs: introduce SEGS_TO_BLKS/BLKS_TO_SEGS for cleanupChao Yu
Just cleanup, no functional change. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: fix to check return value of f2fs_gc_rangeZhiguo Niu
f2fs_gc_range may return error, so its caller f2fs_allocate_pinning_section should determine whether to do retry based on ist return value. Also just do f2fs_gc_range when f2fs_allocate_new_section return -EAGAIN, and check cp error case in f2fs_gc_range. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: fix to check return value __allocate_new_segmentZhiguo Niu
__allocate_new_segment may return error when get_new_segment fails, so its caller should check its return value. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: fix to do sanity check in update_sit_entryZhiguo Niu
If GET_SEGNO return NULL_SEGNO for some unecpected case, update_sit_entry will access invalid memory address, cause system crash. It is better to do sanity check about GET_SEGNO just like update_segment_mtime & locate_dirty_segment. Also remove some redundant judgment code. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: fix to reset fields for unloaded cursegChao Yu
In f2fs_allocate_data_block(), before skip allocating new segment for DATA_PINNED log header, it needs to tag log header as unloaded one to avoid skipping logic in locate_dirty_segment() and __f2fs_save_inmem_curseg(). Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: clean up new_curseg()Chao Yu
Move f2fs_valid_pinned_area() check logic from new_curseg() to get_new_segment(), it can avoid calling __set_free() if it fails to find free segment in conventional zone for pinned file. Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: relocate f2fs_precache_extents() in f2fs_swap_activate()Chao Yu
This patch exchangs position of f2fs_precache_extents() and filemap_fdatawrite(), so that f2fs_precache_extents() can load extent info after physical addresses of all data are fixed. Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: fix blkofs_end correctly in f2fs_migrate_blocks()Chao Yu
In f2fs_migrate_blocks(), when traversing blocks in last section, blkofs_end should be (start_blk + blkcnt - 1) % blk_per_sec, fix it. Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: ro: don't start discard thread for readonly imageChao Yu
[ 9299.893835] F2FS-fs (vdd): Allow to mount readonly mode only mount: /mnt/f2fs: WARNING: source write-protected, mounted read-only. root@qemu:/ ps -ef|grep f2fs root 94 2 0 03:46 ? 00:00:00 [kworker/u17:0-f2fs_post_read_wq] root 6282 2 0 06:21 ? 00:00:00 [f2fs_discard-253:48] There will be no deletion in readonly image, let's skip starting discard thread to save system resources. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: ro: compress: fix to avoid caching unaligned extentChao Yu
Mapping info from dump.f2fs: i_addr[0x2d] cluster flag [0xfffffffe : 4294967294] i_addr[0x2e] [0x 10428 : 66600] i_addr[0x2f] [0x 10429 : 66601] i_addr[0x30] [0x 1042a : 66602] f2fs_io fiemap 37 1 /mnt/f2fs/disk-58390c8c.raw Previsouly, it missed to align fofs and ofs_in_node to cluster_size, result in adding incorrect read extent cache, fix it. Before: f2fs_update_read_extent_tree_range: dev = (253,48), ino = 5, pgofs = 37, len = 4, blkaddr = 66600, c_len = 3 After: f2fs_update_read_extent_tree_range: dev = (253,48), ino = 5, pgofs = 36, len = 4, blkaddr = 66600, c_len = 3 Fixes: 94afd6d6e525 ("f2fs: extent cache: support unaligned extent") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: fix to check return value in f2fs_insert_range()Chao Yu
In f2fs_insert_range(), it missed to check return value of filemap_write_and_wait_range(), fix it. Meanwhile, just return error number once __exchange_data_block() fails. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: fix to use correct segment type in f2fs_allocate_data_block()Chao Yu
@type in f2fs_allocate_data_block() indicates log header's type, it can be CURSEG_COLD_DATA_PINNED or CURSEG_ALL_DATA_ATGC, rather than type of data/node, however IS_DATASEG()/IS_NODESEG() only accept later one, fix it. Fixes: 093749e296e2 ("f2fs: support age threshold based garbage collection") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04Merge tag 'vfs-6.9.rw_hint' of ↵Christian Brauner
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull write hint fix from Christian Brauner: UFS devices are widely used in mobile applications, e.g. in smartphones. UFS vendors need data lifetime information to achieve good performance. Providing data lifetime information to UFS devices can result in up to 40% lower write amplification. Hence this patch series that restores the bi_write_hint member in struct bio. After this patch series has been merged, patches that implement data lifetime support in the SCSI disk (sd) driver will be sent to the Linux kernel SCSI maintainer. The following changes are included in this patch series: - Improvements for the F_GET_RW_HINT and F_SET_RW_HINT fcntls. - Move enum rw_hint into a new header file. - Support F_SET_RW_HINT for block devices to make it easy to test data lifetime support. - Restore the bio.bi_write_hint member and restore support in the VFS layer and also in the block layer for data lifetime information. The shell script that has been used to test the patch series combined with the SCSI patches is available at the end of this cover letter. * tag 'vfs-6.9.rw_hint' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: block, fs: Restore the per-bio/request data lifetime fields fs: Propagate write hints to the struct block_device inode fs: Move enum rw_hint into a new header file fs: Split fcntl_rw_hint() fs: Verify write lifetime constants at compile time fs: Fix rw_hint validation Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-04f2fs: allow to mount if cap is 100Jaegeuk Kim
Don't block mounting the partition, if cap is 100%. Reviewed-by: Chao Yu <chao@kernel.org> Reviewed-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04f2fs: print zone status in string and some logJaegeuk Kim
No functional change, but add some more logs. Note, it includes the spelling mistakes pointed by Colin Ian King. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04btrfs: merge btrfs_del_delalloc_inode() helpersDavid Sterba
The helpers btrfs_del_delalloc_inode() and __btrfs_del_delalloc_inode() don't follow the pattern when the "__" helper does a special case and are in fact reversed regarding the naming. We can merge them into one as there's only one place that needs to be open coded. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: pass btrfs_device to btrfs_scratch_superblocks()David Sterba
Replace the two parameters bdev and name by one that can be used to get them both. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle transaction commit errors in flush_reservations()David Sterba
Other errors in flush_reservations() are handled and also in the caller. Ignoring commit might make some sense as it's called right after join so it's to poke the whole commit machinery to free space. However for consistency return the error. The caller btrfs_quota_disable() would try to start the transaction which would in turn fail too so there's no effective change. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: use KMEM_CACHE() to create btrfs_free_space cacheKunwu Chan
Use the KMEM_CACHE() macro instead of kmem_cache_create() to simplify the creation of SLAB caches when the default values are used. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: use KMEM_CACHE() to create delayed ref cachesKunwu Chan
Use the KMEM_CACHE() macro instead of kmem_cache_create() to simplify the creation of SLAB caches related to delayed refs when the default values are used. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: use KMEM_CACHE() to create btrfs_path cacheKunwu Chan
Use the KMEM_CACHE() macro instead of kmem_cache_create() to simplify the creation of SLAB caches when the default values are used. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: use KMEM_CACHE() to create btrfs_trans_handle cacheKunwu Chan
Use the KMEM_CACHE() macro instead of kmem_cache_create() to simplify the creation of SLAB caches when the default values are used. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>