Age | Commit message (Collapse) | Author |
|
Convert f2fs_get_new_data_page() into f2fs_get_new_data_folio() and
add a f2fs_get_new_data_page() wrapper.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
All callers have now been converted to call f2fs_find_data_folio().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
All callers have now been converted to call f2fs_get_sum_folio()
instead.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert f2fs_get_sum_page() to f2fs_get_sum_folio() and add a
f2fs_get_sum_page() wrapper.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Also convert get_current_nat_page() to get_current_nat_folio().
Removes three hidden calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
All callers have now been converted to f2fs_get_meta_folio() so
we can remove this wrapper.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert f2fs_get_meta_page() to f2fs_get_meta_folio() and add
f2fs_get_meta_page() as a wrapper.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
All callers have now been converted to f2fs_grab_meta_folio() so
we can remove this wrapper.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Remove a call to find_get_page(). Saves two hidden calls to
compound_head(). Change f2fs_folio_put() to check for IS_ERR_OR_NULL
to handle the case where we got an error pointer back from
filemap_get_folio().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert all the callers to receive a folio. Removes a lot of
hidden calls to compound_head() in f2fs_put_page().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
The only caller which passes a page already has a folio, so pass
it in.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Turn f2fs_grab_meta_page() into a wrapper around f2fs_grab_meta_folio().
Saves three hidden calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Iterate over each folio rather than each page. Convert
f2fs_compress_control_page() to f2fs_compress_control_folio() since
this is the only caller. Removes a reference to page->mapping which
is going away soon as well as calls to fscrypt_is_bounce_page() and
fscrypt_pagecache_page().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This helper returns the inode associated with the f2fs_io_info. That's a
relatively common thing to want, mildly awkward to get and provides one
place to change if we decide to record it directly, or change fio->page
to fio->folio.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
[Jaegeuk Kim: fix wrong fio_inode conversion]
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
syzbot reported a f2fs bug as below:
------------[ cut here ]------------
kernel BUG at fs/f2fs/f2fs.h:2521!
RIP: 0010:dec_valid_block_count+0x3b2/0x3c0 fs/f2fs/f2fs.h:2521
Call Trace:
f2fs_truncate_data_blocks_range+0xc8c/0x11a0 fs/f2fs/file.c:695
truncate_dnode+0x417/0x740 fs/f2fs/node.c:973
truncate_nodes+0x3ec/0xf50 fs/f2fs/node.c:1014
f2fs_truncate_inode_blocks+0x8e3/0x1370 fs/f2fs/node.c:1197
f2fs_do_truncate_blocks+0x840/0x12b0 fs/f2fs/file.c:810
f2fs_truncate_blocks+0x10d/0x300 fs/f2fs/file.c:838
f2fs_truncate+0x417/0x720 fs/f2fs/file.c:888
f2fs_setattr+0xc4f/0x12f0 fs/f2fs/file.c:1112
notify_change+0xbca/0xe90 fs/attr.c:552
do_truncate+0x222/0x310 fs/open.c:65
handle_truncate fs/namei.c:3466 [inline]
do_open fs/namei.c:3849 [inline]
path_openat+0x2e4f/0x35d0 fs/namei.c:4004
do_filp_open+0x284/0x4e0 fs/namei.c:4031
do_sys_openat2+0x12b/0x1d0 fs/open.c:1429
do_sys_open fs/open.c:1444 [inline]
__do_sys_creat fs/open.c:1522 [inline]
__se_sys_creat fs/open.c:1516 [inline]
__x64_sys_creat+0x124/0x170 fs/open.c:1516
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/syscall_64.c:94
The reason is: in fuzzed image, sbi->total_valid_block_count is
inconsistent w/ mapped blocks indexed by inode, so, we should
not trigger panic for such case, instead, let's print log and
set fsck flag.
Fixes: 39a53e0ce0df ("f2fs: add superblock and major in-memory structure")
Reported-by: syzbot+8b376a77b2f364097fbe@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/67f3c0b2.050a0220.396535.0547.GAE@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
When we update inject type via sysfs, it shows wrong rate value as
below, there is a same problem when we update inject rate, fix it.
Before:
F2FS-fs (vdd): build fault injection attr: rate: 0, type: 0xffff
F2FS-fs (vdd): build fault injection attr: rate: 1, type: 0x0
After:
F2FS-fs (vdd): build fault injection type: 0x1
F2FS-fs (vdd): build fault injection rate: 1
Meanwhile, let's avoid turning on all fault types when we enable fault
injection via fault_injection mount option, it will lead to shutdown
filesystem or fail the mount() easily.
mount -o fault_injection=4 /dev/vdd /mnt/f2fs
F2FS-fs (vdd): build fault injection attr: rate: 4, type: 0x7fffff
F2FS-fs (vdd): inject kmalloc in f2fs_kmalloc of f2fs_fill_super+0xbdf/0x27c0
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch adds a proc entry named inject_stats to show total injected
count for each fault type.
cat /proc/fs/f2fs/<dev>/inject_stats
fault_type injected_count
kmalloc 0
kvmalloc 0
page alloc 0
page get 0
alloc bio(obsolete) 0
alloc nid 0
orphan 0
no more block 0
too big dir depth 0
evict_inode fail 0
truncate fail 0
read IO error 0
checkpoint error 0
discard error 0
write IO error 0
slab alloc 0
dquot initialize 0
lock_op 0
invalid blkaddr 0
inconsistent blkaddr 0
no free segment 0
inconsistent footer 0
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Set LAZYTIME into sbi during parsing, and transfer it to the sb in
fill_super, so that an sb is not required during option parsing.
(Note: While lazytime is normally handled via mount flag in the vfs,
some f2fs users do expect to be able to use it as an explicit mount
option string via the mount syscall, so this option must remain.)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Set INLINECRYPT into sbi during parsing, and transfer it to the sb in
fill_super, so that an sb is not required during option parsing.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
For several zoned storage devices, vendors will provide extra space
which was used for device level GC than specs and F2FS can use this
space for filesystem level GC. To do that, we can reserve the space
using reserved_blocks. However, it is not enough, since this extra
space should not be shown to users. So, with this new sysfs node,
we can hide the space by substracting reserved_blocks from total
bytes.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This reverts commit 94c821fb286b545d37549ff30a0c341e066f0d6c.
It reports that there is potential corruption in node footer,
the most suspious feature is nat_bits, let's revert recovery
related code.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
To simulate inconsistent node footer error.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch introduces a new wrapper f2fs_get_xnode_page(), then, caller
can use it to load xattr block to page cache, meanwhile it will do sanity
check on xattr node footer.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch introduces a new wrapper f2fs_get_inode_page(), then, caller
can use it to load inode block to page cache, meanwhile it will do sanity
check on inode footer.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Introduce a new mount option "nat_bits" to control nat_bits feature,
by default nat_bits feature is disabled.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert f2fs_find_data_page() to f2fs_find_data_folio() and add a
compatibility wrapper. Saves six hidden calls to compound_head().
This was the last caller of f2fs_get_read_data_page(), so remove
the compatibility wrapper.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert f2fs_get_lock_data_page() to f2fs_get_lock_data_folio() and
add a compatibility wrapper. Removes three hidden calls to
compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert f2fs_get_read_data_page() into f2fs_get_read_data_folio() and
add a compatibility wrapper. Saves seven hidden calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Change __get_node_page() to return a folio and convert back to a page in
f2fs_get_node_page() and f2fs_get_node_page_ra().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
All its callers now have access to a folio, so pass it in. Removes
an access to page->mapping.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
The compiler can make some optimisations if we tell it that a function
call doesn't change this memory.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert f2fs_grab_cache_page() into f2fs_grab_cache_folio()
and add a wrapper. Removes several calls to deprecated functions.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert f2fs_put_page() to f2fs_folio_put() and add a wrapper.
Replaces three calls to compound_head() with one.
[Jaegeuk Kim: fix missing null pointer check in f2fs_put_page]
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert f2fs_wait_on_page_writeback() to f2fs_folio_wait_writeback()
and add a compatibiility wrapper. Replaces five calls to
compound_head() with one.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
1. fadvise(fd1, POSIX_FADV_NOREUSE, {0,3});
2. fadvise(fd2, POSIX_FADV_NOREUSE, {1,2});
3. fadvise(fd3, POSIX_FADV_NOREUSE, {3,1});
4. echo 1024 > /sys/fs/f2fs/tuning/reclaim_caches_kb
This gives a way to reclaim file-backed pages by iterating all f2fs mounts until
reclaiming 1MB page cache ranges, registered by #1, #2, and #3.
5. cat /sys/fs/f2fs/tuning/reclaim_caches_kb
-> gives total number of registered file ranges.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch records POSIX_FADV_NOREUSE ranges for users to reclaim the caches
instantly off from LRU.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch adds an ioctl to give a per-file priority hint to attach
REQ_PRIO.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
F2FS-fs (dm-59): checkpoint=enable has some unwritten data.
------------[ cut here ]------------
WARNING: CPU: 6 PID: 8013 at fs/quota/dquot.c:691 dquot_writeback_dquots+0x2fc/0x308
pc : dquot_writeback_dquots+0x2fc/0x308
lr : f2fs_quota_sync+0xcc/0x1c4
Call trace:
dquot_writeback_dquots+0x2fc/0x308
f2fs_quota_sync+0xcc/0x1c4
f2fs_write_checkpoint+0x3d4/0x9b0
f2fs_issue_checkpoint+0x1bc/0x2c0
f2fs_sync_fs+0x54/0x150
f2fs_do_sync_file+0x2f8/0x814
__f2fs_ioctl+0x1960/0x3244
f2fs_ioctl+0x54/0xe0
__arm64_sys_ioctl+0xa8/0xe4
invoke_syscall+0x58/0x114
checkpoint and f2fs_remount may race as below, resulting triggering warning
in dquot_writeback_dquots().
atomic write remount
- do_remount
- down_write(&sb->s_umount);
- f2fs_remount
- ioctl
- f2fs_do_sync_file
- f2fs_sync_fs
- f2fs_write_checkpoint
- block_operations
- locked = down_read_trylock(&sbi->sb->s_umount)
: fail to lock due to the write lock was held by remount
- up_write(&sb->s_umount);
- f2fs_quota_sync
- dquot_writeback_dquots
- WARN_ON_ONCE(!rwsem_is_locked(&sb->s_umount))
: trigger warning because s_umount lock was unlocked by remount
If checkpoint comes from mount/umount/remount/freeze/quotactl, caller of
checkpoint has already held s_umount lock, calling dquot_writeback_dquots()
in the context should be safe.
So let's record task to sbi->umount_lock_holder, so that checkpoint can
know whether the lock has held in the context or not by checking current
w/ it.
In addition, in order to not misrepresent caller of checkpoint, we should
not allow to trigger async checkpoint for those callers: mount/umount/remount/
freeze/quotactl.
Fixes: af033b2aa8a8 ("f2fs: guarantee journalled quota data by checkpoint")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this series, there are several major improvements such as folio
conversion by Matthew, speed-up of block truncation, and caching more
dentry pages.
In addition, we implemented a linear dentry search to address recent
unicode regression, and figured out some false alarms that we could
get rid of.
Enhancements:
- foilio conversion in various IO paths
- optimize f2fs_truncate_data_blocks_range()
- cache more dentry pages
- remove unnecessary blk_finish_plug
- procfs: show mtime in segment_bits
Bug fixes:
- introduce linear search for dentries
- don't call block truncation for aliased file
- fix using wrong 'submitted' value in f2fs_write_cache_pages
- fix to do sanity check correctly on i_inline_xattr_size
- avoid trying to get invalid block address
- fix inconsistent dirty state of atomic file"
* tag 'f2fs-for-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits)
f2fs: fix inconsistent dirty state of atomic file
f2fs: fix to avoid changing 'check only' behaior of recovery
f2fs: Clean up the loop outside of f2fs_invalidate_blocks()
f2fs: procfs: show mtime in segment_bits
f2fs: fix to avoid return invalid mtime from f2fs_get_section_mtime()
f2fs: Fix format specifier in sanity_check_inode()
f2fs: avoid trying to get invalid block address
f2fs: fix to do sanity check correctly on i_inline_xattr_size
f2fs: remove blk_finish_plug
f2fs: Optimize f2fs_truncate_data_blocks_range()
f2fs: fix using wrong 'submitted' value in f2fs_write_cache_pages
f2fs: add parameter @len to f2fs_invalidate_blocks()
f2fs: update_sit_entry_for_release() supports consecutive blocks.
f2fs: introduce update_sit_entry_for_release/alloc()
f2fs: don't call block truncation for aliased file
f2fs: Introduce linear search for dentries
f2fs: add parameter @len to f2fs_invalidate_internal_cache()
f2fs: expand f2fs_invalidate_compress_page() to f2fs_invalidate_compress_pages_range()
f2fs: ensure that node info flags are always initialized
f2fs: The GC triggered by ioctl also needs to mark the segno as victim
...
|
|
New function can process some consecutive blocks at a time.
Function f2fs_invalidate_blocks()->down_write() and up_write()
are very time-consuming, so if f2fs_invalidate_blocks() can
process consecutive blocks at one time, it will save a lot of time.
Signed-off-by: Yi Sun <yi.sun@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch addresses an issue where some files in case-insensitive
directories become inaccessible due to changes in how the kernel function,
utf8_casefold(), generates case-folded strings from the commit 5c26d2f1d3f5
("unicode: Don't special case ignorable code points").
F2FS uses these case-folded names to calculate hash values for locating
dentries and stores them on disk. Since utf8_casefold() can produce
different output across kernel versions, stored hash values and newly
calculated hash values may differ. This results in affected files no
longer being found via the hash-based lookup.
To resolve this, the patch introduces a linear search fallback.
If the initial hash-based search fails, F2FS will sequentially scan the
directory entries.
Fixes: 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586
Signed-off-by: Daniel Lee <chullee@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
New function can process some consecutive blocks at a time.
Signed-off-by: Yi Sun <yi.sun@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
f2fs_invalidate_compress_pages_range()
New function f2fs_invalidate_compress_pages_range() adds the @len
parameter. So it can process some consecutive blocks at a time.
Signed-off-by: Yi Sun <yi.sun@unisoc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This is the folio equivalent of F2FS_P_SB(). Removes a call to
page_file_mapping() as we know folios seen by f2fs are never part of
the swap cache.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Now that the crc32() library function takes advantage of
architecture-specific optimizations, it is unnecessary to go through the
crypto API. Just use crc32(). This is much simpler, and it improves
performance due to eliminating the crypto API overhead.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20241202010844.144356-19-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
Quoted:
"at this time, there are still 1086911 extent nodes in this zombie
extent tree that need to be cleaned up.
crash_arm64_sprd_v8.0.3++> extent_tree.node_cnt ffffff80896cc500
node_cnt = {
counter = 1086911
},
"
As reported by Xiuhong, there will be a huge number of extent nodes
in extent tree, it may potentially cause:
- slab memory fragments
- extreme long time shrink on extent tree
- low mapping efficiency
Let's add a sysfs node to limit max read extent count for each inode,
by default, value of this threshold is 10240, it can be updated
according to user's requirement.
Reported-by: Xiuhong Wang <xiuhong.wang@unisoc.com>
Closes: https://lore.kernel.org/linux-f2fs-devel/20241112110627.1314632-1-xiuhong.wang@unisoc.com/
Signed-off-by: Xiuhong Wang <xiuhong.wang@unisoc.com>
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Fsync data recovery attempts to check and fix write pointer consistency
of cursegs and all other zones. If the write pointers of cursegs are
unaligned, cursegs are changed to new sections.
If recovery fails, zone write pointers are still checked and fixed,
but the latest checkpoint cannot be written back. Additionally, retry-
mount skips recovery and rolls back to reuse the old cursegs whose
zones are already finished. This can lead to unaligned write later.
This patch addresses the issue by leaving writer pointers untouched if
recovery fails. When retry-mount is performed, cursegs and other zones
are checked and fixed after skipping recovery.
Signed-off-by: Song Feng <songfeng@oppo.com>
Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com>
Signed-off-by: Sheng Yong <shengyong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
additional_reserved_segments was introduced by
commit 300a842937fb ("f2fs: fix to reserve space for IO align feature"),
and its initialization was deleted by
commit 87161a2b0aed ("f2fs: deprecate io_bits").
Signed-off-by: LongPing Wei <weilongping@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In __get_segment_type(), __get_segment_type_6() may return
CURSEG_COLD_DATA_PINNED or CURSEG_ALL_DATA_ATGC log type, but
following f2fs_get_segment_temp() can only handle persistent
log type, fix it.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
syzbot reports deadlock issue of f2fs as below:
======================================================
WARNING: possible circular locking dependency detected
6.12.0-rc3-syzkaller-00087-gc964ced77262 #0 Not tainted
------------------------------------------------------
kswapd0/79 is trying to acquire lock:
ffff888011824088 (&sbi->sb_lock){++++}-{3:3}, at: f2fs_down_write fs/f2fs/f2fs.h:2199 [inline]
ffff888011824088 (&sbi->sb_lock){++++}-{3:3}, at: f2fs_record_stop_reason+0x52/0x1d0 fs/f2fs/super.c:4068
but task is already holding lock:
ffff88804bd92610 (sb_internal#2){.+.+}-{0:0}, at: f2fs_evict_inode+0x662/0x15c0 fs/f2fs/inode.c:842
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (sb_internal#2){.+.+}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
__sb_start_write include/linux/fs.h:1716 [inline]
sb_start_intwrite+0x4d/0x1c0 include/linux/fs.h:1899
f2fs_evict_inode+0x662/0x15c0 fs/f2fs/inode.c:842
evict+0x4e8/0x9b0 fs/inode.c:725
f2fs_evict_inode+0x1a4/0x15c0 fs/f2fs/inode.c:807
evict+0x4e8/0x9b0 fs/inode.c:725
dispose_list fs/inode.c:774 [inline]
prune_icache_sb+0x239/0x2f0 fs/inode.c:963
super_cache_scan+0x38c/0x4b0 fs/super.c:223
do_shrink_slab+0x701/0x1160 mm/shrinker.c:435
shrink_slab+0x1093/0x14d0 mm/shrinker.c:662
shrink_one+0x43b/0x850 mm/vmscan.c:4818
shrink_many mm/vmscan.c:4879 [inline]
lru_gen_shrink_node mm/vmscan.c:4957 [inline]
shrink_node+0x3799/0x3de0 mm/vmscan.c:5937
kswapd_shrink_node mm/vmscan.c:6765 [inline]
balance_pgdat mm/vmscan.c:6957 [inline]
kswapd+0x1ca3/0x3700 mm/vmscan.c:7226
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
-> #1 (fs_reclaim){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
__fs_reclaim_acquire mm/page_alloc.c:3834 [inline]
fs_reclaim_acquire+0x88/0x130 mm/page_alloc.c:3848
might_alloc include/linux/sched/mm.h:318 [inline]
prepare_alloc_pages+0x147/0x5b0 mm/page_alloc.c:4493
__alloc_pages_noprof+0x16f/0x710 mm/page_alloc.c:4722
alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265
alloc_pages_noprof mm/mempolicy.c:2345 [inline]
folio_alloc_noprof+0x128/0x180 mm/mempolicy.c:2352
filemap_alloc_folio_noprof+0xdf/0x500 mm/filemap.c:1010
do_read_cache_folio+0x2eb/0x850 mm/filemap.c:3787
read_mapping_folio include/linux/pagemap.h:1011 [inline]
f2fs_commit_super+0x3c0/0x7d0 fs/f2fs/super.c:4032
f2fs_record_stop_reason+0x13b/0x1d0 fs/f2fs/super.c:4079
f2fs_handle_critical_error+0x2ac/0x5c0 fs/f2fs/super.c:4174
f2fs_write_inode+0x35f/0x4d0 fs/f2fs/inode.c:785
write_inode fs/fs-writeback.c:1503 [inline]
__writeback_single_inode+0x711/0x10d0 fs/fs-writeback.c:1723
writeback_single_inode+0x1f3/0x660 fs/fs-writeback.c:1779
sync_inode_metadata+0xc4/0x120 fs/fs-writeback.c:2849
f2fs_release_file+0xa8/0x100 fs/f2fs/file.c:1941
__fput+0x23f/0x880 fs/file_table.c:431
task_work_run+0x24f/0x310 kernel/task_work.c:228
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x168/0x370 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #0 (&sbi->sb_lock){++++}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3161 [inline]
check_prevs_add kernel/locking/lockdep.c:3280 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
__lock_acquire+0x1384/0x2050 kernel/locking/lockdep.c:5202
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
down_write+0x99/0x220 kernel/locking/rwsem.c:1577
f2fs_down_write fs/f2fs/f2fs.h:2199 [inline]
f2fs_record_stop_reason+0x52/0x1d0 fs/f2fs/super.c:4068
f2fs_handle_critical_error+0x2ac/0x5c0 fs/f2fs/super.c:4174
f2fs_evict_inode+0xa61/0x15c0 fs/f2fs/inode.c:883
evict+0x4e8/0x9b0 fs/inode.c:725
f2fs_evict_inode+0x1a4/0x15c0 fs/f2fs/inode.c:807
evict+0x4e8/0x9b0 fs/inode.c:725
dispose_list fs/inode.c:774 [inline]
prune_icache_sb+0x239/0x2f0 fs/inode.c:963
super_cache_scan+0x38c/0x4b0 fs/super.c:223
do_shrink_slab+0x701/0x1160 mm/shrinker.c:435
shrink_slab+0x1093/0x14d0 mm/shrinker.c:662
shrink_one+0x43b/0x850 mm/vmscan.c:4818
shrink_many mm/vmscan.c:4879 [inline]
lru_gen_shrink_node mm/vmscan.c:4957 [inline]
shrink_node+0x3799/0x3de0 mm/vmscan.c:5937
kswapd_shrink_node mm/vmscan.c:6765 [inline]
balance_pgdat mm/vmscan.c:6957 [inline]
kswapd+0x1ca3/0x3700 mm/vmscan.c:7226
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
other info that might help us debug this:
Chain exists of:
&sbi->sb_lock --> fs_reclaim --> sb_internal#2
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
rlock(sb_internal#2);
lock(fs_reclaim);
lock(sb_internal#2);
lock(&sbi->sb_lock);
Root cause is there will be potential deadlock in between
below tasks:
Thread A Kswapd
- f2fs_ioc_commit_atomic_write
- mnt_want_write_file -- down_read lock A
- balance_pgdat
- __fs_reclaim_acquire -- lock B
- shrink_node
- prune_icache_sb
- dispose_list
- f2fs_evict_inode
- sb_start_intwrite -- down_read lock A
- f2fs_do_sync_file
- f2fs_write_inode
- f2fs_handle_critical_error
- f2fs_record_stop_reason
- f2fs_commit_super
- read_mapping_folio
- filemap_alloc_folio_noprof
- fs_reclaim_acquire -- lock B
Both threads try to acquire read lock of lock A, then its upcoming write
lock grabber will trigger deadlock.
Let's always create an asynchronous task in f2fs_handle_critical_error()
rather than calling f2fs_record_stop_reason() synchronously to avoid
this potential deadlock issue.
Fixes: b62e71be2110 ("f2fs: support errors=remount-ro|continue|panic mountoption")
Reported-by: syzbot+be4a9983e95a5e25c8d3@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6704d667.050a0220.1e4d62.0081.GAE@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|