Age | Commit message (Collapse) | Author |
|
In the case of the following call stack for an atomic file,
FI_DIRTY_INODE is set, but FI_ATOMIC_DIRTIED is not subsequently set.
f2fs_file_write_iter
f2fs_map_blocks
f2fs_reserve_new_blocks
inc_valid_block_count
__mark_inode_dirty(dquot)
f2fs_dirty_inode
If FI_ATOMIC_DIRTIED is not set, atomic file can encounter corruption
due to a mismatch between old file size and new data.
To resolve this issue, I changed to set FI_ATOMIC_DIRTIED when
FI_DIRTY_INODE is set. This ensures that FI_DIRTY_INODE, which was
previously cleared by the Writeback thread during the commit atomic, is
set and i_size is updated.
Cc: <stable@vger.kernel.org>
Fixes: fccaa81de87e ("f2fs: prevent atomic file from being dirtied before commit")
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Sunmin Jeong <s_min.jeong@samsung.com>
Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com>
Reviewed-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
syzbot reports a f2fs bug as below:
F2FS-fs (loop3): Stopped filesystem due to reason: 7
kworker/u8:7: attempt to access beyond end of device
BUG: unable to handle page fault for address: ffffed1604ea3dfa
RIP: 0010:get_ckpt_valid_blocks fs/f2fs/segment.h:361 [inline]
RIP: 0010:has_curseg_enough_space fs/f2fs/segment.h:570 [inline]
RIP: 0010:__get_secs_required fs/f2fs/segment.h:620 [inline]
RIP: 0010:has_not_enough_free_secs fs/f2fs/segment.h:633 [inline]
RIP: 0010:has_enough_free_secs+0x575/0x1660 fs/f2fs/segment.h:649
<TASK>
f2fs_is_checkpoint_ready fs/f2fs/segment.h:671 [inline]
f2fs_write_inode+0x425/0x540 fs/f2fs/inode.c:791
write_inode fs/fs-writeback.c:1525 [inline]
__writeback_single_inode+0x708/0x10d0 fs/fs-writeback.c:1745
writeback_sb_inodes+0x820/0x1360 fs/fs-writeback.c:1976
wb_writeback+0x413/0xb80 fs/fs-writeback.c:2156
wb_do_writeback fs/fs-writeback.c:2303 [inline]
wb_workfn+0x410/0x1080 fs/fs-writeback.c:2343
process_one_work kernel/workqueue.c:3236 [inline]
process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3317
worker_thread+0x870/0xd30 kernel/workqueue.c:3398
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Commit 8b10d3653735 ("f2fs: introduce FAULT_NO_SEGMENT") allows to trigger
no free segment fault in allocator, then it will update curseg->segno to
NULL_SEGNO, though, CP_ERROR_FLAG has been set, f2fs_write_inode() missed
to check the flag, and access invalid curseg->segno directly in below call
path, then resulting in panic:
- f2fs_write_inode
- f2fs_is_checkpoint_ready
- has_enough_free_secs
- has_not_enough_free_secs
- __get_secs_required
- has_curseg_enough_space
- get_ckpt_valid_blocks
: access invalid curseg->segno
To avoid this issue, let's:
- check CP_ERROR_FLAG flag in prior to f2fs_is_checkpoint_ready() in
f2fs_write_inode().
- in has_curseg_enough_space(), save curseg->segno into a temp variable,
and verify its validation before use.
Fixes: 8b10d3653735 ("f2fs: introduce FAULT_NO_SEGMENT")
Reported-by: syzbot+b6b347b7a4ea1b2e29b6@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67973c2b.050a0220.11b1bb.0089.GAE@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch introduces a new wrapper f2fs_get_inode_page(), then, caller
can use it to load inode block to page cache, meanwhile it will do sanity
check on inode footer.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch records POSIX_FADV_NOREUSE ranges for users to reclaim the caches
instantly off from LRU.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
F2FS-fs (dm-105): inconsistent node block, nid:430, node_footer[nid:2198964142,ino:598252782,ofs:118300154,cpver:5409237455940746069,blkaddr:2125070942]
If node block is loaded successfully, but its content is inconsistent, it
doesn't need to retry IO.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
When building for 32-bit platforms, for which 'size_t' is 'unsigned int',
there is a warning due to an incorrect format specifier:
fs/f2fs/inode.c:320:6: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat]
318 | f2fs_warn(sbi, "%s: inode (ino=%lx) has corrupted i_inline_xattr_size: %d, min: %lu, max: %lu",
| ~~~
| %u
319 | __func__, inode->i_ino, fi->i_inline_xattr_size,
320 | MIN_INLINE_XATTR_SIZE, MAX_INLINE_XATTR_SIZE);
| ^~~~~~~~~~~~~~~~~~~~~
fs/f2fs/f2fs.h:1855:46: note: expanded from macro 'f2fs_warn'
1855 | f2fs_printk(sbi, false, KERN_WARNING fmt, ##__VA_ARGS__)
| ~~~ ^~~~~~~~~~~
fs/f2fs/xattr.h:86:31: note: expanded from macro 'MIN_INLINE_XATTR_SIZE'
86 | #define MIN_INLINE_XATTR_SIZE (sizeof(struct f2fs_xattr_header) / sizeof(__le32))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use the format specifier for 'size_t', '%zu', to resolve the warning.
Fixes: 5c1768b67250 ("f2fs: fix to do sanity check correctly on i_inline_xattr_size")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
syzbot reported an out-of-range access issue as below:
UBSAN: array-index-out-of-bounds in fs/f2fs/f2fs.h:3292:19
index 18446744073709550491 is out of range for type '__le32[923]' (aka 'unsigned int[923]')
CPU: 0 UID: 0 PID: 5338 Comm: syz.0.0 Not tainted 6.12.0-syzkaller-10689-g7af08b57bcb9 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
ubsan_epilogue lib/ubsan.c:231 [inline]
__ubsan_handle_out_of_bounds+0x121/0x150 lib/ubsan.c:429
read_inline_xattr+0x273/0x280
lookup_all_xattrs fs/f2fs/xattr.c:341 [inline]
f2fs_getxattr+0x57b/0x13b0 fs/f2fs/xattr.c:533
vfs_getxattr_alloc+0x472/0x5c0 fs/xattr.c:393
ima_read_xattr+0x38/0x60 security/integrity/ima/ima_appraise.c:229
process_measurement+0x117a/0x1fb0 security/integrity/ima/ima_main.c:353
ima_file_check+0xd9/0x120 security/integrity/ima/ima_main.c:572
security_file_post_open+0xb9/0x280 security/security.c:3121
do_open fs/namei.c:3830 [inline]
path_openat+0x2ccd/0x3590 fs/namei.c:3987
do_file_open_root+0x3a7/0x720 fs/namei.c:4039
file_open_root+0x247/0x2a0 fs/open.c:1382
do_handle_open+0x85b/0x9d0 fs/fhandle.c:414
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
index: 18446744073709550491 (decimal, unsigned long long)
= 0xfffffffffffffb9b (hexadecimal) = -1125 (decimal, long long)
UBSAN detects that inline_xattr_addr() tries to access .i_addr[-1125].
w/ below testcase, it can reproduce this bug easily:
- mkfs.f2fs -f -O extra_attr,flexible_inline_xattr /dev/sdb
- mount -o inline_xattr_size=512 /dev/sdb /mnt/f2fs
- touch /mnt/f2fs/file
- umount /mnt/f2fs
- inject.f2fs --node --mb i_inline --nid 4 --val 0x1 /dev/sdb
- inject.f2fs --node --mb i_inline_xattr_size --nid 4 --val 2048 /dev/sdb
- mount /dev/sdb /mnt/f2fs
- getfattr /mnt/f2fs/file
The root cause is if metadata of filesystem and inode were fuzzed as below:
- extra_attr feature is enabled
- flexible_inline_xattr feature is enabled
- ri.i_inline_xattr_size = 2048
- F2FS_EXTRA_ATTR bit in ri.i_inline was not set
sanity_check_inode() will skip doing sanity check on fi->i_inline_xattr_size,
result in using invalid inline_xattr_size later incorrectly, fix it.
Meanwhile, let's fix to check lower boundary for .i_inline_xattr_size w/
MIN_INLINE_XATTR_SIZE like we did in parse_options().
There is a related issue reported by syzbot, Qasim Ijaz has anlyzed and
fixed it w/ very similar way [1], as discussed, we all agree that it will
be better to do sanity check in sanity_check_inode() for fix, so finally,
let's fix these two related bugs w/ current patch.
Including commit message from Qasim's patch as below, thanks a lot for
his contribution.
"In f2fs_getxattr(), the function lookup_all_xattrs() allocates a 12-byte
(base_size) buffer for an inline extended attribute. However, when
__find_inline_xattr() calls __find_xattr(), it uses the macro
"list_for_each_xattr(entry, addr)", which starts by calling
XATTR_FIRST_ENTRY(addr). This skips a 24-byte struct f2fs_xattr_header
at the beginning of the buffer, causing an immediate out-of-bounds read
in a 12-byte allocation. The subsequent !IS_XATTR_LAST_ENTRY(entry)
check then dereferences memory outside the allocated region, triggering
the slab-out-of bounds read.
This patch prevents the out-of-bounds read by adding a check to bail
out early if inline_size is too small and does not account for the
header plus the 4-byte value that IS_XATTR_LAST_ENTRY reads."
[1]: https://lore.kernel.org/linux-f2fs-devel/Z32y1rfBY9Qb5ZjM@qasdev.system/
Fixes: 6afc662e68b5 ("f2fs: support flexible inline xattr size")
Reported-by: syzbot+69f5379a1717a0b982a1@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/674f4e7d.050a0220.17bd51.004f.GAE@google.com
Reported-by: syzbot <syzbot+f5e74075e096e757bdbf@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=f5e74075e096e757bdbf
Tested-by: syzbot <syzbot+f5e74075e096e757bdbf@syzkaller.appspotmail.com>
Tested-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
F2FS should understand how the device aliasing file works and support
deleting the file after use. A device aliasing file can be created by
mkfs.f2fs tool and it can map the whole device with an extent, not
using node blocks. The file space should be pinned and normally used for
read-only usages.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
creating a large files during checkpoint disable until it runs out of
space and then delete it, then remount to enable checkpoint again, and
then unmount the filesystem triggers the f2fs_bug_on as below:
------------[ cut here ]------------
kernel BUG at fs/f2fs/inode.c:896!
CPU: 2 UID: 0 PID: 1286 Comm: umount Not tainted 6.11.0-rc7-dirty #360
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:f2fs_evict_inode+0x58c/0x610
Call Trace:
__die_body+0x15/0x60
die+0x33/0x50
do_trap+0x10a/0x120
f2fs_evict_inode+0x58c/0x610
do_error_trap+0x60/0x80
f2fs_evict_inode+0x58c/0x610
exc_invalid_op+0x53/0x60
f2fs_evict_inode+0x58c/0x610
asm_exc_invalid_op+0x16/0x20
f2fs_evict_inode+0x58c/0x610
evict+0x101/0x260
dispose_list+0x30/0x50
evict_inodes+0x140/0x190
generic_shutdown_super+0x2f/0x150
kill_block_super+0x11/0x40
kill_f2fs_super+0x7d/0x140
deactivate_locked_super+0x2a/0x70
cleanup_mnt+0xb3/0x140
task_work_run+0x61/0x90
The root cause is: creating large files during disable checkpoint
period results in not enough free segments, so when writing back root
inode will failed in f2fs_enable_checkpoint. When umount the file
system after enabling checkpoint, the root inode is dirty in
f2fs_evict_inode function, which triggers BUG_ON. The steps to
reproduce are as follows:
dd if=/dev/zero of=f2fs.img bs=1M count=55
mount f2fs.img f2fs_dir -o checkpoint=disable:10%
dd if=/dev/zero of=big bs=1M count=50
sync
rm big
mount -o remount,checkpoint=enable f2fs_dir
umount f2fs_dir
Let's redirty inode when there is not free segments during checkpoint
is disable.
Signed-off-by: Qi Han <hanqi@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Keep atomic file clean while updating and make it dirtied during commit
in order to avoid unnecessary and excessive inode updates in the previous
fix.
Fixes: 4bf78322346f ("f2fs: mark inode dirty for FI_ATOMIC_COMMITTED flag")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert to use folio, so that we can get rid of 'page->index' to
prepare for removal of 'index' field in structure page [1].
[1] https://lore.kernel.org/all/Zp8fgUSIBGQ1TN0D@casper.infradead.org/
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Li Zetao <lizetao1@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Convert to use folio and related functionality.
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Use temporary variable instead of F2FS_I() for cleanup.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In case of the COW file, new updates and GC writes are already
separated to page caches of the atomic file and COW file. As some cases
that use the meta inode for GC, there are some race issues between a
foreground thread and GC thread.
To handle them, we need to take care when to invalidate and wait
writeback of GC pages in COW files as the case of using the meta inode.
Also, a pointer from the COW inode to the original inode is required to
check the state of original pages.
For the former, we can solve the problem by using the meta inode for GC
of COW files. Then let's get a page from the original inode in
move_data_block when GCing the COW file to avoid race condition.
Fixes: 3db1de0e582c ("f2fs: change the current atomic write way")
Cc: stable@vger.kernel.org #v5.19+
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yeongjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Sunmin Jeong <s_min.jeong@samsung.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Commit f240d3aaf5a1 ("f2fs: do more sanity check on inode") missed
to remove redundant sanity check on flexible_inline_xattr flag, fix
it.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
chenyuwen reports a f2fs bug as below:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000011
fscrypt_set_bio_crypt_ctx+0x78/0x1e8
f2fs_grab_read_bio+0x78/0x208
f2fs_submit_page_read+0x44/0x154
f2fs_get_read_data_page+0x288/0x5f4
f2fs_get_lock_data_page+0x60/0x190
truncate_partial_data_page+0x108/0x4fc
f2fs_do_truncate_blocks+0x344/0x5f0
f2fs_truncate_blocks+0x6c/0x134
f2fs_truncate+0xd8/0x200
f2fs_iget+0x20c/0x5ac
do_garbage_collect+0x5d0/0xf6c
f2fs_gc+0x22c/0x6a4
f2fs_disable_checkpoint+0xc8/0x310
f2fs_fill_super+0x14bc/0x1764
mount_bdev+0x1b4/0x21c
f2fs_mount+0x20/0x30
legacy_get_tree+0x50/0xbc
vfs_get_tree+0x5c/0x1b0
do_new_mount+0x298/0x4cc
path_mount+0x33c/0x5fc
__arm64_sys_mount+0xcc/0x15c
invoke_syscall+0x60/0x150
el0_svc_common+0xb8/0xf8
do_el0_svc+0x28/0xa0
el0_svc+0x24/0x84
el0t_64_sync_handler+0x88/0xec
It is because inode.i_crypt_info is not initialized during below path:
- mount
- f2fs_fill_super
- f2fs_disable_checkpoint
- f2fs_gc
- f2fs_iget
- f2fs_truncate
So, let's relocate truncation of preallocated blocks to f2fs_file_open(),
after fscrypt_file_open().
Fixes: d4dd19ec1ea0 ("f2fs: do not expose unwritten blocks to user by DIO")
Reported-by: chenyuwen <yuwen.chen@xjmz.com>
Closes: https://lore.kernel.org/linux-kernel/20240517085327.1188515-1-yuwen.chen@xjmz.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
syzbot reports a f2fs bug as below:
BUG: KASAN: slab-use-after-free in sanity_check_extent_cache+0x370/0x410 fs/f2fs/extent_cache.c:46
Read of size 4 at addr ffff8880739ab220 by task syz-executor200/5097
CPU: 0 PID: 5097 Comm: syz-executor200 Not tainted 6.9.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
sanity_check_extent_cache+0x370/0x410 fs/f2fs/extent_cache.c:46
do_read_inode fs/f2fs/inode.c:509 [inline]
f2fs_iget+0x33e1/0x46e0 fs/f2fs/inode.c:560
f2fs_nfs_get_inode+0x74/0x100 fs/f2fs/super.c:3237
generic_fh_to_dentry+0x9f/0xf0 fs/libfs.c:1413
exportfs_decode_fh_raw+0x152/0x5f0 fs/exportfs/expfs.c:444
exportfs_decode_fh+0x3c/0x80 fs/exportfs/expfs.c:584
do_handle_to_path fs/fhandle.c:155 [inline]
handle_to_path fs/fhandle.c:210 [inline]
do_handle_open+0x495/0x650 fs/fhandle.c:226
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
We missed to cover sanity_check_extent_cache() w/ extent cache lock,
so, below race case may happen, result in use after free issue.
- f2fs_iget
- do_read_inode
- f2fs_init_read_extent_tree
: add largest extent entry in to cache
- shrink
- f2fs_shrink_read_extent_tree
- __shrink_extent_tree
- __detach_extent_node
: drop largest extent entry
- sanity_check_extent_cache
: access et->largest w/o lock
let's refactor sanity_check_extent_cache() to avoid extent cache access
and call it before f2fs_init_read_extent_tree() to fix this issue.
Reported-by: syzbot+74ebe2104433e9dc610d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/00000000000009beea061740a531@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
syzbot reports f2fs bug as below:
kernel BUG at fs/f2fs/inode.c:933!
RIP: 0010:f2fs_evict_inode+0x1576/0x1590 fs/f2fs/inode.c:933
Call Trace:
evict+0x2a4/0x620 fs/inode.c:664
dispose_list fs/inode.c:697 [inline]
evict_inodes+0x5f8/0x690 fs/inode.c:747
generic_shutdown_super+0x9d/0x2c0 fs/super.c:675
kill_block_super+0x44/0x90 fs/super.c:1667
kill_f2fs_super+0x303/0x3b0 fs/f2fs/super.c:4894
deactivate_locked_super+0xc1/0x130 fs/super.c:484
cleanup_mnt+0x426/0x4c0 fs/namespace.c:1256
task_work_run+0x24a/0x300 kernel/task_work.c:180
ptrace_notify+0x2cd/0x380 kernel/signal.c:2399
ptrace_report_syscall include/linux/ptrace.h:411 [inline]
ptrace_report_syscall_exit include/linux/ptrace.h:473 [inline]
syscall_exit_work kernel/entry/common.c:251 [inline]
syscall_exit_to_user_mode_prepare kernel/entry/common.c:278 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
syscall_exit_to_user_mode+0x15c/0x280 kernel/entry/common.c:296
do_syscall_64+0x50/0x110 arch/x86/entry/common.c:88
entry_SYSCALL_64_after_hwframe+0x63/0x6b
The root cause is:
- do_sys_open
- f2fs_lookup
- __f2fs_find_entry
- f2fs_i_depth_write
- f2fs_mark_inode_dirty_sync
- f2fs_dirty_inode
- set_inode_flag(inode, FI_DIRTY_INODE)
- umount
- kill_f2fs_super
- kill_block_super
- generic_shutdown_super
- sync_filesystem
: sb is readonly, skip sync_filesystem()
- evict_inodes
- iput
- f2fs_evict_inode
- f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE))
: trigger kernel panic
When we try to repair i_current_depth in readonly filesystem, let's
skip dirty inode to avoid panic in later f2fs_evict_inode().
Cc: stable@vger.kernel.org
Reported-by: syzbot+31e4659a3fe953aec2f4@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000e890bc0609a55cff@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
inode can be fuzzed, so it can has F2FS_INLINE_DATA flag and valid
i_blocks/i_nid value, this patch supports to do extra sanity check
to detect such corrupted state.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
After commit 3db1de0e582c ("f2fs: change the current atomic write way"),
we removed all GC_FAILURE_ATOMIC usage, let's change i_gc_failures[]
array to i_pin_failure for cleanup.
Meanwhile, let's define i_current_depth and i_gc_failures as union
variable due to they won't be valid at the same time.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
syzbot reports a kernel bug as below:
F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4
==================================================================
BUG: KASAN: slab-out-of-bounds in f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
BUG: KASAN: slab-out-of-bounds in current_nat_addr fs/f2fs/node.h:213 [inline]
BUG: KASAN: slab-out-of-bounds in f2fs_get_node_info+0xece/0x1200 fs/f2fs/node.c:600
Read of size 1 at addr ffff88807a58c76c by task syz-executor280/5076
CPU: 1 PID: 5076 Comm: syz-executor280 Not tainted 6.9.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
f2fs_test_bit fs/f2fs/f2fs.h:2933 [inline]
current_nat_addr fs/f2fs/node.h:213 [inline]
f2fs_get_node_info+0xece/0x1200 fs/f2fs/node.c:600
f2fs_xattr_fiemap fs/f2fs/data.c:1848 [inline]
f2fs_fiemap+0x55d/0x1ee0 fs/f2fs/data.c:1925
ioctl_fiemap fs/ioctl.c:220 [inline]
do_vfs_ioctl+0x1c07/0x2e50 fs/ioctl.c:838
__do_sys_ioctl fs/ioctl.c:902 [inline]
__se_sys_ioctl+0x81/0x170 fs/ioctl.c:890
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The root cause is we missed to do sanity check on i_xattr_nid during
f2fs_iget(), so that in fiemap() path, current_nat_addr() will access
nat_bitmap w/ offset from invalid i_xattr_nid, result in triggering
kasan bug report, fix it.
Reported-and-tested-by: syzbot+3694e283cf5c40df6d14@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/00000000000094036c0616e72a1d@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Let's convert PageWriteback to folio_test_writeback.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If f2fs_evict_inode is called between freeze_super and thaw_super, the
s_writer rwsem count may become negative, resulting in hang.
CPU1 CPU2
f2fs_resize_fs() f2fs_evict_inode()
f2fs_freeze
set SBI_IS_FREEZING
skip sb_start_intwrite
f2fs_unfreeze
clear SBI_IS_FREEZING
sb_end_intwrite
To solve this problem, the call to sb_end_write is determined by whether
sb_start_intwrite is called, rather than the current freezing status.
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Sunmin Jeong <s_min.jeong@samsung.com>
Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs update from Jaegeuk Kim:
"In this series, we've some progress to support Zoned block device
regarding to the power-cut recovery flow and enabling
checkpoint=disable feature which is essential for Android OTA.
Other than that, some patches touched sysfs entries and tracepoints
which are minor, while several bug fixes on error handlers and
compression flows are good to improve the overall stability.
Enhancements:
- enable checkpoint=disable for zoned block device
- sysfs entries such as discard status, discard_io_aware, dir_level
- tracepoints such as f2fs_vm_page_mkwrite(), f2fs_rename(),
f2fs_new_inode()
- use shared inode lock during f2fs_fiemap() and f2fs_seek_block()
Bug fixes:
- address some power-cut recovery issues on zoned block device
- handle errors and logics on do_garbage_collect(),
f2fs_reserve_new_block(), f2fs_move_file_range(),
f2fs_recover_xattr_data()
- don't set FI_PREALLOCATED_ALL for partial write
- fix to update iostat correctly in f2fs_filemap_fault()
- fix to wait on block writeback for post_read case
- fix to tag gcing flag on page during block migration
- restrict max filesize for 16K f2fs
- fix to avoid dirent corruption
- explicitly null-terminate the xattr list
There are also several clean-up patches to remove dead codes and
better readability"
* tag 'f2fs-for-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (33 commits)
f2fs: show more discard status by sysfs
f2fs: Add error handling for negative returns from do_garbage_collect
f2fs: Constrain the modification range of dir_level in the sysfs
f2fs: Use wait_event_freezable_timeout() for freezable kthread
f2fs: fix to check return value of f2fs_recover_xattr_data
f2fs: don't set FI_PREALLOCATED_ALL for partial write
f2fs: fix to update iostat correctly in f2fs_filemap_fault()
f2fs: fix to check compress file in f2fs_move_file_range()
f2fs: fix to wait on block writeback for post_read case
f2fs: fix to tag gcing flag on page during block migration
f2fs: add tracepoint for f2fs_vm_page_mkwrite()
f2fs: introduce f2fs_invalidate_internal_cache() for cleanup
f2fs: update blkaddr in __set_data_blkaddr() for cleanup
f2fs: introduce get_dnode_addr() to clean up codes
f2fs: delete obsolete FI_DROP_CACHE
f2fs: delete obsolete FI_FIRST_BLOCK_WRITTEN
f2fs: Restrict max filesize for 16K f2fs
f2fs: let's finish or reset zones all the time
f2fs: check write pointers when checkpoint=disable
f2fs: fix write pointers on zoned device after roll forward
...
|
|
Just cleanup, no logic changes.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Commit 3c6c2bebef79 ("f2fs: avoid punch_hole overhead when releasing
volatile data") introduced FI_FIRST_BLOCK_WRITTEN as below reason:
This patch is to avoid some punch_hole overhead when releasing volatile
data. If volatile data was not written yet, we just can make the first
page as zero.
After commit 7bc155fec5b3 ("f2fs: kill volatile write support"), we
won't support volatile write, but it missed to remove obsolete
FI_FIRST_BLOCK_WRITTEN, delete it in this patch.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
There were already assertions that we were not passing a tail page to
error_remove_page(), so make the compiler enforce that by converting
everything to pass and use a folio.
Link: https://lkml.kernel.org/r/20231117161447.2461643-7-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this cycle, we introduce a bigger page size support by changing the
internal f2fs's block size aligned to the page size. We also continue
to improve zoned block device support regarding the power off
recovery. As usual, there are some bug fixes regarding the error
handling routines in compression and ioctl.
Enhancements:
- Support Block Size == Page Size
- let f2fs_precache_extents() traverses in file range
- stop iterating f2fs_map_block if hole exists
- preload extent_cache for POSIX_FADV_WILLNEED
- compress: fix to avoid fragment w/ OPU during f2fs_ioc_compress_file()
Bug fixes:
- do not return EFSCORRUPTED, but try to run online repair
- finish previous checkpoints before returning from remount
- fix error handling of __get_node_page and __f2fs_build_free_nids
- clean up zones when not successfully unmounted
- fix to initialize map.m_pblk in f2fs_precache_extents()
- fix to drop meta_inode's page cache in f2fs_put_super()
- set the default compress_level on ioctl
- fix to avoid use-after-free on dic
- fix to avoid redundant compress extension
- do sanity check on cluster when CONFIG_F2FS_CHECK_FS is on
- fix deadloop in f2fs_write_cache_pages()"
* tag 'f2fs-for-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: finish previous checkpoints before returning from remount
f2fs: fix error handling of __get_node_page
f2fs: do not return EFSCORRUPTED, but try to run online repair
f2fs: fix error path of __f2fs_build_free_nids
f2fs: Clean up errors in segment.h
f2fs: clean up zones when not successfully unmounted
f2fs: let f2fs_precache_extents() traverses in file range
f2fs: avoid format-overflow warning
f2fs: fix to initialize map.m_pblk in f2fs_precache_extents()
f2fs: Support Block Size == Page Size
f2fs: stop iterating f2fs_map_block if hole exists
f2fs: preload extent_cache for POSIX_FADV_WILLNEED
f2fs: set the default compress_level on ioctl
f2fs: compress: fix to avoid fragment w/ OPU during f2fs_ioc_compress_file()
f2fs: fix to drop meta_inode's page cache in f2fs_put_super()
f2fs: split initial and dynamic conditions for extent_cache
f2fs: compress: fix to avoid redundant compress extension
f2fs: compress: do sanity check on cluster when CONFIG_F2FS_CHECK_FS is on
f2fs: compress: fix to avoid use-after-free on dic
f2fs: compress: fix deadloop in f2fs_write_cache_pages()
|
|
Convert to using the new inode timestamp accessor functions.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20231004185347.80880-34-jlayton@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This allows f2fs to support cases where the block size = page size for
both 4K and 16K block sizes. Other sizes should work as well, should the
need arise. This does not currently support 4K Block size filesystems if
the page size is 16K.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this cycle, we don't have a highlighted feature enhancement, but
mostly have fixed issues mainly in two parts: 1) zoned block device,
and 2) compression support.
For zoned block device, we've tried to improve the power-off recovery
flow as much as possible. For compression, we found some corner cases
caused by wrong compression policy and logics. Other than them, there
were some reverts and stat corrections.
Bug fixes:
- use finish zone command when closing a zone
- check zone type before sending async reset zone command
- fix to assign compress_level for lz4 correctly
- fix error path of f2fs_submit_page_read()
- don't {,de}compress non-full cluster
- send small discard commands during checkpoint back
- flush inode if atomic file is aborted
- correct to account gc/cp stats
And, there are minor bug fixes, avoiding false lockdep warning, and
clean-ups"
* tag 'f2fs-for-6-6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (25 commits)
f2fs: use finish zone command when closing a zone
f2fs: compress: fix to assign compress_level for lz4 correctly
f2fs: fix error path of f2fs_submit_page_read()
f2fs: clean up error handling in sanity_check_{compress_,}inode()
f2fs: avoid false alarm of circular locking
Revert "f2fs: do not issue small discard commands during checkpoint"
f2fs: doc: fix description of max_small_discards
f2fs: should update REQ_TIME for direct write
f2fs: fix to account cp stats correctly
f2fs: fix to account gc stats correctly
f2fs: remove unneeded check condition in __f2fs_setxattr()
f2fs: fix to update i_ctime in __f2fs_setxattr()
Revert "f2fs: fix to do sanity check on extent cache correctly"
f2fs: increase usage of folio_next_index() helper
f2fs: Only lfs mode is allowed with zoned block device feature
f2fs: check zone type before sending async reset zone command
f2fs: compress: don't {,de}compress non-full cluster
f2fs: allow f2fs_ioc_{,de}compress_file to be interrupted
f2fs: don't reopen the main block device in f2fs_scan_devices
f2fs: fix to avoid mmap vs set_compress_option case
...
|
|
In sanity_check_{compress_,}inode(), it doesn't need to set SBI_NEED_FSCK
in each error case, instead, we can set the flag in do_read_inode() only
once when sanity_check_inode() fails.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
syzbot reports a f2fs bug as below:
UBSAN: array-index-out-of-bounds in fs/f2fs/f2fs.h:3275:19
index 1409 is out of range for type '__le32[923]' (aka 'unsigned int[923]')
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
ubsan_epilogue lib/ubsan.c:217 [inline]
__ubsan_handle_out_of_bounds+0x11c/0x150 lib/ubsan.c:348
inline_data_addr fs/f2fs/f2fs.h:3275 [inline]
__recover_inline_status fs/f2fs/inode.c:113 [inline]
do_read_inode fs/f2fs/inode.c:480 [inline]
f2fs_iget+0x4730/0x48b0 fs/f2fs/inode.c:604
f2fs_fill_super+0x640e/0x80c0 fs/f2fs/super.c:4601
mount_bdev+0x276/0x3b0 fs/super.c:1391
legacy_get_tree+0xef/0x190 fs/fs_context.c:611
vfs_get_tree+0x8c/0x270 fs/super.c:1519
do_new_mount+0x28f/0xae0 fs/namespace.c:3335
do_mount fs/namespace.c:3675 [inline]
__do_sys_mount fs/namespace.c:3884 [inline]
__se_sys_mount+0x2d9/0x3c0 fs/namespace.c:3861
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The issue was bisected to:
commit d48a7b3a72f121655d95b5157c32c7d555e44c05
Author: Chao Yu <chao@kernel.org>
Date: Mon Jan 9 03:49:20 2023 +0000
f2fs: fix to do sanity check on extent cache correctly
The root cause is we applied both v1 and v2 of the patch, v2 is the right
fix, so it needs to revert v1 in order to fix reported issue.
v1:
commit d48a7b3a72f1 ("f2fs: fix to do sanity check on extent cache correctly")
https://lore.kernel.org/lkml/20230109034920.492914-1-chao@kernel.org/
v2:
commit 269d11948100 ("f2fs: fix to do sanity check on extent cache correctly")
https://lore.kernel.org/lkml/20230207134808.1827869-1-chao@kernel.org/
Reported-by: syzbot+601018296973a481f302@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000fcf0690600e4d04d@google.com/
Fixes: d48a7b3a72f1 ("f2fs: fix to do sanity check on extent cache correctly")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In later patches, we're going to change how the inode's ctime field is
used. Switch to using accessor functions instead of raw accesses of
inode->i_ctime.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230705190309.579783-41-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
There are several issues in sanity_check_inode():
- The code looks not clean, it checks extra_attr related condition
dispersively.
- It missed to check i_extra_isize w/ lower boundary
- It missed to check feature dependency: prjquota, inode_chksum,
inode_crtime, compression features rely on extra_attr feature.
- It's not necessary to check i_extra_isize due to it will only
be assigned to non-zero value if f2fs_has_extra_attr() is true
in do_read_inode().
Fix them all in this patch.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
The last valid compress related field is i_compress_flag, check its
validity instead of i_log_cluster_size.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Commit 3fde13f817e2 ("f2fs: compress: support compress level")
forgot to do basic compress level check, let's add it.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
i_crtime will never change after inode creation, so we don't need
to copy it into f2fs_inode_info.i_disk_time[3], and monitor its
change to decide whether updating inode page, remove related stuff.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Let's use BIT() and GENMASK() instead of open it.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
To fix a race condition between atomic write aborts, I use the inode
lock and make COW inode to be re-usable thoroughout the whole
atomic file inode lifetime.
Reported-by: syzbot+823000d23b3400619f7c@syzkaller.appspotmail.com
Fixes: 3db1de0e582c ("f2fs: change the current atomic write way")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In do_read_inode(), sanity check for extent cache should be called after
f2fs_init_read_extent_tree(), fix it.
Fixes: 72840cccc0a1 ("f2fs: allocate the extent_cache by default")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If the storage gives a corrupted node block due to short power failure and
reset, f2fs stops the entire operations by setting the checkpoint failure flag.
Let's give more chances to live by re-issuing IOs for a while in such critical
path.
Cc: stable@vger.kernel.org
Suggested-by: Randall Huang <huangrandall@google.com>
Suggested-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
.i_compress_level was introduced by commit 3fde13f817e2 ("f2fs: compress:
support compress level"), but never be used.
This patch updates as below:
- load high 8-bits of on-disk .i_compress_flag to in-memory .i_compress_level
- load low 8-bits of on-disk .i_compress_flag to in-memory .i_compress_flag
- change type of in-memory .i_compress_flag from unsigned short to unsigned
char.
w/ above changes, we can avoid unneeded bit shift whenever during
.init_compress_ctx(), and shrink size of struct f2fs_inode_info.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In do_read_inode(), sanity_check_inode() should be called after
f2fs_init_read_extent_tree(), fix it.
Fixes: 72840cccc0a1 ("f2fs: allocate the extent_cache by default")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
There is no need to additionally use f2fs_show_injection_info()
to output information. Concatenate time_to_inject() and
__time_to_inject() via a macro. In the new __time_to_inject()
function, pass in the caller function name and parent function.
In this way, we no longer need the f2fs_show_injection_info() function,
and let's remove it.
Suggested-by: Chao Yu <chao@kernel.org>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch introduces a runtime hot/cold data separation method
for f2fs, in order to improve the accuracy for data temperature
classification, reduce the garbage collection overhead after
long-term data updates.
Enhanced hot/cold data separation can record data block update
frequency as "age" of the extent per inode, and take use of the age
info to indicate better temperature type for data block allocation:
- It records total data blocks allocated since mount;
- When file extent has been updated, it calculate the count of data
blocks allocated since last update as the age of the extent;
- Before the data block allocated, it searches for the age info and
chooses the suitable segment for allocation.
Test and result:
- Prepare: create about 30000 files
* 3% for cold files (with cold file extension like .apk, from 3M to 10M)
* 50% for warm files (with random file extension like .FcDxq, from 1K
to 4M)
* 47% for hot files (with hot file extension like .db, from 1K to 256K)
- create(5%)/random update(90%)/delete(5%) the files
* total write amount is about 70G
* fsync will be called for .db files, and buffered write will be used
for other files
The storage of test device is large enough(128G) so that it will not
switch to SSR mode during the test.
Benefit: dirty segment count increment reduce about 14%
- before: Dirty +21110
- after: Dirty +18286
Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Let's allocate it to remove the runtime complexity.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch prepares extent_cache to be ready for addition.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Let's descrbie it's read extent cache.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
We need to make sure i_size doesn't change until atomic write commit is
successful and restore it when commit is failed.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|