Age | Commit message (Collapse) | Author |
|
Patch series "nilfs2: fix issues with rename operations".
This series fixes BUG_ON check failures reported by syzbot around rename
operations, and a minor behavioral issue where the mtime of a child
directory changes when it is renamed instead of moved.
This patch (of 2):
The directory manipulation routines nilfs_set_link() and
nilfs_delete_entry() rewrite the directory entry in the folio/page
previously read by nilfs_find_entry(), so error handling is omitted on the
assumption that nilfs_prepare_chunk(), which prepares the buffer for
rewriting, will always succeed for these. And if an error is returned, it
triggers the legacy BUG_ON() checks in each routine.
This assumption is wrong, as proven by syzbot: the buffer layer called by
nilfs_prepare_chunk() may call nilfs_get_block() if necessary, which may
fail due to metadata corruption or other reasons. This has been there all
along, but improved sanity checks and error handling may have made it more
reproducible in fuzzing tests.
Fix this issue by adding missing error paths in nilfs_set_link(),
nilfs_delete_entry(), and their caller nilfs_rename().
Link: https://lkml.kernel.org/r/20250111143518.7901-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20250111143518.7901-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+32c3706ebf5d95046ea1@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=32c3706ebf5d95046ea1
Reported-by: syzbot+1097e95f134f37d9395c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1097e95f134f37d9395c
Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations")
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Also for comments that do not cause kernel-doc warnings (those that list
multiple error codes), revise the return value description style to match
Brian G.'s suggestion of "..., or one of the following negative error
codes on failure:".
Link: https://lkml.kernel.org/r/CAAq45aNh1qV8P6XgDhKeNstT=PvcPUaCXsAF-f9rvmzznsZL5A@mail.gmail.com
Link: https://lkml.kernel.org/r/20250110010530.21872-8-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: "Brian G ." <gissf1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There are a number of kernel-doc comments for functions that are missing
return values, which also causes a number of warnings when the kernel-doc
script is run with the "-Wall" option.
Fix this issue by adding proper return value descriptions, and improve
code maintainability.
Link: https://lkml.kernel.org/r/20250110010530.21872-7-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: "Brian G ." <gissf1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Similar to the previous changes to fix return value descriptions, this
fixes the format of the return value descriptions of functions for the
rest.
Link: https://lkml.kernel.org/r/20250110010530.21872-6-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: "Brian G ." <gissf1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Similar to the previous changes to fix return value descriptions, this
fixes the format of the return value descriptions for metadata file
functions other than sufile.
Link: https://lkml.kernel.org/r/20250110010530.21872-5-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: "Brian G ." <gissf1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Similar to the previous changes to fix return value descriptions, this
fixes the format of the return value descriptions of functions for
sufile-related functions, eliminating a dozen warnings emitted by the
kernel-doc script.
Link: https://lkml.kernel.org/r/20250110010530.21872-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: "Brian G ." <gissf1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Similar to the previous patch to fix the ioctl return value descriptions,
this fixes the format of the return value descriptions for bmap (and
btree)-related functions, which was causing the kernel-doc script to emit
a number of warnings.
Link: https://lkml.kernel.org/r/20250110010530.21872-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: "Brian G ." <gissf1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "nilfs2: fix kernel-doc comments for function return values",
v2.
This series fixes the inadequacies in the return value descriptions in
nilfs2's kernel-doc comments (mainly incorrect formatting), as well as the
lack of return value descriptions themselves, and fixes most of the
remaining warnings that are output when the kernel-doc script is run with
the "-Wall" option.
This patch (of 7):
In the kernel-doc comments for functions, there are many cases where the
format of the return value description is inaccurate, such as "Return
Value: ...", which causes many warnings to be output when the kernel-doc
script is executed with the "-Wall" option.
This fixes such incorrectly formatted return value descriptions for ioctl
functions.
Link: https://lkml.kernel.org/r/20250110010530.21872-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20250110010530.21872-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: "Brian G ." <gissf1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
nilfs_lookup_dirty_data_buffers(), which iterates through the buffers
attached to dirty data folios/pages, accesses the attached buffers without
locking the folios/pages.
For data cache, nilfs_clear_folio_dirty() may be called asynchronously
when the file system degenerates to read only, so
nilfs_lookup_dirty_data_buffers() still has the potential to cause use
after free issues when buffers lose the protection of their dirty state
midway due to this asynchronous clearing and are unintentionally freed by
try_to_free_buffers().
Eliminate this race issue by adjusting the lock section in this function.
Link: https://lkml.kernel.org/r/20250107200202.6432-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption")
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "nilfs2: protect busy buffer heads from being force-cleared".
This series fixes the buffer head state inconsistency issues reported by
syzbot that occurs when the filesystem is corrupted and falls back to
read-only, and the associated buffer head use-after-free issue.
This patch (of 2):
Syzbot has reported that after nilfs2 detects filesystem corruption and
falls back to read-only, inconsistencies in the buffer state may occur.
One of the inconsistencies is that when nilfs2 calls mark_buffer_dirty()
to set a data or metadata buffer as dirty, but it detects that the buffer
is not in the uptodate state:
WARNING: CPU: 0 PID: 6049 at fs/buffer.c:1177 mark_buffer_dirty+0x2e5/0x520
fs/buffer.c:1177
...
Call Trace:
<TASK>
nilfs_palloc_commit_alloc_entry+0x4b/0x160 fs/nilfs2/alloc.c:598
nilfs_ifile_create_inode+0x1dd/0x3a0 fs/nilfs2/ifile.c:73
nilfs_new_inode+0x254/0x830 fs/nilfs2/inode.c:344
nilfs_mkdir+0x10d/0x340 fs/nilfs2/namei.c:218
vfs_mkdir+0x2f9/0x4f0 fs/namei.c:4257
do_mkdirat+0x264/0x3a0 fs/namei.c:4280
__do_sys_mkdirat fs/namei.c:4295 [inline]
__se_sys_mkdirat fs/namei.c:4293 [inline]
__x64_sys_mkdirat+0x87/0xa0 fs/namei.c:4293
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The other is when nilfs_btree_propagate(), which propagates the dirty
state to the ancestor nodes of a b-tree that point to a dirty buffer,
detects that the origin buffer is not dirty, even though it should be:
WARNING: CPU: 0 PID: 5245 at fs/nilfs2/btree.c:2089
nilfs_btree_propagate+0xc79/0xdf0 fs/nilfs2/btree.c:2089
...
Call Trace:
<TASK>
nilfs_bmap_propagate+0x75/0x120 fs/nilfs2/bmap.c:345
nilfs_collect_file_data+0x4d/0xd0 fs/nilfs2/segment.c:587
nilfs_segctor_apply_buffers+0x184/0x340 fs/nilfs2/segment.c:1006
nilfs_segctor_scan_file+0x28c/0xa50 fs/nilfs2/segment.c:1045
nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1216 [inline]
nilfs_segctor_collect fs/nilfs2/segment.c:1540 [inline]
nilfs_segctor_do_construct+0x1c28/0x6b90 fs/nilfs2/segment.c:2115
nilfs_segctor_construct+0x181/0x6b0 fs/nilfs2/segment.c:2479
nilfs_segctor_thread_construct fs/nilfs2/segment.c:2587 [inline]
nilfs_segctor_thread+0x69e/0xe80 fs/nilfs2/segment.c:2701
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Both of these issues are caused by the callbacks that handle the
page/folio write requests, forcibly clear various states, including the
working state of the buffers they hold, at unexpected times when they
detect read-only fallback.
Fix these issues by checking if the buffer is referenced before clearing
the page/folio state, and skipping the clear if it is.
Link: https://lkml.kernel.org/r/20250107200202.6432-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20250107200202.6432-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+b2b14916b77acf8626d7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b2b14916b77acf8626d7
Reported-by: syzbot+d98fd19acd08b36ff422@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=d98fd19acd08b36ff422
Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption")
Tested-by: syzbot+b2b14916b77acf8626d7@syzkaller.appspotmail.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The parameter is not used in __ocfs2_mknod_locked(). So remove it.
No functional change.
Link: https://lkml.kernel.org/r/20250106140634.92241-1-glass.su@suse.com
Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
While running fstests generic/329, the kernel workqueue
quota_release_workfn is dead looping in calling ocfs2_release_dquot().
The ocfs2 state is already readonly but ocfs2_release_dquot wants to
start a transaction but fails and returns.
=====================================================================
[ 2918.123602 ][ T275 ] On-disk corruption discovered. Please run
fsck.ocfs2 once the filesystem is unmounted.
[ 2918.124034 ][ T275 ] (kworker/u135:1,275,11):ocfs2_release_dquot:765
ERROR: status = -30
[ 2918.124452 ][ T275 ] (kworker/u135:1,275,11):ocfs2_release_dquot:795
ERROR: status = -30
[ 2918.124883 ][ T275 ] (kworker/u135:1,275,11):ocfs2_start_trans:357
ERROR: status = -30
[ 2918.125276 ][ T275 ] OCFS2: abort (device dm-0): ocfs2_start_trans:
Detected aborted journal
[ 2918.125710 ][ T275 ] On-disk corruption discovered. Please run
fsck.ocfs2 once the filesystem is unmounted.
=====================================================================
ocfs2_release_dquot() is much like dquot_release(), which is called by
ext4 to handle similar situation. So here fix it by marking the dquot as
inactive like what dquot_release() does.
Link: https://lkml.kernel.org/r/20250106140653.92292-1-glass.su@suse.com
Fixes: 9e33d69f553a ("ocfs2: Implementation of local and global quota file handling")
Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
syz reported:
(syz-executor404,5313,0):ocfs2_truncate_log_append:5874 ERROR: bug
expression: tl_count > ocfs2_truncate_recs_per_inode(osb->sb) ||
tl_count == 0
(syz-executor404,5313,0):ocfs2_truncate_log_append:5874 ERROR: Truncate
record count on #77 invalid wanted 39, actual 2087
------------[ cut here ]------------
kernel BUG at fs/ocfs2/alloc.c:5874!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 5313 Comm: syz-executor404 Not tainted
6.12.0-rc5-syzkaller-00299-g11066801dd4b #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:ocfs2_truncate_log_append+0x9a8/0x9c0 fs/ocfs2/alloc.c:5868
RSP: 0018:ffffc9000cf16f40 EFLAGS: 00010292
RAX: b4b54f1d10640800 RBX: 0000000000000027 RCX: b4b54f1d10640800
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: ffffc9000cf17070 R08: ffffffff8174a14c R09: 1ffff11003f8519a
R10: dffffc0000000000 R11: ffffed1003f8519b R12: 1ffff110085f5f58
R13: ffffff3800000000 R14: 000000000000004d R15: ffff8880438f0008
FS: 00005555722df380(0000) GS:ffff88801fc00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002000f000 CR3: 000000004010e000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
ocfs2_remove_btree_range+0x1303/0x1860 fs/ocfs2/alloc.c:5789
ocfs2_remove_inode_range+0xff3/0x29f0 fs/ocfs2/file.c:1907
ocfs2_reflink_remap_extent fs/ocfs2/refcounttree.c:4537 [inline]
ocfs2_reflink_remap_blocks+0xcd4/0x1f30 fs/ocfs2/refcounttree.c:4684
ocfs2_remap_file_range+0x5fa/0x8d0 fs/ocfs2/file.c:2736
vfs_copy_file_range+0xc07/0x1510 fs/read_write.c:1615
__do_sys_copy_file_range fs/read_write.c:1705 [inline]
__se_sys_copy_file_range+0x3f2/0x5d0 fs/read_write.c:1668
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fd327167af9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 61 17 00 00 90 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe6b8e22e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000146
RAX: ffffffffffffffda RBX: 00007fd3271b005e RCX: 00007fd327167af9
RDX: 0000000000000006 RSI: 0000000000000000 RDI: 0000000000000004
RBP: 00007fd3271de610 R08: 000000000000d8c2 R09: 0000000000000000
R10: 0000000020000640 R11: 0000000000000246 R12: 0000000000000001
R13: 00007ffe6b8e24b8 R14: 0000000000000001 R15: 0000000000000001
</TASK>
The fuzz image has a truncate log inode whose tl_count is bigger than
ocfs2_truncate_recs_per_inode() so it triggers the BUG in
ocfs2_truncate_log_append().
As what the check in ocfs2_truncate_log_append() does, just do same check
into ocfs2_get_truncate_log_info when truncate log inode is reading in so
we can bail out earlier.
Link: https://lkml.kernel.org/r/20250108024119.60313-1-glass.su@suse.com
Signed-off-by: Su Yue <glass.su@suse.com>
Reported-by: Liebes Wang <wanghaichi0403@gmail.com>
Link: https://lore.kernel.org/ocfs2-devel/CADCV8souQhdP0RdQF1U7KTWtuHDfpn+3LnTt-EEuMmB-pMRrgQ@mail.gmail.com/T/#u
Reported-by: syzbot+a66542ca5ebb4233b563@syzkaller.appspotmail.com
Tested-by: syzbot+a66542ca5ebb4233b563@syzkaller.appspotmail.com
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Correct the value of l_next_free_rec to l_count during the online check,
as done in the check_el() function in ocfs2_tools.
Link: https://lkml.kernel.org/r/20250106023432.1320904-2-sunjunchao2870@gmail.com
Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Recently syzbot reported a use-after-free issue[1].
The root cause of the problem is that the journal inode recorded in this
file system image is corrupted. The value of
"di->id2.i_list.l_next_free_rec" is 8193, which is greater than the value
of "di->id2.i_list.l_count" (19).
To solve this problem, an additional check should be added within
ocfs2_get_clusters_nocache(). If the check fails, an error will be
returned and the file system will be set to read-only.
[1]: https://lore.kernel.org/all/67577778.050a0220.a30f1.01bc.GAE@google.com/T/
Link: https://lkml.kernel.org/r/20250106023432.1320904-1-sunjunchao2870@gmail.com
Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Reported-by: syzbot+2313dda4dc4885c93578@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2313dda4dc4885c93578
Tested-by: syzbot+2313dda4dc4885c93578@syzkaller.appspotmail.com
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
squashfs_fill_page is only used in this file, so make it static.
Use kmap_local instead of kmap_atomic, and return a bool so that
the caller can use folio_end_read() which saves an atomic operation
over calling folio_mark_uptodate() followed by folio_unlock().
[willy@infradead.org: fix polarity of "uptodate" Thanks to Ryan for testing]
Link: https://lkml.kernel.org/r/20250110163300.3346321-2-willy@infradead.org
Link: https://lkml.kernel.org/r/20241220224634.723899-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove accesses to page->index and page->mapping. Also use folio
APIs where available. This code still assumes order 0 folios.
[dan.carpenter@linaro.org: fix a NULL vs IS_ERR() bug]
Link: https://lkml.kernel.org/r/7b7f44d6-9153-4d7c-b65b-2d78febe6c7a@stanley.mountain
Link: https://lkml.kernel.org/r/20241220224634.723899-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove a few accesses to page->mapping.
Link: https://lkml.kernel.org/r/20241220224634.723899-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove an access to page->mapping.
Link: https://lkml.kernel.org/r/20241220224634.723899-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Use modern folio APIs where they exist and convert back to struct
page for the internal functions.
Link: https://lkml.kernel.org/r/20241220224634.723899-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Update the compression algorithms supported, and the Squashfs website
location.
Link: https://lkml.kernel.org/r/20241229233752.54481-5-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
If Squashfs has been configured to directly read datablocks into the page
cache (SQUASHFS_FILE_DIRECT), then the read_page cache is unnecessary.
This improvement is due to the following two commits, which added the
ability to read datablocks into the page cache when pages were missing,
enabling the fallback which used an intermediate buffer to be removed.
commit f268eedddf359 ("squashfs: extend "page actor" to handle missing pages")
commit 1bb1a07afad97 ("squashfs: don't use intermediate buffer if pages missing")
This reduces the amount of memory used when mounting a filesystem by
block_size * maximum number of threads.
Link: https://lkml.kernel.org/r/20241229233752.54481-3-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "squashfs: reduce memory usage and update docs".
This patchset reduces the amount of memory that Squashfs uses when
CONFIG_FILE_DIRECT is configured, and updates various out of date
information in the documentation and Kconfig.
This patch (of 4):
Make squashfs_cache_init() return an ERR_PTR(-ENOMEM) on failure rather
than NULL.
This tidies up some calling code, but, it also allows NULL to be returned
as a valid result when a cache hasn't be allocated.
Link: https://lkml.kernel.org/r/20241229233752.54481-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20241229233752.54481-2-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When testing the atomic write fix patches, the f2fs_bug_on was
triggered as below:
------------[ cut here ]------------
kernel BUG at fs/f2fs/inode.c:935!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 3 UID: 0 PID: 257 Comm: bash Not tainted 6.13.0-rc1-00033-gc283a70d3497 #5
RIP: 0010:f2fs_evict_inode+0x50f/0x520
Call Trace:
<TASK>
? __die_body+0x65/0xb0
? die+0x9f/0xc0
? do_trap+0xa1/0x170
? f2fs_evict_inode+0x50f/0x520
? f2fs_evict_inode+0x50f/0x520
? handle_invalid_op+0x65/0x80
? f2fs_evict_inode+0x50f/0x520
? exc_invalid_op+0x39/0x50
? asm_exc_invalid_op+0x1a/0x20
? __pfx_f2fs_get_dquots+0x10/0x10
? f2fs_evict_inode+0x50f/0x520
? f2fs_evict_inode+0x2e5/0x520
evict+0x186/0x2f0
prune_icache_sb+0x75/0xb0
super_cache_scan+0x1a8/0x200
do_shrink_slab+0x163/0x320
shrink_slab+0x2fc/0x470
drop_slab+0x82/0xf0
drop_caches_sysctl_handler+0x4e/0xb0
proc_sys_call_handler+0x183/0x280
vfs_write+0x36d/0x450
ksys_write+0x68/0xd0
do_syscall_64+0xc8/0x1a0
? arch_exit_to_user_mode_prepare+0x11/0x60
? irqentry_exit_to_user_mode+0x7e/0xa0
The root cause is: f2fs uses FI_ATOMIC_DIRTIED to indicate dirty
atomic files during commit. If the inode is dirtied during commit,
such as by f2fs_i_pino_write, the vfs inode keeps clean and the
f2fs inode is set to FI_DIRTY_INODE. The FI_DIRTY_INODE flag cann't
be cleared by write_inode later due to the clean vfs inode. Finally,
f2fs_bug_on is triggered due to this inconsistent state when evict.
To reproduce this situation:
- fd = open("/mnt/test.db", O_WRONLY)
- ioctl(fd, F2FS_IOC_START_ATOMIC_WRITE)
- mv /mnt/test.db /mnt/test1.db
- ioctl(fd, F2FS_IOC_COMMIT_ATOMIC_WRITE)
- echo 3 > /proc/sys/vm/drop_caches
To fix this problem, clear FI_DIRTY_INODE after commit, then
f2fs_mark_inode_dirty_sync will ensure a consistent dirty state.
Fixes: fccaa81de87e ("f2fs: prevent atomic file from being dirtied before commit")
Signed-off-by: Yunlei He <heyunlei@xiaomi.com>
Signed-off-by: Jianan Huang <huangjianan@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
- Increase the headroom in the EFI memory map allocation created by the
EFI stub. This is needed because event callbacks called during
ExitBootServices() may cause fragmentation, and reallocation is not
allowed after that.
- Drop obsolete UGA graphics code and switch to a more ergonomic API to
traverse handle buffers. Simplify some error paths using a __free()
helper while at it.
- Fix some W=1 warnings when CONFIG_EFI=n
- Rely on the dentry cache to keep track of the contents of the
efivarfs filesystem, rather than using a separate linked list.
- Improve and extend efivarfs test cases.
- Synchronize efivarfs with underlying variable store on resume from
hibernation - this is needed because the firmware itself or another
OS running on the same machine may have modified it.
- Fix x86 EFI stub build with GCC 15.
- Fix kexec/x86 false positive warning in EFI memory attributes table
sanity check.
* tag 'efi-next-for-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (23 commits)
x86/efi: skip memattr table on kexec boot
efivarfs: add variable resync after hibernation
efivarfs: abstract initial variable creation routine
efi: libstub: Use '-std=gnu11' to fix build with GCC 15
selftests/efivarfs: add concurrent update tests
selftests/efivarfs: fix tests for failed write removal
efivarfs: fix error on write to new variable leaving remnants
efivarfs: remove unused efivarfs_list
efivarfs: move variable lifetime management into the inodes
selftests/efivarfs: add check for disallowing file truncation
efivarfs: prevent setting of zero size on the inodes in the cache
efi: sysfb_efi: fix W=1 warnings when EFI is not set
efi/libstub: Use __free() helper for pool deallocations
efi/libstub: Use cleanup helpers for freeing copies of the memory map
efi/libstub: Simplify PCI I/O handle buffer traversal
efi/libstub: Refactor and clean up GOP resolution picker code
efi/libstub: Simplify GOP handling code
efi/libstub: Use C99-style for loop to traverse handle buffer
x86/efistub: Drop long obsolete UGA support
efivarfs: make variable_is_present use dcache lookup
...
|
|
fuse-over-io-uring uses existing functions to find requests based
on their unique id - make these functions non-static.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Luis Henriques <luis@igalia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Add special fuse-io-uring into the fuse argument
copy handler.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Luis Henriques <luis@igalia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Move 'struct fuse_copy_state' and fuse_copy_* functions
to fuse_dev_i.h to make it available for fuse-io-uring.
'copy_out_args()' is renamed to 'fuse_copy_out_args'.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Luis Henriques <luis@igalia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
This adds basic support for ring SQEs (with opcode=IORING_OP_URING_CMD).
For now only FUSE_IO_URING_CMD_REGISTER is handled to register queue
entries.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> # io_uring
Reviewed-by: Luis Henriques <luis@igalia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
This change sets up FUSE operations to always have headers in
args.in_args[0], even for opcodes without an actual header.
This step prepares for a clean separation of payload from headers,
initially it is used by fuse-over-io-uring.
For opcodes without a header, we use a zero-sized struct as a
placeholder. This approach:
- Keeps things consistent across all FUSE operations
- Will help with payload alignment later
- Avoids future issues when header sizes change
Op codes that already have an op code specific header do not
need modification.
Op codes that have neither payload nor op code headers
are not modified either (FUSE_READLINK and FUSE_DESTROY).
FUSE_BATCH_FORGET already has the header in the right place,
but is not using fuse_copy_args - as -over-uring is currently
not handling forgets it does not matter for now, but header
separation will later need special attention for that op code.
Correct the struct fuse_args->in_args array max size.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Luis Henriques <luis@igalia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
These are needed by fuse-over-io-uring.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Luis Henriques <luis@igalia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Another preparation patch, as this function will be needed by
fuse/dev.c and fuse/dev_uring.c.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Luis Henriques <luis@igalia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
This function is needed by fuse_uring.c to clean ring queues,
so make it non static. Especially in non-static mode the function
name 'end_requests' should be prefixed with fuse_
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Luis Henriques <luis@igalia.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:
- Fix oops in DebugData when link speed 0
- Two reparse point fixes
- Ten DFS (global namespace) fixes
- Symlink error handling fix
- Two SMB1 fixes
- Four cleanup fixes
- Improved debugging of status codes
- Fix incorrect output of tracepoints for compounding, and add missing
compounding tracepoint
* tag 'v6.14-rc-smb3-client-fixes-part' of git://git.samba.org/sfrench/cifs-2.6: (23 commits)
smb: client: handle lack of EA support in smb2_query_path_info()
smb: client: don't check for @leaf_fullpath in match_server()
smb: client: get rid of TCP_Server_Info::refpath_lock
cifs: Remove duplicate struct reparse_symlink_data and SYMLINK_FLAG_RELATIVE
cifs: Do not attempt to call CIFSGetSrvInodeNumber() without CAP_INFOLEVEL_PASSTHRU
cifs: Do not attempt to call CIFSSMBRenameOpenFile() without CAP_INFOLEVEL_PASSTHRU
cifs: Remove declaration of dead CIFSSMBQuerySymLink function
cifs: Fix printing Status code into dmesg
cifs: Add missing NT_STATUS_* codes from nterr.h to nterr.c
cifs: Fix endian types in struct rfc1002_session_packet
cifs: Use cifs_autodisable_serverino() for disabling CIFS_MOUNT_SERVER_INUM in readdir.c
smb3: add missing tracepoint for querying wsl EAs
smb: client: fix order of arguments of tracepoints
smb: client: fix oops due to unset link speed
smb: client: correctly handle ErrorContextData as a flexible array
smb: client: don't retry DFS targets on server shutdown
smb: client: fix return value of parse_dfs_referrals()
smb: client: optimize referral walk on failed link targets
smb: client: provide dns_resolve_{unc,name} helpers
smb: client: parse DNS domain name from domain= option
...
|
|
Pull smb server updates from Steve French:
"Three ksmbd server fixes:
- Fix potential memory corruption in IPC calls
- Support FSCTL_QUERY_INTERFACE_INFO for more configurations
- Remove some unused functions"
* tag 'v6.14-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix integer overflows on 32 bit systems
ksmbd: browse interfaces list on FSCTL_QUERY_INTERFACE_INFO IOCTL
ksmbd: Remove unused functions
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify pre-content notification support from Jan Kara:
"This introduces a new fsnotify event (FS_PRE_ACCESS) that gets
generated before a file contents is accessed.
The event is synchronous so if there is listener for this event, the
kernel waits for reply. On success the execution continues as usual,
on failure we propagate the error to userspace. This allows userspace
to fill in file content on demand from slow storage. The context in
which the events are generated has been picked so that we don't hold
any locks and thus there's no risk of a deadlock for the userspace
handler.
The new pre-content event is available only for users with global
CAP_SYS_ADMIN capability (similarly to other parts of fanotify
functionality) and it is an administrator responsibility to make sure
the userspace event handler doesn't do stupid stuff that can DoS the
system.
Based on your feedback from the last submission, fsnotify code has
been improved and now file->f_mode encodes whether pre-content event
needs to be generated for the file so the fast path when nobody wants
pre-content event for the file just grows the additional file->f_mode
check. As a bonus this also removes the checks whether the old
FS_ACCESS event needs to be generated from the fast path. Also the
place where the event is generated during page fault has been moved so
now filemap_fault() generates the event if and only if there is no
uptodate folio in the page cache.
Also we have dropped FS_PRE_MODIFY event as current real-world users
of the pre-content functionality don't really use it so let's start
with the minimal useful feature set"
* tag 'fsnotify_hsm_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (21 commits)
fanotify: Fix crash in fanotify_init(2)
fs: don't block write during exec on pre-content watched files
fs: enable pre-content events on supported file systems
ext4: add pre-content fsnotify hook for DAX faults
btrfs: disable defrag on pre-content watched files
xfs: add pre-content fsnotify hook for DAX faults
fsnotify: generate pre-content permission event on page fault
mm: don't allow huge faults for files with pre content watches
fanotify: disable readahead if we have pre-content watches
fanotify: allow to set errno in FAN_DENY permission response
fanotify: report file range info with pre-content events
fanotify: introduce FAN_PRE_ACCESS permission event
fsnotify: generate pre-content permission event on truncate
fsnotify: pass optional file access range in pre-content event
fsnotify: introduce pre-content permission events
fanotify: reserve event bit of deprecated FAN_DIR_MODIFY
fanotify: rename a misnamed constant
fanotify: don't skip extra event info if no info_mode is set
fsnotify: check if file is actually being watched for pre-content events on open
fsnotify: opt-in for permission events at file open time
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull isofs update from Jan Kara:
"Partial conversion of isofs to folios"
* tag 'fs_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
isofs: Partially convert zisofs_read_folio to use a folio
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull inotify update from Jan Kara:
"A small inotify strcpy() cleanup"
* tag 'fsnotify_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
inotify: Use strscpy() for event->name copies
|
|
Pull XFS updates from Carlos Maiolino:
"This is mostly focused on the implementation of reflink and
reverse-mapping support for XFS's real-time devices.
It also includes several bugfixes.
- Implement reflink support for the realtime device
- Implement reverse-mapping support for the realtime device
- Several bug fixes and cleanups"
* tag 'xfs-merge-6.14' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (121 commits)
xfs: fix buffer lookup vs release race
xfs: check for dead buffers in xfs_buf_find_insert
xfs: add a b_iodone callback to struct xfs_buf
xfs: move b_li_list based retry handling to common code
xfs: simplify xfsaild_resubmit_item
xfs: always complete the buffer inline in xfs_buf_submit
xfs: remove the extra buffer reference in xfs_buf_submit
xfs: move invalidate_kernel_vmap_range to xfs_buf_ioend
xfs: simplify buffer I/O submission
xfs: move in-memory buftarg handling out of _xfs_buf_ioapply
xfs: move write verification out of _xfs_buf_ioapply
xfs: remove xfs_buf_delwri_submit_buffers
xfs: simplify xfs_buf_delwri_pushbuf
xfs: move xfs_buf_iowait out of (__)xfs_buf_submit
xfs: remove the incorrect comment about the b_pag field
xfs: remove the incorrect comment above xfs_buf_free_maps
xfs: fix a double completion for buffers on in-memory targets
xfs/libxfs: replace kmalloc() and memcpy() with kmemdup()
xfs: constify feature checks
xfs: refactor xfs_fs_statfs
...
|
|
- Set `compressedblks = 1` directly for non-bigpcluster cases. This
simplifies the logic a bit since lcluster sizes larger than one block
are unsupported and the details remain unclear.
- For Z_EROFS_LCLUSTER_TYPE_PLAIN pclusters, avoid assuming
`compressedblks = 1` by default. Instead, check if
Z_EROFS_ADVISE_BIG_PCLUSTER_2 is set.
It basically has no impact to existing valid images, but it's useful to
find the gap to prepare for large PLAIN pclusters.
Link: https://lore.kernel.org/r/20250123090109.973463-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull AT_EXECVE_CHECK from Kees Cook:
- Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün)
- Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
(Mickaël Salaün)
- Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün)
* tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
ima: instantiate the bprm_creds_for_exec() hook
samples/check-exec: Add an enlighten "inc" interpreter and 28 tests
selftests: ktap_helpers: Fix uninitialized variable
samples/check-exec: Add set-exec
selftests/landlock: Add tests for execveat + AT_EXECVE_CHECK
selftests/exec: Add 32 tests for AT_EXECVE_CHECK and exec securebits
security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
exec: Add a new AT_EXECVE_CHECK flag to execveat(2)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
- Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
directly accessible via the library API, instead of requiring the
crypto API. This is much simpler and more efficient.
- Convert some users such as ext4 to use the CRC32 library API instead
of the crypto API. More conversions like this will come later.
- Add a KUnit test that tests and benchmarks multiple CRC variants.
Remove older, less-comprehensive tests that are made redundant by
this.
- Add an entry to MAINTAINERS for the kernel's CRC library code. I'm
volunteering to maintain it. I have additional cleanups and
optimizations planned for future cycles.
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (31 commits)
MAINTAINERS: add entry for CRC library
powerpc/crc: delete obsolete crc-vpmsum_test.c
lib/crc32test: delete obsolete crc32test.c
lib/crc16_kunit: delete obsolete crc16_kunit.c
lib/crc_kunit.c: add KUnit test suite for CRC library functions
powerpc/crc-t10dif: expose CRC-T10DIF function through lib
arm64/crc-t10dif: expose CRC-T10DIF function through lib
arm/crc-t10dif: expose CRC-T10DIF function through lib
x86/crc-t10dif: expose CRC-T10DIF function through lib
crypto: crct10dif - expose arch-optimized lib function
lib/crc-t10dif: add support for arch overrides
lib/crc-t10dif: stop wrapping the crypto API
scsi: target: iscsi: switch to using the crc32c library
f2fs: switch to using the crc32 library
jbd2: switch to using the crc32c library
ext4: switch to using the crc32c library
lib/crc32: make crc32c() go directly to lib
bcachefs: Explicitly select CRYPTO from BCACHEFS_FS
x86/crc32: expose CRC32 functions through lib
x86/crc32: update prototype for crc32_pclmul_le_16()
...
|
|
If the server doesn't support both EAs and reparse point in a file,
the SMB2_QUERY_INFO request will fail with either
STATUS_NO_EAS_ON_FILE or STATUS_EAS_NOT_SUPPORT in the compound chain,
so ignore it as long as reparse point isn't
IO_REPARSE_TAG_LX_(CHR|BLK), which would require the EAs to know about
major/minor numbers.
Reported-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The matching of DFS connections is already handled by @dfs_conn, so
remove @leaf_fullpath matching altogether.
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
TCP_Server_Info::leaf_fullpath is allocated in cifs_get_tcp_session()
and never changed afterwards, so there is no need to serialize its
access.
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The following two 'check only recovery' processes are very dependent on
the return value of f2fs_recover_fsync_data, especially when the return
value is greater than 0.
1. when device has readonly mode, shown as commit
23738e74472f ("f2fs: fix to restrict mount condition on readonly block device")
2. mount optiont NORECOVERY or DISABLE_ROLL_FORWARD is set, shown as commit
6781eabba1bd ("f2fs: give -EINVAL for norecovery and rw mount")
However, commit c426d99127b1 ("f2fs: Check write pointer consistency of open zones")
will change the return value unexpectedly, thereby changing the caller's behavior
This patch let the f2fs_recover_fsync_data return correct value,and not do
f2fs_check_and_fix_write_pointer when the device is read-only.
Fixes: c426d99127b1 ("f2fs: Check write pointer consistency of open zones")
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Now f2fs_invalidate_blocks() supports a continuous range of addresses,
so the for loop can be omitted.
Signed-off-by: Yi Sun <yi.sun@unisoc.com>
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Show mtime in segment_bits for debug.
cat /proc/fs//f2fs/loop0/segment_bits
format: segment_type|valid_blocks|bitmaps|mtime
segment_type(0:HD, 1:WD, 2:CD, 3:HN, 4:WN, 5:CN)
0 3|1 | 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00| ffffffffffffffff
1 4|3 | 00 d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00| ffffffffffffffff
2 5|0 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00| ffffffffffffffff
3 0|1 | 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00| ffffffffffffffff
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
syzbot reported a f2fs bug as below:
------------[ cut here ]------------
kernel BUG at fs/f2fs/gc.c:373!
CPU: 0 UID: 0 PID: 5316 Comm: syz.0.0 Not tainted 6.13.0-rc3-syzkaller-00044-gaef25be35d23 #0
RIP: 0010:get_cb_cost fs/f2fs/gc.c:373 [inline]
RIP: 0010:get_gc_cost fs/f2fs/gc.c:406 [inline]
RIP: 0010:f2fs_get_victim+0x68b1/0x6aa0 fs/f2fs/gc.c:912
Call Trace:
<TASK>
__get_victim fs/f2fs/gc.c:1707 [inline]
f2fs_gc+0xc89/0x2f60 fs/f2fs/gc.c:1915
f2fs_ioc_gc fs/f2fs/file.c:2624 [inline]
__f2fs_ioctl+0x4cc9/0xb8b0 fs/f2fs/file.c:4482
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
w/ below testcase, it can reproduce directly:
- dd if=/dev/zero of=/tmp/file bs=1M count=64
- mkfs.f2fs /tmp/file
- mount -t f2fs -o loop,mode=fragment:block /tmp/file /mnt/f2fs
- echo 0 > /sys/fs/f2fs/loop0/min_ssr_sections
- dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=5
- umount /mnt/f2fs
- for((i=4096;i<16384;i+=512)) do inject.f2fs --sit 0 --blk $i --mb mtime --val -1 /tmp/file; done
- mount -o loop /tmp/file /mnt/f2fs
- f2fs_io gc 0 /mnt/f2fs/file
static unsigned int get_cb_cost()
{
...
mtime = f2fs_get_section_mtime(sbi, segno);
f2fs_bug_on(sbi, mtime == INVALID_MTIME);
...
}
The root cause is: mtime in f2fs_sit_entry can be fuzzed to INVALID_MTIME,
then it will trigger BUG_ON in get_cb_cost() during GC.
Let's change behavior of f2fs_get_section_mtime() as below for fix:
- return INVALID_MTIME only if total valid blocks is zero.
- return INVALID_MTIME - 1 if average mtime calculated is
INVALID_MTIME.
Fixes: b19ee7272208 ("f2fs: introduce f2fs_get_section_mtime")
Reported-by: syzbot+b9972806adbe20a910eb@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/6768c82e.050a0220.226966.0035.GAE@google.com
Cc: liuderong <liuderong@oppo.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
When building for 32-bit platforms, for which 'size_t' is 'unsigned int',
there is a warning due to an incorrect format specifier:
fs/f2fs/inode.c:320:6: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat]
318 | f2fs_warn(sbi, "%s: inode (ino=%lx) has corrupted i_inline_xattr_size: %d, min: %lu, max: %lu",
| ~~~
| %u
319 | __func__, inode->i_ino, fi->i_inline_xattr_size,
320 | MIN_INLINE_XATTR_SIZE, MAX_INLINE_XATTR_SIZE);
| ^~~~~~~~~~~~~~~~~~~~~
fs/f2fs/f2fs.h:1855:46: note: expanded from macro 'f2fs_warn'
1855 | f2fs_printk(sbi, false, KERN_WARNING fmt, ##__VA_ARGS__)
| ~~~ ^~~~~~~~~~~
fs/f2fs/xattr.h:86:31: note: expanded from macro 'MIN_INLINE_XATTR_SIZE'
86 | #define MIN_INLINE_XATTR_SIZE (sizeof(struct f2fs_xattr_header) / sizeof(__le32))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use the format specifier for 'size_t', '%zu', to resolve the warning.
Fixes: 5c1768b67250 ("f2fs: fix to do sanity check correctly on i_inline_xattr_size")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|