Age | Commit message (Collapse) | Author |
|
Separate the op from the rq_flag_bits and have f2fs
set/get the bio using bio_set_op_attrs/bio_op.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This has callers of submit_bio/submit_bio_wait set the bio->bi_rw
instead of passing it in. This makes that use the same as
generic_make_request and how we set the other bio fields.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Fixed up fs/ext4/crypto.c
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This is to avoid cache entry management overhead including radix tree.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This should be 1%, 10MB / 1GB.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset
at any time like below.
Thread #1 Thread #2
- lock_page(ipage)
- update i_fields
- update i_size/i_blocks/and so on
- set FI_DIRTY_INODE
- reset FI_DIRTY_INODE
- set_page_dirty(ipage)
In this case, we can lose the latest i_field information.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
We don't need lock parameter, which is always true.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
The number should be covered by spin_lock. Otherwise we can see wrong count
in f2fs_stat.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Remove deprecated paramter.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Previously, f2fs_write_data_pages() calls __f2fs_writepage() which calls
f2fs_write_data_page().
If f2fs_write_data_page() returns AOP_WRITEPAGE_ACTIVATE, __f2fs_writepage()
calls mapping_set_error(). But, this should not happen at every time, since
sometimes f2fs_write_data_page() tries to skip writing pages without error.
For example, volatile_write() gives EIO all the time, as Shuoran Liu pointed
out.
Reported-by: Shuoran Liu <liushuoran@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Now we can report an error to f2fs_lookup given by f2fs_find_entry.
Suggested-by: He YunLei <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Commit aaf9607516ed38825268515ef4d773289a44f429 ("f2fs: check node page
contents all the time") pointed out that "sometimes it was reported that
its contents was missing", so it checks the page's mapping and contents.
When "nid != nid_of_node(page)", ERR_PTR(-EIO) will be returned to the
caller. However, commit e1c51b9f1df2f9efc2ec11488717e40cd12015f9 ("f2fs:
clean up node page updating flow") moves "nid != nid_of_node(page)" test
to "f2fs_bug_on(sbi, nid != nid_of_node(page))", this will return a
wrong page to the caller when F2FS_CHECK_FS is off when "sometimes it
was reported that its contents was missing" happens.
This patch restores to check node page contents all the time, and
returns the errno to make the caller known something is wrong and avoid
to use the page. This patch also moves f2fs_bug_on to its proper location.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If there is no cold page, we don't need to do a loop to flush dirty
data pages.
On /dev/pmem0,
1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
Before : 1.1 GB/s
After : 1.2 GB/s
2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
Before : 2.2 GB/s
After : 2.3 GB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
For data pages, let's try to flush as much as possible in background.
On /dev/pmem0,
1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
Before : 800 MB/s
After : 1.1 GB/s
2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
Before : 1.3 GB/s
After : 2.2 GB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If we get ENOMEM or EIO in f2fs_find_entry, we should stop right away.
Otherwise, for example, we can get duplicate directory entry by ->chash and
->clevel.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch removes writepages lock.
We can improve multi-threading performance.
tiobench, 32 threads, 4KB write per fsync on SSD
Before: 25.88 MB/s
After: 28.03 MB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch sets flush_merge by default.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If flush commands do not incur any congestion, we don't need to throw that to
dispatching queue which causes unnecessary latency.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch adds lazytime support.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If roll-forward recovery can recover i_size, we don't need to update inode's
metadata during fsync.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch reduces to call them across the whole tree.
- sync_inode_page()
- update_inode_page()
- update_inode()
- f2fs_write_inode()
Instead, checkpoint will flush all the dirty inode metadata before syncing
node pages.
Note that, this is doable, since we call mark_inode_dirty_sync() for all
inode's field change which needs to update on-disk inode as well.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch registers all the inodes which have dirty metadata to sync when
checkpoint is doing.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch calls mark_inode_dirty_sync() for the following on-disk inode
changes.
-> largest
-> ctime/mtime/atime
-> i_current_depth
-> i_xattr_nid
-> i_pino
-> i_advise
-> i_flags
-> i_mode
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch introduces f2fs_i_links_write() to call mark_inode_dirty_sync() when
changing inode->i_links.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch introduces f2fs_i_blocks_write() to call mark_inode_dirty_sync() when
changing inode->i_blocks.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch introduces f2fs_i_size_write() to call mark_inode_dirty_sync() with
i_size_write().
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch refactors to use inode pointer for set_inode_flag and
clear_inode_flag.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This reverts commit b951a4ec165af4973b2bd9c80fb5845fbd840435.
Conflicts:
fs/f2fs/checkpoint.c
|
|
it's not needed for file_operations of inodes located on fs defined
in the hosting module and for file_operations that go into procfs.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
preparation for similar switch in ->setxattr() (see the next commit for
rationale).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, as Ted pointed out, fscrypto allows one more key prefix
given by filesystem to resolve backward compatibility issues. Other
than that, we've fixed several error handling cases by introducing
a fault injection facility. We've also achieved performance
improvement in some workloads as well as a bunch of bug fixes.
Summary:
Enhancements:
- fs-specific prefix for fscrypto
- fault injection facility
- expose validity bitmaps for user to be aware of fragmentation
- fallocate/rm/preallocation speed up
- use percpu counters
Bug fixes:
- some inline_dentry/inline_data bugs
- error handling for atomic/volatile/orphan inodes
- recover broken superblock"
* tag 'for-f2fs-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (73 commits)
f2fs: fix to update dirty page count correctly
f2fs: flush pending bios right away when error occurs
f2fs: avoid ENOSPC fault in the recovery process
f2fs: make exit_f2fs_fs more clear
f2fs: use percpu_counter for total_valid_inode_count
f2fs: use percpu_counter for alloc_valid_block_count
f2fs: use percpu_counter for # of dirty pages in inode
f2fs: use percpu_counter for page counters
f2fs: use bio count instead of F2FS_WRITEBACK page count
f2fs: manipulate dirty file inodes when DATA_FLUSH is set
f2fs: add fault injection to sysfs
f2fs: no need inc dirty pages under inode lock
f2fs: fix incorrect error path handling in f2fs_move_rehashed_dirents
f2fs: fix i_current_depth during inline dentry conversion
f2fs: correct return value type of f2fs_fill_super
f2fs: fix deadlock when flush inline data
f2fs: avoid f2fs_bug_on during recovery
f2fs: show # of orphan inodes
f2fs: support in batch fzero in dnode page
f2fs: support in batch multi blocks preallocation
...
|
|
Let's gather the UUID related functions under one hood.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Once we failed to merge inline data into inode page during flushing inline
inode, we will skip invoking inode_dec_dirty_pages, which makes dirty page
count incorrect, result in panic in ->evict_inode, Fix it.
------------[ cut here ]------------
kernel BUG at /home/yuchao/git/devf2fs/inode.c:336!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 3 PID: 10004 Comm: umount Tainted: G O 4.6.0-rc5+ #17
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
task: f0c33000 ti: c5212000 task.ti: c5212000
EIP: 0060:[<f89aacb5>] EFLAGS: 00010202 CPU: 3
EIP is at f2fs_evict_inode+0x85/0x490 [f2fs]
EAX: 00000001 EBX: c4529ea0 ECX: 00000001 EDX: 00000000
ESI: c0131000 EDI: f89dd0a0 EBP: c5213e9c ESP: c5213e78
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: b75878c0 CR3: 1a36a700 CR4: 000406f0
Stack:
c4529ea0 c4529ef4 c5213e8c c176d45c c4529ef4 00000000 c4529ea0 c4529fac
f89dd0a0 c5213eb0 c1204a68 c5213ed8 c452a2b4 c6680930 c5213ec0 c1204b64
c6680d44 c6680620 c5213eec c120588d ee84b000 ee84b5c0 c5214000 ee84b5e0
Call Trace:
[<c176d45c>] ? _raw_spin_unlock+0x2c/0x50
[<c1204a68>] evict+0xa8/0x170
[<c1204b64>] dispose_list+0x34/0x50
[<c120588d>] evict_inodes+0x10d/0x130
[<c11ea941>] generic_shutdown_super+0x41/0xe0
[<c1185190>] ? unregister_shrinker+0x40/0x50
[<c1185190>] ? unregister_shrinker+0x40/0x50
[<c11eac52>] kill_block_super+0x22/0x70
[<f89af23e>] kill_f2fs_super+0x1e/0x20 [f2fs]
[<c11eae1d>] deactivate_locked_super+0x3d/0x70
[<c11eb383>] deactivate_super+0x43/0x60
[<c1208ec9>] cleanup_mnt+0x39/0x80
[<c1208f50>] __cleanup_mnt+0x10/0x20
[<c107d091>] task_work_run+0x71/0x90
[<c105725a>] exit_to_usermode_loop+0x72/0x9e
[<c1001c7c>] do_fast_syscall_32+0x19c/0x1c0
[<c176dd48>] sysenter_past_esp+0x45/0x74
EIP: [<f89aacb5>] f2fs_evict_inode+0x85/0x490 [f2fs] SS:ESP 0068:c5213e78
---[ end trace d30536330b7fdc58 ]---
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Given errors, this patch flushes pending bios as soon as possible.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch avoids impossible error injection, ENOSPC, during recovery process.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
init_f2fs_fs does:
1) f2fs_build_trace_ios
2) init_inodecache
3) create_node_manager_caches
4) create_segment_manager_caches
5) create_checkpoint_caches
6) create_extent_cache
7) kset_create_and_add
8) kobject_init_and_add
9) register_shrinker
10) register_filesystem
11) f2fs_create_root_stats
12) proc_mkdir
exit_f2fs_fs should do cleanup in the reverse order
to make the code more clear.
Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch uses percpu_counter to avoid stat_lock.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch uses percpu_count for sbi->alloc_valid_block_count.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch adds percpu_counter for # of dirty pages in inode.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch substitutes percpu_counter for atomic_counter when counting
various types of pages.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This can reduce page counting overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs cleanups from Al Viro:
"More cleanups from Christoph"
* 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
nfsd: use RWF_SYNC
fs: add RWF_DSYNC aand RWF_SYNC
ceph: use generic_write_sync
fs: simplify the generic_write_sync prototype
fs: add IOCB_SYNC and IOCB_DSYNC
direct-io: remove the offset argument to dio_complete
direct-io: eliminate the offset argument to ->direct_IO
xfs: eliminate the pos variable in xfs_file_dio_aio_write
filemap: remove the pos argument to generic_file_direct_write
filemap: remove pos variables in generic_file_read_iter
|
|
Backmerge to resolve a conflict in ovl_lookup_real();
"ovl_lookup_real(): use lookup_one_len_unlocked()" instead,
but it was too late in the cycle to rebase.
|
|
It needs to maintain dirty file inodes only if DATA_FLUSH is set.
Otherwise, let's avoid its overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch introduces a new struct f2fs_fault_info and a global f2fs_fault
to save fault injection status. Fault injection entries are created in
/sys/fs/f2fs/fault_injection/ during initializing f2fs module.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
No need inc dirty pages under inode lock
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Fix two bugs in error path of f2fs_move_rehashed_dirents:
- release dir's inode page if fail to call kmalloc
- recover i_current_depth if fail to converting
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
With below steps, we will see that dentry page becoming unaccessable later.
This is because we forget updating i_current_depth in inode during inline
dentry conversion, after that, once we failed at somewhere, it will leave
i_current_depth as 0 in non-inline directory. Then, during ->lookup, the
current_depth value makes all dentry pages in first level invisible. Fix
it.
1) mount f2fs with inline_dentry option
2) mkdir dir
3) touch 180 files named [0-179] in dir
4) touch 180 in dir (fail after inline dir conversion)
5) ll dir
ls: cannot access /mnt/f2fs/dir/0: No such file or directory
ls: cannot access /mnt/f2fs/dir/1: No such file or directory
ls: cannot access /mnt/f2fs/dir/2: No such file or directory
ls: cannot access /mnt/f2fs/dir/3: No such file or directory
ls: cannot access /mnt/f2fs/dir/4: No such file or directory
drwxr-xr-x 2 root root 4096 may 13 21:47 ./
drwxr-xr-x 3 root root 4096 may 13 21:46 ../
-????????? ? ? ? ? ? 0
-????????? ? ? ? ? ? 1
-????????? ? ? ? ? ? 10
-????????? ? ? ? ? ? 100
-????????? ? ? ? ? ? 101
-????????? ? ? ? ? ? 102
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Below backtrace info was reported by Yunlei He:
Call Trace:
[<ffffffff817a9395>] schedule+0x35/0x80
[<ffffffff817abb7d>] rwsem_down_read_failed+0xed/0x130
[<ffffffff813c12a8>] call_rwsem_down_read_failed+0x18/0x
[<ffffffff817ab1d0>] down_read+0x20/0x30
[<ffffffffa02a1a12>] f2fs_evict_inode+0x242/0x3a0 [f2fs]
[<ffffffff81217057>] evict+0xc7/0x1a0
[<ffffffff81217cd6>] iput+0x196/0x200
[<ffffffff812134f9>] __dentry_kill+0x179/0x1e0
[<ffffffff812136f9>] dput+0x199/0x1f0
[<ffffffff811fe77b>] __fput+0x18b/0x220
[<ffffffff811fe84e>] ____fput+0xe/0x10
[<ffffffff81097427>] task_work_run+0x77/0x90
[<ffffffff81074d62>] exit_to_usermode_loop+0x73/0xa2
[<ffffffff81003b7a>] do_syscall_64+0xfa/0x110
[<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25
Call Trace:
[<ffffffff817a9395>] schedule+0x35/0x80
[<ffffffff81216dc3>] __wait_on_freeing_inode+0xa3/0xd0
[<ffffffff810bc300>] ? autoremove_wake_function+0x40/0x4
[<ffffffff8121771d>] find_inode_fast+0x7d/0xb0
[<ffffffff8121794a>] ilookup+0x6a/0xd0
[<ffffffffa02bc740>] sync_node_pages+0x210/0x650 [f2fs]
[<ffffffff8122e690>] ? do_fsync+0x70/0x70
[<ffffffffa02b085e>] block_operations+0x9e/0xf0 [f2fs]
[<ffffffff8137b795>] ? bio_endio+0x55/0x60
[<ffffffffa02b0942>] write_checkpoint+0x92/0xba0 [f2fs]
[<ffffffff8117da57>] ? mempool_free_slab+0x17/0x20
[<ffffffff8117de8b>] ? mempool_free+0x2b/0x80
[<ffffffff8122e690>] ? do_fsync+0x70/0x70
[<ffffffffa02a53e3>] f2fs_sync_fs+0x63/0xd0 [f2fs]
[<ffffffff8129630f>] ? ext4_sync_fs+0xbf/0x190
[<ffffffff8122e6b0>] sync_fs_one_sb+0x20/0x30
[<ffffffff812002e9>] iterate_supers+0xb9/0x110
[<ffffffff8122e7b5>] sys_sync+0x55/0x90
[<ffffffff81003ae9>] do_syscall_64+0x69/0x110
[<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25
With following excuting serials, we will set inline_node in inode page
after inode was unlinked, result in a deadloop described as below:
1. open file
2. write file
3. unlink file
4. write file
5. close file
Thread A Thread B
- dput
- iput_final
- inode->i_state |= I_FREEING
- evict
- f2fs_evict_inode
- f2fs_sync_fs
- write_checkpoint
- block_operations
- f2fs_lock_all (down_write(cp_rwsem))
- f2fs_lock_op (down_read(cp_rwsem))
- sync_node_pages
- ilookup
- find_inode_fast
- __wait_on_freeing_inode
(wait on I_FREEING clear)
Here, we change to set inline_node flag only for linked inode for fixing.
Reported-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Tested-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: stable@vger.kernel.org # v4.6
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|