Age | Commit message (Collapse) | Author |
|
This fixes warning:
fs/btrfs/zoned.c:491:6: warning: variable ‘zone_size’ set but not used [-Wunused-but-set-variable]
491 | u64 zone_size;
which got introduced in 12659251ca5d ("btrfs: implement log-structured
superblock for ZONED mode"). We'll enable the warning by default and
want clean build until the relevant zoned patches land.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This makes the file W=1 clean and fixes the following warnings:
fs/btrfs/extent_io.c:414: warning: Function parameter or member 'tree' not described in '__etree_search'
fs/btrfs/extent_io.c:414: warning: Function parameter or member 'offset' not described in '__etree_search'
fs/btrfs/extent_io.c:414: warning: Function parameter or member 'next_ret' not described in '__etree_search'
fs/btrfs/extent_io.c:414: warning: Function parameter or member 'prev_ret' not described in '__etree_search'
fs/btrfs/extent_io.c:414: warning: Function parameter or member 'p_ret' not described in '__etree_search'
fs/btrfs/extent_io.c:414: warning: Function parameter or member 'parent_ret' not described in '__etree_search'
fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'tree' not described in 'find_contiguous_extent_bit'
fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'start' not described in 'find_contiguous_extent_bit'
fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'start_ret' not described in 'find_contiguous_extent_bit'
fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'end_ret' not described in 'find_contiguous_extent_bit'
fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'bits' not described in 'find_contiguous_extent_bit'
fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'tree' not described in 'find_first_clear_extent_bit'
fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'start' not described in 'find_first_clear_extent_bit'
fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'start_ret' not described in 'find_first_clear_extent_bit'
fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'end_ret' not described in 'find_first_clear_extent_bit'
fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'bits' not described in 'find_first_clear_extent_bit'
fs/btrfs/extent_io.c:4187: warning: Function parameter or member 'epd' not described in 'extent_write_cache_pages'
fs/btrfs/extent_io.c:4187: warning: Excess function parameter 'data' description in 'extent_write_cache_pages'
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
With these fixes space-info.c is clear for W=1 warnings, namely the
following ones are fixed:
fs/btrfs/space-info.c:575: warning: Function parameter or member 'fs_info' not described in 'may_commit_transaction'
fs/btrfs/space-info.c:575: warning: Function parameter or member 'space_info' not described in 'may_commit_transaction'
fs/btrfs/space-info.c:1231: warning: Function parameter or member 'fs_info' not described in 'handle_reserve_ticket'
fs/btrfs/space-info.c:1231: warning: Function parameter or member 'space_info' not described in 'handle_reserve_ticket'
fs/btrfs/space-info.c:1231: warning: Function parameter or member 'ticket' not described in 'handle_reserve_ticket'
fs/btrfs/space-info.c:1231: warning: Function parameter or member 'flush' not described in 'handle_reserve_ticket'
fs/btrfs/space-info.c:1315: warning: Function parameter or member 'fs_info' not described in '__reserve_bytes'
fs/btrfs/space-info.c:1315: warning: Function parameter or member 'space_info' not described in '__reserve_bytes'
fs/btrfs/space-info.c:1315: warning: Function parameter or member 'orig_bytes' not described in '__reserve_bytes'
fs/btrfs/space-info.c:1315: warning: Function parameter or member 'flush' not described in '__reserve_bytes'
fs/btrfs/space-info.c:1427: warning: Function parameter or member 'root' not described in 'btrfs_reserve_metadata_bytes'
fs/btrfs/space-info.c:1427: warning: Function parameter or member 'block_rsv' not described in 'btrfs_reserve_metadata_bytes'
fs/btrfs/space-info.c:1427: warning: Function parameter or member 'orig_bytes' not described in 'btrfs_reserve_metadata_bytes'
fs/btrfs/space-info.c:1427: warning: Function parameter or member 'flush' not described in 'btrfs_reserve_metadata_bytes'
fs/btrfs/space-info.c:1462: warning: Function parameter or member 'fs_info' not described in 'btrfs_reserve_data_bytes'
fs/btrfs/space-info.c:1462: warning: Function parameter or member 'bytes' not described in 'btrfs_reserve_data_bytes'
fs/btrfs/space-info.c:1462: warning: Function parameter or member 'flush' not described in 'btrfs_reserve_data_bytes'
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_inode_rsv_release/btrfs_delalloc_release_space
Fixes following warnings:
fs/btrfs/delalloc-space.c:205: warning: Function parameter or member 'inode' not described in 'btrfs_inode_rsv_release'
fs/btrfs/delalloc-space.c:205: warning: Function parameter or member 'qgroup_free' not described in 'btrfs_inode_rsv_release'
fs/btrfs/delalloc-space.c:472: warning: Function parameter or member 'reserved' not described in 'btrfs_delalloc_release_space'
fs/btrfs/delalloc-space.c:472: warning: Function parameter or member 'qgroup_free' not described in 'btrfs_delalloc_release_space'
fs/btrfs/delalloc-space.c:472: warning: Excess function parameter 'release_bytes' description in 'btrfs_delalloc_release_space'
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Fixes fs/btrfs/inode.c:3101: warning: Function parameter or member 'fs_info' not described in 'btrfs_wait_on_delayed_iputs'
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Fixes fs/btrfs/block-group.c:1570: warning: Function parameter or member 'fs_info' not described in 'btrfs_rmap_block'
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Fixes fs/btrfs/discard.c:203: warning: Function parameter or member 'now' not described in 'peek_discard_list'
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Fixes following W=1 warnings:
fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'root' not described in '__btrfs_write_out_cache'
fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'inode' not described in '__btrfs_write_out_cache'
fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'ctl' not described in '__btrfs_write_out_cache'
fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'block_group' not described in '__btrfs_write_out_cache'
fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'io_ctl' not described in '__btrfs_write_out_cache'
fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'trans' not described in '__btrfs_write_out_cache'
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This fixes the following warnings:
fs/btrfs/delayed-ref.c:80: warning: Function parameter or member 'fs_info' not described in 'btrfs_delayed_refs_rsv_release'
fs/btrfs/delayed-ref.c:80: warning: Function parameter or member 'nr' not described in 'btrfs_delayed_refs_rsv_release'
fs/btrfs/delayed-ref.c:128: warning: Function parameter or member 'fs_info' not described in 'btrfs_migrate_to_delayed_refs_rsv'
fs/btrfs/delayed-ref.c:128: warning: Function parameter or member 'src' not described in 'btrfs_migrate_to_delayed_refs_rsv'
fs/btrfs/delayed-ref.c:128: warning: Function parameter or member 'num_bytes' not described in 'btrfs_migrate_to_delayed_refs_rsv'
fs/btrfs/delayed-ref.c:174: warning: Function parameter or member 'fs_info' not described in 'btrfs_delayed_refs_rsv_refill'
fs/btrfs/delayed-ref.c:174: warning: Function parameter or member 'flush' not described in 'btrfs_delayed_refs_rsv_refill'
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This fixes following W=1 warnings:
fs/btrfs/file-item.c:27: warning: Cannot understand * @inode: the inode we want to update the disk_i_size for
on line 27 - I thought it was a doc line
fs/btrfs/file-item.c:65: warning: Cannot understand * @inode - the inode we're modifying
on line 65 - I thought it was a doc line
fs/btrfs/file-item.c:91: warning: Cannot understand * @inode - the inode we're modifying
on line 91 - I thought it was a doc line
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This fixes the following compiler warnings:
fs/btrfs/extent_map.c:601: warning: Function parameter or member 'fs_info' not described in 'btrfs_add_extent_mapping'
fs/btrfs/extent_map.c:601: warning: Function parameter or member 'em_tree' not described in 'btrfs_add_extent_mapping'
fs/btrfs/extent_map.c:601: warning: Function parameter or member 'em_in' not described in 'btrfs_add_extent_mapping'
fs/btrfs/extent_map.c:601: warning: Function parameter or member 'start' not described in 'btrfs_add_extent_mapping'
fs/btrfs/extent_map.c:601: warning: Function parameter or member 'len' not described in 'btrfs_add_extent_mapping'
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Fixes fs/btrfs/extent_map.c:399: warning: Function parameter or member
'modified' not described in 'add_extent_mapping'
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[BUG]
There is a long existing bug in the last parameter of
btrfs_add_ordered_extent(), in commit 771ed689d2cd ("Btrfs: Optimize
compressed writeback and reads") back to 2008.
In that ancient commit btrfs_add_ordered_extent() expects the @type
parameter to be one of the following:
- BTRFS_ORDERED_REGULAR
- BTRFS_ORDERED_NOCOW
- BTRFS_ORDERED_PREALLOC
- BTRFS_ORDERED_COMPRESSED
But we pass 0 in cow_file_range(), which means BTRFS_ORDERED_IO_DONE.
Ironically extra check in __btrfs_add_ordered_extent() won't set the bit
if we see (type == IO_DONE || type == IO_COMPLETE), and avoid any
obvious bug.
But this still leads to regular COW ordered extent having no bit to
indicate its type in various trace events, rendering REGULAR bit
useless.
[FIX]
Change the following aspects to avoid such problem:
- Reorder btrfs_ordered_extent::flags
Now the type bits go first (REGULAR/NOCOW/PREALLCO/COMPRESSED), then
DIRECT bit, finally extra status bits like IO_DONE/COMPLETE/IOERR.
- Add extra ASSERT() for btrfs_add_ordered_extent_*()
- Remove @type parameter for btrfs_add_ordered_extent_compress()
As the only valid @type here is BTRFS_ORDERED_COMPRESSED.
- Remove the unnecessary special check for IO_DONE/COMPLETE in
__btrfs_add_ordered_extent()
This is just to make the code work, with extra ASSERT(), there are
limited values can be passed in.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Fix below warnings reported by coccicheck:
./fs/btrfs/raid56.c:237:2-8: WARNING: NULL check before some freeing
functions is not needed.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Yang Li <abaci-bugfix@linux.alibaba.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Zygo reported the following panic when testing my error handling patches
for relocation:
kernel BUG at fs/btrfs/backref.c:2545!
invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 3 PID: 8472 Comm: btrfs Tainted: G W 14
Hardware name: QEMU Standard PC (i440FX + PIIX,
Call Trace:
btrfs_backref_error_cleanup+0x4df/0x530
build_backref_tree+0x1a5/0x700
? _raw_spin_unlock+0x22/0x30
? release_extent_buffer+0x225/0x280
? free_extent_buffer.part.52+0xd7/0x140
relocate_tree_blocks+0x2a6/0xb60
? kasan_unpoison_shadow+0x35/0x50
? do_relocation+0xc10/0xc10
? kasan_kmalloc+0x9/0x10
? kmem_cache_alloc_trace+0x6a3/0xcb0
? free_extent_buffer.part.52+0xd7/0x140
? rb_insert_color+0x342/0x360
? add_tree_block.isra.36+0x236/0x2b0
relocate_block_group+0x2eb/0x780
? merge_reloc_roots+0x470/0x470
btrfs_relocate_block_group+0x26e/0x4c0
btrfs_relocate_chunk+0x52/0x120
btrfs_balance+0xe2e/0x18f0
? pvclock_clocksource_read+0xeb/0x190
? btrfs_relocate_chunk+0x120/0x120
? lock_contended+0x620/0x6e0
? do_raw_spin_lock+0x1e0/0x1e0
? do_raw_spin_unlock+0xa8/0x140
btrfs_ioctl_balance+0x1f9/0x460
btrfs_ioctl+0x24c8/0x4380
? __kasan_check_read+0x11/0x20
? check_chain_key+0x1f4/0x2f0
? __asan_loadN+0xf/0x20
? btrfs_ioctl_get_supported_features+0x30/0x30
? kvm_sched_clock_read+0x18/0x30
? check_chain_key+0x1f4/0x2f0
? lock_downgrade+0x3f0/0x3f0
? handle_mm_fault+0xad6/0x2150
? do_vfs_ioctl+0xfc/0x9d0
? ioctl_file_clone+0xe0/0xe0
? check_flags.part.50+0x6c/0x1e0
? check_flags.part.50+0x6c/0x1e0
? check_flags+0x26/0x30
? lock_is_held_type+0xc3/0xf0
? syscall_enter_from_user_mode+0x1b/0x60
? do_syscall_64+0x13/0x80
? rcu_read_lock_sched_held+0xa1/0xd0
? __kasan_check_read+0x11/0x20
? __fget_light+0xae/0x110
__x64_sys_ioctl+0xc3/0x100
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This occurs because of this check
if (RB_EMPTY_NODE(&upper->rb_node))
BUG_ON(!list_empty(&node->upper));
As we are dropping the backref node, if we discover that our upper node
in the edge we just cleaned up isn't linked into the cache that we are
now done with this node, thus the BUG_ON().
However this is an erroneous assumption, as we will look up all the
references for a node first, and then process the pending edges. All of
the 'upper' nodes in our pending edges won't be in the cache's rb_tree
yet, because they haven't been processed. We could very well have many
edges still left to cleanup on this node.
The fact is we simply do not need this check, we can just process all of
the edges only for this node, because below this check we do the
following
if (list_empty(&upper->lower)) {
list_add_tail(&upper->lower, &cache->leaves);
upper->lowest = 1;
}
If the upper node truly isn't used yet, then we add it to the
cache->leaves list to be cleaned up later. If it is still used then the
last child node that has it linked into its node will add it to the
leaves list and then it will be cleaned up.
Fix this problem by dropping this logic altogether. With this fix I no
longer see the panic when testing with error injection in the backref
code.
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
While testing the error paths in relocation, I hit the following lockdep
splat:
======================================================
WARNING: possible circular locking dependency detected
5.10.0-rc3+ #206 Not tainted
------------------------------------------------------
btrfs-balance/1571 is trying to acquire lock:
ffff8cdbcc8f77d0 (&head_ref->mutex){+.+.}-{3:3}, at: btrfs_lookup_extent_info+0x156/0x3b0
but task is already holding lock:
ffff8cdbc54adbf8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x27/0x100
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (btrfs-tree-00){++++}-{3:3}:
down_write_nested+0x43/0x80
__btrfs_tree_lock+0x27/0x100
btrfs_search_slot+0x248/0x890
relocate_tree_blocks+0x490/0x650
relocate_block_group+0x1ba/0x5d0
kretprobe_trampoline+0x0/0x50
-> #1 (btrfs-csum-01){++++}-{3:3}:
down_read_nested+0x43/0x130
__btrfs_tree_read_lock+0x27/0x100
btrfs_read_lock_root_node+0x31/0x40
btrfs_search_slot+0x5ab/0x890
btrfs_del_csums+0x10b/0x3c0
__btrfs_free_extent+0x49d/0x8e0
__btrfs_run_delayed_refs+0x283/0x11f0
btrfs_run_delayed_refs+0x86/0x220
btrfs_start_dirty_block_groups+0x2ba/0x520
kretprobe_trampoline+0x0/0x50
-> #0 (&head_ref->mutex){+.+.}-{3:3}:
__lock_acquire+0x1167/0x2150
lock_acquire+0x116/0x3e0
__mutex_lock+0x7e/0x7b0
btrfs_lookup_extent_info+0x156/0x3b0
walk_down_proc+0x1c3/0x280
walk_down_tree+0x64/0xe0
btrfs_drop_subtree+0x182/0x260
do_relocation+0x52e/0x660
relocate_tree_blocks+0x2ae/0x650
relocate_block_group+0x1ba/0x5d0
kretprobe_trampoline+0x0/0x50
other info that might help us debug this:
Chain exists of:
&head_ref->mutex --> btrfs-csum-01 --> btrfs-tree-00
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(btrfs-tree-00);
lock(btrfs-csum-01);
lock(btrfs-tree-00);
lock(&head_ref->mutex);
*** DEADLOCK ***
5 locks held by btrfs-balance/1571:
#0: ffff8cdb89749ff8 (&fs_info->delete_unused_bgs_mutex){+.+.}-{3:3}, at: btrfs_balance+0x563/0xf40
#1: ffff8cdb89748838 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_relocate_block_group+0x156/0x300
#2: ffff8cdbc2c16650 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x413/0x5c0
#3: ffff8cdbc135f538 (btrfs-treloc-01){+.+.}-{3:3}, at: __btrfs_tree_lock+0x27/0x100
#4: ffff8cdbc54adbf8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x27/0x100
stack backtrace:
CPU: 1 PID: 1571 Comm: btrfs-balance Not tainted 5.10.0-rc3+ #206
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack+0x8b/0xb0
check_noncircular+0xcf/0xf0
? trace_call_bpf+0x139/0x260
__lock_acquire+0x1167/0x2150
lock_acquire+0x116/0x3e0
? btrfs_lookup_extent_info+0x156/0x3b0
__mutex_lock+0x7e/0x7b0
? btrfs_lookup_extent_info+0x156/0x3b0
? btrfs_lookup_extent_info+0x156/0x3b0
? release_extent_buffer+0x124/0x170
? _raw_spin_unlock+0x1f/0x30
? release_extent_buffer+0x124/0x170
btrfs_lookup_extent_info+0x156/0x3b0
walk_down_proc+0x1c3/0x280
walk_down_tree+0x64/0xe0
btrfs_drop_subtree+0x182/0x260
do_relocation+0x52e/0x660
relocate_tree_blocks+0x2ae/0x650
? add_tree_block+0x149/0x1b0
relocate_block_group+0x1ba/0x5d0
elfcorehdr_read+0x40/0x40
? elfcorehdr_read+0x40/0x40
? btrfs_balance+0x796/0xf40
? __kthread_parkme+0x66/0x90
? btrfs_balance+0xf40/0xf40
? balance_kthread+0x37/0x50
? kthread+0x137/0x150
? __kthread_bind_mask+0x60/0x60
? ret_from_fork+0x1f/0x30
As you can see this is bogus, we never take another tree's lock under
the csum lock. This happens because sometimes we have to read tree
blocks from disk without knowing which root they belong to during
relocation. We defaulted to an owner of 0, which translates to an fs
tree. This is fine as all fs trees have the same class, but obviously
isn't fine if the block belongs to a COW only tree.
Thankfully COW only trees only have their owners root as a reference to
them, and since we already look up the extent information during
relocation, go ahead and check and see if this block might belong to a
COW only tree, and if so save the owner in the tree_block struct. This
allows us to read_tree_block with the proper owner, which gets rid of
this lockdep splat.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This patch will extract the code to grab an extent buffer from a page
into a helper, grab_extent_buffer_from_page().
This reduces one indent level, and provides the work place for later
expansion for subapge support.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The original comment is from the initial merge, which has several
problems:
- No holes check any more
- No inline decision is made
Update the out-of-date comment with more correct one.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The refactoring involves the following modifications:
- iosize alignment
In fact we don't really need to manually do alignment at all.
All extent maps should already be aligned, thus basic ASSERT() check
would be enough.
- redundant variables
We have extra variable like blocksize/pg_offset/end.
They are all unnecessary.
@blocksize can be replaced by sectorsize size directly, and it's only
used to verify the em start/size is aligned.
@pg_offset can be easily calculated using @cur and page_offset(page).
@end is just assigned from @page_end and never modified, use
"start + PAGE_SIZE - 1" directly and remove @page_end.
- remove some BUG_ON()s
The BUG_ON()s are for extent map, which we have tree-checker to check
on-disk extent data item and runtime check.
ASSERT() should be enough.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The parameter offset is confusing, it's supposed to be the disk bytenr
of metadata/data. Rename it to disk_bytenr and update the comment.
Also rename each offset passed to submit_extent_page() as @disk_bytenr
so they're consistent.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The refactoring involves the following modifications:
- Return bool instead of int
- Parameter update for @cached of btrfs_dec_test_first_ordered_pending()
For btrfs_dec_test_first_ordered_pending(), @cached is only used to
return the finished ordered extent.
Rename it to @finished_ret.
- Comment updates
* Change one stale comment
Which still refers to btrfs_dec_test_ordered_pending(), but the
context is calling btrfs_dec_test_first_ordered_pending().
* Follow the common comment style for both functions
Add more detailed descriptions for parameters and the return value
* Move the reason why test_and_set_bit() is used into the call sites
- Change how the return value is calculated
The most anti-human part of the return value is:
if (...)
ret = 1;
...
return ret == 0;
This means, when we set ret to 1, the function returns 0.
Change the local variable name to @finished, and directly return the
value of it.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_dio_private::bytes is only assigned from bio::bi_iter::bi_size,
which is never larger than U32.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Following the rework in e076ab2a2ca7 ("btrfs: shrink delalloc pages
instead of full inodes") the nr variable is no longer passed by
reference to start_delalloc_inodes hence it cannot change. Additionally
we are always guaranteed for it to be positive number hence it's
redundant to have it as a condition in the loop. Simply remove that
usage.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It's currently u64 which gets instantly translated either to LONG_MAX
(if U64_MAX is passed) or cast to an unsigned long (which is in fact,
wrong because writeback_control::nr_to_write is a signed, long type).
Just convert the function's argument to be long time which obviates the
need to manually convert u64 value to a long. Adjust all call sites
which pass U64_MAX to pass LONG_MAX. Finally ensure that in
shrink_delalloc the u64 is converted to a long without overflowing,
resulting in a negative number.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
After commit 040ee6120cb670 ("Btrfs: send, improve clone range") we do not
use anymore the data_offset field of struct backref_ctx, as after that we
do all the necessary checks for the data offset of file extent items at
clone_range(). Since there are no more users of data_offset from that
structure, remove it.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Instead of having three 'if' to handle non-NULL return value consolidate
this in one 'if (ret)'. That way the code is more obvious:
- Always drop delete_unused_bgs_mutex if ret is not NULL
- If ret is negative -> goto done
- If it's 1 -> reset ret to 0, release the path and finish the loop.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
I noticed that shared ref entries in ref-verify didn't have the proper
owner set, which caused me to think there was something seriously wrong.
However the problem is if we have a parent we simply weren't filling out
the owner part of the reference, even though we have it.
Fix this by making sure we set all the proper fields when we modify a
reference, this way we'll have the proper owner if a problem happens and
we don't waste time thinking we're updating the wrong level.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
I noticed that sometimes I would have the wrong level printed out with
ref-verify while testing some error injection related problems. This is
because we only get the level from the main extent item, but our
references could go off the current leaf into another, and at that point
we lose our level.
Fix this by keeping track of the last tree block level that we found,
the same way we keep track of our bytenr and num_bytes, in case we
happen to wander into another leaf while still processing the references
for a bytenr.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
I was attempting to reproduce a problem that Zygo hit, but my error
injection wasn't firing for a few of the common calls to
btrfs_should_cancel_balance. This is because the compiler decided to
inline it at these spots. Keep this from happening by explicitly
marking the function as noinline so that error injection will always
work.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The following patches are going to address error handling in relocation,
in order to test those patches I need to be able to inject errors in
btrfs_search_slot and btrfs_cow_block, as we call both of these pretty
often in different cases during relocation.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It's no longer used. While at it also remove new_dirid in create_subvol
as it's used in a single place and open code it. No functional changes.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Adjust the way free_objectid is being initialized, it now stores
BTRFS_FIRST_FREE_OBJECTID rather than the, somewhat arbitrary,
BTRFS_FIRST_FREE_OBJECTID - 1. This change also has the added benefit
that now it becomes unnecessary to explicitly initialize free_objectid
for a newly create fs root.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This reflects the true purpose of the member as it's being used solely
in context where a new objectid is being allocated. Future changes will
also change the way it's being used to closely follow this semantics.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This better reflects the semantics of the function i.e no search is
performed whatsoever.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This function is used to initialize the in-memory
btrfs_root::highest_objectid member, which is used to get an available
objectid. Rename it to better reflect its semantics.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
First replace all inode instances with a pointer to btrfs_inode. This
removes multiple invocations of the BTRFS_I macro, subsequently remove
2 local variables as they are called only once and simply refer to
them directly.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Return value in __load_free_space_cache is not properly set after
(unlikely) memory allocation failures and 0 is returned instead.
This is not a problem for the caller load_free_space_cache because only
value 1 is considered as 'cache loaded' but for clarity it's better
to set the errors accordingly.
Fixes: a67509c30079 ("Btrfs: add a io_ctl struct and helpers for dealing with the space cache")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
While doing error injection I would sometimes get a corrupt file system.
This is because I was injecting errors at btrfs_search_slot, but would
only do it one time per stack. This uncovered a problem in
commit_fs_roots, where if we get an error we would just break. However
we're in a nested loop, the first loop being a loop to find all the
dirty fs roots, and then subsequent root updates would succeed clearing
the error value.
This isn't likely to happen in real scenarios, however we could
potentially get a random ENOMEM once and then not again, and we'd end up
with a corrupted file system. Fix this by moving the error checking
around a bit to the main loop, as this is the only place where something
will fail, and return the error as soon as it occurs.
With this patch my reproducer no longer corrupts the file system.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There are controlled by f2fs_freeze().
This fixes xfstests/generic/068 which is stuck at
task:f2fs_ckpt-252:3 state:D stack: 0 pid: 5761 ppid: 2 flags:0x00004000
Call Trace:
__schedule+0x44c/0x8a0
schedule+0x4f/0xc0
percpu_rwsem_wait+0xd8/0x140
? percpu_down_write+0xf0/0xf0
__percpu_down_read+0x56/0x70
issue_checkpoint_thread+0x12c/0x160 [f2fs]
? wait_woken+0x80/0x80
kthread+0x114/0x150
? __checkpoint_and_complete_reqs+0x110/0x110 [f2fs]
? kthread_park+0x90/0x90
ret_from_fork+0x22/0x30
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
For SQPOLL rings tctx_inflight() always returns zero, so it might skip
doing full cancelation. It's fine because we jam all sqpoll submissions
in any case and do go through files cancel for them, but not nice.
Do the intended full cancellation, by mimicking __io_uring_task_cancel()
waiting but impersonating SQPOLL task.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
These functions always return '0' and no callers use the return value.
So make it a void function.
This eliminates the following coccicheck warning:
./fs/jfs/jfs_txnmgr.c:1365:5-7: Unneeded variable: "rc". Return "0" on
line 1414
./fs/jfs/jfs_txnmgr.c:1422:5-7: Unneeded variable: "rc". Return "0" on
line 1527
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
|
|
Currently there is no way to differentiate the file with alive owner
from the file with dead owner but pid of the owner reused. That's why
CRIU can't actually know if it needs to restore file owner or not,
because if it restores owner but actual owner was dead, this can
introduce unexpected signals to the "false"-owner (which reused the
pid).
Let's change the api, so that F_GETOWN(EX) returns 0 in case actual
owner is dead already. This comports with the POSIX spec, which
states that a PID of 0 indicates that no signal will be sent.
Cc: Jeff Layton <jlayton@kernel.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
|
|
Add support for an additional filesystem version (sb_fs_format = 1802).
When a filesystem with the new version is mounted, the filesystem
supports "trusted.*" xattrs.
In addition, version 1802 filesystems implement a form of forward
compatibility for xattrs: when xattrs with an unknown prefix (ea_type)
are found on a version 1802 filesystem, those attributes are not shown
by listxattr, and they are not accessible by getxattr, setxattr, or
removexattr.
This mechanism might turn out to be what we need in the future, but if
not, we can always bump the filesystem version and break compatibility
instead.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Turn on rgrplvb by default for sb_fs_format > 1801.
Mount options still have to override this so a new args field to
differentiate between 'off' and 'not specified' is added, and the new
default is applied only when it's not specified.
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Add support for FS_VERITY_METADATA_TYPE_SIGNATURE to
FS_IOC_READ_VERITY_METADATA. This allows a userspace server program to
retrieve the built-in signature (if present) of a verity file for
serving to a client which implements fs-verity compatible verification.
See the patch which introduced FS_IOC_READ_VERITY_METADATA for more
details.
The ability for userspace to read the built-in signatures is also useful
because it allows a system that is using the in-kernel signature
verification to migrate to userspace signature verification.
This has been tested using a new xfstest which calls this ioctl via a
new subcommand for the 'fsverity' program from fsverity-utils.
Link: https://lore.kernel.org/r/20210115181819.34732-7-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
Add support for FS_VERITY_METADATA_TYPE_DESCRIPTOR to
FS_IOC_READ_VERITY_METADATA. This allows a userspace server program to
retrieve the fs-verity descriptor of a file for serving to a client
which implements fs-verity compatible verification. See the patch which
introduced FS_IOC_READ_VERITY_METADATA for more details.
"fs-verity descriptor" here means only the part that userspace cares
about because it is hashed to produce the file digest. It doesn't
include the signature which ext4 and f2fs append to the
fsverity_descriptor struct when storing it on-disk, since that way of
storing the signature is an implementation detail. The next patch adds
a separate metadata_type value for retrieving the signature separately.
This has been tested using a new xfstest which calls this ioctl via a
new subcommand for the 'fsverity' program from fsverity-utils.
Link: https://lore.kernel.org/r/20210115181819.34732-6-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
Add support for FS_VERITY_METADATA_TYPE_MERKLE_TREE to
FS_IOC_READ_VERITY_METADATA. This allows a userspace server program to
retrieve the Merkle tree of a verity file for serving to a client which
implements fs-verity compatible verification. See the patch which
introduced FS_IOC_READ_VERITY_METADATA for more details.
This has been tested using a new xfstest which calls this ioctl via a
new subcommand for the 'fsverity' program from fsverity-utils.
Link: https://lore.kernel.org/r/20210115181819.34732-5-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
Add an ioctl FS_IOC_READ_VERITY_METADATA which will allow reading verity
metadata from a file that has fs-verity enabled, including:
- The Merkle tree
- The fsverity_descriptor (not including the signature if present)
- The built-in signature, if present
This ioctl has similar semantics to pread(). It is passed the type of
metadata to read (one of the above three), and a buffer, offset, and
size. It returns the number of bytes read or an error.
Separate patches will add support for each of the above metadata types.
This patch just adds the ioctl itself.
This ioctl doesn't make any assumption about where the metadata is
stored on-disk. It does assume the metadata is in a stable format, but
that's basically already the case:
- The Merkle tree and fsverity_descriptor are defined by how fs-verity
file digests are computed; see the "File digest computation" section
of Documentation/filesystems/fsverity.rst. Technically, the way in
which the levels of the tree are ordered relative to each other wasn't
previously specified, but it's logical to put the root level first.
- The built-in signature is the value passed to FS_IOC_ENABLE_VERITY.
This ioctl is useful because it allows writing a server program that
takes a verity file and serves it to a client program, such that the
client can do its own fs-verity compatible verification of the file.
This only makes sense if the client doesn't trust the server and if the
server needs to provide the storage for the client.
More concretely, there is interest in using this ability in Android to
export APK files (which are protected by fs-verity) to "protected VMs".
This would use Protected KVM (https://lwn.net/Articles/836693), which
provides an isolated execution environment without having to trust the
traditional "host". A "guest" VM can boot from a signed image and
perform specific tasks in a minimum trusted environment using files that
have fs-verity enabled on the host, without trusting the host or
requiring that the guest has its own trusted storage.
Technically, it would be possible to duplicate the metadata and store it
in separate files for serving. However, that would be less efficient
and would require extra care in userspace to maintain file consistency.
In addition to the above, the ability to read the built-in signatures is
useful because it allows a system that is using the in-kernel signature
verification to migrate to userspace signature verification.
Link: https://lore.kernel.org/r/20210115181819.34732-4-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
Now that fsverity_get_descriptor() validates the sig_size field,
fsverity_verify_signature() doesn't need to do it.
Just change the prototype of fsverity_verify_signature() to take the
signature directly rather than take a fsverity_descriptor.
Link: https://lore.kernel.org/r/20210115181819.34732-3-ebiggers@kernel.org
Reviewed-by: Victor Hsieh <victorhsieh@google.com>
Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Amy Parker <enbyamy@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
|