summaryrefslogtreecommitdiff
path: root/fs/btrfs
AgeCommit message (Collapse)Author
2016-04-29btrfs: untangle gotos a bit in __clear_extent_bitDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: untangle gotos a bit in __set_extent_bitDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_record_extent_bitsDavid Sterba
Single caller passes GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_extent_newDavid Sterba
Single caller passes GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_extent_defragDavid Sterba
Single caller passes GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_extent_delallocDavid Sterba
Callers pass GFP_NOFS and tests pass GFP_KERNEL, but using NOFS there does not hurt. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to clear_extent_dirtyDavid Sterba
Callers pass GFP_NOFS. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to clear_record_extent_bitsDavid Sterba
Callers pass GFP_NOFS. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to clear_extent_bitsDavid Sterba
Callers pass GFP_NOFS and GFP_KERNEL. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-29btrfs: sink gfp parameter to set_extent_bitsDavid Sterba
All callers pass GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: uapi/linux/btrfs_tree.h migration, item types and definesJeff Mahoney
The BTRFS_IOC_SEARCH_TREE ioctl returns file system items directly to userspace. In order to decode them, full type information is required. Create a new header, btrfs_tree to contain these since most users won't need them. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: uapi/linux/btrfs.h migration, move struct btrfs_ioctl_defrag_range_argsJeff Mahoney
struct btrfs_ioctl_defrag_range_args is used by the BTRFS_IOC_DEFRAG_RANGE ioctl. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: uapi/linux/btrfs.h migration, move balance flagsJeff Mahoney
The BTRFS_BALANCE_* flags are used by struct btrfs_ioctl_balance_args.flags and btrfs_ioctl_balance_args.{data,meta,sys}.flags in the BTRFS_IOC_BALANCE ioctl. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: uapi/linux/btrfs.h migration, move feature flagsJeff Mahoney
The compat/compat_ro/incompat feature flags are used by the feature set/get ioctls. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: uapi/linux/btrfs.h migration, qgroup limit flagsJeff Mahoney
The BTRFS_QGROUP_LIMIT_* flags are required to tell the kernel which fields are valid when using the BTRFS_IOC_QGROUP_LIMIT ioctl. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: uapi/linux/btrfs.h migration, move BTRFS_LABEL_SIZEJeff Mahoney
BTRFS_LABEL_SIZE is required to define the BTRFS_IOC_GET_FSLABEL and BTRFS_IOC_SET_FSLABEL ioctls. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: refactor btrfs_dev_replace_start for reuseAnand Jain
A refactor patch, and avoids user input verification in the btrfs_dev_replace_start(), and so this function can be reused. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: use fs_info directlyAnand Jain
Local variable fs_info, contains root->fs_info, use it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: rename flags for vol args v2David Sterba
Rename BTRFS_DEVICE_BY_ID so it's more descriptive that we specify the device by id, it'll be part of the public API. The mask of supported flags is also renamed, only for internal use. The error code for unknown flags is EOPNOTSUPP, fixed. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: rename btrfs_find_device_by_user_inputDavid Sterba
For clarity how we are going to find the device, let's call it a device specifier, devspec for short. Also rename the arguments that are a leftover from previous function purpose. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: use existing device constraints table btrfs_raid_arrayDavid Sterba
We should avoid duplicating the device constraints, let's use the btrfs_raid_array in btrfs_check_raid_min_devices. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: introduce raid-type to error-code table, for minimum device constraintDavid Sterba
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: pass number of devices to btrfs_check_raid_min_devicesDavid Sterba
Before this patch, btrfs_check_raid_min_devices would do an off-by-one check of the constraints and not the miminmum check, as its name suggests. This is not a problem if the only caller is device remove, but would be confusing for others. Add an argument with the exact number and let the caller(s) decide if this needs any adjustments, like when device replace is running. Reviewed-by: Anand Jain <anand.jain@oracle.com> Tested-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: rename __check_raid_min_devicesDavid Sterba
Underscores are for special functions, use the full prefix for better stacktrace recognition. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: optimize check for stale deviceAnand Jain
Optimize check for stale device to only be checked when there is device added or changed. If there is no update to the device, there is no need to call btrfs_free_stale_device(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: introduce device delete by devidAnand Jain
This introduces new ioctl BTRFS_IOC_RM_DEV_V2, which uses enhanced struct btrfs_ioctl_vol_args_v2 to carry devid as an user argument. The patch won't delete the old ioctl interface and so kernel remains backward compatible with user land progs. Test case/script: echo "0 $(blockdev --getsz /dev/sdf) linear /dev/sdf 0" | dmsetup create bad_disk mkfs.btrfs -f -d raid1 -m raid1 /dev/sdd /dev/sde /dev/mapper/bad_disk mount /dev/sdd /btrfs dmsetup suspend bad_disk echo "0 $(blockdev --getsz /dev/sdf) error /dev/sdf 0" | dmsetup load bad_disk dmsetup resume bad_disk echo "bad disk failed. now deleting/replacing" btrfs dev del 3 /btrfs echo $? btrfs fi show /btrfs umount /btrfs btrfs-show-super /dev/sdd | egrep num_device dmsetup remove bad_disk wipefs -a /dev/sdf Signed-off-by: Anand Jain <anand.jain@oracle.com> Reported-by: Martin <m_btrfs@ml1.co.uk> [ adjust messages, s/disk/device/ ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: make use of btrfs_scratch_superblocks() in btrfs_rm_device()Anand Jain
With the previous patches now the btrfs_scratch_superblocks() is ready to be used in btrfs_rm_device() so use it. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ use GFP_KERNEL ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: enhance btrfs_find_device_by_user_input() to check device pathAnand Jain
The operation of device replace and device delete follows same steps upto some depth with in btrfs kernel, however they don't share codes. This enhancement will help replace and delete to share codes. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: make use of btrfs_find_device_by_user_input()Anand Jain
btrfs_rm_device() has a section of the code which can be replaced btrfs_find_device_by_user_input() Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: create helper btrfs_find_device_by_user_input()Anand Jain
The patch renames btrfs_dev_replace_find_srcdev() to btrfs_find_device_by_user_input() and moves it to volumes.c, so that delete device can use it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: clean up and optimize __check_raid_min_device()Anand Jain
__check_raid_min_device() which was pealed from btrfs_rm_device() maintianed its original code to show the block move. This patch cleans up __check_raid_min_device(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: create helper function __check_raid_min_devices()Anand Jain
move a section of btrfs_rm_device() code to check for min number of the devices into the function __check_raid_min_devices() Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: create a helper function to read the disk superAnand Jain
A part of code from btrfs_scan_one_device() is moved to a new function btrfs_read_disk_super(), so that former function looks cleaner. (In this process it also moves the code which ensures null terminating label). So this creates easy opportunity to merge various duplicate codes on read disk super. Earlier attempt to merge duplicate codes highlighted that there were some issues for which there are duplicate codes (to read disk super), however it was not clear what was the issue. So until we figure that out, its better to keep them in a separate functions. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ use GFP_KERNEL, PAGE_CACHE_ removal related fixups ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: do not create empty block group if we have allocated dataLiu Bo
Now we force to create empty block group to keep data profile alive, however, in the below example, we eventually get an empty block group while we're trying to get more space for other types (metadata/system), - Before, block group "A": size=2G, used=1.2G block group "B": size=2G, used=512M - After "btrfs balance start -dusage=50 mount_point", block group "A": size=2G, used=(1.2+0.5)G block group "C": size=2G, used=0 Since there is no data in block group C, it won't be deleted automatically and we have to get the unused 2G until the next mount. Balance itself just moves data and doesn't remove data, so it's safe to not create such a empty block group if we already have data allocated in other block groups. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: __btrfs_buffered_write: Pass valid file offset when releasing ↵Chandan Rajendra
delalloc space The delalloc reserved space is calculated in terms of number of bytes used by an integral number of blocks. This is done by rounding down the value of 'pos' to the nearest multiple of sectorsize. The file offset value held by 'pos' variable may not be aligned to sectorsize and hence when passing it as an argument to btrfs_delalloc_release_space(), we may end up releasing larger delalloc space than we originally had reserved. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: cleanup error handling in extent_write_cached_pagesLiu Bo
Now that we bail out immediately if ->writepage() returns an error, we don't need an extra error to retain the error code. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: make mapping->writeback_index point to the last written pageLiu Bo
If sequential writer is writing in the middle of the page and it just redirties the last written page by continuing from it. In the above case this can end up with seeking back to that firstly redirtied page after writing all the pages at the end of file because btrfs updates mapping->writeback_index to 1 past the current one. For non-cow filesystems, the cost is only about extra seek, while for cow filesystems such as btrfs, it means unnecessary fragments. To avoid it, we just need to continue writeback from the last written page. This also updates btrfs to behave like what write_cache_pages() does, ie, bail out immediately if there is an error in writepage(). <Ref: https://www.spinics.net/lists/linux-btrfs/msg52628.html> Reported-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctlLuke Dashjr
32-bit ioctl uses these rather than the regular FS_IOC_* versions. They can be handled in btrfs using the same code. Without this, 32-bit {ch,ls}attr fail. Signed-off-by: Luke Dashjr <luke-jr+git@utopios.org> Cc: stable@vger.kernel.org Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: fix typos in commentsLuis de Bethencourt
Correct a typo in the chunk_mutex name to make it grepable. Since it is better to fix several typos at once, fixing the 2 more in the same file. Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28Btrfs: Refactor btrfs_lock_cluster() to kill compiler warningGeert Uytterhoeven
fs/btrfs/extent-tree.c: In function ‘btrfs_lock_cluster’: fs/btrfs/extent-tree.c:6399: warning: ‘used_bg’ may be used uninitialized in this function - Replace "again: ... goto again;" by standard C "while (1) { ... }", - Move block not processed during the first iteration of the loop to the end of the loop, which allows to kill the "locked" variable, Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-and-Tested-by: Miao Xie <miaox@cn.fujitsu.com> [ the compilation warning has been fixed by other patch, now we want to clean up the function ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: remove save_error_info()Anand Jain
Actually save_error_info() sets the FS state to error and nothing else. Further the word save doesn't induce caffeine when compared to the word set in what actually it does. So to make it better understandable move save_error_info() code to its only consumer itself. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: Simplify conditions about compress while mapping btrfs flags to inode ↵Satoru Takeuchi
flags Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: move error handling code together in ctree.hAnand Jain
Looks like we added the incompatible defines in between the error handling defines in the file ctree.h. Now group them back. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: remove unused function btrfs_assert()Anand Jain
Apparently looks like ASSERT does the same intended job, as intended btrfs_assert(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-28btrfs: rename btrfs_std_error to btrfs_handle_fs_errorAnand Jain
btrfs_std_error() handles errors, puts FS into readonly mode (as of now). So its good idea to rename it to btrfs_handle_fs_error(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ edit changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-04-10xattr_handler: pass dentry and inode as separate arguments of ->get()Al Viro
... and do not assume they are already attached to each other Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-10don't bother with ->d_inode->i_sb - it's always equal to ->d_sbAl Viro
... and neither can ever be NULL Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-09Merge branch 'for-linus-4.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "These are bug fixes, including a really old fsync bug, and a few trace points to help us track down problems in the quota code" * 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix file/data loss caused by fsync after rename and new inode btrfs: Reset IO error counters before start of device replacing btrfs: Add qgroup tracing Btrfs: don't use src fd for printk btrfs: fallback to vmalloc in btrfs_compare_tree btrfs: handle non-fatal errors in btrfs_qgroup_inherit() btrfs: Output more info for enospc_debug mount option Btrfs: fix invalid reference in replace_path Btrfs: Improve FL_KEEP_SIZE handling in fallocate
2016-04-07Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "These changes contains a fix for overlayfs interacting with some (badly behaved) dentry code in various file systems. These have been reviewed by Al and the respective file system mtinainers and are going through the ext4 tree for convenience. This also has a few ext4 encryption bug fixes that were discovered in Android testing (yes, we will need to get these sync'ed up with the fs/crypto code; I'll take care of that). It also has some bug fixes and a change to ignore the legacy quota options to allow for xfstests regression testing of ext4's internal quota feature and to be more consistent with how xfs handles this case" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: ignore quota mount options if the quota feature is enabled ext4 crypto: fix some error handling ext4: avoid calling dquot_get_next_id() if quota is not enabled ext4: retry block allocation for failed DIO and DAX writes ext4: add lockdep annotations for i_data_sem ext4: allow readdir()'s of large empty directories to be interrupted btrfs: fix crash/invalid memory access on fsync when using overlayfs ext4 crypto: use dget_parent() in ext4_d_revalidate() ext4: use file_dentry() ext4: use dget_parent() in ext4_file_open() nfs: use file_dentry() fs: add file_dentry() ext4 crypto: don't let data integrity writebacks fail with ENOMEM ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()
2016-04-06Btrfs: fix file/data loss caused by fsync after rename and new inodeFilipe Manana
If we rename an inode A (be it a file or a directory), create a new inode B with the old name of inode A and under the same parent directory, fsync inode B and then power fail, at log tree replay time we end up removing inode A completely. If inode A is a directory then all its files are gone too. Example scenarios where this happens: This is reproducible with the following steps, taken from a couple of test cases written for fstests which are going to be submitted upstream soon: # Scenario 1 mkfs.btrfs -f /dev/sdc mount /dev/sdc /mnt mkdir -p /mnt/a/x echo "hello" > /mnt/a/x/foo echo "world" > /mnt/a/x/bar sync mv /mnt/a/x /mnt/a/y mkdir /mnt/a/x xfs_io -c fsync /mnt/a/x <power failure happens> The next time the fs is mounted, log tree replay happens and the directory "y" does not exist nor do the files "foo" and "bar" exist anywhere (neither in "y" nor in "x", nor the root nor anywhere). # Scenario 2 mkfs.btrfs -f /dev/sdc mount /dev/sdc /mnt mkdir /mnt/a echo "hello" > /mnt/a/foo sync mv /mnt/a/foo /mnt/a/bar echo "world" > /mnt/a/foo xfs_io -c fsync /mnt/a/foo <power failure happens> The next time the fs is mounted, log tree replay happens and the file "bar" does not exists anymore. A file with the name "foo" exists and it matches the second file we created. Another related problem that does not involve file/data loss is when a new inode is created with the name of a deleted snapshot and we fsync it: mkfs.btrfs -f /dev/sdc mount /dev/sdc /mnt mkdir /mnt/testdir btrfs subvolume snapshot /mnt /mnt/testdir/snap btrfs subvolume delete /mnt/testdir/snap rmdir /mnt/testdir mkdir /mnt/testdir xfs_io -c fsync /mnt/testdir # or fsync some file inside /mnt/testdir <power failure> The next time the fs is mounted the log replay procedure fails because it attempts to delete the snapshot entry (which has dir item key type of BTRFS_ROOT_ITEM_KEY) as if it were a regular (non-root) entry, resulting in the following error that causes mount to fail: [52174.510532] BTRFS info (device dm-0): failed to delete reference to snap, inode 257 parent 257 [52174.512570] ------------[ cut here ]------------ [52174.513278] WARNING: CPU: 12 PID: 28024 at fs/btrfs/inode.c:3986 __btrfs_unlink_inode+0x178/0x351 [btrfs]() [52174.514681] BTRFS: Transaction aborted (error -2) [52174.515630] Modules linked in: btrfs dm_flakey dm_mod overlay crc32c_generic ppdev xor raid6_pq acpi_cpufreq parport_pc tpm_tis sg parport tpm evdev i2c_piix4 proc [52174.521568] CPU: 12 PID: 28024 Comm: mount Tainted: G W 4.5.0-rc6-btrfs-next-27+ #1 [52174.522805] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [52174.524053] 0000000000000000 ffff8801df2a7710 ffffffff81264e93 ffff8801df2a7758 [52174.524053] 0000000000000009 ffff8801df2a7748 ffffffff81051618 ffffffffa03591cd [52174.524053] 00000000fffffffe ffff88015e6e5000 ffff88016dbc3c88 ffff88016dbc3c88 [52174.524053] Call Trace: [52174.524053] [<ffffffff81264e93>] dump_stack+0x67/0x90 [52174.524053] [<ffffffff81051618>] warn_slowpath_common+0x99/0xb2 [52174.524053] [<ffffffffa03591cd>] ? __btrfs_unlink_inode+0x178/0x351 [btrfs] [52174.524053] [<ffffffff81051679>] warn_slowpath_fmt+0x48/0x50 [52174.524053] [<ffffffffa03591cd>] __btrfs_unlink_inode+0x178/0x351 [btrfs] [52174.524053] [<ffffffff8118f5e9>] ? iput+0xb0/0x284 [52174.524053] [<ffffffffa0359fe8>] btrfs_unlink_inode+0x1c/0x3d [btrfs] [52174.524053] [<ffffffffa038631e>] check_item_in_log+0x1fe/0x29b [btrfs] [52174.524053] [<ffffffffa0386522>] replay_dir_deletes+0x167/0x1cf [btrfs] [52174.524053] [<ffffffffa038739e>] fixup_inode_link_count+0x289/0x2aa [btrfs] [52174.524053] [<ffffffffa038748a>] fixup_inode_link_counts+0xcb/0x105 [btrfs] [52174.524053] [<ffffffffa038a5ec>] btrfs_recover_log_trees+0x258/0x32c [btrfs] [52174.524053] [<ffffffffa03885b2>] ? replay_one_extent+0x511/0x511 [btrfs] [52174.524053] [<ffffffffa034f288>] open_ctree+0x1dd4/0x21b9 [btrfs] [52174.524053] [<ffffffffa032b753>] btrfs_mount+0x97e/0xaed [btrfs] [52174.524053] [<ffffffff8108e1b7>] ? trace_hardirqs_on+0xd/0xf [52174.524053] [<ffffffff8117bafa>] mount_fs+0x67/0x131 [52174.524053] [<ffffffff81193003>] vfs_kern_mount+0x6c/0xde [52174.524053] [<ffffffffa032af81>] btrfs_mount+0x1ac/0xaed [btrfs] [52174.524053] [<ffffffff8108e1b7>] ? trace_hardirqs_on+0xd/0xf [52174.524053] [<ffffffff8108c262>] ? lockdep_init_map+0xb9/0x1b3 [52174.524053] [<ffffffff8117bafa>] mount_fs+0x67/0x131 [52174.524053] [<ffffffff81193003>] vfs_kern_mount+0x6c/0xde [52174.524053] [<ffffffff8119590f>] do_mount+0x8a6/0x9e8 [52174.524053] [<ffffffff811358dd>] ? strndup_user+0x3f/0x59 [52174.524053] [<ffffffff81195c65>] SyS_mount+0x77/0x9f [52174.524053] [<ffffffff814935d7>] entry_SYSCALL_64_fastpath+0x12/0x6b [52174.561288] ---[ end trace 6b53049efb1a3ea6 ]--- Fix this by forcing a transaction commit when such cases happen. This means we check in the commit root of the subvolume tree if there was any other inode with the same reference when the inode we are fsync'ing is a new inode (created in the current transaction). Test cases for fstests, covering all the scenarios given above, were submitted upstream for fstests: * fstests: generic test for fsync after renaming directory https://patchwork.kernel.org/patch/8694281/ * fstests: generic test for fsync after renaming file https://patchwork.kernel.org/patch/8694301/ * fstests: add btrfs test for fsync after snapshot deletion https://patchwork.kernel.org/patch/8670671/ Cc: stable@vger.kernel.org Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>