summaryrefslogtreecommitdiff
path: root/fs/btrfs/volumes.c
AgeCommit message (Collapse)Author
2018-03-01btrfs: alloc_chunk: fix DUP stripe size handlingHans van Kranenburg
In case of using DUP, we search for enough unallocated disk space on a device to hold two stripes. The devices_info[ndevs-1].max_avail that holds the amount of unallocated space found is directly assigned to stripe_size, while it's actually twice the stripe size. Later on in the code, an unconditional division of stripe_size by dev_stripes corrects the value, but in the meantime there's a check to see if the stripe_size does not exceed max_chunk_size. Since during this check stripe_size is twice the amount as intended, the check will reduce the stripe_size to max_chunk_size if the actual correct to be used stripe_size is more than half the amount of max_chunk_size. The unconditional division later tries to correct stripe_size, but will actually make sure we can't allocate more than half the max_chunk_size. Fix this by moving the division by dev_stripes before the max chunk size check, so it always contains the right value, instead of putting a duct tape division in further on to get it fixed again. Since in all other cases than DUP, dev_stripes is 1, this change only affects DUP. Other attempts in the past were made to fix this: * 37db63a400 "Btrfs: fix max chunk size check in chunk allocator" tried to fix the same problem, but still resulted in part of the code acting on a wrongly doubled stripe_size value. * 86db25785a "Btrfs: fix max chunk size on raid5/6" unintentionally broke this fix again. The real problem was already introduced with the rest of the code in 73c5de0051. The user visible result however will be that the max chunk size for DUP will suddenly double, while it's actually acting according to the limits in the code again like it was 5 years ago. Reported-by: Naohiro Aota <naohiro.aota@wdc.com> Link: https://www.spinics.net/lists/linux-btrfs/msg69752.html Fixes: 73c5de0051 ("btrfs: quasi-round-robin for chunk allocation") Fixes: 86db25785a ("Btrfs: fix max chunk size on raid5/6") Signed-off-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-02-05btrfs: Fix use-after-free when cleaning up fs_devs with a single stale deviceNikolay Borisov
Commit 4fde46f0cc71 ("Btrfs: free the stale device") introduced btrfs_free_stale_device which iterates the device lists for all registered btrfs filesystems and deletes those devices which aren't mounted. In a btrfs_devices structure has only 1 device attached to it and it is unused then btrfs_free_stale_devices will proceed to also free the btrfs_fs_devices struct itself. Currently this leads to a use after free since list_for_each_entry will try to perform a check on the already freed memory to see if it has to terminate the loop. The fix is to use 'break' when we know we are freeing the current fs_devs. Fixes: 4fde46f0cc71 ("Btrfs: free the stale device") Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-29btrfs: drop devid as device_list_add() argAnand Jain
As struct btrfs_disk_super is being passed, so it can get devid the same way its parent does. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-29btrfs: get device pointer from device_list_add()Anand Jain
Instead of pointer to btrfs_fs_devices as an arg in device_list_add() better to get pointer to btrfs_device as return value, then we have both, pointer to btrfs_device and btrfs_fs_devices. btrfs_device is needed to handle reappearing missing device. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: set the total_devices in device_list_add()Anand Jain
There is no other parent for device_list_add() except for btrfs_scan_one_device(), which would set btrfs_fs_devices::total_devices if device_list_add is successful and this can be done with in device_list_add() itself. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: move pr_info into device_list_addAnand Jain
Commit 60999ca4b403 ("btrfs: make device scan less noisy") adds return value 1 to device_list_add(), so that parent function can call pr_info only when new device is added. Move the pr_info() part into device_list_add() so that this function can be kept simple. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: make btrfs_free_stale_devices() to match the pathAnand Jain
The btrfs_free_stale_devices() is updated to match for the given device path and delete it. (It searches for only unmounted list of devices.) Also drop the comment about different path being used for the same device, since now we will have cli to clean any device that's not a concern any more. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: rename btrfs_free_stale_devices() arg to skip_devAnand Jain
No functional changes. Rename btrfs_free_stale_devices() arg to skip_dev, so that it reflects what that arg for. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: make btrfs_free_stale_devices() argument optionalAnand Jain
This updates btrfs_free_stale_devices() helper function to delete all unmouted devices, when arg is NULL. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: make btrfs_free_stale_device() to iterate all stalesAnand Jain
Let the list iterator iterate further and find other stale devices and delete it. This is in preparation to add support for user land request-able stale devices cleanup. Also rename btrfs_free_stale_device() to btrfs_free_stale_devices(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: no need to check for btrfs_fs_devices::seedingAnand Jain
There is no need to check for btrfs_fs_devices::seeding when we have checked for btrfs_fs_devices::opened, because we can't sprout without its seed FS being opened. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: Remove unused readahead spinlockMatthew Wilcox
The reada_lock in struct btrfs_device was only initialised, and not actually used. That's good because there's another lock also called reada_lock in the btrfs_fs_info that was quite heavily used. Remove this one. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup btrfs_free_stale_device() usageAnand Jain
We call btrfs_free_stale_device() only when we alloc a new struct btrfs_device (ret=1), so move it closer to where we alloc the new device. Also drop the comments. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22Btrfs: avoid losing data raid profile when deleting a deviceLiu Bo
We've avoided data losing raid profile when doing balance, but it turns out that deleting a device could also result in the same problem. Say we have 3 disks, and they're created with '-d raid1' profile. - We have chunk P (the only data chunk on the empty btrfs). - Suppose that chunk P's two raid1 copies reside in disk A and disk B. - Now, 'btrfs device remove disk B' btrfs_rm_device() -> btrfs_shrink_device() -> btrfs_relocate_chunk() #relocate any chunk on disk B to other places. - Chunk P will be removed and a new chunk will be created to hold those data, but as chunk P is the only one holding raid1 profile, after it goes away, the new chunk will be created as single profile which is our default profile. This fixes the problem by creating an empty data chunk before relocating the data chunk. Metadata/System chunk are supposed to have non-zero bytes all the time so their raid profile is preserved. Reported-by: James Alandt <James.Alandt@wdc.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: minor style cleanups in btrfs_scan_one_deviceAnand Jain
Assign ret = -EINVAL where it is actually required. Remove { } around single line if else code. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22Btrfs: make raid6 rebuild retry moreLiu Bo
There is a scenario that can end up with rebuild process failing to return good content, i.e. suppose that all disks can be read without problems and if the content that was read out doesn't match its checksum, currently for raid6 btrfs at most retries twice, - the 1st retry is to rebuild with all other stripes, it'll eventually be a raid5 xor rebuild, - if the 1st fails, the 2nd retry will deliberately fail parity p so that it will do raid6 style rebuild, however, the chances are that another non-parity stripe content also has something corrupted, so that the above retries are not able to return correct content, and users will think of this as data loss. More seriouly, if the loss happens on some important internal btree roots, it could refuse to mount. This extends btrfs to do more retries and each retry fails only one stripe. Since raid6 can tolerate 2 disk failures, if there is one more failure besides the failure on which we're recovering, this can always work. The worst case is to retry as many times as the number of raid6 disks, but given the fact that such a scenario is really rare in practice, it's still acceptable. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: factor btrfs_check_rw_degradable() to check given deviceAnand Jain
Update btrfs_check_rw_degradable() to check against the given device if its lost. We can use this function to know if the volume is going to be in degraded mode OR failed state, when the given device fails. Which is needed when we are handling the device failed state. A preparatory patch does not affect the flow as such. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Qu Wenruo <wqu@suse.com> [ enhance comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: Remove pair of bio_get/put in btrfs_schedule_bioNikolay Borisov
This code was added in 492bb6deee34 ("Btrfs: Hold a reference on bios during submit_bio, add some extra bio checks"). However, holding a reference on a bio is necessary only if it's going to be referenced after the submit_bio returns and the bio is completed. In this particular instance this is not the case so there is no need to hold an extra reference since we directly return. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup device states define BTRFS_DEV_STATE_REPLACE_TGTAnand Jain
Currently device state is being managed by each individual int variable such as struct btrfs_device::is_tgtdev_for_dev_replace. Instead of that declare btrfs_device::dev_state BTRFS_DEV_STATE_MISSING and use the bit operations. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ whitespace adjustments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup device states define BTRFS_DEV_STATE_MISSINGAnand Jain
Currently device state is being managed by each individual int variable such as struct btrfs_device::missing. Instead of that declare btrfs_device::dev_state BTRFS_DEV_STATE_MISSING and use the bit operations. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by : Nikolay Borisov <nborisov@suse.com> [ whitespace adjustments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup device states define BTRFS_DEV_STATE_IN_FS_METADATAAnand Jain
Currently device state is being managed by each individual int variable such as struct btrfs_device::in_fs_metadata. Instead of that declare device state BTRFS_DEV_STATE_IN_FS_METADATA and use the bit operations. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> [ whitespace adjustments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: cleanup device states define BTRFS_DEV_STATE_WRITEABLEAnand Jain
Currently device state is being managed by each individual int variable such as struct btrfs_device::writeable. Instead of that declare device state BTRFS_DEV_STATE_WRITEABLE and use the bit operations. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ whitespace adjustments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: drop btrfs_device::can_discard to query directlyAnand Jain
We can query the bdev directly when needed at btrfs_discard_extent() so drop btrfs_device::can_discard. Signed-off-by: Anand Jain <anand.jain@oracle.com> Suggested-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: factor __btrfs_open_devices() to create btrfs_open_one_device()Anand Jain
No functional changes, create btrfs_open_one_device() from __btrfs_open_devices(). This is a preparatory work to add dynamic device scan. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ minor whitespace fixes ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: move check for device generation to the lastAnand Jain
No functional changes. This helps to move the entire section into a new function. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: set fs_devices->seed directlyAnand Jain
This is in preparation to move a section of code in __btrfs_open_devices() into a new function so that it can be reused. As we set seeding if any of the device is having SB flag BTRFS_SUPER_FLAG_SEEDING, so do it in the device list loop itself. No functional changes. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: simplify btrfs_close_bdevDavid Sterba
Split the conditions a bit. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: document device lockingDavid Sterba
Overview of the main locks protecting various device-related structures. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: simplify exit paths in btrfs_init_new_deviceDavid Sterba
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: use free_device where opencodedDavid Sterba
Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: introduce free_device helperDavid Sterba
A helper to free a device and all it's dynamically allocated members, like the rcu_string name or flush_bio. This is going to replace all open coded places. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: rename device free rcu helper to free_device_rcuDavid Sterba
Make it clear that it is an RCU helper, we want to use the name free_device for a wrapper freeing all device members. Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: rename btrfs_add_device to btrfs_add_dev_itemAnand Jain
Function btrfs_add_device() is adding the device item so rename to reflect that in the function. Similarly we have btrfs_rm_dev_item(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: move volume_mutex into the btrfs_rm_device()Anand Jain
A cleanup patch no functional change, we hold volume_mutex before calling btrfs_rm_device, so move it into the function itself. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22Btrfs: remove rcu_barrier in btrfs_close_devicesLiu Bo
It was introduced because btrfs used to do blkdev_put in a deferred work, now that btrfs has blkdev_put in place, this rcu_barrier can be removed. modprobe -r btrfs will do btrfs_cleanup_fs_uuids(), where it cleanup every %fs_devices on the list, but when we do btrfs_close_devices(), we have replaced the devices on the list with dummy ones which only have the same name and uuid, so modprobe -r btrfs will free those instead of what we were using, this change won't cause a problem for it. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ copied 2nd paragraph from mailinglist discussion ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: Fix memory barriers usage with device stats countersNikolay Borisov
Commit addc3fa74e5b ("Btrfs: Fix the problem that the dirty flag of dev stats is cleared") reworked the way device stats changes are tracked. A new atomic dev_stats_ccnt counter was introduced which is incremented every time any of the device stats counters are changed. This serves as a flag whether there are any pending stats changes. However, this patch only partially implemented the correct memory barriers necessary: - It only ordered the stores to the counters but not the reads e.g. btrfs_run_dev_stats - It completely omitted any comments documenting the intended design and how the memory barriers pair with each-other This patch provides the necessary comments as well as adds a missing smp_rmb in btrfs_run_dev_stats. Furthermore since dev_stats_cnt is only a snapshot at best there was no point in reading the counter twice - once in btrfs_dev_stats_dirty and then again when assigning stats_cnt. Just collapse both reads into 1. Fixes: addc3fa74e5b ("Btrfs: Fix the problem that the dirty flag of dev stats is cleared") Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22btrfs: clean up btrfs_dev_stat_inc usageAnand Jain
btrfs_end_bio() is using btrfs_dev_stat_inc() and then btrfs_dev_stat_print_on_error() separately instead use btrfs_dev_stat_inc_and_print() directly. As of now there isn't any bio in btrfs which is - a non-empty write and also the REQ_PREFLUSH flag is set. So in actual the condition if (bio->bi_opf & REQ_PREFLUSH) is never true in btrfs_end_bio(), and so there won't be any redundant error log by using btrfs_dev_stat_inc_and_print() separately one for write and another for flush. This consolidation will help to add the device critical error handles in the function btrfs_dev_stat_inc_and_print() and which can be renamed as needed. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22Btrfs: free btrfs_device in placeLiu Bo
It's pointless to defer it to a kthread helper as we're not under a special context. For reference, commit 1f78160ce1b1 ("Btrfs: using rcu lock in the reader side of devices list") introduced RCU freeing for device structures. Originally the blkdev_put was called from free_device and rcu_barrier had to be called. This is no longer required, bdev and our device structures are now freed separately. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ enhance changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-05Merge tag 'for-4.15-rc7-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "We have two more fixes for 4.15, both aimed for stable. The leak fix is obvious, the second patch fixes a bug revealed by the refcount API, when it behaves differently than previous atomic_t and reports refs going from 0 to 1 in one case" * tag 'for-4.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes btrfs: Fix flush bio leak
2018-01-02btrfs: Fix flush bio leakNikolay Borisov
Commit e0ae99941423 ("btrfs: preallocate device flush bio") reworked the way the flush bio is allocated and used. Concretely it allocates the bio in __alloc_device and then re-uses it multiple times with a very simple endio routine that just calls complete() without consuming a reference. Allocated bios by default come with a ref count of 1, which is then consumed by the endio routine (or not, in which case they should be bio_put by the caller). The way the impleementation works now is that the flush bio has a refcount of 2 and we only ever bio_put it once, leaving it to hang indefinitely. Fix this by removing the extra bio_get in __alloc_device. Fixes: e0ae99941423 ("btrfs: preallocate device flush bio") Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-11-29Merge tag 'for-4.15-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "We've collected some fixes in since the pre-merge window freeze. There's technically only one regression fix for 4.15, but the rest seems important and candidates for stable. - fix missing flush bio puts in error cases (is serious, but rarely happens) - fix reporting stat::st_blocks for buffered append writes - fix space cache invalidation - fix out of bound memory access when setting zlib level - fix potential memory corruption when fsync fails in the middle - fix crash in integrity checker - incremetnal send fix, path mixup for certain unlink/rename combination - pass flags to writeback so compressed writes can be throttled properly - error handling fixes" * tag 'for-4.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: incremental send, fix wrong unlink path after renaming file btrfs: tree-checker: Fix false panic for sanity test Btrfs: fix list_add corruption and soft lockups in fsync btrfs: Fix wild memory access in compression level parser btrfs: fix deadlock when writing out space cache btrfs: clear space cache inode generation always Btrfs: fix reported number of inode blocks after buffered append writes Btrfs: move definition of the function btrfs_find_new_delalloc_bytes Btrfs: bail out gracefully rather than BUG_ON btrfs: dev_alloc_list is not protected by RCU, use normal list_del btrfs: add missing device::flush_bio puts btrfs: Fix transaction abort during failure in btrfs_rm_dev_item Btrfs: add write_flags for compression bio
2017-11-27Rename superblock flags (MS_xyz -> SB_xyz)Linus Torvalds
This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/*: it generally should *not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15btrfs: dev_alloc_list is not protected by RCU, use normal list_delDavid Sterba
The dev_alloc_list list could be protected by various mutexes, depending on the context. The list tracks devices that can take part of allocating new chunks, so the closest mutex is chunk_mutex. Adding a new device from inside the ADD_DEV ioctl will need device_list_mutex and registering a new device from the ioctl needs uuid_mutex. All mutexes naturally guarantee exclusivity against the same context. The device ownership can move between the contexts and the exclusivity is guaranteed by other means, eg. during the mount with the uuid_mutex. There's no RCU involved for dev_alloc_list. Signed-off-by: David Sterba <dsterba@suse.com>
2017-11-15btrfs: add missing device::flush_bio putsDavid Sterba
This fixes potential bio leaks, in several error paths. Unfortunatelly the device structure freeing is opencoded in many places and I missed them when introducing the flush_bio. Most of the time, devices get freed through call_rcu(..., free_device), so it at least it's not that easy to hit the leak, but it's still possible through the path that frees stale devices. Fixes: e0ae99941423 ("btrfs: preallocate device flush bio") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-11-15btrfs: Fix transaction abort during failure in btrfs_rm_dev_itemNikolay Borisov
btrfs_rm_dev_item calls several function under an active transaction, however it fails to abort it if an error happens. Fix this by adding explicit btrfs_abort_transaction/btrfs_end_transaction calls. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-11-01btrfs: remove BUG_ON in btrfs_rm_dev_replace_free_srcdev()Anand Jain
That was only an extra check to tackle a few bugs around this area, now its safe to remove it. Replace it by an ASSERT. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-10-30btrfs: fix false EIO for missing deviceAnand Jain
When one of the device is missing, bbio_error() takes care of setting the error status. And if its only IO that is pending in that stripe, it fails to check the status of the other IO at %bbio_error before setting the error %bi_status for the %orig_bio. Fix this by checking if %bbio->error has exceeded the %bbio->max_errors. Reproducer as below fdatasync error is seen intermittently. mount -o degraded /dev/sdc /btrfs dd status=none if=/dev/zero of=$(mktemp /btrfs/XXX) bs=4096 count=1 conv=fdatasync dd: fdatasync failed for ‘/btrfs/LSe’: Input/output error The reason for the intermittences of the problem is because the following conditions have to be met, which depends on timing: In btrfs_map_bio() - the RAID1 the missing device has to be at %dev_nr = 1 In bbio_error() . before bbio_error() is called the bio of the not-missing device at %dev_nr = 0 must be completed so that the below condition is true if (atomic_dec_and_test(&bbio->stripes_pending)) { Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-10-30btrfs: use need_full_stripe() in __btrfs_map_block()Anand Jain
A cleanup patch, use need_full_stripe() to replace the open code. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Qu Wenruo <quwenruo.btrfs@gmx.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-10-30btrfs: use BLK_STS defines where neededAnand Jain
At few places we could use BLK_STS_OK and BLK_STS_NOSUPP. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Satoru Taekeuchi <satoru.takeuchi@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> [ dropped first hunk btrfs_endio_direct_read ] Signed-off-by: David Sterba <dsterba@suse.com>
2017-10-30btrfs: fix use of error or warning for missing deviceAnand Jain
When device is missing without the -o degraded option then its an error so report it as an error instead of a warning. And when -o degraded option is provided, log the missing device as warning. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ switch error to bool ] Signed-off-by: David Sterba <dsterba@suse.com>