summaryrefslogtreecommitdiff
path: root/fs/btrfs/extent_io.c
AgeCommit message (Collapse)Author
2017-07-14Merge branch 'for-4.13-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "We've identified and fixed a silent corruption (introduced by code in the first pull), a fixup after the blk_status_t merge and two fixes to incremental send that Filipe has been hunting for some time" * 'for-4.13-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: fix unexpected return value of bio_readpage_error btrfs: btrfs_create_repair_bio never fails, skip error handling btrfs: cloned bios must not be iterated by bio_for_each_segment_all Btrfs: fix write corruption due to bio cloning on raid5/6 Btrfs: incremental send, fix invalid memory access Btrfs: incremental send, fix invalid path for link commands
2017-07-14Btrfs: fix unexpected return value of bio_readpage_errorLiu Bo
With blk_status_t conversion (that are now present in master), bio_readpage_error() may return 1 as now ->submit_bio_hook() may not set %ret if it runs without problems. This fixes that unexpected return value by changing btrfs_check_repairable() to return a bool instead of updating %ret, and patch is applicable to both codebases with and without blk_status_t. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-07-14btrfs: btrfs_create_repair_bio never fails, skip error handlingDavid Sterba
As the function uses the non-failing bio allocation, we can remove error handling from the callers as well. Signed-off-by: David Sterba <dsterba@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-07-14btrfs: cloned bios must not be iterated by bio_for_each_segment_allDavid Sterba
We've started using cloned bios more in 4.13, there are some specifics regarding the iteration. Filipe found [1] that the raid56 iterated a cloned bio using bio_for_each_segment_all, which is incorrect. The cloned bios have wrong bi_vcnt and this could lead to silent corruptions. This patch adds assertions to all remaining bio_for_each_segment_all cases. [1] https://patchwork.kernel.org/patch/9838535/ Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-07-06Merge branch 'for-4.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: "These are the percpu changes for the v4.13-rc1 merge window. There are a couple visibility related changes - tracepoints and allocator stats through debugfs, along with __ro_after_init markings and a cosmetic rename in percpu_counter. Please note that the simple O(#elements_in_the_chunk) area allocator used by percpu allocator is again showing scalability issues, primarily with bpf allocating and freeing large number of counters. Dennis is working on the replacement allocator and the percpu allocator will be seeing increased churns in the coming cycles" * 'for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: fix static checker warnings in pcpu_destroy_chunk percpu: fix early calls for spinlock in pcpu_stats percpu: resolve err may not be initialized in pcpu_alloc percpu_counter: Rename __percpu_counter_add to percpu_counter_add_batch percpu: add tracepoint support for percpu memory percpu: expose statistics about percpu memory via debugfs percpu: migrate percpu data structures to internal header percpu: add missing lockdep_assert_held to func pcpu_free_area mark most percpu globals as __ro_after_init
2017-07-05Merge branch 'for-4.13-part1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "The core updates improve error handling (mostly related to bios), with the usual incremental work on the GFP_NOFS (mis)use removal, refactoring or cleanups. Except the two top patches, all have been in for-next for an extensive amount of time. User visible changes: - statx support - quota override tunable - improved compression thresholds - obsoleted mount option alloc_start Core updates: - bio-related updates: - faster bio cloning - no allocation failures - preallocated flush bios - more kvzalloc use, memalloc_nofs protections, GFP_NOFS updates - prep work for btree_inode removal - dir-item validation - qgoup fixes and updates - cleanups: - removed unused struct members, unused code, refactoring - argument refactoring (fs_info/root, caller -> callee sink) - SEARCH_TREE ioctl docs" * 'for-4.13-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (115 commits) btrfs: Remove false alert when fiemap range is smaller than on-disk extent btrfs: Don't clear SGID when inheriting ACLs btrfs: fix integer overflow in calc_reclaim_items_nr btrfs: scrub: fix target device intialization while setting up scrub context btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges btrfs: qgroup: Introduce extent changeset for qgroup reserve functions btrfs: qgroup: Fix qgroup reserved space underflow caused by buffered write and quotas being enabled btrfs: qgroup: Return actually freed bytes for qgroup release or free data btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function btrfs: qgroup: Add quick exit for non-fs extents Btrfs: rework delayed ref total_bytes_pinned accounting Btrfs: return old and new total ref mods when adding delayed refs Btrfs: always account pinned bytes when dropping a tree block ref Btrfs: update total_bytes_pinned when pinning down extents Btrfs: make BUG_ON() in add_pinned_bytes() an ASSERT() Btrfs: make add_pinned_bytes() take an s64 num_bytes instead of u64 btrfs: fix validation of XATTR_ITEM dir items btrfs: Verify dir_item in iterate_object_props btrfs: Check name_len before in btrfs_del_root_ref btrfs: Check name_len before reading btrfs_get_name ...
2017-06-29btrfs: Remove false alert when fiemap range is smaller than on-disk extentQu Wenruo
Commit 4751832da990 ("btrfs: fiemap: Cache and merge fiemap extent before submit it to user") introduced a warning to catch unemitted cached fiemap extent. However such warning doesn't take the following case into consideration: 0 4K 8K |<---- fiemap range --->| |<----------- On-disk extent ------------------>| In this case, the whole 0~8K is cached, and since it's larger than fiemap range, it break the fiemap extent emit loop. This leaves the fiemap extent cached but not emitted, and caught by the final fiemap extent sanity check, causing kernel warning. This patch removes the kernel warning and renames the sanity check to emit_last_fiemap_cache() since it's possible and valid to have cached fiemap extent. Reported-by: David Sterba <dsterba@suse.cz> Reported-by: Adam Borowski <kilobyte@angband.pl> Fixes: 4751832da990 ("btrfs: fiemap: Cache and merge fiemap extent ...") Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-27btrfs: add support for passing in write hints for buffered writesJens Axboe
Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-20percpu_counter: Rename __percpu_counter_add to percpu_counter_add_batchNikolay Borisov
Currently, percpu_counter_add is a wrapper around __percpu_counter_add which is preempt safe due to explicit calls to preempt_disable. Given how __ prefix is used in percpu related interfaces, the naming unfortunately creates the false sense that __percpu_counter_add is less safe than percpu_counter_add. In terms of context-safety, they're equivalent. The only difference is that the __ version takes a batch parameter. Make this a bit more explicit by just renaming __percpu_counter_add to percpu_counter_add_batch. This patch doesn't cause any functional changes. tj: Minor updates to patch description for clarity. Cosmetic indentation updates. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: David Sterba <dsterba@suse.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Jan Kara <jack@suse.com> Cc: Jens Axboe <axboe@fb.com> Cc: linux-mm@kvack.org Cc: "David S. Miller" <davem@davemloft.net>
2017-06-19btrfs: sink gfp parameter to btrfs_io_bio_allocDavid Sterba
We can hardcode GFP_NOFS to btrfs_io_bio_alloc, although it means we change it back from GFP_KERNEL in scrub. I'd rather save a few stack bytes from not passing the gfp flags in the remaining, more imporatant, contexts and the bio allocating API now looks more consistent. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: add helper to initialize the non-bio part of btrfs_io_bioDavid Sterba
We use btrfs_bioset for bios and ask to allocate the entire size of btrfs_io_bio from btrfs bio_alloc_bioset. The member 'bio' is initialized but the bytes from 0 to offset of 'bio' are left uninitialized. Although we initialize some of the members in our helpers, we should initialize the whole structures. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: pass bytes to btrfs_bio_allocDavid Sterba
Most callers of btrfs_bio_alloc convert from bytes to sectors. Hide that in the helper and simplify the logic in the callsers. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: remove redundant parameters from btrfs_bio_allocDavid Sterba
All callers pass gfp_flags=GFP_NOFS and nr_vecs=BIO_MAX_PAGES. submit_extent_page adds __GFP_HIGH that does not make a difference in our case as it allows access to memory reserves but otherwise does not change the constraints. Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: sink gfp parameter to btrfs_bio_cloneDavid Sterba
All callers pass GFP_NOFS. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: btrfs_io_bio_alloc never fails, skip error handlingDavid Sterba
Update direct callers of btrfs_io_bio_alloc that do error handling, that we can now remove. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: btrfs_bio_alloc never fails, skip error handlingDavid Sterba
Update direct callers of btrfs_bio_alloc that do error handling, that we can now remove. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: bioset allocations will never fail, adapt our helpersDavid Sterba
Christoph pointed out that bio allocations backed by a bioset will never fail. As we always use a bioset for all bio allocations, we can skip the error handling. This patch adjusts our low-level helpers, the cascaded changes to all callers will come next. CC: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: rename btrfs_leaf_data to BTRFS_LEAF_DATA_OFFSETNikolay Borisov
Commit 5f39d397dfbe ("Btrfs: Create extent_buffer interface for large blocksizes") refactored btrfs_leaf_data function to take extent_buffer rather than struct btrfs_leaf. However, as it turns out the parameter being passed is never used. Furthermore this function no longer returns the leaf data but rather the offset to it. So rename the function to BTRFS_LEAF_DATA_OFFSET to make it consistent with other BTRFS_LEAF_* helpers and turn it into a macro. Signed-off-by: Nikolay Borisov <nborisov@suse.com> [ removed () from the macro ] Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19Btrfs: hardcode GFP_NOFS for btrfs_bio_clone_partialLiu Bo
We only pass GFP_NOFS to btrfs_bio_clone_partial, so lets hardcode it. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19Btrfs: change how we iterate bios in endioLiu Bo
Since dio submit has used bio_clone_fast, the submitted bio may not have a reliable bi_vcnt, for the bio vector iterations in checksum related functions, bio->bi_iter is not modified yet and it's safe to use bio_for_each_segment, while for those bio vector iterations in dio read's endio, we now save a copy of bvec_iter in struct btrfs_io_bio when cloning bios and use the helper __bio_for_each_segment with the saved bvec_iter to access each bvec. Also for dio reads which don't get split, we also need to save a copy of bio iterator in btrfs_bio_clone to let __bio_for_each_segments to access each bvec in dio read's endio. Note that it doesn't affect other calls of btrfs_bio_clone() because they don't need to use this iterator. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19Btrfs: new helper btrfs_bio_clone_partialLiu Bo
This adds a new helper btrfs_bio_clone_partial, it'll allocate a cloned bio that only owns a part of the original bio's data. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19Btrfs: use bio_clone_fast to clone our bioLiu Bo
For raid1 and raid10, we clone the original bio to the bios which are then sent to different disks. Right now we use bio_clone_bioset to create a clone bio with iterating bi_io_vec to initialize it. This changes it to use bio_clone_fast() which creates a clone bio but only copies the bi_io_vec pointer instead of iterating bi_io_vec. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19Btrfs: don't pass the inode through clean_io_failureJosef Bacik
Instead pass around the failure tree and the io tree. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19btrfs: remove inode argument from repair_io_failureJosef Bacik
Once we remove the btree_inode we won't have an inode to pass anymore, just pass the fs_info directly and the inum since we use that to print out the repair message. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-19Btrfs: replace tree->mapping with tree->private_dataJosef Bacik
For extent_io tree's we have carried the address_mapping of the inode around in the io tree in order to pull the inode back out for calling into various tree ops hooks. This works fine when everything that has an extent_io_tree has an inode. But we are going to remove the btree_inode, so we need to change this. Instead just have a generic void * for private data that we can initialize with, and have all the tree ops use that instead. This had a lot of cascading changes but should be relatively straightforward. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: David Sterba <dsterba@suse.com> [ minor reordering of the callback prototypes ] Signed-off-by: David Sterba <dsterba@suse.com>
2017-06-18blk: replace bioset_create_nobvec() with a flags arg to bioset_create()NeilBrown
"flags" arguments are often seen as good API design as they allow easy extensibility. bioset_create_nobvec() is implemented internally as a variation in flags passed to __bioset_create(). To support future extension, make the internal structure part of the API. i.e. add a 'flags' argument to bioset_create() and discard bioset_create_nobvec(). Note that the bio_split allocations in drivers/md/raid* do not need the bvec mempool - they should have used bioset_create_nobvec(). Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-12Merge tag 'v4.12-rc5' into for-4.13/blockJens Axboe
We've already got a few conflicts and upcoming work depends on some of the changes that have gone into mainline as regression fixes for this series. Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream trees to continue working on 4.13 changes. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-09block: switch bios to blk_status_tChristoph Hellwig
Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-16btrfs: fix incorrect error return ret being passed to mapping_set_errorColin Ian King
The setting of return code ret should be based on the error code passed into function end_extent_writepage and not on ret. Thanks to Liu Bo for spotting this mistake in the original fix I submitted. Detected by CoverityScan, CID#1414312 ("Logically dead code") Fixes: 5dca6eea91653e ("Btrfs: mark mapping with error flag to report errors to userspace") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-05-16btrfs: fiemap: Cache and merge fiemap extent before submit it to userQu Wenruo
[BUG] Cycle mount btrfs can cause fiemap to return different result. Like: # mount /dev/vdb5 /mnt/btrfs # dd if=/dev/zero bs=16K count=4 oflag=dsync of=/mnt/btrfs/file # xfs_io -c "fiemap -v" /mnt/btrfs/file /mnt/test/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 25088..25215 128 0x1 # umount /mnt/btrfs # mount /dev/vdb5 /mnt/btrfs # xfs_io -c "fiemap -v" /mnt/btrfs/file /mnt/test/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..31]: 25088..25119 32 0x0 1: [32..63]: 25120..25151 32 0x0 2: [64..95]: 25152..25183 32 0x0 3: [96..127]: 25184..25215 32 0x1 But after above fiemap, we get correct merged result if we call fiemap again. # xfs_io -c "fiemap -v" /mnt/btrfs/file /mnt/test/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 25088..25215 128 0x1 [REASON] Btrfs will try to merge extent map when inserting new extent map. btrfs_fiemap(start=0 len=(u64)-1) |- extent_fiemap(start=0 len=(u64)-1) |- get_extent_skip_holes(start=0 len=64k) | |- btrfs_get_extent_fiemap(start=0 len=64k) | |- btrfs_get_extent(start=0 len=64k) | | Found on-disk (ino, EXTENT_DATA, 0) | |- add_extent_mapping() | |- Return (em->start=0, len=16k) | |- fiemap_fill_next_extent(logic=0 phys=X len=16k) | |- get_extent_skip_holes(start=0 len=64k) | |- btrfs_get_extent_fiemap(start=0 len=64k) | |- btrfs_get_extent(start=16k len=48k) | | Found on-disk (ino, EXTENT_DATA, 16k) | |- add_extent_mapping() | | |- try_merge_map() | | Merge with previous em start=0 len=16k | | resulting em start=0 len=32k | |- Return (em->start=0, len=32K) << Merged result |- Stripe off the unrelated range (0~16K) of return em |- fiemap_fill_next_extent(logic=16K phys=X+16K len=16K) ^^^ Causing split fiemap extent. And since in add_extent_mapping(), em is already merged, in next fiemap() call, we will get merged result. [FIX] Here we introduce a new structure, fiemap_cache, which records previous fiemap extent. And will always try to merge current fiemap_cache result before calling fiemap_fill_next_extent(). Only when we failed to merge current fiemap extent with cached one, we will call fiemap_fill_next_extent() to submit cached one. So by this method, we can merge all fiemap extents. It can also be done in fs/ioctl.c, however the problem is if fieinfo->fi_extents_max == 0, we have no space to cache previous fiemap extent. So I choose to merge it in btrfs. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: enable repair during read for raid56 profileLiu Bo
Now that scrub can fix data errors with the help of parity for raid56 profile, repair during read is able to as well. Although the mirror num in raid56 scenario has different meanings, i.e. 0 or 1: read data directly > 1: do recover with parity, it could be fit into how we repair bad block during read. The trick is to use BTRFS_MAP_READ instead of BTRFS_MAP_WRITE to get the device and position on it. Cc: David Sterba <dsterba@suse.cz> Tested-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18Btrfs: create a helper for getting chunk mapLiu Bo
We have similar code here and there, this merges them into a helper. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18btrfs: convert extent_state.refs from atomic_t to refcount_tElena Reshetova
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-04-18btrfs: convert extent_map.refs from atomic_t to refcount_tElena Reshetova
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-03-29Btrfs: bring back repair during readLiu Bo
Commit 20a7db8ab3f2 ("btrfs: add dummy callback for readpage_io_failed and drop checks") made a cleanup around readpage_io_failed_hook, and it was supposed to keep the original sematics, but it also unexpectedly disabled repair during read for dup, raid1 and raid10. This fixes the problem by letting data's inode call the generic readpage_io_failed callback by returning -EAGAIN from its readpage_io_failed_hook in order to notify end_bio_extent_readpage to do the rest. We don't call it directly because the generic one takes an offset from end_bio_extent_readpage() to calculate the index in the checksum array and inode's readpage_io_failed_hook doesn't offer that offset. Cc: David Sterba <dsterba@suse.cz> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ keep the const function attribute ] Signed-off-by: David Sterba <dsterba@suse.com>
2017-03-17Btrfs: fix regression in lock_delalloc_pagesLiu Bo
The bug is a regression after commit (da2c7009f6ca "btrfs: teach __process_pages_contig about PAGE_LOCK operation") and commit (76c0021db8fd "Btrfs: use helper to simplify lock/unlock pages"). So if the dirty pages which are under writeback got truncated partially before we lock the dirty pages, we couldn't find all pages mapping to the delalloc range, and the bug didn't return an error so it kept going on and found that the delalloc range got truncated and got to unlock the dirty pages, and then the ASSERT could caught the error, and showed ----------------------------------------------------------------------------- assertion failed: page_ops & PAGE_LOCK, file: fs/btrfs/extent_io.c, line: 1716 ----------------------------------------------------------------------------- This fixes the bug by returning the proper -EAGAIN. Cc: David Sterba <dsterba@suse.com> Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: add dummy callback for readpage_io_failed and drop checksDavid Sterba
Make extent_io_ops::readpage_io_failed_hook callback mandatory and define a dummy function for btrfs_extent_io_ops. As the failed IO callback is not performance critical, the branch vs extra trade off does not hurt. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: drop checks for mandatory extent_io_ops callbacksDavid Sterba
We know that eadpage_end_io_hook, submit_bio_hook and merge_bio_hook are always defined so we can drop the checks before we call them. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: let writepage_end_io_hook return voidDavid Sterba
There's no error path in any of the instances, always return 0. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make get_extent_t take btrfs_inodeNikolay Borisov
In addition to changing the signature, this patch also switches all the functions which are used as an argument to also take btrfs_inode. Namely those are: btrfs_get_extent and btrfs_get_extent_filemap. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_clear_bit_hook take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make btrfs_free_io_failure_record take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make clean_io_failure take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make repair_io_failure take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make free_io_failure take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_lookup_ordered_range take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-17btrfs: remove unused parameter from extent_write_cache_pagesDavid Sterba
The 'tree' was used to call locking hook that does not exist anymore. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-17btrfs: remove unused parameter from update_nr_writtenDavid Sterba
The logic has been updated in "Btrfs: make mapping->writeback_index point to the last written page" (a91326679f2a0a4c2) and page is not needed anymore. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-17btrfs: remove unused parameter from submit_extent_pageDavid Sterba
This used to hold number of maximum pages to allocate, but this is now limited by BIO_MAX_PAGES. The local are now unused and removed as well. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-17btrfs: embed extent_changeset::range_changed to the structureDavid Sterba
We can embed range_changed to the extent changeset to address following problems: - no need to allocate ulist dynamically, we also get rid of the GFP_NOFS for free - fix lack of allocation failure checking in btrfs_qgroup_reserve_data The stack consuption where extent_changeset is used slightly increases: before: 16 after: 16 - 8 (for pointer) + 32 (sizeof ulist) = 40 Which is bearable. Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>