summaryrefslogtreecommitdiff
path: root/fs/gfs2/bmap.c
AgeCommit message (Collapse)Author
2018-12-18gfs2: take jdata unstuff into account in do_growBob Peterson
Before this patch, function do_grow would not reserve enough journal blocks in the transaction to unstuff jdata files while growing them. This patch adds the logic to add one more block if the file to grow is jdata. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-12-11gfs2: add more timing info to journal recovery processAbhi Das
Tells you how many milliseconds map_journal_extents and find_jhead take. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-11-16Merge tag 'gfs2-4.20.fixes3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull bfs2 fixes from Andreas Gruenbacher: "Fix two bugs leading to leaked buffer head references: - gfs2: Put bitmap buffers in put_super - gfs2: Fix iomap buffer head reference counting bug And one bug leading to significant slow-downs when deleting large files: - gfs2: Fix metadata read-ahead during truncate (2)" * tag 'gfs2-4.20.fixes3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix iomap buffer head reference counting bug gfs2: Fix metadata read-ahead during truncate (2) gfs2: Put bitmap buffers in put_super
2018-11-16gfs2: Fix iomap buffer head reference counting bugAndreas Gruenbacher
GFS2 passes the inode buffer head (dibh) from gfs2_iomap_begin to gfs2_iomap_end in iomap->private. It sets that private pointer in gfs2_iomap_get. Users of gfs2_iomap_get other than gfs2_iomap_begin would have to release iomap->private, but this isn't done correctly, leading to a leak of buffer head references. To fix this, move the code for setting iomap->private from gfs2_iomap_get to gfs2_iomap_begin. Fixes: 64bc06bb32 ("gfs2: iomap buffered write support") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-09gfs2: Fix metadata read-ahead during truncate (2)Andreas Gruenbacher
The previous attempt to fix for metadata read-ahead during truncate was incorrect: for files with a height > 2 (1006989312 bytes with a block size of 4096 bytes), read-ahead requests were not being issued for some of the indirect blocks discovered while walking the metadata tree, leading to significant slow-downs when deleting large files. Fix that. In addition, only issue read-ahead requests in the first pass through the meta-data tree, while deallocating data blocks. Fixes: c3ce5aa9b0 ("gfs2: Fix metadata read-ahead during truncate") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-10-24Merge tag 'gfs2-4.20.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Bob Peterson: "We've got 18 patches for this merge window, none of which are very major: - clean up the gfs2 block allocator to prepare for future performance enhancements (Andreas Gruenbacher) - fix a use-after-free problem (Andy Price) - patches that fix gfs2's broken rgrplvb mount option (me) - cleanup patches and error message improvements (me) - enable getlabel support (Steve Whitehouse and Abhi Das) - flush the glock delete workqueue at exit (Tim Smith)" * tag 'gfs2-4.20.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix minor typo: couln't versus couldn't. gfs2: write revokes should traverse sd_ail1_list in reverse gfs2: Pass resource group to rgblk_free gfs2: Remove unnecessary gfs2_rlist_alloc parameter gfs2: Fix marking bitmaps non-full gfs2: Fix some minor typos gfs2: Rename bitmap.bi_{len => bytes} gfs2: Remove unused RGRP_RSRV_MINBYTES definition gfs2: Move rs_{sizehint, rgd_gh} fields into the inode gfs2: Clean up out-of-bounds check in gfs2_rbm_from_block gfs2: Always check the result of gfs2_rbm_from_block gfs2: getlabel support GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads gfs2: Don't leave s_fs_info pointing to freed memory in init_sbd gfs2: Use fs_* functions instead of pr_* function where we can gfs2: slow the deluge of io error messages gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated gfs2: improve debug information when lvb mismatches are found
2018-10-12gfs2: Fix iomap buffered write support for journaled files (2)Andreas Gruenbacher
It turns out that the fix in commit 6636c3cc56 is bad; the assertion that the iomap code no longer creates buffer heads is incorrect for filesystems that set the IOMAP_F_BUFFER_HEAD flag. Instead, what's happening is that gfs2_iomap_begin_write treats all files that have the jdata flag set as journaled files, which is incorrect as long as those files are inline ("stuffed"). We're handling stuffed files directly via the page cache, which is why we ended up with pages without buffer heads in gfs2_page_add_databufs. Fix this by handling stuffed journaled files correctly in gfs2_iomap_begin_write. This reverts commit 6636c3cc5690c11631e6366cf9a28fb99c8b25bb. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-10-12gfs2: Pass resource group to rgblk_freeAndreas Gruenbacher
Function rgblk_free can only deal with one resource group at a time, so pass that resource group is as a parameter. Several of the callers already have the resource group at hand, so we only need additional lookup code in a few places. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Steven Whitehouse <swhiteho@redhat.com>
2018-10-09gfs2: Fix iomap buffered write support for journaled filesAndreas Gruenbacher
Commit 64bc06bb32ee broke buffered writes to journaled files (chattr +j): we'll try to journal the buffer heads of the page being written to in gfs2_iomap_journaled_page_done. However, the iomap code no longer creates buffer heads, so we'll BUG() in gfs2_page_add_databufs. Fix that by creating buffer heads ourself when needed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-25gfs2: Special-case rindex for gfs2_growAndreas Gruenbacher
To speed up the common case of appending to a file, gfs2_write_alloc_required presumes that writing beyond the end of a file will always require additional blocks to be allocated. This assumption is incorrect for preallocates files, but there are no negative consequences as long as *some* space is still left on the filesystem. One special file that always has some space preallocated beyond the end of the file is the rindex: when growing a filesystem, gfs2_grow adds one or more new resource groups and appends records describing those resource groups to the rindex; the preallocated space ensures that this is always possible. However, when a filesystem is completely full, gfs2_write_alloc_required will indicate that an additional allocation is required, and appending the next record to the rindex will fail even though space for that record has already been preallocated. To fix that, skip the incorrect optimization in gfs2_write_alloc_required, but for the rindex only. Other writes to preallocated space beyond the end of the file are still allowed to fail on completely full filesystems. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-25Merge branch 'iomap-4.19-merge' into linux-gfs2/for-nextAndreas Gruenbacher
Merge xfs branch 'iomap-4.19-merge' into linux-gfs2/for-next. This brings in readpage and direct I/O support for inline data. The IOMAP_F_BUFFER_HEAD flag introduced in commit "iomap: add initial support for writes without buffer heads" needs to be set for gfs2 as well, so do that in the merge. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-24Merge branch 'iomap-write' into linux-gfs2/for-nextAndreas Gruenbacher
Pull in the gfs2 iomap-write changes: Tweak the existing code to properly support iomap write and eliminate an unnecessary special case in gfs2_block_map. Implement iomap write support for buffered and direct I/O. Simplify some of the existing code and eliminate code that is no longer used: gfs2: Remove gfs2_write_{begin,end} gfs2: iomap direct I/O support gfs2: gfs2_extent_length cleanup gfs2: iomap buffered write support gfs2: Further iomap cleanups This is based on the following changes on the xfs 'iomap-4.19-merge' branch: iomap: add private pointer to struct iomap iomap: add a page_done callback iomap: generic inline data handling iomap: complete partial direct I/O writes synchronously iomap: mark newly allocated buffer heads as new fs: factor out a __generic_write_end helper Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-07-02gfs2: iomap direct I/O supportAndreas Gruenbacher
The page unmapping previously done in gfs2_direct_IO is now done generically in iomap_dio_rw. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-02gfs2: gfs2_extent_length cleanupAndreas Gruenbacher
Now that gfs2_extent_length is no longer used for determining the size of a hole and always with an upper size limit, the function can be simplified. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-02gfs2: iomap buffered write supportAndreas Gruenbacher
With the traditional page-based writes, blocks are allocated separately for each page written to. With iomap writes, we can allocate a lot more blocks at once, with a fraction of the allocation overhead for each page. Split calculating the number of blocks that can be allocated at a given position (gfs2_alloc_size) off from gfs2_iomap_alloc: that size determines the number of blocks to allocate and reserve in the journal. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-07-02gfs2: Further iomap cleanupsAndreas Gruenbacher
In gfs2_iomap_alloc, set the type of newly allocated extents to IOMAP_MAPPED so that iomap_to_bh will set the bh states correctly: otherwise, the bhs would not be marked as mapped, confusing __mpage_writepage. This means that we need to check for the IOMAP_F_NEW flag in fallocate_chunk now. Further clean up gfs2_iomap_get and implement gfs2_stuffed_iomap here directly. For reads beyond the end of the file, return holes instead of failing with -ENOENT so that we can get rid of that special case in gfs2_block_map. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
2018-06-21gfs2: Minor clarification to __gfs2_punch_holeAndreas Gruenbacher
Rename end_off to end_len to make the code less confusing. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-05Merge tag 'xfs-4.18-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs updates from Darrick Wong: "New features this cycle include the ability to relabel mounted filesystems, support for fallocated swapfiles, and using FUA for pure data O_DSYNC directio writes. With this cycle we begin to integrate online filesystem repair and refactor the growfs code in preparation for eventual subvolume support, though the road ahead for both features is quite long. There are also numerous refactorings of the iomap code to remove unnecessary log overhead, to disentangle some of the quota code, and to prepare for buffer head removal in a future upstream kernel. Metadata validation continues to improve, both in the hot path veifiers and the online filesystem check code. I anticipate sending a second pull request in a few days with more metadata validation improvements. This series has been run through a full xfstests run over the weekend and through a quick xfstests run against this morning's master, with no major failures reported. Summary: - Strengthen inode number and structure validation when allocating inodes. - Reduce pointless buffer allocations during cache miss - Use FUA for pure data O_DSYNC directio writes - Various iomap refactorings - Strengthen quota metadata verification to avoid unfixable broken quota - Make AGFL block freeing a deferred operation to avoid blowing out transaction reservations when running complex operations - Get rid of the log item descriptors to reduce log overhead - Fix various reflink bugs where inodes were double-joined to transactions - Don't issue discards when trimming unwritten extents - Refactor incore dquot initialization and retrieval interfaces - Fix some locking problmes in the quota scrub code - Strengthen btree structure checks in scrub code - Rewrite swapfile activation to use iomap and support unwritten extents - Make scrub exit to userspace sooner when corruptions or cross-referencing problems are found - Make scrub invoke the data fork scrubber directly on metadata inodes - Don't do background reclamation of post-eof and cow blocks when the fs is suspended - Fix secondary superblock buffer lifespan hinting - Refactor growfs to use table-dispatched functions instead of long stringy functions - Move growfs code to libxfs - Implement online fs label getting and setting - Introduce online filesystem repair (in a very limited capacity) - Fix unit conversion problems in the realtime freemap iteration functions - Various refactorings and cleanups in preparation to remove buffer heads in a future release - Reimplement the old bmap call with iomap - Remove direct buffer head accesses from seek hole/data - Various bug fixes" * tag 'xfs-4.18-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (121 commits) fs: use ->is_partially_uptodate in page_cache_seek_hole_data fs: remove the buffer_unwritten check in page_seek_hole_data fs: move page_cache_seek_hole_data to iomap.c xfs: use iomap_bmap iomap: add an iomap-based bmap implementation iomap: add a iomap_sector helper iomap: use __bio_add_page in iomap_dio_zero iomap: move IOMAP_F_BOUNDARY to gfs2 iomap: fix the comment describing IOMAP_NOWAIT iomap: inline data should be an iomap type, not a flag mm: split ->readpages calls to avoid non-contiguous pages lists mm: return an unsigned int from __do_page_cache_readahead mm: give the 'ret' variable a better name __do_page_cache_readahead block: add a lower-level bio_add_page interface xfs: fix error handling in xfs_refcount_insert() xfs: fix xfs_rtalloc_rec units xfs: strengthen rtalloc query range checks xfs: xfs_rtbuf_get should check the bmapi_read results xfs: xfs_rtword_t should be unsigned, not signed dax: change bdev_dax_supported() to support boolean returns ...
2018-06-04gfs2: Iomap cleanups and improvementsAndreas Gruenbacher
Clean up gfs2_iomap_alloc and gfs2_iomap_get. Document how gfs2_iomap_alloc works: it now needs to be called separately after gfs2_iomap_get where necessary; this will be used later by iomap write. Move gfs2_iomap_ops into bmap.c. Introduce a new gfs2_iomap_get_alloc helper and use it in fallocate_chunk: gfs2_iomap_begin will become unsuitable for fallocate with proper iomap write support. In gfs2_block_map and fallocate_chunk, zero-initialize struct iomap. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: Remove ordered write mode handling from gfs2_trans_add_dataAndreas Gruenbacher
In journaled data mode, we need to add each buffer head to the current transaction. In ordered write mode, we only need to add the inode to the ordered inode list. So far, both cases are handled in gfs2_trans_add_data. This makes the code look misleading and is inefficient for small block sizes as well. Handle both cases separately instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: hole_size improvementAndreas Gruenbacher
Reimplement function hole_size based on a generic function for walking the metadata tree and rename hole_size to gfs2_hole_size. While previously, multiple invocations of hole_size were sometimes needed to walk across the entire hole, the new implementation always returns the entire hole at once (provided that the caller is interested in the total size). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-04gfs2: Update find_metapath commentAndreas Gruenbacher
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-06-01iomap: move IOMAP_F_BOUNDARY to gfs2Christoph Hellwig
Just define a range of fs specific flags and use that in gfs2 instead of exposing this internal flag globally. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-01iomap: inline data should be an iomap type, not a flagChristoph Hellwig
Inline data is fundamentally different from our normal mapped case in that it doesn't even have a block address. So instead of having a flag for it it should be an entirely separate iomap range type. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-04-16gfs2: Remove sdp->sd_jheightsizeAndreas Gruenbacher
GFS2 keeps two arrarys in the superblock that define the maximum size of an inode depending on the inode's height: sdp->sd_heightsize defines the heights in units of sb->s_blocksize; sdp->sd_jheightsize defines them in units of sb->s_blocksize - sizeof(struct gfs2_meta_header). These arrays are used to determine when additional layers of indirect blocks are needed. The second array is used for directories which have an additional gfs2_meta_header at the beginning of each block. Distinguishing between these two cases makes no sense: the height required for representing N blocks will come out the same no matter if the calculation is done in gross (sb->s_blocksize) or net (sb->s_blocksize - sizeof(struct gfs2_meta_header)) units. Stuffed directories don't have an additional gfs2_meta_header, but the stuffed case is handled separately for both files and directories, anyway. Remove the unncessary sdp->sd_jheightsize array. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-04-12GFS2: Minor improvements to comments and documentationBob Peterson
This patch simply fixes some comments and the gfs2-glocks.txt file: Places where i_rwsem was called i_mutex, and adding i_rw_mutex. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-03-29gfs2: Zero out fallocated blocks in fallocate_chunkAndreas Gruenbacher
Instead of zeroing out fallocated blocks in gfs2_iomap_alloc, zero them out in fallocate_chunk, much higher up the call stack. This gets rid of gfs2's abuse of the IOMAP_ZERO flag as well as the gfs2 specific zeronew buffer flag. I can't think of a reason why zeroing out the blocks in gfs2_iomap_alloc would have any benefits: there is no additional locking at that level that would add protection to the newly allocated blocks. While at it, change fallocate over from gs2_block_map to gfs2_iomap_begin. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Christoph Hellwig <hch@lst.de>
2018-03-23gfs2: Check for the end of metadata in punch_holeAndreas Gruenbacher
When punching a hole or truncating an inode down to a given size, also check if the truncate point / start of the hole is within the range we have metadata for. Otherwise, we can end up freeing blocks that shouldn't be freed, corrupting the inode, or crashing the machine when trying to punch a hole into the void. When growing an inode via truncate, we set the new size but we don't allocate additional levels of indirect blocks and grow the inode height. When shrinking that inode again, the new size may still point beyond the end of the inode's metadata. Fixes xfstest generic/476. Debugged-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-03-08gfs2: Improve gfs2_block_map commentAndreas Gruenbacher
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-03-07gfs2: Fixes to "Implement iomap for block_map" (2)Andreas Gruenbacher
It turns out that commit 3229c18c0d6b2 'Fixes to "Implement iomap for block_map"' introduced another bug in gfs2_iomap_begin that can cause gfs2_block_map to set bh->b_size of an actual buffer to 0. This can lead to arbitrary incorrect behavior including crashes or disk corruption. Revert the incorrect part of that commit. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-02-13gfs2: Fixes to "Implement iomap for block_map"Andreas Gruenbacher
It turns out that commit 3974320ca6 "Implement iomap for block_map" introduced a few bugs that trigger occasional failures with xfstest generic/476: In gfs2_iomap_begin, we jump to do_alloc when we determine that we are beyond the end of the allocated metadata (height > ip->i_height). There, we can end up calling hole_size with a metapath that doesn't match the current metadata tree, which doesn't make sense. After untangling the code at do_alloc, fix this by checking if the block we are looking for is within the range of allocated metadata. In addition, add a BUG() in case gfs2_iomap_begin is accidentally called for reading stuffed files: this is handled separately. Make sure we don't truncate iomap->length for reads beyond the end of the file; in that case, the entire range counts as a hole. Finally, revert to taking a bitmap write lock when doing allocations. It's unclear why that change didn't lead to any failures during testing. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-18gfs2: Add gfs2_max_stuffed_sizeAndreas Gruenbacher
Add a small inline function for computing the maximum size of a stuffed inode instead of open coding that in several places throughout the code. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-18gfs2: Implement fallocate(FALLOC_FL_PUNCH_HOLE)Andreas Gruenbacher
Implement the top-level bits of punching a hole into a file. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-18gfs2: Turn trunc_dealloc into punch_holeAndreas Gruenbacher
Add an upper bound to the range of blocks to deallocate blocks to function trunc_dealloc so that this function can be used for truncating a file as well as for punching a hole into a file. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-18gfs2: Generalize truncate codeAndreas Gruenbacher
Pull the code for computing the range of metapointers to iterate out of gfs2_metapath_ra (for readahead), sweep_bh_for_rgrps (for deallocating metapointers within a block), and trunc_dealloc (for walking the metadata tree). In sweep_bh_for_rgrps, move the code for looking up the resource group descriptor of the current resource group out of the inner loop. The metatype check moves to trunc_dealloc. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-17Turn gfs2_block_truncate_page into gfs2_block_zero_rangeAndreas Gruenbacher
Turn gfs2_block_truncate_page into a function that zeroes a range within a block rather than only the end of a block. This will be used for cleaning the end of the first partial block and the start of the last partial block when punching a hole in a file. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-17gfs2: Improve non-recursive delete algorithmAndreas Gruenbacher
In rare cases, the current non-recursive delete algorithm doesn't deallocate empty intermediary indirect blocks. This should have very little practical effect, but deallocating all blocks correctly should still be preferable as it is cleaner and easier to validate. The fix consists of using the first block to deallocate to compute the start marker of the truncate point instead of the last block that needs to be kept. With that change, computing which indirect blocks are still needed becomes relatively easy. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-17gfs2: Fix metadata read-ahead during truncateAndreas Gruenbacher
The metadata read-ahead algorithm broke when switching from recursive to non-recursive delete: the current algorithm reads ahead blocks at height N - 1 while deallocating the blocks at hight N. However, deallocating the blocks at height N requires a complete walk of the metadata tree, not only down to height N - 1. Consequently, all blocks below height N - 1 will be accessed without read-ahead. Fix this by issuing read-aheads as early as possible, after each metapath lookup. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-17gfs2: Clean up {lookup,fillup}_metapathAndreas Gruenbacher
Split out the entire lookup loop from lookup_metapath and fillup_metapath. Make both functions return the actual height in mp->mp_aheight, and return 0 on success. Handle lookup errors properly in trunc_dealloc. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-17gfs2: Remove minor gfs2_journaled_truncate inefficienciesAndreas Gruenbacher
First, this function truncates the file in chunks. When the original file size isn't block aligned, each chunk that is truncated will remain be misaligned. This is inefficient. Second, this function doesn't recognize where holes are, so it loops through them. For each chunk of a hole, it creates a new transaction. At least avoid creating another transactions whe the current one is still empty. (An better fix would be to skip large holes, of course.) Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-17gfs2: truncate: Remove unnecessary oldsize parametersAndreas Gruenbacher
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-17gfs2: Clean up trunc_start error pathAndreas Gruenbacher
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-17gfs2: Add gfs2_blk2rgrpd comment and fix incorrect useSteven Whitehouse
Document when to use gfs2_blk2rgrpd for "inexact" resource group matching. Based on that, fix an incorrect use of gfs2_blk2rgrpd in sweep_bh_for_rgrps. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-10-31GFS2: Implement iomap for block_mapBob Peterson
This patch implements iomap for block mapping, and switches the block_map function to use it under the covers. The additional IOMAP_F_BOUNDARY iomap flag indicates when iomap has reached a "metadata boundary" and fetching the next mapping is likely to incur an additional I/O. This flag is used for setting the bh buffer boundary flag. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2017-10-31GFS2: Make height info part of metapathBob Peterson
This patch eliminates height parameters from function gfs2_bmap_alloc. Function find_metapath determines the metapath's "find height", also known as the desired height. Function lookup_metapath determines the metapath's "actual height", previously known as starting height or sheight. Function gfs2_bmap_alloc now gets both height values from the metapath. This simplification was done as a step toward switching the block_map functions to using iomap. The bh_map responsibilities are also removed from function gfs2_bmap_alloc for the same reason. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2017-09-25gfs2: Clarify gfs2_block_mapAndreas Gruenbacher
Add a comment about the logical block size for directories. Rename "bsize" in gfs2_block_map to "factor". Fix a typo in the description of metaptr1. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-30GFS2: Fix non-recursive truncate bugBob Peterson
Before this patch if you truncated a file to a smaller size it wasn't freeing all the blocks properly. There are two reasons. First, the metapath comparison was not comparing previous heights. I added a function, mp_eq_to_hgt, which checks the metapath at all heights prior to the target height. Second, in function find_nonnull_ptr, it needed to zero out all pointers for heights following the target height. Translated into decimal integer terms, this way a number like 299, when incremented, becomes 300, not 399. The 2 gets incremented to 3, and the following digits need to be reset. These two things allow the truncate state machine to properly find the blocks it needs to delete. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-21gfs2: add flag REQ_PRIO for metadata I/OColy Li
When gfs2 does metadata I/O, only REQ_META is used as a metadata hint of the bio. But flag REQ_META is just a hint for block trace, not for block layer code to handle a bio as metadata request. For some of metadata I/Os of gfs2, A REQ_PRIO flag on the metadata bio would be very informative to block layer code. For example, if bcache is used as a I/O cache for gfs2, it will be possible for bcache code to get the hint and cache the pre-fetched metadata blocks on cache device. This behavior may be helpful to improve metadata I/O performance if the following requests hit the cache. Here are the locations in gfs2 code where a REQ_PRIO flag should be added, - All places where REQ_READAHEAD is used, gfs2 code uses this flag for metadata read ahead. - In gfs2_meta_rq() where the first metadata block is read in. - In gfs2_write_buf_to_page(), read in quota metadata blocks to have them up to date. These metadata blocks are probably to be accessed again in future, adding a REQ_PRIO flag may have bcache to keep such metadata in fast cache device. For system without a cache layer, REQ_PRIO can still provide hint to block layer to handle metadata requests more properly. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-07-05Merge tag 'gfs2-4.13.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull GFS2 updates from Bob Peterson: "We've got eight GFS2 patches for this merge window: - Andreas Gruenbacher has four patches related to cleaning up the GFS2 inode evict process. This is about half of his patches designed to fix a long-standing GFS2 hang related to the inode shrinker: Shrinker calls gfs2 evict, evict calls DLM, DLM requires memory and blocks on the shrinker. These four patches have been well tested. His second set of patches are still being tested, so I plan to hold them until the next merge window, after we have more weeks of testing. The first patch eliminates the flush_delayed_work, which can block. - Andreas's second patch protects setting of gl_object for rgrps with a spin_lock to prevent proven races. - His third patch introduces a centralized mechanism for queueing glock work with better reference counting, to prevent more races. -His fourth patch retains a reference to inode glocks when an error occurs while creating an inode. This keeps the subsequent evict from needing to reacquire the glock, which might call into DLM and block in low memory conditions. - Arvind Yadav has a patch to add const to attribute_group structures. - I have a patch to detect directory entry inconsistencies and withdraw the file system if any are found. Better that than silent corruption. - I have a patch to remove a vestigial variable from glock structures, saving some slab space. - I have another patch to remove a vestigial variable from the GFS2 in-core superblock structure" * tag 'gfs2-4.13.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: GFS2: constify attribute_group structures. gfs2: gfs2_create_inode: Keep glock across iput gfs2: Clean up glock work enqueuing gfs2: Protect gl->gl_object by spin lock gfs2: Get rid of flush_delayed_work in gfs2_evict_inode GFS2: Eliminate vestigial sd_log_flush_wrapped GFS2: Remove gl_list from glock structure GFS2: Withdraw when directory entry inconsistencies are detected
2017-07-05gfs2: Protect gl->gl_object by spin lockAndreas Gruenbacher
Put all remaining accesses to gl->gl_object under the gl->gl_lockref.lock spinlock to prevent races. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>