summaryrefslogtreecommitdiff
path: root/fs/xfs
AgeCommit message (Collapse)Author
2018-05-25xfs, proc: hide unused xfs procfs helpersArnd Bergmann
These two functions now trigger a warning when CONFIG_PROC_FS is disabled: fs/xfs/xfs_stats.c:128:12: error: 'xqmstat_proc_show' defined but not used [-Werror=unused-function] static int xqmstat_proc_show(struct seq_file *m, void *v) ^~~~~~~~~~~~~~~~~ fs/xfs/xfs_stats.c:118:12: error: 'xqm_proc_show' defined but not used [-Werror=unused-function] static int xqm_proc_show(struct seq_file *m, void *v) ^~~~~~~~~~~~~ Previously, they were referenced from an unused 'static const' structure, which is silently dropped by gcc. We can address the warning by adding the same #ifdef around them that hides the reference. Fixes: 3f3942aca6da ("proc: introduce proc_create_single{,_data}") Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-22xfs_vn_lookup: simplify a bitAl Viro
have all post-xfs_lookup() branches converge on d_splice_alias() Cc: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-22xfs, dax: introduce xfs_break_dax_layouts()Dan Williams
xfs_break_dax_layouts(), similar to xfs_break_leased_layouts(), scans for busy / pinned dax pages and waits for those pages to go idle before any potential extent unmap operation. dax_layout_busy_page() handles synchronizing against new page-busy events (get_user_pages). It invalidates all mappings to trigger the get_user_pages slow path which will eventually block on the xfs inode lock held in XFS_MMAPLOCK_EXCL mode. If dax_layout_busy_page() finds a busy page it returns it for xfs to wait for the page-idle event that will fire when the page reference count reaches 1 (recall ZONE_DEVICE pages are idle at count 1, see generic_dax_pagefree()). While waiting, the XFS_MMAPLOCK_EXCL lock is dropped in order to not deadlock the process that might be trying to elevate the page count of more pages before arranging for any of them to go idle. I.e. the typical case of submitting I/O is that iov_iter_get_pages() elevates the reference count of all pages in the I/O before starting I/O on the first page. The process of elevating the reference count of all pages involved in an I/O may cause faults that need to take XFS_MMAPLOCK_EXCL. Although XFS_MMAPLOCK_EXCL is dropped while waiting, XFS_IOLOCK_EXCL is held while sleeping. We need this to prevent starvation of the truncate path as continuous submission of direct-I/O could starve the truncate path indefinitely if the lock is dropped. Cc: Dave Chinner <david@fromorbit.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-05-22xfs: prepare xfs_break_layouts() for another layout typeDan Williams
When xfs is operating as the back-end of a pNFS block server, it prevents collisions between local and remote operations by requiring a lease to be held for remotely accessed blocks. Local filesystem operations break those leases before writing or mutating the extent map of the file. A similar mechanism is needed to prevent operations on pinned dax mappings, like device-DMA, from colliding with extent unmap operations. BREAK_WRITE and BREAK_UNMAP are introduced as two distinct levels of layout breaking. Layouts are broken in the BREAK_WRITE case to ensure that layout-holders do not collide with local writes. Additionally, layouts are broken in the BREAK_UNMAP case to make sure the layout-holder has a consistent view of the file's extent map. While BREAK_WRITE breaks can be satisfied be recalling FL_LAYOUT leases, BREAK_UNMAP breaks additionally require waiting for busy dax-pages to go idle while holding XFS_MMAPLOCK_EXCL. After this refactoring xfs_break_layouts() becomes the entry point for coordinating both types of breaks. Finally, xfs_break_leased_layouts() becomes just the BREAK_WRITE handler. Note that the unlock tracking is needed in a follow on change. That will coordinate retrying either break handler until both successfully test for a lease break while maintaining the lock state. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Reported-by: Dave Chinner <david@fromorbit.com> Reported-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-05-22xfs: prepare xfs_break_layouts() to be called with XFS_MMAPLOCK_EXCLDan Williams
In preparation for adding coordination between extent unmap operations and busy dax-pages, update xfs_break_layouts() to permit it to be called with the mmap lock held. This lock scheme will be required for coordinating the break of 'dax layouts' (non-idle dax (ZONE_DEVICE) pages mapped into the file's address space). Breaking dax layouts will be added to xfs_break_layouts() in a future patch, for now this preps the unmap call sites to take and hold XFS_MMAPLOCK_EXCL over the call to xfs_break_layouts(). Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <darrick.wong@oracle.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-05-16xfs: implement online get/set fs labelEric Sandeen
The GET ioctl is trivial, just return the current label. The SET ioctl is more involved: It transactionally modifies the superblock to write a new filesystem label to the primary super. A new variant of xfs_sync_sb then writes the superblock buffer immediately to disk so that the change is visible from userspace. It then invalidates any page cache that userspace might have previously read on the block device so that i.e. blkid can see the change immediately, and updates all secondary superblocks as userspace relable does. Signed-off-by: Eric Sandeen <sandeen@redhat.com> [darrick: use dchinner's new xfs_update_secondary_sbs function] Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-16proc: introduce proc_create_single{,_data}Christoph Hellwig
Variants of proc_create{,_data} that directly take a seq_file show callback and drastically reduces the boilerplate code in the callers. All trivial callers converted over. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-15xfs: factor the ag length extension code into libxfsDave Chinner
Growfs currently manually codes the extension of the last AG in a filesytem during the growfs process. Factor that out of the growfs code and move it into libxfs along with teh rest of the AG header modification code. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: move growfs core to libxfsDave Chinner
So it can be shared with userspace (e.g. mkfs) easily. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: rework secondary superblock updates in growfsDave Chinner
Right now we wait until we've committed changes to the primary superblock before we initialise any of the new secondary superblocks. This means that if we have any write errors for new secondary superblocks we end up with garbage in place rather than zeros or even an "in progress" superblock to indicate a grow operation is being done. To ensure we can write the secondary superblocks, initialise them earlier in the same loop that initialises the AG headers. We stamp the new secondary superblocks here with the old geometry, but set the "sb_inprogress" field to indicate that updates are being done to the superblock so they cannot be used. This will result in the secondary superblock fields being updated or triggering errors that will abort the grow before we commit any permanent changes. This also means we can change the update mechanism of the secondary superblocks. We know that we are going to wholly overwrite the information in the struct xfs_sb in the buffer, so there's no point reading it from disk. Just allocate an uncached buffer, zero it in memory, stamp the new superblock structure in it and write it out. If we fail to write it out, then we'll leave the existing sb (old or new w/ inprogress) on disk for repair to deal with later. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: separate secondary sb update in growfsDave Chinner
This happens after all the transactions to update the superblock occur, and errors need to be handled slightly differently. Seperate out the code into it's own function, and clean up the error goto stack in the core growfs code as it is now much simpler. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: make imaxpct changes in growfs separateDave Chinner
When growfs changes the imaxpct value of the filesystem, it runs through all the "change size" growfs code, whether it needs to or not. Separate out changing imaxpct into it's own function and transaction to simplify the rest of the growfs code. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: turn ag header initialisation into a table driven operationDave Chinner
There's still more cookie cutter code in setting up each AG header. Separate all the variables into a simple structure and iterate a table of header definitions to initialise everything. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: factor ag btree root block initialisationDave Chinner
Cookie cutter code, easily factored. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: convert growfs AG header init to use buffer listsDave Chinner
We currently write all new AG headers synchronously, which can be slow for large grow operations. All we really need to do is ensure all the headers are on disk before we run the growfs transaction, so convert this to a buffer list and a delayed write operation. We block waiting for the delayed write buffer submission to complete, so this will fulfill the requirement to have all the buffers written correctly before proceeding. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: factor out AG header initialisation from growfs coreDave Chinner
The intialisation of new AG headers is mostly common with the userspace mkfs code and growfs in the kernel, so start factoring it out so we can move it to libxfs and use it in both places. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: one-shot cached buffersDave Chinner
For the new growfs work, we want to ensure that we serialise secondary superblock updates with other operations (e.g. scrub) correctly, but we don't want to cache the buffers for long term reuse. We need cached buffers for serialisation, however. To solve this, introduce a "oneshot" buffer which will be marshalled through the cache but then released once the last current reference goes away. If the buffer is already cached, then we ignore the "one-shot" behaviour and leave the buffer in the state it was prior to the one-shot command being run. This means we don't perturb either the working set or existing cached buffer state by a one-shot operation. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: implement the metadata repair ioctl flagDarrick J. Wong
Plumb in the pieces necessary to make the "scrub" subfunction of the scrub ioctl actually work. This means that we make the IFLAG_REPAIR flag to the scrub ioctl actually do something, and we add an errortag knob so that xfstests can force the kernel to rebuild a metadata structure even if there's nothing wrong with it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: create tracepoints for online repairDarrick J. Wong
These tracepoints will be used to debug the online repair routines. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: teach xfs_bmapi_remap to accept some bmapi flagsDarrick J. Wong
Teach xfs_bmapi_remap how to map in unwritten extent and to skip rmap updates. This enables us to rebuild real and unwritten extents from the rmapbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: make xfs_bmapi_remapi work with attribute forksDarrick J. Wong
Add a new flags argument to xfs_bmapi_remapi so that we can pass BMAPI flags into the function. This enables us to pass in BMAPI_ATTRFORK so that we can remap things into the attribute fork. Eventually the online repair code will use this to rebuild attribute forks, so make it non-static. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: hoist xfs_scrub_agfl_walk to libxfs as xfs_agfl_walkDarrick J. Wong
This function is basically a generic AGFL block iterator, so promote it to libxfs ahead of online repair wanting to use it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: avoid ABBA deadlock when scrubbing parent pointersDarrick J. Wong
In normal operation, the XFS convention is to take an inode's iolock and then allocate a transaction. However, when scrubbing parent inodes this is inverted -- we allocated the transaction to do the scrub, and now we're trying to grab the parent's iolock. This can lead to ABBA deadlocks: some thread grabbed the parent's iolock and is waiting for space for a transaction while our parent scrubber is sitting on a transaction trying to get the parent's iolock. Therefore, convert all iolock attempts to use trylock; if that fails, they can use the existing mechanisms to back off and try again. The ABBA deadlock didn't happen with a non-repair scrub because the transactions don't reserve any space, but repair scrubs require reservation in order to update metadata. However, any other concurrent metadata update (e.g. directory create in the parent) could also induce this deadlock with the parent scrubber. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: scrub the data fork of the realtime inodesDarrick J. Wong
The realtime bitmap and summary inodes live on the metadata device, so we can scrub their data forks with the regular scrubbers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: quota scrub should use bmapbtd scrubberDarrick J. Wong
Replace the quota scrubber's open-coded data fork scrubber with a redirected call to the bmapbtd scrubber. This strengthens the quota scrub to include all the cross-referencing that it does. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: don't continue scrub if already corruptDarrick J. Wong
If we've already decided that something is corrupt, we might as well abort all the loops and exit as quickly as possible. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: refactor quota limits initializationDarrick J. Wong
Replace all the if (!error) weirdness with helper functions that follow our regular coding practices, and factor out the ternary expression soup. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: superblock scrub should use short-lived buffersDarrick J. Wong
Secondary superblocks are rarely used, so create a helper to read a given non-primary AG's superblock and ensure that it won't stick around hogging memory. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: skip scrub xref if corruption already notedDarrick J. Wong
Don't bother looking for cross-referencing problems if the metadata is already corrupt or we've already found a cross-referencing problem. Since we added a helper function for flags testing, convert existing users to use it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: clear sb->s_fs_info on mount failureDave Chinner
We recently had an oops reported on a 4.14 kernel in xfs_reclaim_inodes_count() where sb->s_fs_info pointed to garbage and so the m_perag_tree lookup walked into lala land. Essentially, the machine was under memory pressure when the mount was being run, xfs_fs_fill_super() failed after allocating the xfs_mount and attaching it to sb->s_fs_info. It then cleaned up and freed the xfs_mount, but the sb->s_fs_info field still pointed to the freed memory. Hence when the superblock shrinker then ran it fell off the bad pointer. With the superblock shrinker problem fixed at teh VFS level, this stale s_fs_info pointer is still a problem - we use it unconditionally in ->put_super when the superblock is being torn down, and hence we can still trip over it after a ->fill_super call failure. Hence we need to clear s_fs_info if xfs-fs_fill_super() fails, and we need to check if it's valid in the places it can potentially be dereferenced after a ->fill_super failure. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: add mount delay debug optionDave Chinner
Similar to log_recovery_delay, this delay occurs between the VFS superblock being initialised and the xfs_mount being fully initialised. It also poisons the per-ag radix tree node so that it can be used for triggering shrinker races during mount such as the following: <run memory pressure workload in background> $ cat dirty-mount.sh #! /bin/bash umount -f /dev/pmem0 mkfs.xfs -f /dev/pmem0 mount /dev/pmem0 /mnt/test rm -f /mnt/test/foo xfs_io -fxc "pwrite 0 4k" -c fsync -c "shutdown" /mnt/test/foo umount /dev/pmem0 # let's crash it now! echo 30 > /sys/fs/xfs/debug/mount_delay mount /dev/pmem0 /mnt/test echo 0 > /sys/fs/xfs/debug/mount_delay umount /dev/pmem0 $ sudo ./dirty-mount.sh ..... [ 60.378118] CPU: 3 PID: 3577 Comm: fs_mark Tainted: G D W 4.16.0-rc5-dgc #440 [ 60.378120] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 60.378124] RIP: 0010:radix_tree_next_chunk+0x76/0x320 [ 60.378127] RSP: 0018:ffffc9000276f4f8 EFLAGS: 00010282 [ 60.383670] RAX: a5a5a5a5a5a5a5a4 RBX: 0000000000000010 RCX: 000000000000001a [ 60.385277] RDX: 0000000000000000 RSI: ffffc9000276f540 RDI: 0000000000000000 [ 60.386554] RBP: 0000000000000000 R08: 0000000000000000 R09: a5a5a5a5a5a5a5a5 [ 60.388194] R10: 0000000000000006 R11: 0000000000000001 R12: ffffc9000276f598 [ 60.389288] R13: 0000000000000040 R14: 0000000000000228 R15: ffff880816cd6458 [ 60.390827] FS: 00007f5c124b9740(0000) GS:ffff88083fc00000(0000) knlGS:0000000000000000 [ 60.392253] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 60.393423] CR2: 00007f5c11bba0b8 CR3: 000000035580e001 CR4: 00000000000606e0 [ 60.394519] Call Trace: [ 60.395252] radix_tree_gang_lookup_tag+0xc4/0x130 [ 60.395948] xfs_perag_get_tag+0x37/0xf0 [ 60.396522] xfs_reclaim_inodes_count+0x32/0x40 [ 60.397178] xfs_fs_nr_cached_objects+0x11/0x20 [ 60.397837] super_cache_count+0x35/0xc0 [ 60.399159] shrink_slab.part.66+0xb1/0x370 [ 60.400194] shrink_node+0x7e/0x1a0 [ 60.401058] try_to_free_pages+0x199/0x470 [ 60.402081] __alloc_pages_slowpath+0x3a1/0xd20 [ 60.403729] __alloc_pages_nodemask+0x1c3/0x200 [ 60.404941] cache_grow_begin+0x20b/0x2e0 [ 60.406164] fallback_alloc+0x160/0x200 [ 60.407088] kmem_cache_alloc+0x111/0x4e0 [ 60.408038] ? xfs_buf_rele+0x61/0x430 [ 60.408925] kmem_zone_alloc+0x61/0xe0 [ 60.409965] xfs_inode_alloc+0x24/0x1d0 ..... Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15xfs: factor out nodiscard helpersBrian Foster
The changes to skip discards of speculative preallocation and unwritten extents introduced several new wrapper functions through the bunmapi -> extent free codepath to reduce churn in all of the associated callers. In several cases, these wrappers simply toggle a single flag to skip or not skip discards for the resulting blocks. The explicit _nodiscard() wrappers for such an isolated set of callers is a bit overkill. Kill off these wrappers and replace with the calls to the underlying functions in the contexts that need to control discard behavior. Retain the wrappers that preserve the original calling conventions to serve the original purpose of reducing code churn. This is a refactoring patch and does not change behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-15iomap: add a swapfile activation functionDarrick J. Wong
Add a new iomap_swapfile_activate function so that filesystems can activate swap files without having to use the obsolete and slow bmap function. This enables XFS to support fallocate'd swap files and swap files on realtime devices. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz>
2018-05-15xfs: halt auto-reclamation activities while rebuilding rmapDarrick J. Wong
Rebuilding the reverse-mapping tree requires us to quiesce all inodes in the filesystem, so we must stop background reclamation of post-EOF and CoW prealloc blocks. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: add BMAPI_NORMAP flag to perform block remapping without updating rmapbtDarrick J. Wong
Add a new flag, XFS_BMAPI_NORMAP, which will perform file block remapping without updating the rmapbt. This will be used by the repair code to reconstruct bmbts from the rmapbt, in which case we don't want the rmapbt update. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: add repair helpers for the reference count btreeDarrick J. Wong
Add a couple of functions to the refcount btree and generic btree code that will be used to repair the refcountbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: add repair helpers for the reverse mapping btreeDarrick J. Wong
Add a couple of functions to the reverse mapping btree that will be used to repair the rmapbt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: expose various functions to repair codeDarrick J. Wong
Expose various helpers that the repair code will want to use. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: add helpers to calculate btree sizeDarrick J. Wong
Add a bunch of helper functions that calculate the sizes of various btrees. These will be used to repair btrees and btree headers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-15xfs: refactor scrub transaction allocation functionDarrick J. Wong
Since the transaction allocation helper is about to become more complex, move it to common.c and remove the redundant parameters. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: btree scrub should check minrecsDarrick J. Wong
Strengthen the btree block header checks to detect the number of records being less than the btree type's minimum record count. Certain blocks are allowed to violate this constraint -- specifically any btree block at the top of the tree can have fewer than minrecs records. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: clean up scrub usage of KM_NOFSDarrick J. Wong
All scrub code runs in transaction context, which means that memory allocations are automatically run in PF_MEMALLOC_NOFS context. It's therefore unnecessary to pass in KM_NOFS to allocation routines, so clean them all out. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: avoid ilock games in the quota scrubberDarrick J. Wong
Refactor the quota scrubber to take the quotaofflock and grab the quota inode in the setup function so that we can treat quota in the same "scrub in the context of this inode" (i.e. sc->ip) manner as we treat any other inode. We do have to drop the quota inode's ILOCK_EXCL to use dqiterate, but since dquots have their own individual locks the ILOCK wasn't helping us anyway. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-15xfs: refactor dquot iterationDarrick J. Wong
Create a helper function to iterate all the dquots of a given type in the system, and refactor the dquot scrub to use it. This will get more use in the quota repair code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-05-10xfs: rename on-disk dquot counter zap functionsDarrick J. Wong
The function 'xfs_qm_dqiterate' doesn't iterate dquots at all, it iterates all dquot blocks of a quota inode and clears the counters. Therefore, change the name to something more descriptive so that we can introduce a real dquot iterator later. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-05-10xfs: replace XFS_QMOPT_DQALLOC with a simple booleanDarrick J. Wong
DQALLOC is only ever used with xfs_qm_dqget*, and the only flag that the _dqget family of functions cares about is DQALLOC. Therefore, change it to a boolean 'can alloc?' flag for the dqget interfaces where that makes sense. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-10xfs: remove direct calls to _qm_dqreadDarrick J. Wong
The quota initialization code needs an "uncached" variant of _dqget to read in default quota limits and timers before the dquot cache is fully set up. We've already split up _dqget into its component pieces so create a fourth variant to address this need, and make dqread internal to xfs_dquot.c again. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-05-10xfs: refactor xfs_qm_dqtobp and xfs_qm_dqallocDarrick J. Wong
Separate the disk dquot read and allocation functionality into two helper functions, then refactor dqread to call them directly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-10xfs: refactor incore dquot initialization functionsDarrick J. Wong
Create two incore dquot initialization functions that will help us to disentangle dqget and dqread. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-05-10xfs: fetch dquots directly during quotacheckDarrick J. Wong
Quotacheck only runs during mount, which means that there are no other processes in the system that could be doing chown or chproj. Therefore there's no potential for racing to attach dquots to the inode so we can drop all the ILOCK and race detection bits from quotacheck. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>