summaryrefslogtreecommitdiff
path: root/fs/btrfs
AgeCommit message (Collapse)Author
2012-07-23Btrfs: use mnt_want_write_file instead of mnt_want_writeLiu Bo
mnt_want_write_file is faster when file has been opened for write. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
2012-07-23Btrfs: remove redundant r/o check for superblockLiu Bo
mnt_want_write() and mnt_want_write_file() will check sb->s_flags with MS_RDONLY, and we don't need to do it ourselves. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
2012-07-23Btrfs: check write access to mount earlier while creating snapshotsLiu Bo
Move check of write access to mount into upper functions so that we can use mnt_want_write_file instead, which is faster than mnt_want_write. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
2012-07-23Btrfs: fix typo in cow_file_range_async and async_cow_submitLiu Bo
It should be 10 * 1024 * 1024. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-07-23Btrfs: change how we indicate we're adding csumsJosef Bacik
There is weird logic I had to put in place to make sure that when we were adding csums that we'd used the delalloc block rsv instead of the global block rsv. Part of this meant that we had to free up our transaction reservation before we ran the delayed refs since csum deletion happens during the delayed ref work. The problem with this is that when we release a reservation we will add it to the global reserve if it is not full in order to keep us going along longer before we have to force a transaction commit. By releasing our reservation before we run delayed refs we don't get the opportunity to drain down the global reserve for the work we did, so we won't refill it as often. This isn't a problem per-se, it just results in us possibly committing transactions more and more often, and in rare cases could cause those WARN_ON()'s to pop in use_block_rsv because we ran out of space in our block rsv. This also helps us by holding onto space while the delayed refs run so we don't end up with as many people trying to do things at the same time, which again will help us not force commits or hit the use_block_rsv warnings. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-23Btrfs: return error of btrfs_update_inode() to callerTsutomu Itoh
We didn't check error of btrfs_update_inode(), but that error looks easy to bubble back up. Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-23Btrfs: fix error handling in __add_reloc_root()Dan Carpenter
We dereferenced "node" in the error message after freeing it. Also btrfs_panic() can return so we should return an error code instead of continuing. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-07-23Btrfs: do not ignore errors from btrfs_cleanup_fs_roots() when mountingIlya Dryomov
There used to be a BUG_ON(ret) there before EH patch (79787eaa) went in. Bail out with EINVAL. Cc: David Sterba <dsterba@suse.cz> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-07-23Btrfs: do not return EINVAL instead of ENOMEM from open_ctree()Ilya Dryomov
When bailing from open_ctree() err is returned, not ret. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-07-23Btrfs: add DEVICE_READY ioctlJosef Bacik
This will be used in conjunction with btrfs device ready <dev>. This is needed for initrd's to have a nice and lightweight way to tell if all of the devices needed for a file system are in the cache currently. This keeps them from having to do mount+sleep loops waiting for devices to show up. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-23Btrfs: flush delayed inodes if we're short on spaceJosef Bacik
Those crazy gentoo guys have been complaining about ENOSPC errors on their portage volumes. This is because doing things like untar tends to create lots of new files which will soak up all the reservation space in the delayed inodes. Usually this gets papered over by the fact that we will try and commit the transaction, however if this happens in the wrong spot or we choose not to commit the transaction you will be screwed. So add the ability to expclitly flush delayed inodes to free up space. Please test this out guys to make sure it works since as usual I cannot reproduce. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-23btrfs: join DEV_STATS ioctls to oneDavid Sterba
Commit c11d2c236cc260b36 (Btrfs: add ioctl to get and reset the device stats) introduced two ioctls doing almost the same thing distinguished by just the ioctl number which encodes "do reset after read". I have suggested http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16604.html to implement it via the ioctl args. This hasn't happen, and I think we should use a more clean way to pass flags and should not waste ioctl numbers. CC: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: David Sterba <dsterba@suse.cz>
2012-07-23btrfs: ignore unfragmented file checks in defrag when compression enabled - ↵Andrew Mahone
rebased Rebased on btrfs-next and retested. Inform should_defrag_range if BTRFS_DEFRAG_RANGE_COMPRESS is set. If so, skip checks for adjacent extents and extent size when deciding whether to defrag, as these can prevent an uncompressed and unfragmented file from being compressed as requested. Signed-off-by: Andrew Mahone <andrew.mahone@gmail.com>
2012-07-23Btrfs: small naming cleanup in join_transaction()Dan Carpenter
"root->fs_info" and "fs_info" are the same, but "fs_info" is prefered because it is shorter and that's what is used in the rest of the function. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-07-23Btrfs: don't update atime on RO subvolumesAlexander Block
Before the update_time inode operation was indroduced, it was not possible to prevent updates of atime on RO subvolumes. VFS was only able to check for RO on the mount, but did not know anything about btrfs subvolumes. btrfs_update_time does now check if the root is RO and skip updating of times. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-07-23Btrfs: allow mount -o remount,compress=noArnd Hannemann
Btrfs allows to turn on compression on a mounted and used filesystem by issuing mount -o remount,compress=lzo. This patch allows to turn compression off again while the filesystem is mounted. As suggested by David Sterba if the compress-force option was set, it is implicitly cleared if compression is turned off. Tested-by: David Sterba <dsterba@suse.cz> Signed-off-by: Arnd Hannemann <arnd@arndnet.de>
2012-07-23Btrfs: remove ->dirty_inodeJosef Bacik
We do all of our inode updating when we change it, and now that we do ->update_time we don't need ->dirty_inode for atime updates anymore, so just remove it. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2012-07-23Btrfs: reduce calls to wake_up on uncontended locksChris Mason
The btrfs locks were unconditionally calling wake_up as the locks were released. This lead to extra thrashing on the waitqueue, especially for locks that were dominated by readers. Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-07-23Btrfs: don't wait around for new log writers on an SSDChris Mason
Waiting on spindles improves performance, but ssds want all the IO as quickly as we can push it down. Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-07-23btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14VFS: Pass mount flags to sget()David Howells
Pass mount flags to sget() so that it can use them in initialising a new superblock before the set function is called. They could also be passed to the compare function. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14don't pass nameidata to ->create()Al Viro
boolean "does it have to be exclusive?" flag is passed instead; Local filesystem should just ignore it - the object is guaranteed not to be there yet. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14stop passing nameidata to ->lookup()Al Viro
Just the flags; only NFS cares even about that, but there are legitimate uses for such argument. And getting rid of that completely would require splitting ->lookup() into a couple of methods (at least), so let's leave that alone for now... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14vfs: switch i_dentry/d_alias to hlistAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-12Btrfs: fix typo in convert_extent_bitLiu Bo
It should be convert_extent_bit. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-07-12Btrfs: add qgroup inheritanceArne Jansen
When creating a subvolume or snapshot, it is necessary to initialize the qgroup account with a copy of some other (tracking) qgroup. This patch adds parameters to the ioctls to pass the information from which qgroup to inherit. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-12Btrfs: add qgroup ioctlsArne Jansen
Ioctls to control the qgroup feature like adding and removing qgroups and assigning qgroups. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-12Btrfs: hooks to reserve qgroup spaceArne Jansen
Like block reserves, reserve a small piece of space on each transaction start and for delalloc. These are the hooks that can actually return EDQUOT to the user. The amount of space reserved is tracked in the transaction handle. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-12Btrfs: hooks for qgroup to record delayed refsJan Schmidt
Hooks into qgroup code to record refs and into transaction commit. This is the main entry point for qgroup. Basically every change in extent backrefs got accounted to the appropriate qgroups. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-07-12Btrfs: quota tree support and startupArne Jansen
Init the quota tree along with the others on open_ctree and close_ctree. Add the quota tree to the list of well known trees in btrfs_read_fs_root_no_name. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-12Btrfs: call the qgroup accounting functionsJan Schmidt
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-07-12Btrfs: qgroup implementation and prototypesArne Jansen
Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-07-10Btrfs: Test code to change the order of delayed-ref processingArne Jansen
Normally delayed refs get processed in ascending bytenr order. This correlates in most cases to the order added. To expose dependencies on this order, we start to process the tree in the middle instead of the beginning. This code is only effective when SCRAMBLE_DELAYED_REFS is defined. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-10Btrfs: qgroup state and initializationArne Jansen
Add state to fs_info. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-10Btrfs: added helper to create new treesArne Jansen
This creates a brand new tree. Will be used to create the quota tree. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-10Btrfs: check the root passed to btrfs_end_transactionArne Jansen
This patch only add a consistancy check to validate that the same root is passed to start_transaction and end_transaction. Subvolume quota depends on this. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-10Btrfs: add helper for tree enumerationArne Jansen
Often no exact match is wanted but just the next lower or higher item. There's a lot of duplicated code throughout btrfs to deal with the corner cases. This patch adds a helper function that can facilitate searching. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-10Btrfs: qgroup on-disk formatArne Jansen
Not all features are in use by the current version and thus may change in the future. Signed-off-by: Arne Jansen <sensille@gmx.net>
2012-07-10Btrfs: join tree mod log code with the code holding back delayed refsJan Schmidt
We've got two mechanisms both required for reliable backref resolving (tree mod log and holding back delayed refs). You cannot make use of one without the other. So instead of requiring the user of this mechanism to setup both correctly, we join them into a single interface. Additionally, we stop inserting non-blockers into fs_info->tree_mod_seq_list as we did before, which was of no value. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-07-10Btrfs: fix buffer leak in btrfs_next_old_leafJan Schmidt
When calling btrfs_next_old_leaf, we were leaking an extent buffer in the rare case of using the deadlock avoidance code needed for the tree mod log. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-07-05Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "I held off on my rc5 pull because I hit an oops during log recovery after a crash. I wanted to make sure it wasn't a regression because we have some logging fixes in here. It turns out that a commit during the merge window just made it much more likely to trigger directory logging instead of full commits, which exposed an old bug. The new backref walking code got some additional fixes. This should be the final set of them. Josef fixed up a corner where our O_DIRECT writes and buffered reads could expose old file contents (not stale, just not the most recent). He and Liu Bo fixed crashes during tree log recover as well. Ilya fixed errors while we resume disk balancing operations on readonly mounts." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: run delayed directory updates during log replay Btrfs: hold a ref on the inode during writepages Btrfs: fix tree log remove space corner case Btrfs: fix wrong check during log recovery Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS Btrfs: resume balance on rw (re)mounts properly Btrfs: restore restriper state on all mounts Btrfs: fix dio write vs buffered read race Btrfs: don't count I/O statistic read errors for missing devices Btrfs: resolve tree mod log locking issue in btrfs_next_leaf Btrfs: fix tree mod log rewind of ADD operations Btrfs: leave critical region in btrfs_find_all_roots as soon as possible Btrfs: always put insert_ptr modifications into the tree mod log Btrfs: fix tree mod log for root replacements at leaf level Btrfs: support root level changes in __resolve_indirect_ref Btrfs: avoid waiting for delayed refs when we must not
2012-07-02Btrfs: run delayed directory updates during log replayChris Mason
While we are resolving directory modifications in the tree log, we are triggering delayed metadata updates to the filesystem btrees. This commit forces the delayed updates to run so the replay code can find any modifications done. It stops us from crashing because the directory deleltion replay expects items to be removed immediately from the tree. Signed-off-by: Chris Mason <chris.mason@fusionio.com> cc: stable@kernel.org
2012-07-02Btrfs: hold a ref on the inode during writepagesJosef Bacik
We can race with unlink and not actually be able to do our igrab in btrfs_add_ordered_extent. This will result in all sorts of problems. Instead of doing the complicated work to try and handle returning an error properly from btrfs_add_ordered_extent, just hold a ref to the inode during writepages. If we cannot grab a ref we know we're freeing this inode anyway and can just drop the dirty pages on the floor, because screw them we're going to invalidate them anyway. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-02Btrfs: fix tree log remove space corner caseJosef Bacik
The tree log stuff can have allocated space that we end up having split across a bitmap and a real extent. The free space code does not deal with this, it assumes that if it finds an extent or bitmap entry that the entire range must fall within the entry it finds. This isn't necessarily the case, so rework the remove function so it can handle this case properly. This fixed two panics the user hit, first in the case where the space was initially in a bitmap and then in an extent entry, and then the reverse case. Thanks, Reported-and-tested-by: Shaun Reich <sreich@kde.org> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-02Btrfs: fix wrong check during log recoveryLiu Bo
When we're evicting an inode during log recovery, we need to ensure that the inode is not in orphan state any more, which means inode's run_time flags has _no_ BTRFS_INODE_HAS_ORPHAN_ITEM. Thus, the BUG_ON was triggered because of a wrong check for the flags. Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-02Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGSAlexander Block
We used the wrong ioctl macro for the getflags ioctl before. As we don't have the set/getflags ioctls in the user space ioctl.h at the moment, it's safe to fix it now. Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-07-02Btrfs: resume balance on rw (re)mounts properlyIlya Dryomov
This introduces btrfs_resume_balance_async(), which, given that restriper state was recovered earlier by btrfs_recover_balance(), resumes balance in btrfs-balance kthread. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-07-02Btrfs: restore restriper state on all mountsIlya Dryomov
Fix a bug that triggered asserts in btrfs_balance() in both normal and resume modes -- restriper state was not properly restored on read-only mounts. This factors out resuming code from btrfs_restore_balance(), which is now also called earlier in the mount sequence to avoid the problem of some early writes getting the old profile. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-07-02Btrfs: fix dio write vs buffered read raceJosef Bacik
Miao pointed out there's a problem with mixing dio writes and buffered reads. If the read happens between us invalidating the page range and actually locking the extent we can bring in pages into page cache. Then once the write finishes if somebody tries to read again it will just find uptodate pages and we'll read stale data. So we need to lock the extent and check for uptodate bits in the range. If there are uptodate bits we need to unlock and invalidate again. This will keep this race from happening since we will hold the extent locked until we create the ordered extent, and then teh read side always waits for ordered extents. There was also a race in how we updated i_size, previously we were relying on the generic DIO stuff to adjust the i_size after the DIO had completed, but this happens outside of the extent lock which means reads could come in and not see the updated i_size. So instead move this work into where we create the extents, and then this way the update ordered i_size stuff works properly in the endio handlers. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2012-07-02Btrfs: don't count I/O statistic read errors for missing devicesStefan Behrens
It is normal behaviour of the low level btrfs function btrfs_map_bio() to complete a bio with -EIO if the device is missing, instead of just preventing the bio creation in an earlier step. This used to cause I/O statistic read error increments and annoying printk_ratelimited messages. This commit fixes the issue. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Reported-by: Carey Underwood <cwillu@cwillu.com>