summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2014-04-28ceph: clear directory's completeness when creating fileYan, Zheng
When creating a file, ceph_set_dentry_offset() puts the new dentry at the end of directory's d_subdirs, then set the dentry's offset based on directory's max offset. The offset does not reflect the real postion of the dentry in directory. Later readdir reply from MDS may change the dentry's position/offset. This inconsistency can cause missing/duplicate entries in readdir result if readdir is partly satisfied by dcache_readdir(). The fix is clear directory's completeness after creating/renaming file. It prevents later readdir from using dcache_readdir(). Fixes: http://tracker.ceph.com/issues/8025 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-28ceph: use fpos_cmp() to compare dentry positionsYan, Zheng
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-28ceph: check directory's completeness before emitting directory entryYan, Zheng
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-04-28fuse: add renameat2 supportMiklos Szeredi
Support RENAME_EXCHANGE and RENAME_NOREPLACE flags on the userspace ABI. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: clear MS_I_VERSIONMiklos Szeredi
Fuse doesn't support i_version (yet). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: clear FUSE_I_CTIME_DIRTY flag on setattrMaxim Patlasov
The patch addresses two use-cases when the flag may be safely cleared: 1. fuse_do_setattr() is called with ATTR_CTIME flag set in attr->ia_valid. In this case attr->ia_ctime bears actual value. In-kernel fuse must send it to the userspace server and then assign the value to inode->i_ctime. 2. fuse_do_setattr() is called with ATTR_SIZE flag set in attr->ia_valid, whereas ATTR_CTIME is not set (truncate(2)). In this case in-kernel fuse must sent "now" to the userspace server and then assign the value to inode->i_ctime. In both cases we could clear I_DIRTY_SYNC, but that needs more thought. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: trust kernel i_ctime onlyMaxim Patlasov
Let the kernel maintain i_ctime locally: update i_ctime explicitly on truncate, fallocate, open(O_TRUNC), setxattr, removexattr, link, rename, unlink. The inode flag I_DIRTY_SYNC serves as indication that local i_ctime should be flushed to the server eventually. The patch sets the flag and updates i_ctime in course of operations listed above. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: remove .update_timeMiklos Szeredi
This implements updating ctime as well as mtime on file_update_time(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: allow ctime flushing to userspaceMaxim Patlasov
The patch extends fuse_setattr_in, and extends the flush procedure (fuse_flush_times()) called on ->write_inode() to send the ctime as well as mtime. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: fuse: add time_gran to INIT_OUTMiklos Szeredi
Allow userspace fs to specify time granularity. This is needed because with writeback_cache mode the kernel is responsible for generating mtime and ctime, but if the underlying filesystem doesn't support nanosecond granularity then the cache will contain a different value from the one stored on the filesystem resulting in a change of times after a cache flush. Make the default granularity 1s. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: add .write_inodeMiklos Szeredi
...and flush mtime from this. This allows us to use the kernel infrastructure for writing out dirty metadata (mtime at this point, but ctime in the next patches and also maybe atime). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: clean up fsyncMiklos Szeredi
Don't need to start I/O twice (once without i_mutex and one within). Also make sure that even if the userspace filesystem doesn't support FSYNC we do all the steps other than sending the message. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: fuse: fallocate: use file_update_time()Miklos Szeredi
in preparation for getting rid of FUSE_I_MTIME_DIRTY. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: update mtime on open(O_TRUNC) in atomic_o_trunc modeMaxim Patlasov
In case of fc->atomic_o_trunc is set, fuse does nothing in fuse_do_setattr() while handling open(O_TRUNC). Hence, i_mtime must be updated explicitly in fuse_finish_open(). The patch also adds extra locking encompassing open(O_TRUNC) operation to avoid races between the truncation and updating i_mtime. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: update mtime on truncate(2)Maxim Patlasov
Handling truncate(2), VFS doesn't set ATTR_MTIME bit in iattr structure; only ATTR_SIZE bit is set. In-kernel fuse must handle the case by setting mtime fields of struct fuse_setattr_in to "now" and set FATTR_MTIME bit even though ATTR_MTIME was not set. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: do not use uninitialized i_modeMaxim Patlasov
When inode is in I_NEW state, inode->i_mode is not initialized yet. Do not use it before fuse_init_inode() is called. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: fix mtime update error in fsyncMiklos Szeredi
Bad case of shadowing. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: check fallocate modeMiklos Szeredi
Don't allow new fallocate modes until we figure out what (if anything) that takes. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28fuse: add __exit to fuse_ctl_cleanupFabian Frederick
fuse_ctl_cleanup is only called by __exit fuse_exit Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-28GFS2: lops.c: replace 0 by NULL for pointersFabian Frederick
Sparse warning: fs/gfs2/lops.c:78:29: "warning: Using plain integer as NULL pointer" Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-04-27Merge 3.15-rc3 into staging-nextGreg Kroah-Hartman
2014-04-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: limit the path size in send to PATH_MAX Btrfs: correctly set profile flags on seqlock retry Btrfs: use correct key when repeating search for extent item Btrfs: fix inode caching vs tree log Btrfs: fix possible memory leaks in open_ctree() Btrfs: avoid triggering bug_on() when we fail to start inode caching task Btrfs: move btrfs_{set,clear}_and_info() to ctree.h btrfs: replace error code from btrfs_drop_extents btrfs: Change the hole range to a more accurate value. btrfs: fix use-after-free in mount_subvol()
2014-04-27Merge tag 'driver-core-3.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are some kernfs fixes for 3.15-rc3 that resolve some reported problems. Nothing huge, but all needed" * tag 'driver-core-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: s390/ccwgroup: Fix memory corruption kernfs: add back missing error check in kernfs_fop_mmap() kernfs: fix a subdir count leak
2014-04-26Btrfs: limit the path size in send to PATH_MAXChris Mason
fs_path_ensure_buf is used to make sure our path buffers for send are big enough for the path names as we construct them. The buffer size is limited to 32K by the length field in the struct. But bugs in the path construction can end up trying to build a huge buffer, and we'll do invalid memmmoves when the buffer length field wraps. This patch is step one, preventing the overflows. Signed-off-by: Chris Mason <clm@fb.com>
2014-04-25Merge tag 'locks-v3.15-2' of git://git.samba.org/jlayton/linuxLinus Torvalds
Pull file locking fixes from Jeff Layton: "File locking related bugfixes for v3.15 (pile #2) - fix for a long-standing bug in __break_lease that can cause soft lockups - renaming of file-private locks to "open file description" locks, and the command macros to more visually distinct names The fix for __break_lease is also in the pile of patches for which Bruce sent a pull request, but I assume that your merge procedure will handle that correctly. For the other patches, I don't like the fact that we need to rename this stuff at this late stage, but it should be settled now (hopefully)" * tag 'locks-v3.15-2' of git://git.samba.org/jlayton/linux: locks: rename FL_FILE_PVT and IS_FILE_PVT to use "*_OFDLCK" instead locks: rename file-private locks to "open file description locks" locks: allow __break_lease to sleep even when break_time is 0
2014-04-25Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd bugfixes from Bruce Fields: "Three small nfsd bugfixes (including one locks.c fix for a bug triggered only from nfsd). Jeff's patches are for long-existing problems that became easier to trigger since the addition of vfs delegation support" * 'for-3.15' of git://linux-nfs.org/~bfields/linux: Revert "nfsd4: fix nfs4err_resource in 4.1 case" nfsd: set timeparms.to_maxval in setup_callback_client locks: allow __break_lease to sleep even when break_time is 0
2014-04-25kernfs: add back missing error check in kernfs_fop_mmap()Tejun Heo
While updating how mmap enabled kernfs files are handled by lockdep, 9b2db6e18945 ("sysfs: bail early from kernfs_file_mmap() to avoid spurious lockdep warning") inadvertently dropped error return check from kernfs_file_mmap(). The intention was just dropping "if (ops->mmap)" check as the control won't reach the point if the mmap callback isn't implemented, but I mistakenly removed the error return check together with it. This led to Xorg crash on i810 which was reported and bisected to the commit and then to the specific change by Tobias. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-bisected-by: Tobias Powalowski <tobias.powalowski@googlemail.com> Tested-by: Tobias Powalowski <tobias.powalowski@googlemail.com> References: http://lkml.kernel.org/g/533D01BD.1010200@googlemail.com Cc: stable <stable@vger.kernel.org> # 3.14 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-25kernfs: fix a subdir count leakJianyu Zhan
Currently kernfs_link_sibling() increates parent->dir.subdirs before adding the node into parent's chidren rb tree. Because it is possible that kernfs_link_sibling() couldn't find a suitable slot and bail out, this leads to a mismatch between elevated subdir count with actual children node numbers. This patches fix this problem, by moving the subdir accouting after the actual addtion happening. Signed-off-by: Jianyu Zhan <nasa4836@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-25kernfs: make kernfs_notify() trigger inotify events tooTejun Heo
kernfs_notify() is used to indicate either new data is available or the content of a file has changed. It currently only triggers poll which may not be the most convenient to monitor especially when there are a lot to monitor. Let's hook it up to fsnotify too so that the events can be monitored via inotify too. fsnotify_modify() requires file * but kernfs_notify() doesn't have any specific file associated; however, we can walk all super_blocks associated with a kernfs_root and as kernfs always associate one ino with inode and one dentry with an inode, it's trivial to look up the dentry associated with a given kernfs_node. As any active monitor would pin dentry, just looking up existing dentry is enough. This patch looks up the dentry associated with the specified kernfs_node and generates events equivalent to fsnotify_modify(). Note that as fsnotify doesn't provide fsnotify_modify() equivalent which can be called with dentry, kernfs_notify() directly calls fsnotify_parent() and fsnotify(). It might be better to add a wrapper in fsnotify.h instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: John McCutchan <john@johnmccutchan.com> Cc: Robert Love <rlove@rlove.org> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-25kernfs: implement kernfs_root->supers listTejun Heo
Currently, there's no way to find out which super_blocks are associated with a given kernfs_root. Let's implement it - the planned inotify extension to kernfs_notify() needs it. Make kernfs_super_info point back to the super_block and chain it at kernfs_root->supers. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-24cifs: fix actimeo=0 corner case when cifs_i->time == jiffiesJeff Layton
actimeo=0 is supposed to be a special case that ensures that inode attributes are always refetched from the server instead of trusting the cache. The cifs code however uses time_in_range() to determine whether the attributes have timed out. In the case where cifs_i->time equals jiffies, this leads to the cifs code not refetching the inode attributes when it should. Fix this by explicitly testing for actimeo=0, and handling it as a special case. Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2014-04-24Btrfs: correctly set profile flags on seqlock retryFilipe Manana
If we had to retry on the profiles seqlock (due to a concurrent write), we would set bits on the input flags that corresponded both to the current profile and to previous values of the profile. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: use correct key when repeating search for extent itemFilipe Manana
If skinny metadata is enabled and our first tree search fails to find a skinny extent item, we may repeat a tree search for a "fat" extent item (if the previous item in the leaf is not the "fat" extent we're looking for). However we were not setting the new key's objectid to the right value, as we previously used the same key variable to peek at the previous item in the leaf, which has a different objectid. So just set the right objectid to avoid modifying/deleting a wrong item if we repeat the tree search. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: fix inode caching vs tree logMiao Xie
Currently, with inode cache enabled, we will reuse its inode id immediately after unlinking file, we may hit something like following: |->iput inode |->return inode id into inode cache |->create dir,fsync |->power off An easy way to reproduce this problem is: mkfs.btrfs -f /dev/sdb mount /dev/sdb /mnt -o inode_cache,commit=100 dd if=/dev/zero of=/mnt/data bs=1M count=10 oflag=sync inode_id=`ls -i /mnt/data | awk '{print $1}'` rm -f /mnt/data i=1 while [ 1 ] do mkdir /mnt/dir_$i test1=`stat /mnt/dir_$i | grep Inode: | awk '{print $4}'` if [ $test1 -eq $inode_id ] then dd if=/dev/zero of=/mnt/dir_$i/data bs=1M count=1 oflag=sync echo b > /proc/sysrq-trigger fi sleep 1 i=$(($i+1)) done mount /dev/sdb /mnt umount /dev/sdb btrfs check /dev/sdb We fix this problem by adding unlinked inode's id into pinned tree, and we can not reuse them until committing transaction. Cc: stable@vger.kernel.org Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: fix possible memory leaks in open_ctree()Wang Shilong
Fix possible memory leaks in the following error handling paths: read_tree_block() btrfs_recover_log_trees btrfs_commit_super() btrfs_find_orphan_roots() btrfs_cleanup_fs_roots() Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: avoid triggering bug_on() when we fail to start inode caching taskWang Shilong
When running stress test(including snapshots,balance,fstress), we trigger the following BUG_ON() which is because we fail to start inode caching task. [ 181.131945] kernel BUG at fs/btrfs/inode-map.c:179! [ 181.137963] invalid opcode: 0000 [#1] SMP [ 181.217096] CPU: 11 PID: 2532 Comm: btrfs Not tainted 3.14.0 #1 [ 181.240521] task: ffff88013b621b30 ti: ffff8800b6ada000 task.ti: ffff8800b6ada000 [ 181.367506] Call Trace: [ 181.371107] [<ffffffffa036c1be>] btrfs_return_ino+0x9e/0x110 [btrfs] [ 181.379191] [<ffffffffa038082b>] btrfs_evict_inode+0x46b/0x4c0 [btrfs] [ 181.387464] [<ffffffff810b5a70>] ? autoremove_wake_function+0x40/0x40 [ 181.395642] [<ffffffff811dc5fe>] evict+0x9e/0x190 [ 181.401882] [<ffffffff811dcde3>] iput+0xf3/0x180 [ 181.408025] [<ffffffffa03812de>] btrfs_orphan_cleanup+0x1ee/0x430 [btrfs] [ 181.416614] [<ffffffffa03a6abd>] btrfs_mksubvol.isra.29+0x3bd/0x450 [btrfs] [ 181.425399] [<ffffffffa03a6cd6>] btrfs_ioctl_snap_create_transid+0x186/0x190 [btrfs] [ 181.435059] [<ffffffffa03a6e3b>] btrfs_ioctl_snap_create_v2+0xeb/0x130 [btrfs] [ 181.444148] [<ffffffffa03a9656>] btrfs_ioctl+0xf76/0x2b90 [btrfs] [ 181.451971] [<ffffffff8117e565>] ? handle_mm_fault+0x475/0xe80 [ 181.459509] [<ffffffff8167ba0c>] ? __do_page_fault+0x1ec/0x520 [ 181.467046] [<ffffffff81185b35>] ? do_mmap_pgoff+0x2f5/0x3c0 [ 181.474393] [<ffffffff811d4da8>] do_vfs_ioctl+0x2d8/0x4b0 [ 181.481450] [<ffffffff811d5001>] SyS_ioctl+0x81/0xa0 [ 181.488021] [<ffffffff81680b69>] system_call_fastpath+0x16/0x1b We should avoid triggering BUG_ON() here, instead, we output warning messages and clear inode_cache option. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: move btrfs_{set,clear}_and_info() to ctree.hWang Shilong
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24btrfs: replace error code from btrfs_drop_extentsDavid Sterba
There's a case which clone does not handle and used to BUG_ON instead, (testcase xfstests/btrfs/035), now returns EINVAL. This error code is confusing to the ioctl caller, as it normally signifies errorneous arguments. Change it to ENOPNOTSUPP which allows a fall back to copy instead of clone. This does not affect the common reflink operation. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24btrfs: Change the hole range to a more accurate value.Qu Wenruo
Commit 3ac0d7b96a268a98bd474cab8bce3a9f125aaccf fixed the btrfs expanding write problem but the hole punched is sometimes too large for some iovec, which has unmapped data ranges. This patch will change to hole range to a more accurate value using the counts checked by the write check routines. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24xfs: enable the finobt feature on v5 superblocksBrian Foster
Add the finobt feature bit to the list of known features. As of this point, the kernel code knows how to mount and manage both finobt and non-finobt formatted filesystems. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: report finobt status in fs geometryBrian Foster
Define the XFS_FSOP_GEOM_FLAGS_FINOBT fs geometry flag and set the associated bit if the filesystem supports the free inode btree. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: add finobt support to growfsBrian Foster
Add finobt support to growfs. Initialize the agi root/level fields and the root finobt block. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: update the finobt on inode freeBrian Foster
An inode free operation can have several effects on the finobt. If all inodes have been freed and the chunk deallocated, we remove the finobt record. If the inode chunk was previously full, we must insert a new record based on the existing inobt record. Otherwise, we modify the record in place. Create the xfs_difree_finobt() function to identify the potential scenarios and update the finobt appropriately. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helperBrian Foster
Refactor xfs_difree() in preparation for the finobt. xfs_difree() performs the validity checks against the ag and reads the agi header. The work of physically updating the inode allocation btree is pushed down into the new xfs_difree_inobt() helper. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: use and update the finobt on inode allocationBrian Foster
Replace xfs_dialloc_ag() with an implementation that looks for a record in the finobt. The finobt only tracks records with at least one free inode. This eliminates the need for the intra-ag scan in the original algorithm. Once the inode is allocated, update the finobt appropriately (possibly removing the record) as well as the inobt. Move the original xfs_dialloc_ag() algorithm to xfs_dialloc_ag_inobt() and fall back as such if finobt support is not enabled. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: insert newly allocated inode chunks into the finobtBrian Foster
A newly allocated inode chunk, by definition, has at least one free inode, so a record is always inserted into the finobt. Create the xfs_inobt_insert() helper from existing code to insert a record in an inobt based on the provided BTNUM. Update xfs_ialloc_ag_alloc() to invoke the helper for the existing XFS_BTNUM_INO tree and XFS_BTNUM_FINO tree, if enabled. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: update inode allocation/free transaction reservations for finobtBrian Foster
Create the xfs_calc_finobt_res() helper to calculate the finobt log reservation for inode allocation and free. Update XFS_IALLOC_SPACE_RES() to reserve blocks for the additional finobt insertion on inode allocation. Create XFS_IFREE_SPACE_RES() to reserve blocks for the potential finobt record insertion on inode free (i.e., if an inode chunk was previously fully allocated). Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: support the XFS_BTNUM_FINOBT free inode btree typeBrian Foster
Define the AGI fields for the finobt root/level and add magic numbers. Update the btree code to add support for the new XFS_BTNUM_FINOBT inode btree. The finobt root block is reserved immediately following the inobt root block in the AG. Update XFS_PREALLOC_BLOCKS() to determine the starting AG data block based on whether finobt support is enabled. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: reserve v5 superblock read-only compat. feature bit for finobtBrian Foster
Reserve a v5 read-only compatibility feature bit for the finobt and create the xfs_sb_version_hasfinobt() helper to determine whether an fs has the feature enabled. The finobt does not change existing on-disk structures, but must remain consistent with the ialloc btree. Modifications from older kernels would violate that constrant. Therefore, we restrict older kernels to read-only mounts of finobt-enabled filesystems. Note that this does not yet enable the ability to rw mount a finobt fs (by setting the feature bit in the XFS_SB_FEAT_RO_COMPAT_ALL mask). Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbersBrian Foster
The introduction of the free inode btree (finobt) requires that xfs_ialloc_btree.c handle multiple trees. Refactor xfs_ialloc_btree.c so the caller specifies the btree type on cursor initialization to prepare for addition of the finobt. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>