Age | Commit message (Collapse) | Author |
|
Support RENAME_EXCHANGE and RENAME_NOREPLACE flags on the userspace ABI.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Fuse doesn't support i_version (yet).
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
The patch addresses two use-cases when the flag may be safely cleared:
1. fuse_do_setattr() is called with ATTR_CTIME flag set in attr->ia_valid.
In this case attr->ia_ctime bears actual value. In-kernel fuse must send it
to the userspace server and then assign the value to inode->i_ctime.
2. fuse_do_setattr() is called with ATTR_SIZE flag set in attr->ia_valid,
whereas ATTR_CTIME is not set (truncate(2)).
In this case in-kernel fuse must sent "now" to the userspace server and then
assign the value to inode->i_ctime.
In both cases we could clear I_DIRTY_SYNC, but that needs more thought.
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Let the kernel maintain i_ctime locally: update i_ctime explicitly on
truncate, fallocate, open(O_TRUNC), setxattr, removexattr, link, rename,
unlink.
The inode flag I_DIRTY_SYNC serves as indication that local i_ctime should
be flushed to the server eventually. The patch sets the flag and updates
i_ctime in course of operations listed above.
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
This implements updating ctime as well as mtime on file_update_time().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
The patch extends fuse_setattr_in, and extends the flush procedure
(fuse_flush_times()) called on ->write_inode() to send the ctime as well as
mtime.
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Allow userspace fs to specify time granularity.
This is needed because with writeback_cache mode the kernel is responsible
for generating mtime and ctime, but if the underlying filesystem doesn't
support nanosecond granularity then the cache will contain a different
value from the one stored on the filesystem resulting in a change of times
after a cache flush.
Make the default granularity 1s.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
...and flush mtime from this. This allows us to use the kernel
infrastructure for writing out dirty metadata (mtime at this point, but
ctime in the next patches and also maybe atime).
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Don't need to start I/O twice (once without i_mutex and one within).
Also make sure that even if the userspace filesystem doesn't support FSYNC
we do all the steps other than sending the message.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
in preparation for getting rid of FUSE_I_MTIME_DIRTY.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
In case of fc->atomic_o_trunc is set, fuse does nothing in
fuse_do_setattr() while handling open(O_TRUNC). Hence, i_mtime must be
updated explicitly in fuse_finish_open(). The patch also adds extra locking
encompassing open(O_TRUNC) operation to avoid races between the truncation
and updating i_mtime.
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Handling truncate(2), VFS doesn't set ATTR_MTIME bit in iattr structure;
only ATTR_SIZE bit is set. In-kernel fuse must handle the case by setting
mtime fields of struct fuse_setattr_in to "now" and set FATTR_MTIME bit
even though ATTR_MTIME was not set.
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
When inode is in I_NEW state, inode->i_mode is not initialized yet. Do not
use it before fuse_init_inode() is called.
Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Bad case of shadowing.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Don't allow new fallocate modes until we figure out what (if anything) that
takes.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
fuse_ctl_cleanup is only called by __exit fuse_exit
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Sparse warning: fs/gfs2/lops.c:78:29:
"warning: Using plain integer as NULL pointer"
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: limit the path size in send to PATH_MAX
Btrfs: correctly set profile flags on seqlock retry
Btrfs: use correct key when repeating search for extent item
Btrfs: fix inode caching vs tree log
Btrfs: fix possible memory leaks in open_ctree()
Btrfs: avoid triggering bug_on() when we fail to start inode caching task
Btrfs: move btrfs_{set,clear}_and_info() to ctree.h
btrfs: replace error code from btrfs_drop_extents
btrfs: Change the hole range to a more accurate value.
btrfs: fix use-after-free in mount_subvol()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some kernfs fixes for 3.15-rc3 that resolve some reported
problems. Nothing huge, but all needed"
* tag 'driver-core-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
s390/ccwgroup: Fix memory corruption
kernfs: add back missing error check in kernfs_fop_mmap()
kernfs: fix a subdir count leak
|
|
fs_path_ensure_buf is used to make sure our path buffers for
send are big enough for the path names as we construct them.
The buffer size is limited to 32K by the length field in
the struct.
But bugs in the path construction can end up trying to build
a huge buffer, and we'll do invalid memmmoves when the
buffer length field wraps.
This patch is step one, preventing the overflows.
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Pull file locking fixes from Jeff Layton:
"File locking related bugfixes for v3.15 (pile #2)
- fix for a long-standing bug in __break_lease that can cause soft
lockups
- renaming of file-private locks to "open file description" locks,
and the command macros to more visually distinct names
The fix for __break_lease is also in the pile of patches for which
Bruce sent a pull request, but I assume that your merge procedure will
handle that correctly.
For the other patches, I don't like the fact that we need to rename
this stuff at this late stage, but it should be settled now
(hopefully)"
* tag 'locks-v3.15-2' of git://git.samba.org/jlayton/linux:
locks: rename FL_FILE_PVT and IS_FILE_PVT to use "*_OFDLCK" instead
locks: rename file-private locks to "open file description locks"
locks: allow __break_lease to sleep even when break_time is 0
|
|
Pull nfsd bugfixes from Bruce Fields:
"Three small nfsd bugfixes (including one locks.c fix for a bug
triggered only from nfsd).
Jeff's patches are for long-existing problems that became easier to
trigger since the addition of vfs delegation support"
* 'for-3.15' of git://linux-nfs.org/~bfields/linux:
Revert "nfsd4: fix nfs4err_resource in 4.1 case"
nfsd: set timeparms.to_maxval in setup_callback_client
locks: allow __break_lease to sleep even when break_time is 0
|
|
While updating how mmap enabled kernfs files are handled by lockdep,
9b2db6e18945 ("sysfs: bail early from kernfs_file_mmap() to avoid
spurious lockdep warning") inadvertently dropped error return check
from kernfs_file_mmap(). The intention was just dropping "if
(ops->mmap)" check as the control won't reach the point if the mmap
callback isn't implemented, but I mistakenly removed the error return
check together with it.
This led to Xorg crash on i810 which was reported and bisected to the
commit and then to the specific change by Tobias.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-bisected-by: Tobias Powalowski <tobias.powalowski@googlemail.com>
Tested-by: Tobias Powalowski <tobias.powalowski@googlemail.com>
References: http://lkml.kernel.org/g/533D01BD.1010200@googlemail.com
Cc: stable <stable@vger.kernel.org> # 3.14
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Currently kernfs_link_sibling() increates parent->dir.subdirs before
adding the node into parent's chidren rb tree.
Because it is possible that kernfs_link_sibling() couldn't find
a suitable slot and bail out, this leads to a mismatch between
elevated subdir count with actual children node numbers.
This patches fix this problem, by moving the subdir accouting
after the actual addtion happening.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
kernfs_notify() is used to indicate either new data is available or
the content of a file has changed. It currently only triggers poll
which may not be the most convenient to monitor especially when there
are a lot to monitor. Let's hook it up to fsnotify too so that the
events can be monitored via inotify too.
fsnotify_modify() requires file * but kernfs_notify() doesn't have any
specific file associated; however, we can walk all super_blocks
associated with a kernfs_root and as kernfs always associate one ino
with inode and one dentry with an inode, it's trivial to look up the
dentry associated with a given kernfs_node. As any active monitor
would pin dentry, just looking up existing dentry is enough. This
patch looks up the dentry associated with the specified kernfs_node
and generates events equivalent to fsnotify_modify().
Note that as fsnotify doesn't provide fsnotify_modify() equivalent
which can be called with dentry, kernfs_notify() directly calls
fsnotify_parent() and fsnotify(). It might be better to add a wrapper
in fsnotify.h instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Currently, there's no way to find out which super_blocks are
associated with a given kernfs_root. Let's implement it - the planned
inotify extension to kernfs_notify() needs it.
Make kernfs_super_info point back to the super_block and chain it at
kernfs_root->supers.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
actimeo=0 is supposed to be a special case that ensures that inode
attributes are always refetched from the server instead of trusting the
cache. The cifs code however uses time_in_range() to determine whether
the attributes have timed out. In the case where cifs_i->time equals
jiffies, this leads to the cifs code not refetching the inode attributes
when it should.
Fix this by explicitly testing for actimeo=0, and handling it as a
special case.
Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
If we had to retry on the profiles seqlock (due to a concurrent write), we
would set bits on the input flags that corresponded both to the current
profile and to previous values of the profile.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
If skinny metadata is enabled and our first tree search fails to find a
skinny extent item, we may repeat a tree search for a "fat" extent item
(if the previous item in the leaf is not the "fat" extent we're looking
for). However we were not setting the new key's objectid to the right
value, as we previously used the same key variable to peek at the previous
item in the leaf, which has a different objectid. So just set the right
objectid to avoid modifying/deleting a wrong item if we repeat the tree
search.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Currently, with inode cache enabled, we will reuse its inode id immediately
after unlinking file, we may hit something like following:
|->iput inode
|->return inode id into inode cache
|->create dir,fsync
|->power off
An easy way to reproduce this problem is:
mkfs.btrfs -f /dev/sdb
mount /dev/sdb /mnt -o inode_cache,commit=100
dd if=/dev/zero of=/mnt/data bs=1M count=10 oflag=sync
inode_id=`ls -i /mnt/data | awk '{print $1}'`
rm -f /mnt/data
i=1
while [ 1 ]
do
mkdir /mnt/dir_$i
test1=`stat /mnt/dir_$i | grep Inode: | awk '{print $4}'`
if [ $test1 -eq $inode_id ]
then
dd if=/dev/zero of=/mnt/dir_$i/data bs=1M count=1 oflag=sync
echo b > /proc/sysrq-trigger
fi
sleep 1
i=$(($i+1))
done
mount /dev/sdb /mnt
umount /dev/sdb
btrfs check /dev/sdb
We fix this problem by adding unlinked inode's id into pinned tree,
and we can not reuse them until committing transaction.
Cc: stable@vger.kernel.org
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Fix possible memory leaks in the following error handling paths:
read_tree_block()
btrfs_recover_log_trees
btrfs_commit_super()
btrfs_find_orphan_roots()
btrfs_cleanup_fs_roots()
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
When running stress test(including snapshots,balance,fstress), we trigger
the following BUG_ON() which is because we fail to start inode caching task.
[ 181.131945] kernel BUG at fs/btrfs/inode-map.c:179!
[ 181.137963] invalid opcode: 0000 [#1] SMP
[ 181.217096] CPU: 11 PID: 2532 Comm: btrfs Not tainted 3.14.0 #1
[ 181.240521] task: ffff88013b621b30 ti: ffff8800b6ada000 task.ti: ffff8800b6ada000
[ 181.367506] Call Trace:
[ 181.371107] [<ffffffffa036c1be>] btrfs_return_ino+0x9e/0x110 [btrfs]
[ 181.379191] [<ffffffffa038082b>] btrfs_evict_inode+0x46b/0x4c0 [btrfs]
[ 181.387464] [<ffffffff810b5a70>] ? autoremove_wake_function+0x40/0x40
[ 181.395642] [<ffffffff811dc5fe>] evict+0x9e/0x190
[ 181.401882] [<ffffffff811dcde3>] iput+0xf3/0x180
[ 181.408025] [<ffffffffa03812de>] btrfs_orphan_cleanup+0x1ee/0x430 [btrfs]
[ 181.416614] [<ffffffffa03a6abd>] btrfs_mksubvol.isra.29+0x3bd/0x450 [btrfs]
[ 181.425399] [<ffffffffa03a6cd6>] btrfs_ioctl_snap_create_transid+0x186/0x190 [btrfs]
[ 181.435059] [<ffffffffa03a6e3b>] btrfs_ioctl_snap_create_v2+0xeb/0x130 [btrfs]
[ 181.444148] [<ffffffffa03a9656>] btrfs_ioctl+0xf76/0x2b90 [btrfs]
[ 181.451971] [<ffffffff8117e565>] ? handle_mm_fault+0x475/0xe80
[ 181.459509] [<ffffffff8167ba0c>] ? __do_page_fault+0x1ec/0x520
[ 181.467046] [<ffffffff81185b35>] ? do_mmap_pgoff+0x2f5/0x3c0
[ 181.474393] [<ffffffff811d4da8>] do_vfs_ioctl+0x2d8/0x4b0
[ 181.481450] [<ffffffff811d5001>] SyS_ioctl+0x81/0xa0
[ 181.488021] [<ffffffff81680b69>] system_call_fastpath+0x16/0x1b
We should avoid triggering BUG_ON() here, instead, we output warning messages
and clear inode_cache option.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
There's a case which clone does not handle and used to BUG_ON instead,
(testcase xfstests/btrfs/035), now returns EINVAL. This error code is
confusing to the ioctl caller, as it normally signifies errorneous
arguments.
Change it to ENOPNOTSUPP which allows a fall back to copy instead of
clone. This does not affect the common reflink operation.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Commit 3ac0d7b96a268a98bd474cab8bce3a9f125aaccf fixed the btrfs expanding
write problem but the hole punched is sometimes too large for some
iovec, which has unmapped data ranges.
This patch will change to hole range to a more accurate value using the
counts checked by the write check routines.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Add the finobt feature bit to the list of known features. As of
this point, the kernel code knows how to mount and manage both
finobt and non-finobt formatted filesystems.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
Define the XFS_FSOP_GEOM_FLAGS_FINOBT fs geometry flag and set the
associated bit if the filesystem supports the free inode btree.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
Add finobt support to growfs. Initialize the agi root/level fields
and the root finobt block.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
An inode free operation can have several effects on the finobt. If
all inodes have been freed and the chunk deallocated, we remove the
finobt record. If the inode chunk was previously full, we must
insert a new record based on the existing inobt record. Otherwise,
we modify the record in place.
Create the xfs_difree_finobt() function to identify the potential
scenarios and update the finobt appropriately.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
Refactor xfs_difree() in preparation for the finobt. xfs_difree()
performs the validity checks against the ag and reads the agi
header. The work of physically updating the inode allocation btree
is pushed down into the new xfs_difree_inobt() helper.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
Replace xfs_dialloc_ag() with an implementation that looks for a
record in the finobt. The finobt only tracks records with at least
one free inode. This eliminates the need for the intra-ag scan in
the original algorithm. Once the inode is allocated, update the
finobt appropriately (possibly removing the record) as well as the
inobt.
Move the original xfs_dialloc_ag() algorithm to
xfs_dialloc_ag_inobt() and fall back as such if finobt support is
not enabled.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
A newly allocated inode chunk, by definition, has at least one
free inode, so a record is always inserted into the finobt.
Create the xfs_inobt_insert() helper from existing code to insert
a record in an inobt based on the provided BTNUM. Update
xfs_ialloc_ag_alloc() to invoke the helper for the existing
XFS_BTNUM_INO tree and XFS_BTNUM_FINO tree, if enabled.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
Create the xfs_calc_finobt_res() helper to calculate the finobt log
reservation for inode allocation and free. Update
XFS_IALLOC_SPACE_RES() to reserve blocks for the additional finobt
insertion on inode allocation. Create XFS_IFREE_SPACE_RES() to
reserve blocks for the potential finobt record insertion on inode
free (i.e., if an inode chunk was previously fully allocated).
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
Define the AGI fields for the finobt root/level and add magic
numbers. Update the btree code to add support for the new
XFS_BTNUM_FINOBT inode btree.
The finobt root block is reserved immediately following the inobt
root block in the AG. Update XFS_PREALLOC_BLOCKS() to determine the
starting AG data block based on whether finobt support is enabled.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
Reserve a v5 read-only compatibility feature bit for the finobt and
create the xfs_sb_version_hasfinobt() helper to determine whether
an fs has the feature enabled.
The finobt does not change existing on-disk structures, but must
remain consistent with the ialloc btree. Modifications from older
kernels would violate that constrant. Therefore, we restrict older
kernels to read-only mounts of finobt-enabled filesystems.
Note that this does not yet enable the ability to rw mount a finobt
fs (by setting the feature bit in the XFS_SB_FEAT_RO_COMPAT_ALL
mask).
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
The introduction of the free inode btree (finobt) requires that
xfs_ialloc_btree.c handle multiple trees. Refactor xfs_ialloc_btree.c
so the caller specifies the btree type on cursor initialization to
prepare for addition of the finobt.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
File-private locks have been re-christened as "open file description"
locks. Finish the symbol name cleanup in the internal implementation.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|
|
There is no good reason to create a filestream when a directory entry
is created. Delay it until the first allocation happens to simply
the code and reduce the amount of mru cache lookups we do.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
|