summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2018-05-30btrfs: Remove fs_info argument from btrfs_uuid_tree_remLu Fengqi
This function always takes a transaction handle which contains a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> [ rename the function ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30btrfs: Remove fs_info argument from btrfs_uuid_tree_addLu Fengqi
This function always takes a transaction handle which contains a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30Btrfs: remove unused check of skip_lockingLiu Bo
The check is superfluous since all callers who set search_for_commit also have skip_locking set. ASSERT() is put in place to ensure skip_locking is set by new callers. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30Btrfs: remove always true check in unlock_upLiu Bo
As unlock_up() is written as for () { if (!path->locks[i]) break; ... if (... && path->locks[i]) { } } Apparently, @path->locks[i] is always true at this 'if'. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30Btrfs: grab write lock directly if write_lock_level is the max levelLiu Bo
Typically, when acquiring root node's lock, btrfs tries its best to get read lock and trade for write lock if @write_lock_level implies to do so. In case of (cow && (p->keep_locks || p->lowest_level)), write_lock_level is set to BTRFS_MAX_LEVEL, which means we need to acquire root node's write lock directly. In this particular case, the dance of acquiring read lock and then trading for write lock can be saved. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30Btrfs: move get root out of btrfs_search_slot to a helperLiu Bo
It's good to have a helper instead of having all get-root details open-coded. The new helper locks (if necessary) and sets root node of the path. Also invert the checks to make the code flow easier to read. There is no functional change in this commit. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30Btrfs: use more straightforward extent_buffer_uptodate checkLiu Bo
If parent_transid "0" is passed to btrfs_buffer_uptodate(), btrfs_buffer_uptodate() is equivalent to extent_buffer_uptodate(), but extent_buffer_uptodate() is preferred since we don't have to look into verify_parent_transid(). Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30Btrfs: remove superfluous free_extent_buffer in read_block_for_searchLiu Bo
read_block_for_search() can be simplified as: tmp = find_extent_buffer(); if (tmp) return; ... free_extent_buffer(); read_tree_block(); Apparently, @tmp must be NULL at this point, free_extent_buffer() is not needed. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30btrfs: drop unused space_info parameter from create_space_infoLu Fengqi
Since commit dc2d3005d27d ("btrfs: remove dead create_space_info calls"), there is only one caller btrfs_init_space_info. However, it doesn't need create_space_info to return space_info at all. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30Btrfs: add parent_transid parameter to veirfy_level_keyLiu Bo
As verify_level_key() is checked after verify_parent_transid(), i.e. if (verify_parent_transid()) ret = -EIO; else if (verify_level_key()) ret = -EUCLEAN; if parent_transid is 0, verify_parent_transid() skips verifying parent_transid and considers eb as valid, and if verify_level_key() reports something wrong, we're not going to know if it's caused by corrupted metadata or non-checkecd eb (e.g. stale eb). The stale eb can be from an outdated raid1 mirror after a degraded mount, see eg "btrfs: fix reading stale metadata blocks after degraded raid1 mounts" (02a3307aa9c20b4f66262) for more details. @parent_transid is able to tell whether the eb's generation has been verified by the caller. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30btrfs: qgroup: show more meaningful qgroup_rescan_init error messageQu Wenruo
Error message from qgroup_rescan_init() mostly looks like: BTRFS info (device nvme0n1p1): qgroup_rescan_init failed with -115 Which is far from meaningful, and sometimes confusing as for above -EINPROGRESS it's mostly (despite the init race) harmless, but sometimes it can also indicate problem if the return value is -EINVAL. Change it to some more meaningful messages like: BTRFS info (device nvme0n1p1): qgroup rescan is already in progress And BTRFS err(device nvme0n1p1): qgroup rescan init failed, qgroup is not enabled Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> [ update the messages and level ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2()Omar Sandoval
If we have invalid flags set, when we error out we must drop our writer counter and free the buffer we allocated for the arguments. This bug is trivially reproduced with the following program on 4.7+: #include <fcntl.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <sys/types.h> #include <linux/btrfs.h> #include <linux/btrfs_tree.h> int main(int argc, char **argv) { struct btrfs_ioctl_vol_args_v2 vol_args = { .flags = UINT64_MAX, }; int ret; int fd; if (argc != 2) { fprintf(stderr, "usage: %s PATH\n", argv[0]); return EXIT_FAILURE; } fd = open(argv[1], O_WRONLY); if (fd == -1) { perror("open"); return EXIT_FAILURE; } ret = ioctl(fd, BTRFS_IOC_RM_DEV_V2, &vol_args); if (ret == -1) perror("ioctl"); close(fd); return EXIT_SUCCESS; } When unmounting the filesystem, we'll hit the WARN_ON(mnt_get_writers(mnt)) in cleanup_mnt() and also may prevent the filesystem to be remounted read-only as the writer count will stay lifted. Fixes: 6b526ed70cf1 ("btrfs: introduce device delete by devid") CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30btrfs: lzo: Harden inline lzo compressed extent decompressionQu Wenruo
For inlined extent, we only have one segment, thus less things to check. And further more, inlined extent always has the csum in its leaf header, it's less probable to have corrupted data. Anyway, still check header and segment header. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30btrfs: lzo: Add header length check to avoid potential out-of-bounds accessQu Wenruo
James Harvey reported that some corrupted compressed extent data can lead to various kernel memory corruption. Such corrupted extent data belongs to inode with NODATASUM flags, thus data csum won't help us detecting such bug. If lucky enough, KASAN could catch it like: BUG: KASAN: slab-out-of-bounds in lzo_decompress_bio+0x384/0x7a0 [btrfs] Write of size 4096 at addr ffff8800606cb0f8 by task kworker/u16:0/2338 CPU: 3 PID: 2338 Comm: kworker/u16:0 Tainted: G O 4.17.0-rc5-custom+ #50 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: btrfs-endio btrfs_endio_helper [btrfs] Call Trace: dump_stack+0xc2/0x16b print_address_description+0x6a/0x270 kasan_report+0x260/0x380 memcpy+0x34/0x50 lzo_decompress_bio+0x384/0x7a0 [btrfs] end_compressed_bio_read+0x99f/0x10b0 [btrfs] bio_endio+0x32e/0x640 normal_work_helper+0x15a/0xea0 [btrfs] process_one_work+0x7e3/0x1470 worker_thread+0x1b0/0x1170 kthread+0x2db/0x390 ret_from_fork+0x22/0x40 ... The offending compressed data has the following info: Header: length 32768 (looks completely valid) Segment 0 Header: length 3472882419 (obviously out of bounds) Then when handling segment 0, since it's over the current page, we need the copy the compressed data to temporary buffer in workspace, then such large size would trigger out-of-bounds memory access, screwing up the whole kernel. Fix it by adding extra checks on header and segment headers to ensure we won't access out-of-bounds, and even checks the decompressed data won't be out-of-bounds. Reported-by: James Harvey <jamespharvey20@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> [ updated comments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29aio: sanitize the limit checking in io_submit(2)Al Viro
as it is, the logics in native io_submit(2) is "if asked for more than LONG_MAX/sizeof(pointer) iocbs to submit, don't bother with more than LONG_MAX/sizeof(pointer)" (i.e. 512M requests on 32bit and 1E requests on 64bit) while compat io_submit(2) goes with "stop after the first PAGE_SIZE/sizeof(pointer) iocbs", i.e. 1K or so. Which is * inconsistent * *way* too much in native case * possibly too little in compat one and * wrong anyway, since the natural point where we ought to stop bothering is ctx->nr_events Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-29aio: fold do_io_submit() into callersAl Viro
get rid of insane "copy array of 32bit pointers into an array of native ones" glue. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-29aio: shift copyin of iocb into io_submit_one()Al Viro
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-29aio_read_events_ring(): make a bit more readableAl Viro
The logics for 'avail' is * not past the tail of cyclic buffer * no more than asked * not past the end of buffer * not past the end of a page Unobfuscate the last part. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-29aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the ↵Al Viro
same way ... so just make them return 0 when caller does not need to destroy iocb Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-29aio: take list removal to (some) callers of aio_complete()Al Viro
We really want iocb out of io_cancel(2) reach before we start tearing it down. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-29Merge tag 'afs-fixes-20180529' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: - fix a BUG triggerable from faccessat() - fix the mounting of backup volumes * tag 'afs-fixes-20180529' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix mounting of backup volumes afs: Fix directory permissions check
2018-05-29f2fs: add fsync_mode=nobarrier for non-atomic filesJaegeuk Kim
For non-atomic files, this patch adds an option to give nobarrier which doesn't issue flush commands to the device. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-29f2fs: let fstrim issue discard commands in lower priorityJaegeuk Kim
The fstrim gathers huge number of large discard commands, and tries to issue without IO awareness, which results in long user-perceive IO latencies on READ, WRITE, and FLUSH in UFS. We've observed some of commands take several seconds due to long discard latency. This patch limits the maximum size to 2MB per candidate, and check IO congestion when issuing them to disk. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-29fs: xfs: Change return type to vm_fault_tSouptick Joarder
Use new return type vm_fault_t for fault handlers. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-29xfs: fix inobt magic number checkDarrick J. Wong
In commit a6a781a58befcbd467c ("xfs: have buffer verifier functions report failing address") the bad magic number return was ported incorrectly. Fixes: a6a781a58befcbd467ce843af4eaca3906aa1f08 Reported-by: syzbot+08ab33be0178b76851c8@syzkaller.appspotmail.com Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2018-05-29fs: clear writeback errors in inode_init_alwaysDarrick J. Wong
In inode_init_always(), we clear the inode mapping flags, which clears any retained error (AS_EIO, AS_ENOSPC) bits. Unfortunately, we do not also clear wb_err, which means that old mapping errors can leak through to new inodes. This is crucial for the XFS inode allocation path because we recycle old in-core inodes and we do not want error state from an old file to leak into the new file. This bug was discovered by running generic/036 and generic/047 in a loop and noticing that the EIOs generated by the collision of direct and buffered writes in generic/036 would survive the remount between 036 and 047, and get reported to the fsyncs (on different files!) in generic/047. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-05-29vfs: delete unnecessary assignment in vfs_listxattrChengguang Xu
It seems the first error assignment in if branch is redundant. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-29btrfs: lzo: document the compressed data formatQu Wenruo
Although it's not that complex, but such comment could still save several minutes for newer reader/reviewer instead of inferring that from the code. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ minor wording updates ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29btrfs: compression: Add linux/sizes.h for compression.hQu Wenruo
Since compression.h is using the SZ_* macros, and if some file includes only compression.h without linux/sizes.h, it will cause compile error. One example is lzo.c, if it uses BTRFS_MAX_COMPRESSED. Fix it by adding linux/sizes.h in compression.h Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29Btrfs: fix clone vs chattr NODATASUM raceOmar Sandoval
In btrfs_clone_files(), we must check the NODATASUM flag while the inodes are locked. Otherwise, it's possible that btrfs_ioctl_setflags() will change the flags after we check and we can end up with a party checksummed file. The race window is only a few instructions in size, between the if and the locks which is: 3834 if (S_ISDIR(src->i_mode) || S_ISDIR(inode->i_mode)) 3835 return -EISDIR; where the setflags must be run and toggle the NODATASUM flag (provided the file size is 0). The clone will block on the inode lock, segflags takes the inode lock, changes flags, releases log and clone continues. Not impossible but still needs a lot of bad luck to hit unintentionally. Fixes: 0e7b824c4ef9 ("Btrfs: don't make a file partly checksummed through file clone") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29btrfs: propagate failures of __exclude_logged_extent to upper callerGu Jinxiang
Function btrfs_exclude_logged_extents may call __exclude_logged_extent which may fail. Propagate the failures of __exclude_logged_extent to upper caller. Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29btrfs: Streamline shared ref check in alloc_reserved_tree_blockNikolay Borisov
Instead of setting "parent" to ref->parent only when dealing with a shared ref and subsequently performing another check to see if (parent > 0), check the "node->type" directly and act accordingly. This makes the code more streamline. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29btrfs: Pass btrfs_delayed_extent_op to alloc_reserved_tree_blockNikolay Borisov
Instead of taking only specific member of this structure, which results in 2 extra arguments, just take the delayed_extent_op struct and reference the arguments inside the functions. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29btrfs: Simplify alloc_reserved_tree_block interfaceNikolay Borisov
This function currently takes 7 parameters, most of which are proxies for values from btrfs_delayed_ref_node struct which is not passed. This patch simplifies the interface of the function by simply passing said delayed ref node struct to the function. This enables us to: 1. Move locals variables and init code related to them from run_delayed_tree_ref which should only be used inside alloc_reserved_tree_block, such as skinny_metadata and the btrfs_key, representing the extent being inserted. This removes the need for the "ins" argument. Instead, it's replaced by a local var with a more verbose name - extent_key. 2. Now that we have a reference to the node in alloc_reserved_tree_block the delayed_tree_ref struct can be referenced inside the function and this enable removing the "ref->level", "parent" and "ref_root" arguments. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29btrfs: Remove fs_info argument from alloc_reserved_tree_blockNikolay Borisov
This function already takes a transaction handle which contains a reference to the fs_info. So use this and remove the extra argument. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29btrfs: tests: drop newline from test_msg stringsDavid Sterba
Now that test_err strings do not need the newline, remove them also from the test_msg. Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29btrfs: tests: add helper for error messages and update themDavid Sterba
The test failures are not clearly visible in the system log as they're printed at INFO level. Add a new helper that is level ERROR. As this touches almost all strings, I took the opportunity to unify them: - decapitalize the first letter as there's a prefix and the text continues after ":" - glue strings split to more lines and un-indent so they fit to 80 columns - use %llu instead of %Lu - drop \n from the modified messages (test_msg is left untouched) Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-29dlm: remove O_NONBLOCK flag in sctp_connect_to_sockGang He
We should remove O_NONBLOCK flag when calling sock->ops->connect() in sctp_connect_to_sock() function. Why? 1. up to now, sctp socket connect() function ignores the flag argument, that means O_NONBLOCK flag does not take effect, then we should remove it to avoid the confusion (but is not urgent). 2. for the future, there will be a patch to fix this problem, then the flag argument will take effect, the patch has been queued at https://git.kernel.o rg/pub/scm/linux/kernel/git/davem/net.git/commit/net/sctp?id=644fbdeacf1d3ed d366e44b8ba214de9d1dd66a9. But, the O_NONBLOCK flag will make sock->ops->connect() directly return without any wait time, then the connection will not be established, DLM kernel module will call sock->ops->connect() again and again, the bad results are, CPU usage is almost 100%, even trigger soft_lockup problem if the related configurations are enabled, DLM kernel module also prints lots of messages like, [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 The upper application (e.g. ocfs2 mount command) is hanged at new_lockspace(), the whole backtrace is as below, tb0307-nd2:~ # cat /proc/2935/stack [<0>] new_lockspace+0x957/0xac0 [dlm] [<0>] dlm_new_lockspace+0xae/0x140 [dlm] [<0>] user_cluster_connect+0xc3/0x3a0 [ocfs2_stack_user] [<0>] ocfs2_cluster_connect+0x144/0x220 [ocfs2_stackglue] [<0>] ocfs2_dlm_init+0x215/0x440 [ocfs2] [<0>] ocfs2_fill_super+0xcb0/0x1290 [ocfs2] [<0>] mount_bdev+0x173/0x1b0 [<0>] mount_fs+0x35/0x150 [<0>] vfs_kern_mount.part.23+0x54/0x100 [<0>] do_mount+0x59a/0xc40 [<0>] SyS_mount+0x80/0xd0 [<0>] do_syscall_64+0x76/0x140 [<0>] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [<0>] 0xffffffffffffffff So, I think we should remove O_NONBLOCK flag here, since DLM kernel module can not handle non-block sockect in connect() properly. Signed-off-by: Gang He <ghe@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>
2018-05-29block: don't print a message when the device went awayChristoph Hellwig
The information about a size change in this case just creates confusion. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-29block: unexport check_disk_size_changeChristoph Hellwig
Only used in block_dev.c and the partitions code, and it should remain that way.. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-28aio: add missing break for the IOCB_CMD_FDSYNC caseChristoph Hellwig
Looks like this got lost in a merge. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-28NFS: Optimise away lookups for rename targetsTrond Myklebust
We can optimise away any lookup for a rename target, unless we're being asked to revalidate a dentry that might be in use. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-05-28NFS: If the VFS sets LOOKUP_REVAL then force a lookup of the dentryTrond Myklebust
If nfs_lookup_revalidate() is called with LOOKUP_REVAL because a previous path lookup failed, then we ought to force a full lookup of the component name. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-05-28NFS: Optimise away the close-to-open GETATTR when we have NFSv4 OPENTrond Myklebust
NFSv4 should not need to perform an extra close-to-open GETATTR as part of the process of looking up a regular file, since the OPEN call will do that for us. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-05-28IB: Revert "remove redundant INFINIBAND kconfig dependencies"Arnd Bergmann
Several subsystems depend on INFINIBAND_ADDR_TRANS, which in turn depends on INFINIBAND. However, when with CONFIG_INIFIBAND=m, this leads to a link error when another driver using it is built-in. The INFINIBAND_ADDR_TRANS dependency is insufficient here as this is a 'bool' symbol that does not force anything to be a module in turn. fs/cifs/smbdirect.o: In function `smbd_disconnect_rdma_work': smbdirect.c:(.text+0x1e4): undefined reference to `rdma_disconnect' net/9p/trans_rdma.o: In function `rdma_request': trans_rdma.c:(.text+0x7bc): undefined reference to `rdma_disconnect' net/9p/trans_rdma.o: In function `rdma_destroy_trans': trans_rdma.c:(.text+0x830): undefined reference to `ib_destroy_qp' trans_rdma.c:(.text+0x858): undefined reference to `ib_dealloc_pd' Fixes: 9533b292a7ac ("IB: remove redundant INFINIBAND kconfig dependencies") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Greg Thelen <gthelen@google.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-28btrfs: use error code returned by btrfs_read_fs_root_no_name in search ioctlMisono Tomohiro
btrfs_read_fs_root_no_name() may return ERR_PTR(-ENOENT) or ERR_PTR(-ENOMEM) and therefore search_ioctl() and btrfs_search_path_in_tree() should use PTR_ERR() instead of -ENOENT, which all other callers of btrfs_read_fs_root_no_name() do. Drop the error message as it would be confusing, the caller of ioctl will likely interpret the error code and not look into the syslog. Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28Btrfs: allow empty subvol= againOmar Sandoval
I got a report that after upgrading to 4.16, someone's filesystems weren't mounting: [ 23.845852] BTRFS info (device loop0): unrecognized mount option 'subvol=' Before 4.16, this mounted the default subvolume. It turns out that this empty "subvol=" is actually an application bug, but it was causing the application to fail, so it's an ABI break if you squint. The generic parsing code we use for mount options (match_token()) doesn't match an empty string as "%s". Previously, setup_root_args() removed the "subvol=" string, but the mount path was cleaned up to not need that. Add a dummy Opt_subvol_empty to fix this. The simple workaround is to use / or . for the value of 'subvol=' . Fixes: 312c89fbca06 ("btrfs: cleanup btrfs_mount() using btrfs_mount_root()") CC: stable@vger.kernel.org # 4.16+ Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: fix describe_relocation when printing unknown flagsAnand Jain
Looks like the original idea was to print the hex of the flags which is not coded with their flag name. So use the current buf pointer bp instead of buf. Reaching the uknown flags should never happen, it's there just in case. Fixes: ebce0e01b930b ("btrfs: make block group flags in balance printks human-readable") Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: use kvzalloc for EXTENT_SAME temporary dataDavid Sterba
The dedupe range is 16 MiB, with 4 KiB pages and 8 byte pointers, the arrays can be 32KiB large. To avoid allocation failures due to fragmented memory, use the allocation with fallback to vmalloc. The arrays are allocated and freed only inside btrfs_extent_same and reused for all the ranges. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28Btrfs: reuse cmp workspace in EXTENT_SAME ioctlTimofey Titovets
We support big dedup requests by splitting range to smaller parts, and call dedupe logic on each of them. Instead of repeated allocation and deallocation, allocate once at the beginning and reuse in the iteration. Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>