summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2014-04-02fuse: Flush files on wb closePavel Emelyanov
Any write request requires a file handle to report to the userspace. Thus when we close a file (and free the fuse_file with this info) we have to flush all the outstanding dirty pages. filemap_write_and_wait() is enough because every page under fuse writeback is accounted in ff->count. This delays actual close until all fuse wb is completed. In case of "write cache" turned off, the flush is ensured by fuse_vma_close(). Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-02fuse: Trust kernel i_mtime onlyMaxim Patlasov
Let the kernel maintain i_mtime locally: - clear S_NOCMTIME - implement i_op->update_time() - flush mtime on fsync and last close - update i_mtime explicitly on truncate and fallocate Fuse inode flag FUSE_I_MTIME_DIRTY serves as indication that local i_mtime should be flushed to the server eventually. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-02fuse: Trust kernel i_size onlyPavel Emelyanov
Make fuse think that when writeback is on the inode's i_size is always up-to-date and not update it with the value received from the userspace. This is done because the page cache code may update i_size without letting the FS know. This assumption implies fixing the previously introduced short-read helper -- when a short read occurs the 'hole' is filled with zeroes. fuse_file_fallocate() is also fixed because now we should keep i_size up to date, so it must be updated if FUSE_FALLOCATE request succeeded. Signed-off-by: Maxim V. Patlasov <MPatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-02fuse: Connection bit for enabling writebackPavel Emelyanov
Off (0) by default. Will be used in the next patches and will be turned on at the very end. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-02fuse: Prepare to handle short readsPavel Emelyanov
A helper which gets called when read reports less bytes than was requested. See patch "trust kernel i_size only" for details. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-02fuse: Linking file to inode helperPavel Emelyanov
When writeback is ON every writeable file should be in per-inode write list, not only mmap-ed ones. Thus introduce a helper for this linkage. Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-04-01ocfs2_file_aio_write(): switch to generic_perform_write()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01ceph_aio_write(): switch to generic_perform_write()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01xfs_file_buffered_aio_write(): switch to generic_perform_write()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01generic_file_direct_write(): get rid of ppos argumentAl Viro
always equal to &iocb->ki_pos. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01btrfs_file_aio_write(): get rid of pposAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01kill the 5th argument of generic_file_buffered_write()Al Viro
same story - it's &iocb->ki_pos in all cases Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01kill the 4th argument of __generic_file_aio_write()Al Viro
It's always equal to &iocb->ki_pos, where iocb is the value of the 1st argument. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01ocfs2: don't open-code kernel_recvmsg()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01constify blk_rq_map_user_iov() and friendsAl Viro
sg_iovec array passed to it can be const Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01ocfs2: don't open-code kernel_sendmsg()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01read_code(): go through vfs_read() instead of calling the method directlyAl Viro
... and don't skip on sanity checks. It's *not* a hot path, TYVM (a couple of calls per a.out execve(), for pity sake) and headers of random a.out binary are not to be trusted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01fold cifs_iovec_read() into its (only) callerAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01cifs_iovec_read: keep iov_iter between the calls of cifs_readdata_to_iov()Al Viro
... we are doing them on adjacent parts of file, so what happens is that each subsequent call works to rebuild the iov_iter to exact state it had been abandoned in by previous one. Just keep it through the entire cifs_iovec_read(). And use copy_page_to_iter() instead of doing kmap/copy_to_user/kunmap manually... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01switch vmsplice_to_user() to copy_page_to_iter()Al Viro
I've switched the sanity checks on iovec to rw_copy_check_uvector(); we might need to do a local analog, if any behaviour differences are not actually bugfixes here... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01switch pipe_read() to copy_page_to_iter()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01cifs_iovec_read(): resubmit shouldn't restart the loopAl Viro
... by that point the request we'd just resent is in the head of the list anyway. Just return to the beginning of the loop body... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01callers of iov_copy_from_user_atomic() don't need pagecache_disable()Al Viro
... it does that itself (via kmap_atomic()) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01switch ->is_partially_uptodate() to saner argumentsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01pipe: kill ->map() and ->unmap()Al Viro
all pipe_buffer_operations have the same instances of those... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01fuse/dev: use atomic mapsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01VFS: Make delayed_free() call free_vfsmnt()David Howells
Make delayed_free() call free_vfsmnt() so that we don't have two functions doing the same job. This requires the calls to mnt_free_id() in free_vfsmnt() to be moved into the callers of that function. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01cifs: ->rename() without ->lookup() makes no senseAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01get rid of pointless checks for NULL ->i_opAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01ntfs: don't put NULL into ->i_op/->i_fopAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01new helper: readlink_copy()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01get rid of files_defer_init()Al Viro
the only thing it's doing these days is calculation of upper limit for fs.nr_open sysctl and that can be done statically Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01namei.c: move EXPORT_SYMBOL to corresponding definitionsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01get_write_access() is inlined, exporting it is pointlessAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01tidy do_dentry_open() up a bitAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01mark struct file that had write access grabbed by open()Al Viro
new flag in ->f_mode - FMODE_WRITER. Set by do_dentry_open() in case when it has grabbed write access, checked by __fput() to decide whether it wants to drop the sucker. Allows to stop bothering with mnt_clone_write() in alloc_file(), along with fewer special_file() checks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01fold __get_file_write_access() into its only callerAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01get rid of DEBUG_WRITECOUNTAl Viro
it only makes control flow in __fput() and friends more convoluted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01don't bother with {get,put}_write_access() on non-regular filesAl Viro
it's pointless and actually leads to wrong behaviour in at least one moderately convoluted case (pipe(), close one end, try to get to another via /proc/*/fd and run into ETXTBUSY). Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01ncpfs: switch to sockfd_lookup()/sockfd_put()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01reduce m_start() cost...Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01smarter propagate_mnt()Al Viro
The current mainline has copies propagated to *all* nodes, then tears down the copies we made for nodes that do not contain counterparts of the desired mountpoint. That sets the right propagation graph for the copies (at teardown time we move the slaves of removed node to a surviving peer or directly to master), but we end up paying a fairly steep price in useless allocations. It's fairly easy to create a situation where N calls of mount(2) create exactly N bindings, with O(N^2) vfsmounts allocated and freed in process. Fortunately, it is possible to avoid those allocations/freeings. The trick is to create copies in the right order and find which one would've eventually become a master with the current algorithm. It turns out to be possible in O(nodes getting propagation) time and with no extra allocations at all. One part is that we need to make sure that eventual master will be created before its slaves, so we need to walk the propagation tree in a different order - by peer groups. And iterate through the peers before dealing with the next group. Another thing is finding the (earlier) copy that will be a master of one we are about to create; to do that we are (temporary) marking the masters of mountpoints we are attaching the copies to. Either we are in a peer of the last mountpoint we'd dealt with, or we have the following situation: we are attaching to mountpoint M, the last copy S_0 had been attached to M_0 and there are sequences S_0...S_n, M_0...M_n such that S_{i+1} is a master of S_{i}, S_{i} mounted on M{i} and we need to create a slave of the first S_{k} such that M is getting propagation from M_{k}. It means that the master of M_{k} will be among the sequence of masters of M. On the other hand, the nearest marked node in that sequence will either be the master of M_{k} or the master of M_{k-1} (the latter - in the case if M_{k-1} is a slave of something M gets propagation from, but in a wrong peer group). So we go through the sequence of masters of M until we find a marked one (P). Let N be the one before it. Then we go through the sequence of masters of S_0 until we find one (say, S) mounted on a node D that has P as master and check if D is a peer of N. If it is, S will be the master of new copy, if not - the master of S will be. That's it for the hard part; the rest is fairly simple. Iterator is in next_group(), handling of one prospective mountpoint is propagate_one(). It seems to survive all tests and gives a noticably better performance than the current mainline for setups that are seriously using shared subtrees. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01Merge branch 'for-3.15/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block layer updates from Jens Axboe: "This is the pull request for the core block IO bits for the 3.15 kernel. It's a smaller round this time, it contains: - Various little blk-mq fixes and additions from Christoph and myself. - Cleanup of the IPI usage from the block layer, and associated helper code. From Frederic Weisbecker and Jan Kara. - Duplicate code cleanup in bio-integrity from Gu Zheng. This will give you a merge conflict, but that should be easy to resolve. - blk-mq notify spinlock fix for RT from Mike Galbraith. - A blktrace partial accounting bug fix from Roman Pen. - Missing REQ_SYNC detection fix for blk-mq from Shaohua Li" * 'for-3.15/core' of git://git.kernel.dk/linux-block: (25 commits) blk-mq: add REQ_SYNC early rt,blk,mq: Make blk_mq_cpu_notify_lock a raw spinlock blk-mq: support partial I/O completions blk-mq: merge blk_mq_insert_request and blk_mq_run_request blk-mq: remove blk_mq_alloc_rq blk-mq: don't dump CPU -> hw queue map on driver load blk-mq: fix wrong usage of hctx->state vs hctx->flags blk-mq: allow blk_mq_init_commands() to return failure block: remove old blk_iopoll_enabled variable blktrace: fix accounting of partially completed requests smp: Rename __smp_call_function_single() to smp_call_function_single_async() smp: Remove wait argument from __smp_call_function_single() watchdog: Simplify a little the IPI call smp: Move __smp_call_function_single() below its safe version smp: Consolidate the various smp_call_function_single() declensions smp: Teach __smp_call_function_single() to check for offline cpus smp: Remove unused list_head from csd smp: Iterate functions through llist_for_each_entry_safe() block: Stop abusing rq->csd.list in blk-softirq block: Remove useless IPI struct initialization ...
2014-04-02f2fs: fix to cover io->bio with io_rwsemJaegeuk Kim
In the f2fs_wait_on_page_writeback, io->bio should be covered by io_rwsem. Otherwise, the bio pointer can become a dangling pointer due to data races. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-02f2fs: fix error path when fail to read inline dataChao Yu
We should unlock page in ->readpage() path and also should unlock & release page in error path of ->write_begin() to avoid deadlock or memory leak. So let's add release code to fix the problem when we fail to read inline data. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-02f2fs: use list_for_each_entry{_safe} for simplyfying codeChao Yu
This patch use list_for_each_entry{_safe} instead of list_for_each{_safe} for simplfying code. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-02f2fs: avoid free slab cache under spinlockChao Yu
Move kmem_cache_free out of spinlock protection region for better performance. Change log from v1: o remove spinlock protection for kmem_cache_free in destroy_node_manager suggested by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-04-01ext4: fix premature freeing of partial clusters split across leaf blocksEric Whitney
Xfstests generic/311 and shared/298 fail when run on a bigalloc file system. Kernel error messages produced during the tests report that blocks to be freed are already on the to-be-freed list. When e2fsck is run at the end of the tests, it typically reports bad i_blocks and bad free blocks counts. The bug that causes these failures is located in ext4_ext_rm_leaf(). Code at the end of the function frees a partial cluster if it's not shared with an extent remaining in the leaf. However, if all the extents in the leaf have been removed, the code dereferences an invalid extent pointer (off the front of the leaf) when the check for sharing is made. This generally has the effect of unconditionally freeing the partial cluster, which leads to the observed failures when the partial cluster is shared with the last extent in the next leaf. Fix this by attempting to free the cluster only if extents remain in the leaf. Any remaining partial cluster will be freed if possible when the next leaf is processed or when leaf removal is complete. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2014-04-01Merge tag 'driver-core-3.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and sysfs updates from Greg KH: "Here's the big driver core / sysfs update for 3.15-rc1. Lots of kernfs updates to make it useful for other subsystems, and a few other tiny driver core patches. All have been in linux-next for a while" * tag 'driver-core-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (42 commits) Revert "sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()" kernfs: cache atomic_write_len in kernfs_open_file numa: fix NULL pointer access and memory leak in unregister_one_node() Revert "driver core: synchronize device shutdown" kernfs: fix off by one error. kernfs: remove duplicate dir.c at the top dir x86: align x86 arch with generic CPU modalias handling cpu: add generic support for CPU feature based module autoloading sysfs: create bin_attributes under the requested group driver core: unexport static function create_syslog_header firmware: use power efficient workqueue for unloading and aborting fw load firmware: give a protection when map page failed firmware: google memconsole driver fixes firmware: fix google/gsmi duplicate efivars_sysfs_init() drivers/base: delete non-required instances of include <linux/init.h> kernfs: fix kernfs_node_from_dentry() ACPI / platform: drop redundant ACPI_HANDLE check kernfs: fix hash calculation in kernfs_rename_ns() kernfs: add CONFIG_KERNFS sysfs, kobject: add sysfs wrapper for kernfs_enable_ns() ...
2014-04-01ext2: acl: remove unneeded include of linux/capability.hJakub Sitnicki
acl.c has not been (directly) using the interface defined by linux/capability.h header since commit 3bd858ab1c451725c07a ("Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check"). Remove it. Signed-off-by: Jakub Sitnicki <jsitnicki@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>