summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2017-04-20NFS: Clean up _nfs4_proc_exchange_id()Anna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs4_proc_bind_one_conn_to_session()Anna Schumaker
Returning errors directly even lets us remove the goto Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Remove extra dprintk()s from nfs4namespace.cAnna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs4_get_rootfh()Anna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Remove extra dprintk()s from nfs4client.cAnna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs4_init_server()Anna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs4_set_client()Anna Schumaker
If we cut out the dprintk()s, then we can return error codes directly and cut out the goto. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs4_check_server_scope()Anna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs4_check_serverowner_major_id()Anna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Create a common nfs4_match_client() functionAnna Schumaker
This puts all the common code in a single place for the walk_client_list() functions. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs4_check_serverowner_minor_id()Anna Schumaker
Once again, we can remove the function and compare integer values directly. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs4_match_clientids()Anna Schumaker
If we cut out the dprintk()s, then we don't even need this to be a separate function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs42_layoutstat_done()Anna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Remove extra dprintk()s from namespace.cAnna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs_direct_commit_complete()Anna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Remove nfs_direct_readpage_release()Anna Schumaker
Just remove the function and have the caller use nfs_release_request() instead. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up extra dprintk()s in client.cAnna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs_init_client()Anna Schumaker
We always call nfs_mark_client_ready() even if nfs_create_rpc_client() returns an error, so we can rearrange nfs_init_client() to mark the client ready from a single place. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Remove extra dprintk()s from callback_xdr.cAnna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up encode_cb_sequence_res()Anna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up decode_notify_lock_args()Anna Schumaker
Let's cut out the goto and return any errors immedately Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up decode_cb_sequence_args()Anna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up decode_layoutrecall_args()Anna Schumaker
Additionally, this change lets us cut out the goto by returning errors immediately. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up decode_recall_args()Anna Schumaker
Removing the dprintk() lets us simplify the function by returning status codes directly, rather than using a goto. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up decode_getattr_args()Anna Schumaker
Removing the dprintk() lets us return the status value directly, rather than jumping to a label if an error occurs. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Remove extra dprintk()s from callback_proc.cAnna Schumaker
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up nfs4_callback_layoutrecall()Anna Schumaker
In addition to removing the dprintk(), this patch also initializes "res" to the default return value instead of doing this through an else condition. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Clean up do_callback_layoutrecall()Anna Schumaker
Removing the dprintk()s lets us simplify the function by removing the else condition entirely and returning the status of initiate_{file,bulk}_draining() directly. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20nfs: flexfilelayout: remove v3-only data server limitationTigran Mkrtchyan
Flexfilelayout supports data servers which talk NFS v3 and v4.{0,1,2}. However, this code path is disabled and v3 only servers are accepted. This change removes this limitation. Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: switch back to to ->iterate()Benjamin Coddington
NFS has some optimizations for readdir to choose between using READDIR or READDIRPLUS based on workload, and which NFS operation to use is determined by subsequent interactions with lookup, d_revalidate, and getattr. Concurrent use of nfs_readdir() via ->iterate_shared() can cause those optimizations to repeatedly invalidate the pagecache used to store directory entries during readdir(), which causes some very bad performance for directories with many entries (more than about 10000). There's a couple ways to fix this in NFS, but no fix would be as simple as going back to ->iterate() to serialize nfs_readdir(), and neither fix I tested performed as well as going back to ->iterate(). The first required taking the directory's i_lock for each entry, with the result of terrible contention. The second way adds another flag to the nfs_inode, and so keeps the optimizations working for large directories. The difference from using ->iterate() here is that much more memory is consumed for a given workload without any performance gain. The workings of nfs_readdir() are such that concurrent users are serialized within read_cache_page() waiting to retrieve pages of entries from the server. By serializing this work in iterate_dir() instead, contention for cache pages is reduced. Waiting processes can have an uncontended pass at the entirety of the directory's pagecache once previous processes have completed filling it. v2 - Keep the bits needed for parallel lookup Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20ovl: check IS_APPEND() on real upper inodeAmir Goldstein
For overlay file open, check IS_APPEND() on the real upper inode inside d_real(), because the overlay inode does not have the S_APPEND flag and IS_APPEND() can only be checked at open time. Note that because overlayfs does not copy up the chattr inode flags (i.e. S_APPEND, S_IMMUTABLE), the IS_APPEND() check is only relevant for upper inodes that were set with chattr +a and not to lower inodes that had chattr +a before copy up. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-04-20vfs: ftruncate check IS_APPEND() on real upper inodeAmir Goldstein
ftruncate an overlayfs inode was checking IS_APPEND() on overlay inode, but overlay inode does not have the S_APPEND flag. Check IS_APPEND() on real upper inode instead. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-04-20ovl: Use designated initializersKees Cook
Prepare to mark sensitive kernel structures for randomization by making sure they're using designated initializers. These were identified during allyesconfig builds of x86, arm, and arm64, with most initializer fixes extracted from grsecurity. For these cases, use { }, which will be zero-filled, instead of undesignated NULLs. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-04-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
A function in kernel/bpf/syscall.c which got a bug fix in 'net' was moved to kernel/bpf/verifier.c in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20Annotate hardware config module parameters in fs/pstore/David Howells
When the kernel is running in secure boot mode, we lock down the kernel to prevent userspace from modifying the running kernel image. Whilst this includes prohibiting access to things like /dev/mem, it must also prevent access by means of configuring driver modules in such a way as to cause a device to access or modify the kernel image. To this end, annotate module_param* statements that refer to hardware configuration and indicate for future reference what type of parameter they specify. The parameter parser in the core sees this information and can skip such parameters with an error message if the kernel is locked down. The module initialisation then runs as normal, but just sees whatever the default values for those parameters is. Note that we do still need to do the module initialisation because some drivers have viable defaults set in case parameters aren't specified and some drivers support automatic configuration (e.g. PNP or PCI) in addition to manually coded parameters. This patch annotates drivers in fs/pstore/. Suggested-by: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> cc: Anton Vorontsov <anton@enomsg.org> cc: Colin Cross <ccross@android.com> cc: Tony Luck <tony.luck@intel.com>
2017-04-19Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull CIFS fix from Steve French: "One more cifs fix for stable" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: cifs: Do not send echoes before Negotiate is complete
2017-04-19nsfs: mark dentry with DCACHE_RCUACCESSCong Wang
Andrey reported a use-after-free in __ns_get_path(): spin_lock include/linux/spinlock.h:299 [inline] lockref_get_not_dead+0x19/0x80 lib/lockref.c:179 __ns_get_path+0x197/0x860 fs/nsfs.c:66 open_related_ns+0xda/0x200 fs/nsfs.c:143 sock_ioctl+0x39d/0x440 net/socket.c:1001 vfs_ioctl fs/ioctl.c:45 [inline] do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685 SYSC_ioctl fs/ioctl.c:700 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 We are under rcu read lock protection at that point: rcu_read_lock(); d = atomic_long_read(&ns->stashed); if (!d) goto slow; dentry = (struct dentry *)d; if (!lockref_get_not_dead(&dentry->d_lockref)) goto slow; rcu_read_unlock(); but don't use a proper RCU API on the free path, therefore a parallel __d_free() could free it at the same time. We need to mark the stashed dentry with DCACHE_RCUACCESS so that __d_free() will be called after all readers leave RCU. Fixes: e149ed2b805f ("take the targets of /proc/*/ns/* symlinks to separate fs") Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-19jffs2: fix spelling mistake: "requestied" -> "requested"Colin Ian King
trivial fix to spelling mistake in JFFS2_ERROR message Signed-off-by: Colin Ian King <colin.king@canonical.com> [Brian: also fix 'an' -> 'a'] Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2017-04-19f2fs: introduce async IPU policyHou Pengyang
This patch introduces an ASYNC IPU policy. Under senario of large # of async updating(e.g. log writing in Android), disk would be seriously fragmented, and higher frequent gc would be triggered. This patch uses IPU to rewrite the async update writting, since async is NOT sensitive to io latency. Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
2017-04-19f2fs: add undiscard blocks statChao Yu
This patch adds to account undiscard blocks. Signed-off-by: Chao Yu <yuchao0@huawei.com>
2017-04-19f2fs: unlock cp_rwsem early for IPU writesChao Yu
For IPU writes, there won't be any udpates in dnode page since we will reuse old block address instead of allocating new one, so we don't need to lock cp_rwsem during IPU IO submitting. Signed-off-by: Chao Yu <yuchao0@huawei.com>
2017-04-19f2fs: introduce __check_rb_tree_consistenceChao Yu
Introduce __check_rb_tree_consistence to check consistence of rb-tree based discard cache in runtime. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-19f2fs: trace __submit_discard_cmdChao Yu
Add an even class f2fs_discard for introducing f2fs_queue_discard, then use f2fs_{queue,issue}_discard to trace __{queue,submit}_discard_cmd. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-19f2fs: in prior to issue big discardChao Yu
Keep issuing big size discard in prior instead of the one with random size, so that we expect that it will help to: - be quick to recycle unused large space in flash storage device. - give a chance for a) wait to merge small piece discards into bigger one, or b) avoid issuing discards while they have being reallocated by SSR. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-19f2fs: clean up discard_cmd_control structureChao Yu
Avoid long variable name in discard_cmd_control structure, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-19f2fs: use rb-tree to track pending discard commandsChao Yu
Introduce rb-tree based discard cache infrastructure to speed up lookup and merge operation of discard entry. Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: initialize dc to avoid build warning] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-19GFS2: Non-recursive deleteBob Peterson
Implement truncate/delete as a non-recursive algorithm. The older algorithm was implemented with recursion to strip off each layer at a time (going by height, starting with the maximum height. This version tries to do the same thing but without recursion, and without needing to allocate new structures or lists in memory. For example, say you want to truncate a very large file to 1 byte, and its end-of-file metapath is: 0.505.463.428. The starting metapath would be 0.0.0.0. Since it's a truncate to non-zero, it needs to preserve that byte, and all metadata pointing to it. So it would start at 0.0.0.0, look up all its metadata buffers, then free all data blocks pointed to at the highest level. After that buffer is "swept", it moves on to 0.0.0.1, then 0.0.0.2, etc., reading in buffers and sweeping them clean. When it gets to the end of the 0.0.0 metadata buffer (for 4K blocks the last valid one is 0.0.0.508), it backs up to the previous height and starts working on 0.0.1.0, then 0.0.1.1, and so forth. After it reaches the end and sweeps 0.0.1.508, it continues with 0.0.2.0, and so on. When that height is exhausted, and it reaches 0.0.508.508 it backs up another level, to 0.1.0.0, then 0.1.0.1, through 0.1.0.508. So it has to keep marching backwards and forwards through the metadata until it's all swept clean. Once it has all the data blocks freed, it lowers the strip height, and begins the process all over again, but with one less height. This time it sweeps 0.0.0 through 0.505.463. When that's clean, it lowers the strip height again and works to free 0.505. Eventually it strips the lowest height, 0. For a delete or truncate to 0, all metadata for all heights of 0.0.0.0 would be freed. For a truncate to 1 byte, 0.0.0.0 would be preserved. This isn't much different from normal integer incrementing, where an integer gets incremented from 0000 (0.0.0.0) to 3021 (3.0.2.1). So 0000 gets increments to 0001, 0002, up to 0009, then on to 0010, 0011 up to 0099, then 0100 and so forth. It's just that each "digit" goes from 0 to 508 (for a total of 509 pointers) rather than from 0 to 9. Note that the dinode will only have 483 pointers due to the dinode structure itself. Also note: this is just an example. These numbers (509 and 483) are based on a standard 4K block size. Smaller block sizes will yield smaller numbers of indirect pointers accordingly. The truncation process is accomplished with the help of two major functions and a few helper functions. Functions do_strip and recursive_scan are obsolete, so removed. New function sweep_bh_for_rgrps cleans a buffer_head pointed to by the given metapath and height. By cleaning, I mean it frees all blocks starting at the offset passed in metapath. It starts at the first block in the buffer pointed to by the metapath and identifies its resource group (rgrp). From there it frees all subsequent block pointers that lie within that rgrp. If it's already inside a transaction, it stays within it as long as it can. In other words, it doesn't close a transaction until it knows it's freed what it can from the resource group. In this way, multiple buffers may be cleaned in a single transaction, as long as those blocks in the buffer all lie within the same rgrp. If it's not in a transaction, it starts one. If the buffer_head has references to blocks within multiple rgrps, it frees all the blocks inside the first rgrp it finds, then closes the transaction. Then it repeats the cycle: identifies the next unfreed block, uses it to find its rgrp, then starts a new transaction for that set. It repeats this process repeatedly until the buffer_head contains no more references to any blocks past the given metapath. Function trunc_dealloc has been reworked into a finite state automaton. It has basically 3 active states: DEALLOC_MP_FULL, DEALLOC_MP_LOWER, and DEALLOC_FILL_MP: The DEALLOC_MP_FULL state implies the metapath has a full set of buffers out to the "shrink height", and therefore, it can call function sweep_bh_for_rgrps to free the blocks within the highest height of the metapath. If it's just swept the lowest level (or an error has occurred) the state machine is ended. Otherwise it proceeds to the DEALLOC_MP_LOWER state. The DEALLOC_MP_LOWER state implies we are finished with a given buffer_head, which may now be released, and therefore we are then missing some buffer information from the metapath. So we need to find more buffers to read in. In most cases, this is just a matter of releasing the buffer_head and moving to the next pointer from the previous height, so it may be read in and swept as well. If it can't find another non-null pointer to process, it checks whether it's reached the end of a height and needs to lower the strip height, or whether it still needs move forward through the previous height's metadata. In this state, all zero-pointers are skipped. From this state, it can only loop around (once more backing up another height) or, once a valid metapath is found (one that has non-zero pointers), proceed to state DEALLOC_FILL_MP. The DEALLOC_FILL_MP state implies that we have a metapath but not all its buffers are read in. So we must proceed to read in buffer_heads until the metapath has a valid buffer for every height. If the previous state backed us up 3 heights, we may need to read in a buffer, increment the height, then repeat the process until buffers have been read in for all required heights. If it's successful reading a buffer, and it's at the highest height we need, it proceeds back to the DEALLOC_MP_FULL state. If it's unable to fill in a buffer, (encounters a hole, etc.) it tries to find another non-zero block pointer. If they're all zero, it lowers the height and returns to the DEALLOC_MP_LOWER state. If it finds a good non-null pointer, it loops around and reads it in, while keeping the metapath in lock-step with the pointers it examines. The state machine runs until the truncation request is satisfied. Then any transactions are ended, the quota and statfs data are updated, and the function is complete. Helper function metaptr1 was introduced to be an easy way to determine the start of a buffer_head's indirect pointers. Helper function lookup_mp_height was introduced to find a metapath index and read in the buffer that corresponds to it. In this way, function lookup_metapath becomes a simple loop to call it for every height. Helper function fillup_metapath is similar to lookup_metapath except it can do partial lookups. If the state machine backed up multiple levels (like 2999 wrapping to 3000) it needs to find out the next starting point and start issuing metadata reads at that point. Helper function hptrs is a shortcut to determine how many pointers should be expected in a buffer. Height 0 is the dinode which has fewer pointers than the others. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-04-19quota: Remove dquot_quotactl_opsJan Kara
Nobody uses them anymore. Signed-off-by: Jan Kara <jack@suse.cz>
2017-04-19reiserfs: Remove i_attrs_to_sd_attrs()Jan Kara
Now that all places setting inode->i_flags that should be reflected in on-disk flags are gone, we can remove i_attrs_to_sd_attrs() call. Signed-off-by: Jan Kara <jack@suse.cz>
2017-04-19reiserfs: Remove useless setting of i_flagsJan Kara
reiserfs_new_inode() clears IMMUTABLE and APPEND flags from a symlink i_flags however a few lines below in sd_attrs_to_i_attrs() we will happily overwrite i_flags with whatever we inherited from the directory. Since this behavior is there for ages just remove the useless setting of i_flags. Signed-off-by: Jan Kara <jack@suse.cz>