summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2015-12-089p: ->evict_inode() should kick out ->i_data, not ->i_mappingAl Viro
For block devices the pagecache is associated with the inode on bdevfs, not with the aliasing ones on the mountable filesystems. The latter have its own ->i_data empty and ->i_mapping pointing to the (unique per major/minor) bdevfs inode. That guarantees cache coherence between all block device inodes with the same device number. Eviction of an alias inode has no business trying to evict the pages belonging to bdevfs one; moreover, ->i_mapping is only safe to access when the thing is opened. At the time of ->evict_inode() the victim is definitely *not* opened. We are about to kill the address space embedded into struct inode (inode->i_data) and that's what we need to empty of any pages. 9p instance tries to empty inode->i_mapping instead, which is both unsafe and bogus - if we have several device nodes with the same device number in different places, closing one of them should not try to empty the (shared) page cache. Fortunately, other instances in the tree are OK; they are evicting from &inode->i_data instead, as 9p one should. Cc: stable@vger.kernel.org # v2.6.32+, ones prior to 2.6.36 need only half of that Reported-by: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com> Tested-by: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-08cifs: avoid unused variable and labelArnd Bergmann
The newly introduced cifs_clone_file_range() function produces two harmless compile-time warnings: cifsfs.c: In function 'cifs_clone_file_range': cifsfs.c:963:1: warning: label 'out_unlock' defined but not used [-Wunused-label] cifsfs.c:924:20: warning: unused variable 'src_tcon' [-Wunused-variable] In both cases, removing the extraneous line avoids the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: c6f2a1e2e5f8 ("vfs: pull btrfs clone API to vfs layer") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-08Documentation: filesystem: Fix typo in fs/eventfd.cMasanari Iida
This patch fix typos found in Documentation/filesystems.xml, DocBook/filesystems/API-eventfd-signal.html, and DocBook/filesystems.aux.xml These files are generated from comments within the source, so I had to fix typos in fs/eventfd.c Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-12-08fs/super.c: use && instead of & for warn_on conditionVincent Stehlé
This fixes the following sparse warning: fs/super.c:1202:9: warning: dubious: x & !y Bitwise and logical and are equivalent here, but logical was intended. The generated code is identical, with and without CONFIG_LOCKDEP. Signed-off-by: Vincent Stehlé <vincent.stehle@freescale.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-12-07nfsd: implement the NFSv4.2 CLONE operationChristoph Hellwig
This is basically a remote version of the btrfs CLONE operation, so the implementation is fairly trivial. Made even more trivial by stealing the XDR code and general framework Anna Schumaker's COPY prototype. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-07nfsd: Pass filehandle to nfs4_preprocess_stateid_op()Anna Schumaker
This will be needed so COPY can look up the saved_fh in addition to the current_fh. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-07vfs: pull btrfs clone API to vfs layerChristoph Hellwig
The btrfs clone ioctls are now adopted by other file systems, with NFS and CIFS already having support for them, and XFS being under active development. To avoid growth of various slightly incompatible implementations, add one to the VFS. Note that clones are different from file copies in several ways: - they are atomic vs other writers - they support whole file clones - they support 64-bit legth clones - they do not allow partial success (aka short writes) - clones are expected to be a fast metadata operation Because of that it would be rather cumbersome to try to piggyback them on top of the recent clone_file_range infrastructure. The converse isn't true and the clone_file_range system call could try clone file range as a first attempt to copy, something that further patches will enable. Based on earlier work from Peng Tao. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-07locks: new locks_mandatory_area calling conventionChristoph Hellwig
Pass a loff_t end for the last byte instead of the 32-bit count parameter to allow full file clones even on 32-bit architectures. While we're at it also simplify the read/write selection. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-07SUNRPC: Fix callback channelTrond Myklebust
The NFSv4.1 callback channel is currently broken because the receive message will keep shrinking because the backchannel receive buffer size never gets reset. The easiest solution to this problem is instead of changing the receive buffer, to rather adjust the copied request. Fixes: 38b7631fbe42 ("nfs4: limit callback decoding to received bytes") Cc: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-07ext4: use pre-zeroed blocks for DAX page faultsJan Kara
Make DAX fault path use pre-zeroed blocks to avoid races with extent conversion and zeroing when two page faults to the same block happen. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: implement allocation of pre-zeroed blocksJan Kara
DAX page fault path needs to get blocks that are pre-zeroed to avoid races when two concurrent page faults happen in the same block of a file. Implement support for this in ext4_map_blocks(). Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: provide ext4_issue_zeroout()Jan Kara
Create new function ext4_issue_zeroout() to zeroout contiguous (both logically and physically) part of inode data. We will need to issue zeroout when extent structure is not readily available and this function will allow us to do it without making up fake extent structures. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flagJan Kara
When dioread_nolock mode is enabled, we grab i_data_sem in ext4_ext_direct_IO() and therefore we need to instruct _ext4_get_block() not to grab i_data_sem again using EXT4_GET_BLOCKS_NO_LOCK. However holding i_data_sem over overwrite direct IO isn't needed these days. We have exclusion against truncate / hole punching because we increase i_dio_count under i_mutex in ext4_ext_direct_IO() so once ext4_file_write_iter() verifies blocks are allocated & written, they are guaranteed to stay so during the whole direct IO even after we drop i_mutex. So we can just remove this locking abuse and the no longer necessary EXT4_GET_BLOCKS_NO_LOCK flag. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: document lock orderingJan Kara
We have enough locks that it's probably worth documenting the lock ordering rules we have in ext4. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: fix races of writeback with punch hole and zero rangeJan Kara
When doing delayed allocation, update of on-disk inode size is postponed until IO submission time. However hole punch or zero range fallocate calls can end up discarding the tail page cache page and thus on-disk inode size would never be properly updated. Make sure the on-disk inode size is updated before truncating page cache. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: fix races between buffered IO and collapse / insert rangeJan Kara
Current code implementing FALLOC_FL_COLLAPSE_RANGE and FALLOC_FL_INSERT_RANGE is prone to races with buffered writes and page faults. If buffered write or write via mmap manages to squeeze between filemap_write_and_wait_range() and truncate_pagecache() in the fallocate implementations, the written data is simply discarded by truncate_pagecache() although it should have been shifted. Fix the problem by moving filemap_write_and_wait_range() call inside i_mutex and i_mmap_sem. That way we are protected against races with both buffered writes and page faults. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: move unlocked dio protection from ext4_alloc_file_blocks()Jan Kara
Currently ext4_alloc_file_blocks() was handling protection against unlocked DIO. However we now need to sometimes call it under i_mmap_sem and sometimes not and DIO protection ranks above it (although strictly speaking this cannot currently create any deadlocks). Also ext4_zero_range() was actually getting & releasing unlocked DIO protection twice in some cases. Luckily it didn't introduce any real bug but it was a land mine waiting to be stepped on. So move DIO protection out from ext4_alloc_file_blocks() into the two callsites. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: fix races between page faults and hole punchingJan Kara
Currently, page faults and hole punching are completely unsynchronized. This can result in page fault faulting in a page into a range that we are punching after truncate_pagecache_range() has been called and thus we can end up with a page mapped to disk blocks that will be shortly freed. Filesystem corruption will shortly follow. Note that the same race is avoided for truncate by checking page fault offset against i_size but there isn't similar mechanism available for punching holes. Fix the problem by creating new rw semaphore i_mmap_sem in inode and grab it for writing over truncate, hole punching, and other functions removing blocks from extent tree and for read over page faults. We cannot easily use i_data_sem for this since that ranks below transaction start and we need something ranking above it so that it can be held over the whole truncate / hole punching operation. Also remove various workarounds we had in the code to reduce race window when page fault could have created pages with stale mapping information. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Ext4 bug fixes for v4.4, including fixes for post-2038 time encodings, some endian conversion problems with ext4 encryption, potential memory leaks after truncate in data=journal mode, and an ocfs2 regression caused by a jbd2 performance improvement" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: fix null committed data return in undo_access ext4: add "static" to ext4_seq_##name##_fops struct ext4: fix an endianness bug in ext4_encrypted_follow_link() ext4: fix an endianness bug in ext4_encrypted_zeroout() jbd2: Fix unreclaimed pages after truncate in data=journal mode ext4: Fix handling of extended tv_sec
2015-12-07btrfs: make set_range_writeback return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make extent_range_redirty_for_io return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make extent_range_clear_dirty_for_io return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make end_extent_writepage return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. The branch in end_bio_extent_writepage has been skipped since 5fd02043553b ("Btrfs: finish ordered extents in their own thread"). Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make extent_clear_unlock_delalloc return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make clear_extent_buffer_uptodate return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make set_extent_buffer_uptodate return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: remove a trivial helper btrfs_set_buffer_uptodateDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-06xfs: Change how listxattr generates synthetic attributesAndreas Gruenbacher
Instead of adding the synthesized POSIX ACL attribute names after listing all non-synthesized attributes, generate them immediately when listing the non-synthesized attributes. In addition, merge xfs_xattr_put_listent and xfs_xattr_put_listent_sizes to ensure that the list size is computed correctly; the split version was overestimating the list size for non-root users. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: xfs@oss.sgi.com Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06tmpfs: listxattr should include POSIX ACL xattrsAndreas Gruenbacher
When a file on tmpfs has an ACL or a Default ACL, listxattr should include the corresponding xattr name. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: linux-mm@kvack.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06tmpfs: Use xattr handler infrastructureAndreas Gruenbacher
Use the VFS xattr handler infrastructure and get rid of similar code in the filesystem. For implementing shmem_xattr_handler_set, we need a version of simple_xattr_set which removes the attribute when value is NULL. Use this to implement kernfs_iop_removexattr as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: linux-mm@kvack.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06btrfs: Use xattr handler infrastructureAndreas Gruenbacher
Use the VFS xattr handler infrastructure and get rid of similar code in the filesystem. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: Distinguish between full xattr names and proper prefixesAndreas Gruenbacher
Add an additional "name" field to struct xattr_handler. When the name is set, the handler matches attributes with exactly that name. When the prefix is set instead, the handler matches attributes with the given prefix and with a non-empty suffix. This patch should avoid bugs like the one fixed in commit c361016a in the future. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06posix acls: Remove duplicate xattr name definitionsAndreas Gruenbacher
Remove POSIX_ACL_XATTR_{ACCESS,DEFAULT} and GFS2_POSIX_ACL_{ACCESS,DEFAULT} and replace them with the definitions in <include/uapi/linux/xattr.h>. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06gfs2: Remove gfs2_xattr_acl_chmodAndreas Gruenbacher
Function gfs2_xattr_acl_chmod is unused since commit e01580bf. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: Remove vfs_xattr_cmpAndreas Gruenbacher
This function was only briefly used in security/integrity/evm, between commits 66dbc325 and 15647eb3. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06restore_nameidata(): no need to clear now->stackAl Viro
microoptimization: in all callers *now is in the frame we are about to leave. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06namei.c: take "jump to root" into a new helperAl Viro
... and use it both in path_init() (for absolute pathnames) and get_link() (for absolute symlinks). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06path_init(): set nd->inode earlier in cwd-relative caseAl Viro
that allows to kill the recheck of nd->seq on the way out in this case, and this check on the way out is left only for absolute pathnames. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06namei.c: fold set_root_rcu() into set_root()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06don't opencode iget_failed()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06f2fs: it's umode_t, not mode_t...Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06typo in fs/namei.c commentMike Marshall
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06coredump: Use 64bit time for unix time of coredumpArnd Bergmann
struct timeval on 32-bit systems will have its tv_sec value overflow in year 2038 and beyond. Use a 64 bit value to print time of the coredump in seconds. ktime_get_real_seconds is chosen here for efficiency reasons. Suggested by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06adfs: constify adfs_dir_ops structuresJulia Lawall
The adfs_dir_ops structures are never modified, so declare them as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: show_vfsstat: remove redundant initialization and check of error codeDmitry V. Levin
As err variable is now always checked right after each assignment, its initialization is redundant and could be safely removed. For the same reason, the last check of err is also redundant and could be removed as well. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: show_mountinfo: cleanup error code checksDmitry V. Levin
Check err variable right after each assignment. This change makes initialization of err redundant, so remove the initialization. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: show_vfsmnt: remove redundant initialization of error codeDmitry V. Levin
As err variable is now always checked right after the first assignment, its initialization is redundant and could be safely removed. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06fs/bad_inode.c: is_bad_inode can be booleanYaowei Bai
This patch makes is_bad_inode return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06fs/dcache.c: is_subdir can be booleanYaowei Bai
This patch makes is_subdir return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06fs/namespace.c: path_is_under can be booleanYaowei Bai
This patch makes path_is_under return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>