summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2015-12-07ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flagJan Kara
When dioread_nolock mode is enabled, we grab i_data_sem in ext4_ext_direct_IO() and therefore we need to instruct _ext4_get_block() not to grab i_data_sem again using EXT4_GET_BLOCKS_NO_LOCK. However holding i_data_sem over overwrite direct IO isn't needed these days. We have exclusion against truncate / hole punching because we increase i_dio_count under i_mutex in ext4_ext_direct_IO() so once ext4_file_write_iter() verifies blocks are allocated & written, they are guaranteed to stay so during the whole direct IO even after we drop i_mutex. So we can just remove this locking abuse and the no longer necessary EXT4_GET_BLOCKS_NO_LOCK flag. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: document lock orderingJan Kara
We have enough locks that it's probably worth documenting the lock ordering rules we have in ext4. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: fix races of writeback with punch hole and zero rangeJan Kara
When doing delayed allocation, update of on-disk inode size is postponed until IO submission time. However hole punch or zero range fallocate calls can end up discarding the tail page cache page and thus on-disk inode size would never be properly updated. Make sure the on-disk inode size is updated before truncating page cache. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: fix races between buffered IO and collapse / insert rangeJan Kara
Current code implementing FALLOC_FL_COLLAPSE_RANGE and FALLOC_FL_INSERT_RANGE is prone to races with buffered writes and page faults. If buffered write or write via mmap manages to squeeze between filemap_write_and_wait_range() and truncate_pagecache() in the fallocate implementations, the written data is simply discarded by truncate_pagecache() although it should have been shifted. Fix the problem by moving filemap_write_and_wait_range() call inside i_mutex and i_mmap_sem. That way we are protected against races with both buffered writes and page faults. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: move unlocked dio protection from ext4_alloc_file_blocks()Jan Kara
Currently ext4_alloc_file_blocks() was handling protection against unlocked DIO. However we now need to sometimes call it under i_mmap_sem and sometimes not and DIO protection ranks above it (although strictly speaking this cannot currently create any deadlocks). Also ext4_zero_range() was actually getting & releasing unlocked DIO protection twice in some cases. Luckily it didn't introduce any real bug but it was a land mine waiting to be stepped on. So move DIO protection out from ext4_alloc_file_blocks() into the two callsites. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07ext4: fix races between page faults and hole punchingJan Kara
Currently, page faults and hole punching are completely unsynchronized. This can result in page fault faulting in a page into a range that we are punching after truncate_pagecache_range() has been called and thus we can end up with a page mapped to disk blocks that will be shortly freed. Filesystem corruption will shortly follow. Note that the same race is avoided for truncate by checking page fault offset against i_size but there isn't similar mechanism available for punching holes. Fix the problem by creating new rw semaphore i_mmap_sem in inode and grab it for writing over truncate, hole punching, and other functions removing blocks from extent tree and for read over page faults. We cannot easily use i_data_sem for this since that ranks below transaction start and we need something ranking above it so that it can be held over the whole truncate / hole punching operation. Also remove various workarounds we had in the code to reduce race window when page fault could have created pages with stale mapping information. Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-12-07Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Ext4 bug fixes for v4.4, including fixes for post-2038 time encodings, some endian conversion problems with ext4 encryption, potential memory leaks after truncate in data=journal mode, and an ocfs2 regression caused by a jbd2 performance improvement" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: fix null committed data return in undo_access ext4: add "static" to ext4_seq_##name##_fops struct ext4: fix an endianness bug in ext4_encrypted_follow_link() ext4: fix an endianness bug in ext4_encrypted_zeroout() jbd2: Fix unreclaimed pages after truncate in data=journal mode ext4: Fix handling of extended tv_sec
2015-12-07btrfs: make set_range_writeback return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make extent_range_redirty_for_io return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make extent_range_clear_dirty_for_io return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make end_extent_writepage return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. The branch in end_bio_extent_writepage has been skipped since 5fd02043553b ("Btrfs: finish ordered extents in their own thread"). Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make extent_clear_unlock_delalloc return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make clear_extent_buffer_uptodate return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: make set_extent_buffer_uptodate return voidDavid Sterba
Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-07btrfs: remove a trivial helper btrfs_set_buffer_uptodateDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2015-12-06xfs: Change how listxattr generates synthetic attributesAndreas Gruenbacher
Instead of adding the synthesized POSIX ACL attribute names after listing all non-synthesized attributes, generate them immediately when listing the non-synthesized attributes. In addition, merge xfs_xattr_put_listent and xfs_xattr_put_listent_sizes to ensure that the list size is computed correctly; the split version was overestimating the list size for non-root users. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: xfs@oss.sgi.com Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06tmpfs: listxattr should include POSIX ACL xattrsAndreas Gruenbacher
When a file on tmpfs has an ACL or a Default ACL, listxattr should include the corresponding xattr name. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: linux-mm@kvack.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06tmpfs: Use xattr handler infrastructureAndreas Gruenbacher
Use the VFS xattr handler infrastructure and get rid of similar code in the filesystem. For implementing shmem_xattr_handler_set, we need a version of simple_xattr_set which removes the attribute when value is NULL. Use this to implement kernfs_iop_removexattr as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: linux-mm@kvack.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06btrfs: Use xattr handler infrastructureAndreas Gruenbacher
Use the VFS xattr handler infrastructure and get rid of similar code in the filesystem. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: Distinguish between full xattr names and proper prefixesAndreas Gruenbacher
Add an additional "name" field to struct xattr_handler. When the name is set, the handler matches attributes with exactly that name. When the prefix is set instead, the handler matches attributes with the given prefix and with a non-empty suffix. This patch should avoid bugs like the one fixed in commit c361016a in the future. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06posix acls: Remove duplicate xattr name definitionsAndreas Gruenbacher
Remove POSIX_ACL_XATTR_{ACCESS,DEFAULT} and GFS2_POSIX_ACL_{ACCESS,DEFAULT} and replace them with the definitions in <include/uapi/linux/xattr.h>. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06gfs2: Remove gfs2_xattr_acl_chmodAndreas Gruenbacher
Function gfs2_xattr_acl_chmod is unused since commit e01580bf. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: Remove vfs_xattr_cmpAndreas Gruenbacher
This function was only briefly used in security/integrity/evm, between commits 66dbc325 and 15647eb3. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06restore_nameidata(): no need to clear now->stackAl Viro
microoptimization: in all callers *now is in the frame we are about to leave. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06namei.c: take "jump to root" into a new helperAl Viro
... and use it both in path_init() (for absolute pathnames) and get_link() (for absolute symlinks). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06path_init(): set nd->inode earlier in cwd-relative caseAl Viro
that allows to kill the recheck of nd->seq on the way out in this case, and this check on the way out is left only for absolute pathnames. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06namei.c: fold set_root_rcu() into set_root()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06don't opencode iget_failed()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06f2fs: it's umode_t, not mode_t...Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06typo in fs/namei.c commentMike Marshall
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06coredump: Use 64bit time for unix time of coredumpArnd Bergmann
struct timeval on 32-bit systems will have its tv_sec value overflow in year 2038 and beyond. Use a 64 bit value to print time of the coredump in seconds. ktime_get_real_seconds is chosen here for efficiency reasons. Suggested by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06adfs: constify adfs_dir_ops structuresJulia Lawall
The adfs_dir_ops structures are never modified, so declare them as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: show_vfsstat: remove redundant initialization and check of error codeDmitry V. Levin
As err variable is now always checked right after each assignment, its initialization is redundant and could be safely removed. For the same reason, the last check of err is also redundant and could be removed as well. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: show_mountinfo: cleanup error code checksDmitry V. Levin
Check err variable right after each assignment. This change makes initialization of err redundant, so remove the initialization. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: show_vfsmnt: remove redundant initialization of error codeDmitry V. Levin
As err variable is now always checked right after the first assignment, its initialization is redundant and could be safely removed. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06fs/bad_inode.c: is_bad_inode can be booleanYaowei Bai
This patch makes is_bad_inode return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06fs/dcache.c: is_subdir can be booleanYaowei Bai
This patch makes is_subdir return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06fs/namespace.c: path_is_under can be booleanYaowei Bai
This patch makes path_is_under return bool to improve readability due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06fs/file.c: __const_max is actually __const_min :-)Rasmus Villemoes
7f4b36f9bb930 "get rid of files_defer_init()" inexplicably changed a min() to a __const_max() - but the __const_max macro actually gives the minimum... So no functional change, just less confusing naming. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06namei: page_getlink() and page_follow_link_light() are the same thingAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06ufs: get rid of ->setattr() for symlinksAl Viro
It was to needed for a couple of months in 2010, until UFS quota support got dropped. Since then it's equivalent to simple_setattr() (i.e. the default) for everything except the regular files. And dropping it there allows to convert all UFS symlinks to {page,simple}_symlink_inode_operations, getting rid of fs/ufs/symlink.c completely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06udf: don't duplicate page_symlink_inode_operationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06logfs: don't duplicate page_symlink_inode_operationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06switch befs long symlinks to page_symlink_operationsAl Viro
just give them the right ->readpage()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A couple of fixes (-stable fodder) + dead code removal after the overlayfs fix. I agree that it's better to separate from the fix part to make backporting easier, but IMO it's not worth delaying said dead code removal until the next window" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Don't reset ->total_link_count on nested calls of vfs_path_lookup() ovl: get rid of the dead code left from broken (and disabled) optimizations ovl: fix permission checking for setattr
2015-12-06Don't reset ->total_link_count on nested calls of vfs_path_lookup()Al Viro
we already zero it on outermost set_nameidata(), so initialization in path_init() is pointless and wrong. The same DoS exists on pre-4.2 kernels, but there a slightly different fix will be needed. Cc: stable@vger.kernel.org # v4.2 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06ovl: get rid of the dead code left from broken (and disabled) optimizationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06ovl: fix permission checking for setattrMiklos Szeredi
[Al Viro] The bug is in being too enthusiastic about optimizing ->setattr() away - instead of "copy verbatim with metadata" + "chmod/chown/utimes" (with the former being always safe and the latter failing in case of insufficient permissions) it tries to combine these two. Note that copyup itself will have to do ->setattr() anyway; _that_ is where the elevated capabilities are right. Having these two ->setattr() (one to set verbatim copy of metadata, another to do what overlayfs ->setattr() had been asked to do in the first place) combined is where it breaks. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@vger.kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-04f2fs: fix to convert inline inode in ->setattrChao Yu
In commit 3c4541452748 ("f2fs: do not trim preallocated blocks when truncating after i_size"), in order to follow the regulation: "truncate(x) where x > i_size will not trim all blocks past i_size." like other file systems, in ->setattr we invoked truncate_setsize instead of f2fs_truncate to avoid unneeded block trimming in such case, but forgot to call f2fs_convert_inline_inode keep consistency of inline data conversion rule. This patch fixes to convert inline data if necessary. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-12-04f2fs: use sbi->blocks_per_seg to avoid unnecessary calculationChao Yu
Use sbi->blocks_per_seg directly to avoid unnecessary calculation when using 1 << sbi->log_blocks_per_seg. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>