summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2015-03-18gfs2: allow fallocate to max out quotas/fs efficientlyAbhi Das
We can quickly get an estimate of how many blocks are available for allocation restricted by quota and fs size respectively, using the ap->allowed field in the gfs2_alloc_parms structure. gfs2_quota_check() and gfs2_inplace_reserve() provide these values. Once we have the total number of blocks available to us, we can compute how many bytes of data can be written using those blocks instead of guessing inefficiently. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2015-03-18gfs2: allow quota_check and inplace_reserve to return available blocksAbhi Das
struct gfs2_alloc_parms is passed to gfs2_quota_check() and gfs2_inplace_reserve() with ap->target containing the number of blocks being requested for allocation in the current operation. We add a new field to struct gfs2_alloc_parms called 'allowed'. gfs2_quota_check() and gfs2_inplace_reserve() return the max blocks allowed by quota and the max blocks allowed by the chosen rgrp respectively in 'allowed'. A new field 'min_target', when non-zero, tells gfs2_quota_check() and gfs2_inplace_reserve() to not return -EDQUOT/-ENOSPC when there are atleast 'min_target' blocks allowable/available. The assumption is that the caller is ok with just 'min_target' blocks and will likely proceed with allocating them. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2015-03-18gfs2: perform quota checks against allocation parametersAbhi Das
Use struct gfs2_alloc_parms as an argument to gfs2_quota_check() and gfs2_quota_lock_check() to check for quota violations while accounting for the new blocks requested by the current operation in ap->target. Previously, the number of new blocks requested during an operation were not accounted for during quota_check and would allow these operations to exceed quota. This was not very apparent since most operations allocated only 1 block at a time and quotas would get violated in the next operation. i.e. quota excess would only be by 1 block or so. With fallocate, (where we allocate a bunch of blocks at once) the quota excess is non-trivial and is addressed by this patch. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2015-03-18GFS2: Move gfs2_file_splice_write outside of #ifdefBob Peterson
This patch moves function gfs2_file_splice_write so it's not conditionally compiled. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2015-03-18GFS2: Allocate reservation during splice_writeBob Peterson
This patch adds a GFS2-specific function for splice_write which first calls function gfs2_rs_alloc to make sure a reservation structure has been allocated before attempting to reserve blocks. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2015-03-18GFS2: gfs2_set_acl(): Cache "no acl" as wellAndreas Gruenbacher
When removing a default acl or setting an access acl that is entirely represented in the file mode, we end up with acl == NULL in gfs2_set_acl(). In that case, bring gfs2 in line with other file systems and cache the NULL acl with set_cached_acl() instead of invalidating the cache with forget_cached_acl(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2015-03-18ovl: upper fs should not be R/Ohujianyang
After importing multi-lower layer support, users could mount a r/o partition as the left most lowerdir instead of using it as upperdir. And a r/o upperdir may cause an error like overlayfs: failed to create directory ./workdir/work during mount. This patch check the *s_flags* of upper fs and return an error if it is a r/o partition. The checking of *upper_mnt->mnt_sb->s_flags* can be removed now. This patch also remove /* FIXME: workdir is not needed for a R/O mount */ from ovl_fill_super() because: 1) for upper fs r/o case Setting a r/o partition as upper is prevented, no need to care about workdir in this case. 2) for "mount overlay -o ro" with a r/w upper fs case Users could remount overlayfs to r/w in this case, so workdir should not be omitted. Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-03-18ovl: check lowerdir amount for non-upper mounthujianyang
Recently multi-lower layer mount support allow upperdir and workdir to be omitted, then cause overlayfs can be mount with only one lowerdir directory. This action make no sense and have potential risk. This patch check the total number of lower directories to prevent mounting overlayfs with only one directory. Also, an error message is added to indicate lower directories exceed OVL_MAX_STACK limit. Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-03-18ovl: print error message for invalid mount optionshujianyang
Overlayfs should print an error message if an incorrect mount option is caught like other filesystems. After this patch, improper option input could be clearly known. Reported-by: Fabian Sturm <fabian.sturm@aduu.de> Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-03-17Btrfs: fix outstanding_extents accounting in DIOJosef Bacik
We are keeping track of how many extents we need to reserve properly based on the amount we want to write, but we were still incrementing outstanding_extents if we wrote less than what we requested. This isn't quite right since we will be limited to our max extent size. So instead lets do something horrible! Keep track of how many outstanding_extents we reserved, and decrement each time we allocate an extent. If we use our entire reserve make sure to jack up outstanding_extents on the inode so the accounting works out properly. Thanks, Reported-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2015-03-17Btrfs: add sanity test for outstanding_extents accountingJosef Bacik
I introduced a regression wrt outstanding_extents accounting. These are tricky areas that aren't easily covered by xfstests as we could change MAX_EXTENT_SIZE at any time. So add sanity tests to cover the various conditions that are tricky in order to make sure we don't introduce regressions in the future. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com>
2015-03-17Btrfs: just free dummy extent buffersJosef Bacik
If we fail during our sanity tests we could get NULL deref's because we unload the module before the dummy extent buffers are free'd via RCU. So check for this case and just free the things directly. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com>
2015-03-17Btrfs: account merges/splits properlyJosef Bacik
My fix Btrfs: fix merge delalloc logic only fixed half of the problems, it didn't fix the case where we have two large extents on either side and then join them together with a new small extent. We need to instead keep track of how many extents we have accounted for with each side of the new extent, and then see how many extents we need for the new large extent. If they match then we know we need to keep our reservation, otherwise we need to drop our reservation. This shows up with a case like this [BTRFS_MAX_EXTENT_SIZE+4K][4K HOLE][BTRFS_MAX_EXTENT_SIZE+4K] Previously the logic would have said that the number extents required for the new size (3) is larger than the number of extents required for the largest side (2) therefore we need to keep our reservation. But this isn't the case, since both sides require a reservation of 2 which leads to 4 for the whole range currently reserved, but we only need 3, so we need to drop one of the reservations. The same problem existed for splits, we'd think we only need 3 extents when creating the hole but in reality we need 4. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com>
2015-03-17pagemap: do not leak physical addresses to non-privileged userspaceKirill A. Shutemov
As pointed by recent post[1] on exploiting DRAM physical imperfection, /proc/PID/pagemap exposes sensitive information which can be used to do attacks. This disallows anybody without CAP_SYS_ADMIN to read the pagemap. [1] http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html [ Eventually we might want to do anything more finegrained, but for now this is the simple model. - Linus ] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Seaborn <mseaborn@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-17fs: add dirtytime_expire_seconds sysctlTheodore Ts'o
Add a tuning knob so we can adjust the dirtytime expiration timeout, which is very useful for testing lazytime. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2015-03-17fs: make sure the timestamps for lazytime inodes eventually get writtenTheodore Ts'o
Jan Kara pointed out that if there is an inode which is constantly getting dirtied with I_DIRTY_PAGES, an inode with an updated timestamp will never be written since inode->dirtied_when is constantly getting updated. We fix this by adding an extra field to the inode, dirtied_time_when, so inodes with a stale dirtytime can get detected and handled. In addition, if we have a dirtytime inode caused by an atime update, and there is no write activity on the file system, we need to have a secondary system to make sure these inodes get written out. We do this by setting up a second delayed work structure which wakes up the CPU much more rarely compared to writeback_expire_centisecs. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2015-03-17reiserfs: fix __RASSERT format stringNicolas Iooss
__RASSERT format string does not use the PID argument. reiserfs_panic arguments are therefore formatted with the wrong format specifier (for example __LINE__ with %s). This bug was introduced when commit c3a9c2109f84 ("reiserfs: rework reiserfs_panic") removed a "reiserfs[%i]" prefix. This bug is only triggered when using CONFIG_REISERFS_CHECK, otherwise __RASSERT is never used. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-17Btrfs: prepare block group cache before writingJosef Bacik
Writing the block group cache will modify the extent tree quite a bit because it truncates the old space cache and pre-allocates new stuff. To try and cut down on the churn lets do the setup dance first, then later on hopefully we can avoid looping with newly dirtied roots. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com>
2015-03-16kernfs: handle poll correctly on 'direct_read' files.NeilBrown
Kernfs supports two styles of read: direct_read and seqfile_read. The latter supports 'poll' correctly thanks to the update of '->event' in kernfs_seq_show. The former does not as '->event' is never updated on a read. So add an appropriate update in kernfs_file_direct_read(). This was noticed because some 'md' sysfs attributes were recently changed to use direct reads. Reported-by: Prakash Punnoor <prakash@punnoor.de> Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com> Fixes: 750f199ee8b578062341e6ddfe36c59ac8ff2dcb Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-16pstore: Fix the ramoops module parameters updateWang Long
In the function ramoops_probe, the console_size, pmsg_size, ftrace_size may be update because the value is not the power of two. We should update the module parameter variables as well so they are visible through /sys/module/ramoops/parameters correctly. Signed-off-by: Wang Long <long.wanglong@huawei.com> Acked-by: Mark Salyzyn <salyzyn@android.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2015-03-16Merge branch 'quota_interface' into for_next_testingJan Kara
2015-03-16udf: use int for allocated blocks instead of sector_tFabian Frederick
Fix the following warnings: fs/udf/balloc.c:768:15: warning: conversion to 'sector_t' from 'int' may change the sign of the result [-Wsign-conversion] allocated = udf_bitmap_prealloc_blocks(sb, ^ fs/udf/balloc.c:773:15: warning: conversion to 'sector_t' from 'int' may change the sign of the result [-Wsign-conversion] allocated = udf_table_prealloc_blocks(sb, ^ fs/udf/balloc.c:778:15: warning: conversion to 'sector_t' from 'int' may change the sign of the result [-Wsign-conversion] allocated = udf_bitmap_prealloc_blocks(sb, ^ fs/udf/balloc.c:783:15: warning: conversion to 'sector_t' from 'int' may change the sign of the result [-Wsign-conversion] allocated = udf_table_prealloc_blocks(sb, ^ fs/udf/balloc.c:791:26: warning: conversion to 'loff_t' from 'sector_t' may change the sign of the result [-Wsign-conversion] inode_add_bytes(inode, allocated << sb->s_blocksize_bits); ^ fs/udf/balloc.c:792:2: warning: conversion to 'int' from 'sector_t' may alter its value [-Wconversion] return allocated; Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-14locks: fix generic_delete_lease tracepoint to use victim pointerJeff Layton
It's possible that "fl" won't point at a valid lock at this point, so use "victim" instead which is either a valid lock or NULL. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
2015-03-14udf: remove redundant buffer_head.h includesFabian Frederick
buffer_head.h was already included in udfdecl.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-14udf: remove else after return in __load_block_bitmap()Fabian Frederick
else after return is not needed. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-14udf: remove unused variable in udf_table_free_blocks()Fabian Frederick
Fix set but not used warning. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jan Kara <jack@suse.cz>
2015-03-13Btrfs: fix ASSERT(list_empty(&cur_trans->dirty_bgs_list)Josef Bacik
Dave could hit this assert consistently running btrfs/078. This is because when we update the block groups we could truncate the free space, which would try to delete the csums for that range and dirty the csum root. For this to happen we have to have already written out the csum root so it's kind of hard to hit this case. This patch fixes this by changing the logic to only write the dirty block groups if the dirty_cowonly_roots list is empty. This will get us the same effect as before since we add the extent root last, and will cover the case that we dirty some other root again but not the extent root. Thanks, Reported-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-13Btrfs: account for the correct number of extents for delalloc reservationsJosef Bacik
Direct IO can easily pass in an buffer that is greater than BTRFS_MAX_EXTENT_SIZE, so take this into account when reserving extents in the delalloc reservation code. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-13Btrfs: fix merge delalloc logicJosef Bacik
My patch to properly count outstanding extents wrt MAX_EXTENT_SIZE introduced a regression when re-dirtying already dirty areas. We have logic in split to make sure we are taking the largest space into account but didn't have it for merge, so it was sometimes making us think we were turning a tiny extent into a huge extent, when in reality we already had a huge extent and needed to use the other side in our logic. This fixes the regression that was reported by a user on list. Thanks, Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-13Btrfs: fix comp_oper to get right orderLiu Bo
Case (oper1->seq > oper2->seq) should differ with case (oper1->seq < oper2->seq). Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.cz> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-13Btrfs: catch transaction abortion after waiting for itLiu Bo
This problem is uncovered by a test case: http://patchwork.ozlabs.org/patch/244297. Fsync() can report success when it actually doesn't. When we have several threads running fsync() at the same tiem and in one fsync() we get a transaction abortion due to some problems(in the test case it's disk failures), and other fsync()s may return successfully which makes userspace programs think that data is now safely flushed into disk. It's because that after fsyncs() fail btrfs_sync_log() due to disk failures, they get to try btrfs_commit_transaction() where it finds that there is already a transaction being committed, and they'll just call wait_for_commit() and return. Note that we actually check "trans->aborted" in btrfs_end_transaction, but it's likely that the error message is still not yet throwed out and only after wait_for_commit() we're sure whether the transaction is committed successfully. This add the necessary check and it now passes the test. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-13btrfs: fix sizeof format specifier in btrfs_check_super_valid()Fabian Frederick
This patch fixes mips compilation warning: fs/btrfs/disk-io.c: In function 'btrfs_check_super_valid': fs/btrfs/disk-io.c:3927:21: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'unsigned int' [-Wformat] Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-13fs: split generic and aio kiocbChristoph Hellwig
Most callers in the kernel want to perform synchronous file I/O, but still have to bloat the stack with a full struct kiocb. Split out the parts needed in filesystem code from those in the aio code, and only allocate those needed to pass down argument on the stack. The aio code embedds the generic iocb in the one it allocates and can easily get back to it by using container_of. Also add a ->ki_complete method to struct kiocb, this is used to call into the aio code and thus removes the dependency on aio for filesystems impementing asynchronous operations. It will also allow other callers to substitute their own completion callback. We also add a new ->ki_flags field to work around the nasty layering violation recently introduced in commit 5e33f6 ("usb: gadget: ffs: add eventfd notification about ffs events"). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-13fs: don't allow to complete sync iocbs through aio_completeChristoph Hellwig
The AIO interface is fairly complex because it tries to allow filesystems to always work async and then wakeup a synchronous caller through aio_complete. It turns out that basically no one was doing this to avoid the complexity and context switches, and we've already fixed up the remaining users and can now get rid of this case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-13fuse: handle synchronous iocbs internallyChristoph Hellwig
Based on a patch from Maxim Patlasov <MPatlasov@parallels.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-13nfs: clean up nfs_direct_IOPeng Tao
This follows up "nfs: fix dio deadlock when O_DIRECT flag is flipped" and removes the unnecessary CONFIG_NFS_SWAP switch. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-12fs: remove ki_nbytesChristoph Hellwig
There is no need to pass the total request length in the kiocb, as we already get passed in through the iov_iter argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-12fanotify: fix event filtering with FAN_ONDIR setSuzuki K. Poulose
With FAN_ONDIR set, the user can end up getting events, which it hasn't marked. This was revealed with fanotify04 testcase failure on Linux-4.0-rc1, and is a regression from 3.19, revealed with 66ba93c0d7fe6 ("fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask"). # /opt/ltp/testcases/bin/fanotify04 [ ... ] fanotify04 7 TPASS : event generated properly for type 100000 fanotify04 8 TFAIL : fanotify04.c:147: got unexpected event 30 fanotify04 9 TPASS : No event as expected The testcase sets the adds the following marks : FAN_OPEN | FAN_ONDIR for a fanotify on a dir. Then does an open(), followed by close() of the directory and expects to see an event FAN_OPEN(0x20). However, the fanotify returns (FAN_OPEN|FAN_CLOSE_NOWRITE(0x10)). This happens due to the flaw in the check for event_mask in fanotify_should_send_event() which does: if (event_mask & marks_mask & ~marks_ignored_mask) return true; where, event_mask == (FAN_ONDIR | FAN_CLOSE_NOWRITE), marks_mask == (FAN_ONDIR | FAN_OPEN), marks_ignored_mask == 0 Fix this by masking the outgoing events to the user, as we already take care of FAN_ONDIR and FAN_EVENT_ON_CHILD. Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com> Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-12nilfs2: fix deadlock of segment constructor during recoveryRyusuke Konishi
According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck during recovery at mount time. The code path that caused the deadlock was as follows: nilfs_fill_super() load_nilfs() nilfs_salvage_orphan_logs() * Do roll-forwarding, attach segment constructor for recovery, and kick it. nilfs_segctor_thread() nilfs_segctor_thread_construct() * A lock is held with nilfs_transaction_lock() nilfs_segctor_do_construct() nilfs_segctor_drop_written_files() iput() iput_final() write_inode_now() writeback_single_inode() __writeback_single_inode() do_writepages() nilfs_writepage() nilfs_construct_dsync_segment() nilfs_transaction_lock() --> deadlock This can happen if commit 7ef3ff2fea8b ("nilfs2: fix deadlock of segment constructor over I_SYNC flag") is applied and roll-forward recovery was performed at mount time. The roll-forward recovery can happen if datasync write is done and the file system crashes immediately after that. For instance, we can reproduce the issue with the following steps: < nilfs2 is mounted on /nilfs (device: /dev/sdb1) > # dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync # dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k count=1 && reboot -nfh < the system will immediately reboot > # mount -t nilfs2 /dev/sdb1 /nilfs The deadlock occurs because iput() can run segment constructor through writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags. The above commit changed segment constructor so that it calls iput() asynchronously for inodes with i_nlink == 0, but that change was imperfect. This fixes the another deadlock by deferring iput() in segment constructor even for the case that mount is not finished, that is, for the case that MS_ACTIVE flag is not set. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Reported-by: Yuxuan Shui <yshuiv7@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-12ocfs2: make append_dio an incompat featureMark Fasheh
It turns out that making this feature ro_compat isn't quite enough to prevent accidental corruption on mount from older kernels. Ocfs2 (like other file systems) will process orphaned inodes even when the user mounts in 'ro' mode. So for the case of a filesystem not knowing the append_dio feature, mounting the filesystem could result in orphaned-for-dio files being deleted, which we clearly don't want. So instead, turn this into an incompat flag. Btw, this is kind of my fault - initially I asked that we add a flag to cover the feature and even suggested that we use an ro flag. It wasn't until I was looking through our commits for v4.0-rc1 that I realized we actually want this to be incompat. Signed-off-by: Mark Fasheh <mfasheh@suse.de> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-12jfs: %pf is only for function pointersScott Wood
Use %ps for actual addresses, otherwise you'll get bad output on arches like ppc64 where %pf expects a function descriptor. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: jfs-discussion@lists.sourceforge.net
2015-03-12NFSv4: Append delegations to the per-client list instead of prependingTrond Myklebust
Do so on the assumption that for most use cases, that list will turn into a more or less LRU-ordered list, and so the list traversals in nfs_client_return_marked_delegations() are likely to be shorter before hitting a candidate to return. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-12NFS: remount with security change should return EINVALBenjamin Coddington
A remount that alters security flavors can appear to succeed when it should instead return -EINVAL. Check to see if the current security flavor exists within the flavors specified in the remount options, and if not fail the remount. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-12nfs: do not export discarded symbolsArnd Bergmann
The function nfs4_pnfs_v3_ds_connect_unload is exported so it can be used by other modules, but is also marked '__exit' and will be discarded when built into the kernel, as pointed out by this linker error: `nfs4_pnfs_v3_ds_connect_unload' referenced in section `___ksymtab_gpl+nfs4_pnfs_v3_ds_connect_unload' of fs/built-in.o: defined in discarded section `.exit.text' of fs/built-in.o This removes the __exit annotation to make it safe to call this function. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 5f01d9539496 ("nfs41: create NFSv3 DS connection if specified") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-12NFSv4.1: don't export static symbolJulia Lawall
The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ type T; identifier f; @@ static T f (...) { ... } @@ identifier r.f; declarer name EXPORT_SYMBOL_GPL; @@ -EXPORT_SYMBOL_GPL(f); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-06coredump: Fix do_coredump() commentBastien Nocera
Signed-off-by: Bastien Nocera <hadess@hadess.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-03-06treewide: Fix typo in printk messagesMasanari Iida
This patch fix spelling typo in printk messages. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-03-06Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "Outside of misc fixes, Filipe has a few fsync corners and we're pulling in one more of Josef's fixes from production use here" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs:__add_inode_ref: out of bounds memory read when looking for extended ref. Btrfs: fix data loss in the fast fsync path Btrfs: remove extra run_delayed_refs in update_cowonly_root Btrfs: incremental send, don't rename a directory too soon btrfs: fix lost return value due to variable shadowing Btrfs: do not ignore errors from btrfs_lookup_xattr in do_setxattr Btrfs: fix off-by-one logic error in btrfs_realloc_node Btrfs: add missing inode update when punching hole Btrfs: abort the transaction if we fail to update the free space cache inode Btrfs: fix fsync race leading to ordered extent memory leaks
2015-03-06Merge tag 'locks-v4.0-3' of git://git.samba.org/jlayton/linuxLinus Torvalds
Pull file locking fix from Jeff Layton: "Just a single patch to fix a memory leak that Daniel Wagner discovered while doing some testing with leases" * tag 'locks-v4.0-3' of git://git.samba.org/jlayton/linux: locks: fix fasync_struct memory leak in lease upgrade/downgrade handling
2015-03-06Merge tag 'nfs-for-4.0-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: - Fix a regression in the NFSv4 open state recovery code - Fix a regression in the NFSv4 close code - Fix regressions and side-effects of the loop-back mounted NFS fixes in 3.18, that cause the NFS read() syscall to return EBUSY. - Fix regressions around the readdirplus code and how it interacts with the VFS lazy unmount changes that went into v3.18. - Fix issues with out-of-order RPC call replies replacing updated attributes with stale ones (particularly after a truncate()). - Fix an underflow checking issue with RPC/RDMA credits - Fix a number of issues with the NFSv4 delegation return/free code. - Fix issues around stale NFSv4.1 leases when doing a mount" * tag 'nfs-for-4.0-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits) NFSv4.1: Clear the old state by our client id before establishing a new lease NFSv4: Fix a race in NFSv4.1 server trunking discovery NFS: Don't write enable new pages while an invalidation is proceeding NFS: Fix a regression in the read() syscall NFSv4: Ensure we skip delegations that are already being returned NFSv4: Pin the superblock while we're returning the delegation NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in nfs_inode_set_delegation() NFSv4: Ensure that we don't reap a delegation that is being returned NFS: Fix stateid used for NFS v4 closes NFSv4: Don't call put_rpccred() under the rcu_read_lock() NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache() NFSv3: Use the readdir fileid as the mounted-on-fileid NFS: Don't invalidate a submounted dentry in nfs_prime_dcache() NFSv4: Set a barrier in the update_changeattr() helper NFS: Fix nfs_post_op_update_inode() to set an attribute barrier NFS: Remove size hack in nfs_inode_attrs_need_update() NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit NFS: Add attribute update barriers to NFS writebacks NFS: Set an attribute barrier on all updates NFS: Add attribute update barriers to nfs_setattr_update_inode() ...