summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2016-03-15ocfs2: fix a tiny race that leads file system read-onlyJiufei Xue
when o2hb detect a node down, it first set the dead node to recovery map and create ocfs2rec which will replay journal for dead node. o2hb thread then call dlm_do_local_recovery_cleanup() to delete the lock for dead node. After the lock of dead node is gone, locks for other nodes can be granted and may modify the meta data without replaying journal of the dead node. The detail is described as follows. N1 N2 N3(master) modify the extent tree of inode, and commit dirty metadata to journal, then goes down. o2hb thread detects N1 goes down, set recovery map and delete the lock of N1. dlm_thread flush ast for the lock of N2. do not detect the death of N1, so recovery map is empty. read inode from disk without replaying the journal of N1 and modify the extent tree of the inode that N1 had modified. ocfs2rec recover the journal of N1. The modification of N2 is lost. The modification of N1 and N2 are not serial, and it will lead to read-only file system. We can set recovery_waiting flag to the lock resource after delete the lock for dead node to prevent other node from getting the lock before dlm recovery. After dlm recovery, the recovery map on N2 is not empty, ocfs2_inode_lock_full_nested() will wait for ocfs2 recovery. Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15ocfs2/dlm: return EINVAL when the lockres on migration target is in ↵xuejiufei
DROPPING_REF state If master migrate this lock resource to node when it happened to purge it, a new lock resource will be created and inserted into hash list. If then master goes down, the lock resource being purged is recovered, so there exist two lock resource with different owner. So return error to master if the lock resource is in DROPPING state, master will retry to migrate this lock resource. Signed-off-by: xuejiufei <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15ocfs2/dlm: clear DROPPING_REF flag when the master goes downxuejiufei
If the master goes down after return in-progress for deref message. The lock resource on non-master node can not be purged. Clear the DROPPING_REF flag and recovery it. Signed-off-by: xuejiufei <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15ocfs2/dlm: return in progress if master can not clear the refmap bit right nowxuejiufei
Master returns in-progress to non-master node when it can not clear the refmap bit right now. And non-master node will not purge the lock resource until receiving deref done message. Signed-off-by: xuejiufei <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15ocfs2/dlm: add DEREF_DONE messagexuejiufei
This series of patches is to fix the dis-order issue of setting/clearing refmap bit described below. Node 1 Node 2(master) dlmlock dlm_do_master_request dlm_master_request_handler -> dlm_lockres_set_refmap_bit dlmlock succeed dlmunlock succeed dlm_purge_lockres dlm_deref_handler -> find lock resource is in DLM_LOCK_RES_SETREF_INPROG state, so dispatch a deref work dlm_purge_lockres succeed. call dlmlock again dlm_do_master_request dlm_master_request_handler -> dlm_lockres_set_refmap_bit deref work trigger, call dlm_lockres_clear_refmap_bit to clear Node 1 from refmap dlm_purge_lockres succeed dlm_send_remote_lock_request return DLM_IVLOCKID because the lockres is not exist BUG if the lockres is $RECOVERY This series of patches add a new message to keep the order of set and clear. Other nodes can purge the lock resource only after the refmap bit on master is cleared. This patch is to add DEREF_DONE message and corresponding handler. Node can purge the lock resource after receiving this message. As a new message is added, so increase the minor number of dlm protocol version. Signed-off-by: xuejiufei <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15ocfs2/dlm: fix a typo in dlmcommon.hJoseph Qi
Refer to cluster/tcp.h, NET_MAX_PAYLOAD_BYTES is a typo for O2NET_MAX_PAYLOAD_BYTES. Since currently DLM_MIG_LOCKRES_RESERVED is not actually used, it won't cause any problem. But we'd better correct it for further use. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15ocfs2: use spinlock_irqsave() to downconvert lock in ocfs2_osb_dump()jiangyiwen
Commit a75e9ccabd92 ("ocfs2: use spinlock irqsave for downconvert lock") missed an unmodified place in ocfs2_osb_dump(), so it still exists a deadlock scenario. ocfs2_wake_downconvert_thread ocfs2_rw_unlock ocfs2_dio_end_io dio_complete ..... bio_endio req_bio_endio .... scsi_io_completion blk_done_softirq __do_softirq do_softirq irq_exit do_IRQ ocfs2_osb_dump cat /sys/kernel/debug/ocfs2/${uuid}/fs_state This patch still uses spin_lock_irqsave() - replace spin_lock() to solve this situation. Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15ocfs2/cluster: replace the interrupt safe spinlocks with common onesjiangyiwen
There actually no hardware or software interrupts in the context which using o2hb_live_lock, so we don't need to worry about race conditions caused by irq/softirq with spinlock held. Turning off irq is not good for system performance after all. Just replace them with a non interrupt safe function. Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "This is another big update. Main changes are: - lots of x86 system call (and other traps/exceptions) entry code enhancements. In particular the complex parts of the 64-bit entry code have been migrated to C code as well, and a number of dusty corners have been refreshed. (Andy Lutomirski) - vDSO special mapping robustification and general cleanups (Andy Lutomirski) - cpufeature refactoring, cleanups and speedups (Borislav Petkov) - lots of other changes ..." * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) x86/cpufeature: Enable new AVX-512 features x86/entry/traps: Show unhandled signal for i386 in do_trap() x86/entry: Call enter_from_user_mode() with IRQs off x86/entry/32: Change INT80 to be an interrupt gate x86/entry: Improve system call entry comments x86/entry: Remove TIF_SINGLESTEP entry work x86/entry/32: Add and check a stack canary for the SYSENTER stack x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed x86/entry: Vastly simplify SYSENTER TF (single-step) handling x86/entry/traps: Clear DR6 early in do_debug() and improve the comment x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions x86/entry/32: Restore FLAGS on SYSEXIT x86/entry/32: Filter NT and speed up AC filtering in SYSENTER x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test selftests/x86: In syscall_nt, test NT|TF as well x86/asm-offsets: Remove PARAVIRT_enabled x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled uprobes: __create_xol_area() must nullify xol_mapping.fault x86/cpufeature: Create a new synthetic cpu capability for machine check recovery ...
2016-03-15GFS2: Eliminate parameter non_block on gfs2_inode_lookupBob Peterson
Now that we're not filtering out I_FREEING inodes from our lookups anymore, we can eliminate the non_block parameter from the lookup function. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2016-03-15GFS2: Don't filter out I_FREEING inodes anymoreBob Peterson
This patch basically reverts a very old patch from 2008, 7a9f53b3c1875bef22ad4588e818bc046ef183da, with the title "Alternate gfs2_iget to avoid looking up inodes being freed". The original patch was designed to avoid a deadlock caused by lock ordering with try_rgrp_unlink. The patch forced the function to not find inodes that were being removed by VFS. The problem is, that made it impossible for nodes to delete their own unlinked dinodes after a certain point in time, because the inode needed was not found by this filtering process. There is no longer a need for the patch, since function try_rgrp_unlink no longer locks the inode: All it does is queue the glock onto the delete work_queue, so there should be no more deadlock. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2016-03-15GFS2: Prevent delete work from occurring on glocks used for createBob Peterson
This patch tries to prevent delete work (queued via iopen callback) from executing if the glock is currently being used to create a new inode. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2016-03-15GFS2: Fix direct IO write rounding errorBob Peterson
The fsx test in xfstests was failing because it was using direct IO writes which were using a bad calculation. It was using loff_t lstart = offset & (PAGE_CACHE_SIZE - 1); when it should be loff_t lstart = offset & ~(PAGE_CACHE_SIZE - 1); Thus, the write at offset 0x67e00 was calculating lstart to be 0xe00, the address of our corruption. Instead, it should have been 0x67000. This patch fixes the calculation. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2016-03-15gfs2: avoid uninitialized variable warningArnd Bergmann
We get a bogus warning about a potential uninitialized variable use in gfs2, because the compiler does not figure out that we never use the leaf number if get_leaf_nr() returns an error: fs/gfs2/dir.c: In function 'get_first_leaf': fs/gfs2/dir.c:802:9: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized] fs/gfs2/dir.c: In function 'dir_split_leaf': fs/gfs2/dir.c:1021:8: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized] Changing the 'if (!error)' to 'if (!IS_ERR_VALUE(error))' is sufficient to let gcc understand that this is exactly the same condition as in IS_ERR() so it can optimize the code path enough to understand it. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2016-03-15Merge branch 'xfs-misc-fixes-4.6-4' into for-nextDave Chinner
2016-03-15xfs: always set rvalp in xfs_dir2_node_trim_freeChristoph Hellwig
xfs_dir2_node_trim_free can return with setting the rvalp argument pointer. Initialize it to 0 at the beginning of the function and only update it to 1 if we succeeded trimming a freespace block. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-15xfs: ensure committed is initialized in xfs_trans_rollEric Sandeen
__xfs_trans_roll() can return without setting the *committed argument; this was a problem for xfs_bmap_finish(): int committed;/* xact committed or not */ ... error = __xfs_trans_roll(tp, ip, &committed); if (error) { ... if (committed) { and we tested an uninitialized "committed" variable on the error path. No caller is preserving "committed" state across calls to __xfs_trans_roll(), so just initialize committed inside the function to avoid future errors like this. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-15xfs: borrow indirect blocks from freed extent when availableBrian Foster
xfs_bmap_del_extent() handles extent removal from the in-core and on-disk extent lists. When removing a delalloc range, it updates the indirect block reservation appropriately based on the removal. It currently enforces that the new indirect block reservation is less than or equal to the original. This is normally the case in all situations except for in certain cases when the removed range creates a hole in a single delalloc extent, thus splitting a single delalloc extent in two. It is possible with small enough extents to split an indlen==1 extent into two such slightly smaller extents. This leaves one extent with 0 indirect blocks and leads to assert failures in other areas (e.g., xfs_bunmapi() if the extent happens to be removed). Update the indlen distribution code to steal blocks from the deleted extent, if necessary, to satisfy the worst case total indirect reservation for the new extents. This is safe as the caller does not update the fdblocks counters until the extent is removed. Blocks stolen in this manner simply remain accounted as allocated, having ownership transferred from the data extent to an indirect reservation. As a precaution, fall back to the original reservation algorithm if the new indlen requirement is not met and warn if we end up with extents without any reservation at all to detect this more easily in the future. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-15xfs: refactor delalloc indlen reservation split into helperBrian Foster
The delayed allocation indirect reservation splitting code is not sufficient in some cases where a delalloc extent is split in two. In preparation for enhancements to this code, refactor the current indlen distribution algorithm into a new helper function. [dchinner: rename temp, temp2 variables] Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-15xfs: update freeblocks counter after extent deletionBrian Foster
xfs_bunmapi() currently updates the fdblocks counter, unreserves quota, etc. before the extent is deleted by xfs_bmap_del_extent(). The function has problems dividing up the indirect reserved blocks for scenarios where a single delalloc extent is split in two. Particularly, there aren't always enough blocks reserved for multiple extents in a single extent reservation. The solution to this problem is to allow the extent removal code to steal from the deleted extent to meet indirect reservation requirements. Move the block of code in xfs_bmapi() that updates the fdblocks counter to after the call to xfs_bmap_del_extent() to allow the codepath to update the extent record before the free blocks are accounted. Also, reshuffle the code slightly so the delalloc accounting occurs near the xfs_bmap_del_extent() call to provide context for the comments. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-15xfs: debug mode forced buffered write failureBrian Foster
Add a DEBUG mode-only sysfs knob to enable forced buffered write failure. An additional side effect of this mode is brute force killing of delayed allocation blocks in the range of the write. The latter is the prime motiviation behind this patch, as userspace test infrastructure requires a reliable mechanism to create and split delalloc extents without causing extent conversion. Certain fallocate operations (i.e., zero range) were used for this in the past, but the implementations have changed such that delalloc extents are flushed and converted to real blocks, rendering the test useless. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-14Orangefs: fix sloppy cleanups of debugfs and sysfs init failures.Mike Marshall
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-03-14Orangefs: follow_link -> get_link changeMike Marshall
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-03-14Orangefs: Extra sanity insurance on buffer before using string functions on it.Mike Marshall
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-03-14Orangefs: merge to v4.5Mike Marshall
Merge tag 'v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into current Linux 4.5
2016-03-14btrfs: Fix misspellings in comments.Adam Buchbinder
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-03-14fuse: Add reference counting for fuse_io_privSeth Forshee
The 'reqs' member of fuse_io_priv serves two purposes. First is to track the number of oustanding async requests to the server and to signal that the io request is completed. The second is to be a reference count on the structure to know when it can be freed. For sync io requests these purposes can be at odds. fuse_direct_IO() wants to block until the request is done, and since the signal is sent when 'reqs' reaches 0 it cannot keep a reference to the object. Yet it needs to use the object after the userspace server has completed processing requests. This leads to some handshaking and special casing that it needlessly complicated and responsible for at least one race condition. It's much cleaner and safer to maintain a separate reference count for the object lifecycle and to let 'reqs' just be a count of outstanding requests to the userspace server. Then we can know for sure when it is safe to free the object without any handshaking or special cases. The catch here is that most of the time these objects are stack allocated and should not be freed. Initializing these objects with a single reference that is never released prevents accidental attempts to free the objects. Fixes: 9d5722b7777e ("fuse: handle synchronous iocbs internally") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-14fuse: do not use iocb after it may have been freedRobert Doebbelin
There's a race in fuse_direct_IO(), whereby is_sync_kiocb() is called on an iocb that could have been freed if async io has already completed. The fix in this case is simple and obvious: cache the result before starting io. It was discovered by KASan: kernel: ================================================================== kernel: BUG: KASan: use after free in fuse_direct_IO+0xb1a/0xcc0 at addr ffff88036c414390 Signed-off-by: Robert Doebbelin <robert@quobyte.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: bcba24ccdc82 ("fuse: enable asynchronous processing direct IO") Cc: <stable@vger.kernel.org> # 3.10+
2016-03-14btrfs: Print Warning only if ENOSPC_DEBUG is enabledAshish Samant
Dont print warning for ENOSPC error unless ENOSPC_DEBUG is enabled. Use btrfs_debug if it is enabled. Signed-off-by: Ashish Samant <ashish.samant@oracle.com> [ preserve the WARN_ON ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-03-14dcache.c: new helper: __d_add()Al Viro
d_add() with inode->i_lock already held; common to d_add() and d_splice_alias(). All ->lookup() instances that end up hashing the dentry they are given will hash it here. This almost completes the preparations to parallel lookups proper - the only remaining bit is taking security_d_instantiate() past d_rehash() and doing rehashing without dropping ->d_lock. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14don't bother with __d_instantiate(dentry, NULL)Al Viro
it's a no-op - bumping ->d_seq is pointless there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14untangle fsnotify_d_instantiate() a bitAl Viro
First of all, don't bother calling it if inode is NULL - that makes inode argument unused. Moreover, do it *before* dropping ->d_lock, not right after that (and don't bother grabbing ->d_lock in it, of course). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14uninline d_add()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14replace d_add_unique() with saner primitiveAl Viro
new primitive: d_exact_alias(dentry, inode). If there is an unhashed dentry with the same name/parent and given inode, rehash, grab and return it. Otherwise, return NULL. The only caller of d_add_unique() switched to d_exact_alias() + d_splice_alias(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14quota: use lookup_one_len_unlocked()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14cifs_get_root(): use lookup_one_len_unlocked()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14nfs_lookup: don't bother with d_instantiate(dentry, NULL)Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14kill dentry_unhash()Al Viro
the last user is gone Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14ceph_fill_trace(): don't bother with d_instantiate(dn, NULL)Al Viro
... and use d_add(dn, NULL) in case we need to hash a negative unhashed rather than using d_rehash() directly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14autofs4: don't bother with d_instantiate(dentry, NULL) in ->lookup()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14configfs: move d_rehash() into configfs_create() for regular filesAl Viro
... and turn it into d_add in there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14ceph: don't bother with d_rehash() in splice_dentry()Al Viro
d_splice_alias() guarantees that it'll be always hashed Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14namei: teach lookup_slow() to skip revalidateAl Viro
... and make mountpoint_last() use it. That makes all candidates for lookup with parent locked shared go through lookup_slow(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14namei: massage lookup_slow() to be usable by lookup_one_len_unlocked()Al Viro
Return dentry and don't pass nameidata or path; lift crossing mountpoints into the caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14lookup_one_len_unlocked(): use lookup_dcache()Al Viro
No need to lock parent just because of ->d_revalidate() on child; contrary to the stale comment, lookup_dcache() *can* be used without locking the parent. Result can be moved as soon as we return, of course, but the same is true for lookup_one_len_unlocked() itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14namei: simplify invalidation logics in lookup_dcache()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14namei: change calling conventions for lookup_{fast,slow} and follow_managed()Al Viro
Have lookup_fast() return 1 on success and 0 on "need to fall back"; lookup_slow() and follow_managed() return positive (1) on success. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-14namei: untanlge lookup_fast()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-13ext4: clean up error handling in the MMP supportvikram.jadhav07
There is memory leak as both caller function kmmpd() and callee read_mmp_block() not releasing bh_check (i.e buffer_head). Given patch fixes this problem. [ Additional changes suggested by Andreas Dilger -- TYT ] Signed-off-by: Jadhav Vikram <vikramjadhavpucsd2007@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-03-13jbd2: do not fail journal because of frozen_buffer allocation failureMichal Hocko
Journal transaction might fail prematurely because the frozen_buffer is allocated by GFP_NOFS request: [ 72.440013] do_get_write_access: OOM for frozen_buffer [ 72.440014] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in __ext4_journal_get_write_access [ 72.440015] EXT4-fs error (device sda1) in ext4_reserve_inode_write:4735: Out of memory (...snipped....) [ 72.495559] do_get_write_access: OOM for frozen_buffer [ 72.495560] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in __ext4_journal_get_write_access [ 72.496839] do_get_write_access: OOM for frozen_buffer [ 72.496841] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in __ext4_journal_get_write_access [ 72.505766] Aborting journal on device sda1-8. [ 72.505851] EXT4-fs (sda1): Remounting filesystem read-only This wasn't a problem until "mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM" because small GPF_NOFS allocations never failed. This allocation seems essential for the journal and GFP_NOFS is too restrictive to the memory allocator so let's use __GFP_NOFAIL here to emulate the previous behavior. Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>