summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2017-10-26xfs: use xfs_bmap_del_extent_delay for the data fork as wellChristoph Hellwig
And remove the delalloc code from xfs_bmap_del_extent, which gets renamed to xfs_bmap_del_extent_real to fit the naming scheme used by the other xfs_bmap_{add,del}_extent_* routines. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-26xfs: rename bno to end in __xfs_bunmapiChristoph Hellwig
Rename the bno variable that's used as the end of the range in __xfs_bunmapi to end, which better describes it. Additionally change the start variable which takes the initial value of bno to be the function parameter itself. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-26xfs: don't set XFS_BTCUR_BPRV_WASDEL in xfs_bunmapiChristoph Hellwig
The XFS_BTCUR_BPRV_WASDEL flag is supposed to indicate that we are converting a delayed allocation to a real one, which isn't the case in xfs_bunmapi. Setting it could theoretically lead to misaccounting here, but it's unlikely that we ever hit it in practice. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-26xfs: use xfs_iext_get_extent instead of open coding itChristoph Hellwig
This avoids exposure to details of the extent list implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-26xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_realChristoph Hellwig
There was one spot in xfs_bmap_add_extent_unwritten_real that didn't use the passed in new extent state but always converted to normal, leading to wrong behavior when converting from normal to unwritten. Only found by code inspection, it seems like this code path to move partial extent from written to unwritten while merging it with the next extent is rarely exercised. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-26xfs: simplify the xfs_getbmap interfaceChristoph Hellwig
Instead of passing in a formatter callback allocate the bmap buffer in the caller and process the entries there. Additionally replace the in-kernel buffer with a new much smaller structure, and unify the implementation of the different ioctls in a single function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-26xfs: rewrite getbmap using the xfs_iext_* helpersChristoph Hellwig
Currently getbmap uses xfs_bmapi_read to query the extent map, and then fixes up various bits that are eventually reported to userspace. This patch instead rewrites it to use xfs_iext_lookup_extent and xfs_iext_get_extent to iteratively process the extent map. This not only avoids the need to allocate a map for the returned xfs_bmbt_irec structures but also greatly simplified the code. There are two intentional behavior changes compared to the old code: - the current code reports unwritten extents that don't directly border a written one as unwritten even when not passing the BMV_IF_PREALLOC option, contrary to the documentation. The new code requires the BMV_IF_PREALLOC flag to report the unwrittent extent bit. - The new code does never merges consecutive extents, unlike the old code that sometimes does it based on the boundaries of the xfs_bmapi_read calls. Note that the extent merging behavior was entirely undocumented. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-26SMB3: Validate negotiate request must always be signedSteve French
According to MS-SMB2 3.2.55 validate_negotiate request must always be signed. Some Windows can fail the request if you send it unsigned See kernel bugzilla bug 197311 CC: Stable <stable@vger.kernel.org> Acked-by: Ronnie Sahlberg <lsahlber.redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-10-26Merge tag 'ceph-for-4.14-rc7' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fix from Ilya Dryomov: "A small lock imbalance fix, marked for stable" * tag 'ceph-for-4.14-rc7' of git://github.com/ceph/ceph-client: ceph: unlock dangling spinlock in try_flush_caps()
2017-10-26f2fs: remove several redundant assignmentsColin Ian King
There are several assignments to variables that are redundant as the values are never read when the variables are updated later and so the redundant statements can be safely removed. Cleans up clang warnings: fs/f2fs/segment.c:923:19: warning: Value stored to 'p' during its initialization is never read fs/f2fs/segment.c:2060:2: warning: Value stored to 'hint' is never read fs/f2fs/segment.c:2353:2: warning: Value stored to 'start_block' is never read fs/f2fs/segment.c:2354:2: warning: Value stored to 'end_block' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: avoid using timespecArnd Bergmann
All uses of timespec are deprecated, and this one is not particularly useful, as the documented method for converting seconds to jiffies is to multiply by 'HZ'. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: fix to correct no_fggc_candidateChao Yu
There may be extreme case as below: For one section contains one segment, and there are total 100 segments with 10% over-privision ratio in f2fs partition, fggc_threshold will be rounded down to 460 instead of 460.8 as below caclulation: sbi->fggc_threshold = div_u64((u64)(main_count - ovp_count) * BLKS_PER_SEC(sbi), (main_count - resv_count)); If section usage is as: 60 segments which contain 460 valid blocks 40 segments which contain 462 valid blocks As valid block number in all sections is large than fggc_threshold, so none of them will be chosen as candidate due to incorrect fggc_threshold. Let's just soften the term of choosing foreground GC candidates. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26Revert "f2fs: return wrong error number on f2fs_quota_write"Jaegeuk Kim
This reverts commit 4f31d26b0c17f2aae6a6afeb823a87e20671ab4b. It turns out that we need to report error number if nothing was written. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: remove obsolete pointer for truncate_xattr_nodeJaegeuk Kim
This patch removes obosolete parameter for truncate_xattr_node. Suggested-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: retry ENOMEM for quota_read|writeJaegeuk Kim
This gives another chance to read or write quota data. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: limit # of inmemory pagesJaegeuk Kim
If some abnormal users try lots of atomic write operations, f2fs is able to produce pinned pages in the main memory which affects system performance. This patch limits that as 20% over total memory size, and if f2fs reaches to the limit, it will drop all the inmemory pages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: update ctx->pos correctly when hitting hole in directoryChao Yu
This patch fixes to update ctx->pos correctly when hitting hole in directory. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: relocate readahead codes in readdir()Chao Yu
Previously, for large directory, we just do readahead only once in readdir(), readdir()'s performance may drop when traversing latter blocks. In order to avoid this, relocate readahead codes to covering all traverse flow. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: allow readdir() to be interruptedChao Yu
This patch follows ext4 to allow readdir() in large empty directory to be interrupted. Referenced commit of ext4: 1f60fbe72749 ("ext4: allow readdir()'s of large empty directories to be interrupted"). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: trace f2fs_readdirChao Yu
This patch adds trace for f2fs_readdir. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: trace f2fs_lookupChao Yu
This patch adds trace for f2fs_lookup. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: skip searching non-exist range in truncate_holeWeichao Guo
Let's skip entire non-exist area to speed up truncate_hole by using get_next_page_offset. Signed-off-by: Weichao Guo <guoweichao@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: expose some sectors to user in inline data or dentry caseJaegeuk Kim
If there's some data written through inline data or dentry, we need to shouw st_blocks. This fixes reporting zero blocks even though there is small written data. Cc: stable@vger.kernel.org Reviewed-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: avoid link file for quotacheck] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: avoid stale fi->gdirty_list pointerJaegeuk Kim
When doing fault injection test, f2fs_evict_inode() didn't remove gdirty_list which incurs a kernel panic due to wrong pointer access. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs/crypto: drop crypto key at evict_inode onlyJaegeuk Kim
This patch avoids dropping crypto key in f2fs_drop_inode, so we can guarantee it happens only at evict_inode. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: fix to avoid race when accessing last_disk_sizeChao Yu
last_disk_size could be wrong due to concurrently updating, so using i_sem semaphore to make last_disk_size updating exclusive to fix this issue. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: Fix bool initialization/comparisonThomas Meyer
Bool initializations should use true and false. Bool tests don't need comparisons. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: give up CP_TRIMMED_FLAG if it drops discardsChao Yu
In ->umount, once we drop remained discard entries, we should not set CP_TRIMMED_FLAG with another checkpoint. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: trace f2fs_remove_discardChao Yu
This patch adds tracepoint to trace f2fs_remove_discard. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: reduce cmd_lock coverage in __issue_discard_cmdChao Yu
__submit_discard_cmd may lead long latency due to exhaustion of I/O request resource in block layer, so issuing all discard under cmd_lock may lead to hangtask, in order to avoid that, let's reduce it's coverage. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: split discard policyChao Yu
There are many different scenarios such as fstrim, umount, urgent or background where we will issue discards, actually, they need use different policy in aspect of io aware, discard granularity, delay interval and so on. But now they just share one common discard policy, so there will be race when changing policy in between these scenarios, the interference of changing discard policy will be very serious. This patch changes to split discard policy for different scenarios. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: wrap discard policyChao Yu
This patch wraps scattered optional parameters into discard policy as below, later, with it we expect that we can adjust these parameters with proper strategy in different scenario. struct discard_policy { unsigned int min_interval; /* used for candidates exist */ unsigned int max_interval; /* used for candidates not exist */ unsigned int max_requests; /* # of discards issued per round */ unsigned int io_aware_gran; /* minimum granularity discard not be aware of I/O */ bool io_aware; /* issue discard in idle time */ bool sync; /* submit discard with REQ_SYNC flag */ }; This patch doesn't change any logic of codes. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: support issuing/waiting discard in rangeChao Yu
Fstrim intends to trim invalid blocks of filesystem only with specified range and granularity, but actually, it will issue all previous cached discard commands which may be out-of-range and be with unmatched granularity, it's unneeded. In order to fix above issues, this patch introduces new helps to support to issue and wait discard in range and adds a new fstrim_list for tracking in-flight discard from ->fstrim. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26Merge tag 'xfs-4.14-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fix from Darrick Wong: "Here's (hopefully) the last bugfix for 4.14: - Rework nowait locking code to reduce locking overhead penalty" * tag 'xfs-4.14-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix AIM7 regression
2017-10-25SMB: fix validate negotiate info uninitialised memory useDavid Disseldorp
An undersize validate negotiate info server response causes the client to use uninitialised memory for struct validate_negotiate_info_rsp comparisons of Dialect, SecurityMode and/or Capabilities members. Link: https://bugzilla.samba.org/show_bug.cgi?id=13092 Fixes: 7db0a6efdc3e ("SMB3: Work around mount failure when using SMB3 dialect to Macs") Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-10-25SMB: fix leak of validate negotiate info response bufferDavid Disseldorp
Fixes: ff1c038addc4 ("Check SMB3 dialects against downgrade attacks") Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <smfrench@gmail.com>
2017-10-25CIFS: Fix NULL pointer deref on SMB2_tcon() failureAurélien Aptel
If SendReceive2() fails rsp is set to NULL but is dereferenced in the error handling code. Cc: stable@vger.kernel.org Signed-off-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-10-25CIFS: do not send invalid input buffer on QUERY_INFO requestsAurelien Aptel
query_info() doesn't use the InputBuffer field of the QUERY_INFO request, therefore according to [MS-SMB2] it must: a) set the InputBufferOffset to 0 b) send a zero-length InputBuffer Doing a) is trivial but b) is a bit more tricky. The packet is allocated according to it's StructureSize, which takes into account an extra 1 byte buffer which we don't need here. StructureSize fields must have constant values no matter the actual length of the whole packet so we can't just edit that constant. Both the NetBIOS-over-TCP message length ("rfc1002 length") L and the iovec length L' have to be updated. Since L' is computed from L we just update L by decrementing it by one. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-10-25cifs: Select all required crypto modulesBenjamin Gilbert
Some dependencies were lost when CIFS_SMB2 was merged into CIFS. Fixes: 2a38e12053b7 ("[SMB3] Remove ifdef since SMB3 (and later) now STRONGLY preferred") Signed-off-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com>
2017-10-25fuse: fix READDIRPLUS skipping an entryMiklos Szeredi
Marios Titas running a Haskell program noticed a problem with fuse's readdirplus: when it is interrupted by a signal, it skips one directory entry. The reason is that fuse erronously updates ctx->pos after a failed dir_emit(). The issue originates from the patch adding readdirplus support. Reported-by: Jakob Unterwurzacher <jakobunt@gmail.com> Tested-by: Marios Titas <redneb@gmx.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support") Cc: <stable@vger.kernel.org> # v3.9
2017-10-25locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns ↵Mark Rutland
to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25locking/atomics, fs/ncpfs: Convert ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE()Mark Rutland
The NCPFS code has some stale comments regarding ACCESS_ONCE() uses which were removed a long time ago. Let's remove the stale comments. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Vandrovec <petr@vandrovec.name> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-5-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25locking/atomics, fs/dcache: Convert ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE()Mark Rutland
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't currently harmful. However, for some features it is necessary to instrument reads and writes separately, which is not possible with ACCESS_ONCE(). This distinction is critical to correct operation. It's possible to transform the bulk of kernel code using the Coccinelle script below. However, this doesn't handle comments, leaving references to ACCESS_ONCE() instances which have been removed. As a preparatory step, this patch converts the dcache code and comments to use {READ,WRITE}_ONCE() consistently. ---- virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-4-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25ceph: unlock dangling spinlock in try_flush_caps()Jeff Layton
sparse warns: fs/ceph/caps.c:2042:9: warning: context imbalance in 'try_flush_caps' - wrong count at exit We need to exit this function with the lock unlocked, but a couple of cases leave it locked. Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-10-24ovl: do not cleanup unsupported index entriesAmir Goldstein
With index=on, ovl_indexdir_cleanup() tries to cleanup invalid index entries (e.g. bad index name). This behavior could result in cleaning of entries created by newer kernels and is therefore undesirable. Instead, abort mount if such entries are encountered. We still cleanup 'stale' entries and 'orphan' entries, both those cases can be a result of offline changes to lower and upper dirs. When encoutering an index entry of type directory or whiteout, kernel was supposed to fallback to read-only mount, but the fill_super() operation returns EROFS in this case instead of returning success with read-only mount flag, so mount fails when encoutering directory or whiteout index entries. Bless this behavior by returning -EINVAL on directory and whiteout index entries as we do for all unsupported index entries. Fixes: 61b674710cd9 ("ovl: do not cleanup directory and whiteout index..") Cc: <stable@vger.kernel.org> # v4.13 Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2017-10-24ovl: handle ENOENT on index lookupAmir Goldstein
Treat ENOENT from index entry lookup the same way as treating a returned negative dentry. Apparently, either could be returned if file is not found, depending on the underlying file system. Fixes: 359f392ca53e ("ovl: lookup index entry for copy up origin") Cc: <stable@vger.kernel.org> # v4.13 Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2017-10-24ovl: fix EIO from lookup of non-indexed upperAmir Goldstein
Commit fbaf94ee3cd5 ("ovl: don't set origin on broken lower hardlink") attempt to avoid the condition of non-indexed upper inode with lower hardlink as origin. If this condition is found, lookup returns EIO. The protection of commit mentioned above does not cover the case of lower that is not a hardlink when it is copied up (with either index=off/on) and then lower is hardlinked while overlay is offline. Changes to lower layer while overlayfs is offline should not result in unexpected behavior, so a permanent EIO error after creating a link in lower layer should not be considered as correct behavior. This fix replaces EIO error with success in cases where upper has origin but no index is found, or index is found that does not match upper inode. In those cases, lookup will not fail and the returned overlay inode will be hashed by upper inode instead of by lower origin inode. Fixes: 359f392ca53e ("ovl: lookup index entry for copy up origin") Cc: <stable@vger.kernel.org> # v4.13 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-10-24locking/barriers: Convert users of lockless_dereference() to READ_ONCE()Will Deacon
READ_ONCE() now has an implicit smp_read_barrier_depends() call, so it can be used instead of lockless_dereference() without any change in semantics. Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1508840570-22169-4-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-24Merge tag 'v4.14-rc6' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-23fs/9p: Compare qid.path in v9fs_test_inodeTuomas Tynkkynen
Commit fd2421f54423 ("fs/9p: When doing inode lookup compare qid details and inode mode bits.") transformed v9fs_qid_iget() to use iget5_locked() instead of iget_locked(). However, the test() callback is not checking fid.path at all, which means that a lookup in the inode cache can now accidentally locate a completely wrong inode from the same inode hash bucket if the other fields (qid.type and qid.version) match. Fixes: fd2421f54423 ("fs/9p: When doing inode lookup compare qid details and inode mode bits.") Cc: stable@vger.kernel.org Reviewed-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Tuomas Tynkkynen <tuomas@tuxera.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>