summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2018-06-05CIFS: Pass page offset for encryptingSteve French
Encryption function needs to read data starting page offset from input buffer. This doesn't affect decryption path since it allocates its own page buffers. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-05CIFS: Pass page offset for calculating signatureLong Li
When calculating signature for the packet, it needs to read into the correct page offset for the data. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-05CIFS: SMBD: Support page offset in memory registrationLong Li
Change code to pass the correct page offset during memory registration for RDMA read/write. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-05CIFS: SMBD: Support page offset in RDMA recvLong Li
RDMA recv function needs to place data to the correct place starting at page offset. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-05CIFS: SMBD: Support page offset in RDMA sendLong Li
The RDMA send function needs to look at offset in the request pages, and send data starting from there. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-05CIFS: When sending data on socket, pass the correct page offsetLong Li
It's possible that the offset is non-zero in the page to send, change the function to pass this offset to socket. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-05CIFS: Introduce helper function to get page offset and length in smb_rqstLong Li
Introduce a function rqst_page_get_length to return the page offset and length for a given page in smb_rqst. This function is to be used by following patches. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-05CIFS: Calculate the correct request length based on page offset and tail sizeLong Li
It's possible that the page offset is non-zero in the pages in a request, change the function to calculate the correct data buffer length. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-05Merge tag 'fscrypt_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt Pull fscrypt updates from Ted Ts'o: "Add bunch of cleanups, and add support for the Speck128/256 algorithms. Yes, Speck is contrversial, but the intention is to use them only for the lowest end Android devices, where the alternative *really* is no encryption at all for data stored at rest" * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: fscrypt: log the crypto algorithm implementations fscrypt: add Speck128/256 support fscrypt: only derive the needed portion of the key fscrypt: separate key lookup from key derivation fscrypt: use a common logging function fscrypt: remove internal key size constants fscrypt: remove unnecessary check for non-logon key type fscrypt: make fscrypt_operations.max_namelen an integer fscrypt: drop empty name check from fname_decrypt() fscrypt: drop max_namelen check from fname_decrypt() fscrypt: don't special-case EOPNOTSUPP from fscrypt_get_encryption_info() fscrypt: don't clear flags on crypto transform fscrypt: remove stale comment from fscrypt_d_revalidate() fscrypt: remove error messages for skcipher_request_alloc() failure fscrypt: remove unnecessary NULL check when allocating skcipher fscrypt: clean up after fscrypt_prepare_lookup() conversions fs, fscrypt: only define ->s_cop when FS_ENCRYPTION is enabled fscrypt: use unbound workqueue for decryption
2018-06-05Merge tag 'xfs-4.18-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs updates from Darrick Wong: "New features this cycle include the ability to relabel mounted filesystems, support for fallocated swapfiles, and using FUA for pure data O_DSYNC directio writes. With this cycle we begin to integrate online filesystem repair and refactor the growfs code in preparation for eventual subvolume support, though the road ahead for both features is quite long. There are also numerous refactorings of the iomap code to remove unnecessary log overhead, to disentangle some of the quota code, and to prepare for buffer head removal in a future upstream kernel. Metadata validation continues to improve, both in the hot path veifiers and the online filesystem check code. I anticipate sending a second pull request in a few days with more metadata validation improvements. This series has been run through a full xfstests run over the weekend and through a quick xfstests run against this morning's master, with no major failures reported. Summary: - Strengthen inode number and structure validation when allocating inodes. - Reduce pointless buffer allocations during cache miss - Use FUA for pure data O_DSYNC directio writes - Various iomap refactorings - Strengthen quota metadata verification to avoid unfixable broken quota - Make AGFL block freeing a deferred operation to avoid blowing out transaction reservations when running complex operations - Get rid of the log item descriptors to reduce log overhead - Fix various reflink bugs where inodes were double-joined to transactions - Don't issue discards when trimming unwritten extents - Refactor incore dquot initialization and retrieval interfaces - Fix some locking problmes in the quota scrub code - Strengthen btree structure checks in scrub code - Rewrite swapfile activation to use iomap and support unwritten extents - Make scrub exit to userspace sooner when corruptions or cross-referencing problems are found - Make scrub invoke the data fork scrubber directly on metadata inodes - Don't do background reclamation of post-eof and cow blocks when the fs is suspended - Fix secondary superblock buffer lifespan hinting - Refactor growfs to use table-dispatched functions instead of long stringy functions - Move growfs code to libxfs - Implement online fs label getting and setting - Introduce online filesystem repair (in a very limited capacity) - Fix unit conversion problems in the realtime freemap iteration functions - Various refactorings and cleanups in preparation to remove buffer heads in a future release - Reimplement the old bmap call with iomap - Remove direct buffer head accesses from seek hole/data - Various bug fixes" * tag 'xfs-4.18-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (121 commits) fs: use ->is_partially_uptodate in page_cache_seek_hole_data fs: remove the buffer_unwritten check in page_seek_hole_data fs: move page_cache_seek_hole_data to iomap.c xfs: use iomap_bmap iomap: add an iomap-based bmap implementation iomap: add a iomap_sector helper iomap: use __bio_add_page in iomap_dio_zero iomap: move IOMAP_F_BOUNDARY to gfs2 iomap: fix the comment describing IOMAP_NOWAIT iomap: inline data should be an iomap type, not a flag mm: split ->readpages calls to avoid non-contiguous pages lists mm: return an unsigned int from __do_page_cache_readahead mm: give the 'ret' variable a better name __do_page_cache_readahead block: add a lower-level bio_add_page interface xfs: fix error handling in xfs_refcount_insert() xfs: fix xfs_rtalloc_rec units xfs: strengthen rtalloc query range checks xfs: xfs_rtbuf_get should check the bmapi_read results xfs: xfs_rtword_t should be unsigned, not signed dax: change bdev_dax_supported() to support boolean returns ...
2018-06-05Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "A lot of cleanups and bug fixes, especially dealing with corrupted file systems" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits) ext4: fix fencepost error in check for inode count overflow during resize ext4: correctly handle a zero-length xattr with a non-zero e_value_offs ext4: bubble errors from ext4_find_inline_data_nolock() up to ext4_iget() ext4: do not allow external inodes for inline data ext4: report delalloc reserve as non-free in statfs for project quota ext4: remove NULL check before calling kmem_cache_destroy() jbd2: remove NULL check before calling kmem_cache_destroy() jbd2: remove bunch of empty lines with jbd2 debug ext4: handle errors on ext4_commit_super ext4: do not update s_last_mounted of a frozen fs ext4: factor out helper ext4_sample_last_mounted() vfs: add the sb_start_intwrite_trylock() helper ext4: update mtime in ext4_punch_hole even if no blocks are released ext4: add verifier check for symlink with append/immutable flags fs: ext4: add new return type vm_fault_t ext4: fix hole length detection in ext4_ind_map_blocks() ext4: mark block bitmap corrupted when found ext4: mark inode bitmap corrupted when found ext4: add new ext4_mark_group_bitmap_corrupted() helper ext4: fix wrong return value in ext4_read_inode_bitmap() ...
2018-06-05ncpfs: remove compat functionalityGreg Kroah-Hartman
Now that ncpfs is gone from the tree, no need to have the compatibility thunking layer around, it will not actually go anywhere :) So delete that logic from fs/compat.c, it is no longer needed. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-05iomap: fsync swap files before iterating mappingsDarrick J. Wong
Swap files require that all the file mapping metadata be stable on disk. It is insufficient to flush dirty pages in the page cache because that won't necessarily result in filesystems pushing all their metadata out to disk. Therefore, call fsync from iomap_swapfile_activate. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz>
2018-06-05jfs: Fix inconsistency between memory allocation and ea_buf->max_sizeShankara Pailoor
The code is assuming the buffer is max_size length, but we weren't allocating enough space for it. Signed-off-by: Shankara Pailoor <shankarapailoor@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2018-06-05NFSv4: Fix a compiler warning when CONFIG_NFS_V4_1 is undefinedTrond Myklebust
Fix a compiler warning: fs/nfs/nfs4proc.c:910:13: warning: 'nfs4_layoutget_release' defined but not used [-Wunused-function] static void nfs4_layoutget_release(void *calldata) ^~~~~~~~~~~~~~~~~~~~~~ Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-06-05btrfs: Check error of btrfs_iget in btrfs_search_path_in_tree_userMisono Tomohiro
The patch introducing the ioctl was not the latest version at the time of merging to the mainline and needs a fixup from this patch. Fixes: ba637a252d30 ("btrfs: Check error of btrfs_iget() in btrfs_search_path_in_tree_user") Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-06-04xfs: use xfs_trans_getsb in xfs_sync_sb_bufEric Sandeen
Use xfs_trans_getsb rather than reaching right in for mp->m_sb_bp; I think this is more correct, and it facilitates building this libxfs code in userspace as well. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-04xfs: don't assert on corrupted unlinked inode listDarrick J. Wong
Use the per-ag inode number verifiers to detect corrupt lists and error out, instead of using ASSERTs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: explicitly pass buffer size to xfs_corruption_errorDarrick J. Wong
Explicitly pass the buffer length to xfs_corruption_error() instead of assuming XFS_CORRUPTION_DUMP_LEN so that we avoid dumping off the end of the buffer. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: don't assert when on-disk btree pointers are garbageDarrick J. Wong
Don't ASSERT when we encounter bad on-disk btree pointers in the debug check functions. Log the error to leave breadcrumbs and let the upper layers deal with it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: strengthen btree pointer checks before useDarrick J. Wong
Instead of ASSERTing on null btree pointers in xfs_btree_ptr_to_daddr, use the new block number verifiers to ensure that the btree pointer doesn't point to any sensitive areas (AG headers, past-EOFS) and return -EFSCORRUPTED if this is the case. Remove the ASSERT because on-disk corruptions shouldn't trigger ASSERTs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: introduce xfs_btree_debug_check_ptrDarrick J. Wong
Make xfs_btree_check_ptr a non-debug function and introduce a new _debug version that only runs when #ifdef DEBUG. This will enable us to reuse the checking logic with other parts of the btree code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: check directory bestfree information in the verifierDarrick J. Wong
Create a variant of xfs_dir2_data_freefind that is suitable for use in a verifier. Because _freefind is called by the verifier, we simply duplicate the _freefind function, convert the ASSERTs to return __this_address, and modify the verifier to call our new function. Once we've made it impossible for directory blocks with bad bestfree data to make it into the filesystem we can remove the DEBUG code from the regular _freefind function. Underlying argument: corruption of on-disk metadata should return -EFSCORRUPTED instead of blowing ASSERTs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04cifs: For SMB2 security informaion query, check for minimum sized security ↵Shirish Pargaonkar
descriptor instead of sizeof FileAllInformation class Validate_buf () function checks for an expected minimum sized response passed to query_info() function. For security information, the size of a security descriptor can be smaller (one subauthority, no ACEs) than the size of the structure that defines FileInfoClass of FileAllInformation. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199725 Cc: <stable@vger.kernel.org> Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Reviewed-by: Noah Morrison <noah.morrison@rubrik.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-04CIFS: Fix signing for SMB2/3Aurelien Aptel
It seems Ronnie's preamble removal broke signing. the signing functions are called when: A) we send a request (to sign it) B) when we recv a response (to check the signature). On code path A, the smb2 header is in iov[1] but on code path B, the smb2 header is in iov[0] (and there's only one vector). So we have different iov indexes for the smb2 header but the signing function always use index 1. Fix this by checking the nb of io vectors in the signing function as a hint. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-06-04Merge branch 'siginfo-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull siginfo updates from Eric Biederman: "This set of changes close the known issues with setting si_code to an invalid value, and with not fully initializing struct siginfo. There remains work to do on nds32, arc, unicore32, powerpc, arm, arm64, ia64 and x86 to get the code that generates siginfo into a simpler and more maintainable state. Most of that work involves refactoring the signal handling code and thus careful code review. Also not included is the work to shrink the in kernel version of struct siginfo. That depends on getting the number of places that directly manipulate struct siginfo under control, as it requires the introduction of struct kernel_siginfo for the in kernel things. Overall this set of changes looks like it is making good progress, and with a little luck I will be wrapping up the siginfo work next development cycle" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) signal/sh: Stop gcc warning about an impossible case in do_divide_error signal/mips: Report FPE_FLTUNK for undiagnosed floating point exceptions signal/um: More carefully relay signals in relay_signal. signal: Extend siginfo_layout with SIL_FAULT_{MCEERR|BNDERR|PKUERR} signal: Remove unncessary #ifdef SEGV_PKUERR in 32bit compat code signal/signalfd: Add support for SIGSYS signal/signalfd: Remove __put_user from signalfd_copyinfo signal/xtensa: Use force_sig_fault where appropriate signal/xtensa: Consistenly use SIGBUS in do_unaligned_user signal/um: Use force_sig_fault where appropriate signal/sparc: Use force_sig_fault where appropriate signal/sparc: Use send_sig_fault where appropriate signal/sh: Use force_sig_fault where appropriate signal/s390: Use force_sig_fault where appropriate signal/riscv: Replace do_trap_siginfo with force_sig_fault signal/riscv: Use force_sig_fault where appropriate signal/parisc: Use force_sig_fault where appropriate signal/parisc: Use force_sig_mceerr where appropriate signal/openrisc: Use force_sig_fault where appropriate signal/nios2: Use force_sig_fault where appropriate ...
2018-06-04Merge branch 'userns-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns updates from Eric Biederman: "This is the last couple of vfs bits to enable root in a user namespace to mount and manipulate a filesystem with backing store (AKA not a virtual filesystem like proc, but a filesystem where the unprivileged user controls the content). The target filesystem for this work is fuse, and Miklos should be sending you the pull request for the fuse bits this merge window. The two key patches are "evm: Don't update hmacs in user ns mounts" and "vfs: Don't allow changing the link count of an inode with an invalid uid or gid". Those close small gaps in the vfs that would be a problem if an unprivileged fuse filesystem is mounted. The rest of the changes are things that are now safe to allow a root user in a user namespace to do with a filesystem they have mounted. The most interesting development is that remount is now safe" * 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems capabilities: Allow privileged user in s_user_ns to set security.* xattrs fs: Allow superblock owner to access do_remount_sb() fs: Allow superblock owner to replace invalid owners of inodes vfs: Allow userns root to call mknod on owned filesystems. vfs: Don't allow changing the link count of an inode with an invalid uid or gid evm: Don't update hmacs in user ns mounts
2018-06-04xfs: don't return garbage buffers in xfs_da3_node_readDarrick J. Wong
If we're reading a node in a dir/attr btree and the buffer comes off the disk with a magic number we don't recognize, don't ASSERT and don't set a garbage buffer type (0 also triggers ASSERTs). Instead, report the corruption, release the buffer, and return -EFSCORRUPTED because that's what the dabtree is -- corrupt. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: don't ASSERT on short form btree root pointer of zeroDarrick J. Wong
Don't ASSERT if the short form btree root pointer is zero. Now that we use xfs_verify_agbno to check all short form btree pointers, we'll let that log the error and pass it to the upper layers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: btree lookup shouldn't ASSERT on empty btree nodesDarrick J. Wong
If a btree lookup encounters an empty btree node or an empty btree leaf on a multi-level btree, that's evidence of a corrupt on-disk btree. Therefore, we should return -EFSCORRUPTED to the upper levels, not an ASSERT failure. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: xfs_alloc_get_rec should return EFSCORRUPTED for obvious bnobt corruptionDarrick J. Wong
Return -EFSCORRUPTED when the bnobt/cntbt return obviously corrupt values, rather than letting them bounce around in the internal code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: remove redundant ASSERT on insufficient bestfree length in _leaf_addnameDarrick J. Wong
In xfs_dir2_leaf_addname we ASSERT if the length of the unused space described by bestfree[0] is less the amount of space we wish to consume. Immediately after it is a call to xfs_dir2_data_use_free where the offset parameter is offset of the unused space and the length parameter is the amount of space we wish to consume. Both values (and the unused space pointer) are passed into xfs_dir2_data_check_free, which also validates that the region of unused space is big enough to cover the space we wish to consume. This is effectively the same check that the ASSERT covers, and since a check failure results in a corruption message being logged we can remove the ASSERT. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: don't assert when reporting on-disk corruption while loading btreeDarrick J. Wong
Don't bother ASSERTing when we're already going to log and return the corruption status. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04xfs: don't forbid setting dax flag on directories if device doesn't daxDarrick J. Wong
On a directory, the DAX flag is merely a hint that files created in the directory should have the DAX flag set at creation time. We don't care if the underlying device supports DAX or not because directory metadata are always cached in DRAM. We don't care if new files get the flag even if the device doesn't support DAX because we always check for DAX support before setting the VFS flag (S_DAX). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04Merge tag '4.18-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs updates from Steve French: - smb3 fixes for stable - addition of ftrace hooks for cifs.ko - improvements in compounding and smbdirect (rdma) * tag '4.18-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: (38 commits) CIFS: Add support for direct pages in wdata CIFS: Use offset when reading pages CIFS: Add support for direct pages in rdata cifs: update multiplex loop to handle compounded responses cifs: remove header_preamble_size where it is always 0 cifs: remove struct smb2_hdr CIFS: 511c54a2f69195b28afb9dd119f03787b1625bb4 adds a check for session expiry, status STATUS_NETWORK_SESSION_EXPIRED, however the server can also respond with STATUS_USER_SESSION_DELETED in cases where the session has been idle for some time and the server reaps the session to recover resources. cifs: change smb2_get_data_area_len to take a smb2_sync_hdr as argument cifs: update smb2_calc_size to use smb2_sync_hdr instead of smb2_hdr cifs: remove struct smb2_oplock_break_rsp cifs: remove rfc1002 header from all SMB2 response structures smb3: on reconnect set PreviousSessionId field smb3: Add posix create context for smb3.11 posix mounts smb3: add tracepoints for smb2/smb3 open cifs: add debug output to show nocase mount option smb3: add define for id for posix create context and corresponding struct cifs: update smb2_check_message to handle PDUs without a 4 byte length header smb3: allow "posix" mount option to enable new SMB311 protocol extensions smb3: add support for posix negotiate context cifs: allow disabling less secure legacy dialects ...
2018-06-04Merge tag 'gfs2-4.18.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Bob Peterson: "We've got nine more patches for this merge window. - remove sd_jheightsize to greatly simplify some code (Andreas Gruenbacher) - fix some comments (Andreas) - fix a glock recursion bug when allocation errors occur (Andreas) - improve the hole_size function so it returns the entire hole rather than figuring it out piecemeal (Andreas) - clean up gfs2_stuffed_write_end to remove a lot of redundancy (Andreas) - clarify code with regard to the way ordered writes are processed (Andreas) - a bunch of improvements and cleanups of the iomap code to pave the way for iomap writes, which is a future patch set (Andreas) - fix a bug where block reservations can run off the end of a bitmap (Bob Peterson) - add Andreas to the MAINTAINERS file (Bob Peterson)" * tag 'gfs2-4.18.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: MAINTAINERS: Add Andreas Gruenbacher as a maintainer for gfs2 gfs2: Iomap cleanups and improvements gfs2: Remove ordered write mode handling from gfs2_trans_add_data gfs2: gfs2_stuffed_write_end cleanup gfs2: hole_size improvement GFS2: gfs2_free_extlen can return an extent that is too long GFS2: Fix allocation error bug with recursive rgrp glocking gfs2: Update find_metapath comment gfs2: Remove sdp->sd_jheightsize
2018-06-04Merge tag 'dlm-4.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "These three commits fix and clean up the flags dlm was using on its SCTP sockets. This improves performance and fixes some bad connection delays" * tag 'dlm-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: remove O_NONBLOCK flag in sctp_connect_to_sock dlm: make sctp_connect_to_sock() return in specified time dlm: fix a clerical error when set SCTP_NODELAY
2018-06-04f2fs: fix to clear FI_VOLATILE_FILE correctlyChao Yu
Thread A Thread B - f2fs_release_file - clear_inode_flag(FI_VOLATILE_FILE) - wb_writeback - writeback_sb_inodes - __writeback_single_inode - do_writepages - f2fs_write_data_pages - __write_data_page all volatile file's pages are writebacked to storage - set_inode_flag(FI_DROP_CACHE) - filemap_fdatawrite There is a hole that mm can flush all dirty pages of volatile file as inode is not tagged with both FI_VOLATILE_FILE and FI_DROP_CACHE flags, we should never writeback the page #0 and also it's unneeded to writeback other pages. This patch adjusts to relocate clear_inode_flag(FI_VOLATILE_FILE), so that FI_VOLATILE_FILE flag can be remained before all dirty pages were dropped to avoid issue. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-06-04f2fs: let sync node IO interrupt async oneChao Yu
Although mixed sync/async IOs can have continuous LBA, as they have different IO priority, block IO scheduler will add them into different queues and commit them separately, result in splited IOs which causes wrose performance. This patch gives high priority to synchronous IO of nodes, means that once synchronous flow starts, it can interrupt asynchronous writeback flow of system flusher, so more big IOs can be expected. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-06-04f2fs: don't change wbc->sync_modeChao Yu
We should never falsify wbc->sync_mode passed from mm, otherwise mm can trigger writeback with wrong IO priority. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-06-04f2fs: fix to update mtime correctlyChao Yu
If we change system time to the past, get_mtime() will return a overflowed time, and SIT_I(sbi)->max_mtime will be udpated incorrectly, this patch fixes the two issues. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-06-04Merge tag 'for-4.18-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "User visible features: - added support for the ioctl FS_IOC_FSGETXATTR, per-inode flags, successor of GET/SETFLAGS; now supports only existing flags: append, immutable, noatime, nodump, sync - 3 new unprivileged ioctls to allow users to enumerate subvolumes - dedupe syscall implementation does not restrict the range to 16MiB, though it still splits the whole range to 16MiB chunks - on user demand, rmdir() is able to delete an empty subvolume, export the capability in sysfs - fix inode number types in tracepoints, other cleanups - send: improved speed when dealing with a large removed directory, measurements show decrease from 2000 minutes to 2 minutes on a directory with 2 million entries - pre-commit check of superblock to detect a mysterious in-memory corruption - log message updates Other changes: - orphan inode cleanup improved, does no keep long-standing reservations that could lead up to early ENOSPC in some cases - slight improvement of handling snapshotted NOCOW files by avoiding some unnecessary tree searches - avoid OOM when dealing with many unmergeable small extents at flush time - speedup conversion of free space tree representations from/to bitmap/tree - code refactoring, deletion, cleanups: + delayed refs + delayed iput + redundant argument removals + memory barrier cleanups + remove a redundant mutex supposedly excluding several ioctls to run in parallel - new tracepoints for blockgroup manipulation - more sanity checks of compressed headers" * tag 'for-4.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (183 commits) btrfs: Add unprivileged version of ino_lookup ioctl btrfs: Add unprivileged ioctl which returns subvolume's ROOT_REF btrfs: Add unprivileged ioctl which returns subvolume information Btrfs: clean up error handling in btrfs_truncate() btrfs: Factor out write portion of btrfs_get_blocks_direct btrfs: Factor out read portion of btrfs_get_blocks_direct btrfs: return ENOMEM if path allocation fails in btrfs_cross_ref_exist btrfs: raid56: Remove VLA usage btrfs: return error value if create_io_em failed in cow_file_range btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot btrfs: drop unused parameter qgroup_reserved btrfs: balance dirty metadata pages in btrfs_finish_ordered_io btrfs: lift some btrfs_cross_ref_exist checks in nocow path btrfs: Remove fs_info argument from btrfs_uuid_tree_rem btrfs: Remove fs_info argument from btrfs_uuid_tree_add Btrfs: remove unused check of skip_locking Btrfs: remove always true check in unlock_up Btrfs: grab write lock directly if write_lock_level is the max level Btrfs: move get root out of btrfs_search_slot to a helper Btrfs: use more straightforward extent_buffer_uptodate check ...
2018-06-04Merge tag 'affs-for-4.18-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull affs fix from David Sterba: "A potential memory leak fix for AFFS" * tag 'affs-for-4.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: affs: fix potential memory leak when parsing option 'prefix'
2018-06-04Merge branch 'work.aio-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull aio updates from Al Viro: "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly. The only thing I'm holding back for a day or so is Adam's aio ioprio - his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case), but let it sit in -next for decency sake..." * 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) aio: sanitize the limit checking in io_submit(2) aio: fold do_io_submit() into callers aio: shift copyin of iocb into io_submit_one() aio_read_events_ring(): make a bit more readable aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way aio: take list removal to (some) callers of aio_complete() aio: add missing break for the IOCB_CMD_FDSYNC case random: convert to ->poll_mask timerfd: convert to ->poll_mask eventfd: switch to ->poll_mask pipe: convert to ->poll_mask crypto: af_alg: convert to ->poll_mask net/rxrpc: convert to ->poll_mask net/iucv: convert to ->poll_mask net/phonet: convert to ->poll_mask net/nfc: convert to ->poll_mask net/caif: convert to ->poll_mask net/bluetooth: convert to ->poll_mask net/sctp: convert to ->poll_mask net/tipc: convert to ->poll_mask ...
2018-06-04Merge branch 'work.lookup' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull dcache lookup cleanups from Al Viro: "Cleaning ->lookup() instances up - mostly d_splice_alias() conversions" * 'work.lookup' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (29 commits) switch the rest of procfs lookups to d_splice_alias() procfs: switch instantiate_t to d_splice_alias() don't bother with tid_fd_revalidate() in lookups proc_lookupfd_common(): don't bother with instantiate unless the file is open procfs: get rid of ancient BS in pid_revalidate() uses cifs_lookup(): switch to d_splice_alias() cifs_lookup(): cifs_get_inode_...() never returns 0 with *inode left NULL 9p: unify paths in v9fs_vfs_lookup() ncp_lookup(): use d_splice_alias() hfsplus: switch to d_splice_alias() hfs: don't allow mounting over .../rsrc hfs: use d_splice_alias() omfs_lookup(): report IO errors, use d_splice_alias() orangefs_lookup: simplify openpromfs: switch to d_splice_alias() xfs_vn_lookup: simplify a bit adfs_lookup: do not fail with ENOENT on negatives, use d_splice_alias() adfs_lookup_byname: .. *is* taken care of in fs/namei.c romfs_lookup: switch to d_splice_alias() qnx6_lookup: switch to d_splice_alias() ...
2018-06-04rxrpc: Fix handling of call quietly cancelled out on serverDavid Howells
Sometimes an in-progress call will stop responding on the fileserver when the fileserver quietly cancels the call with an internally marked abort (RX_CALL_DEAD), without sending an ABORT to the client. This causes the client's call to eventually expire from lack of incoming packets directed its way, which currently leads to it being cancelled locally with ETIME. Note that it's not currently clear as to why this happens as it's really hard to reproduce. The rotation policy implement by kAFS, however, doesn't differentiate between ETIME meaning we didn't get any response from the server and ETIME meaning the call got cancelled mid-flow. The latter leads to an oops when fetching data as the rotation partially resets the afs_read descriptor, which can result in a cleared page pointer being dereferenced because that page has already been filled. Handle this by the following means: (1) Set a flag on a call when we receive a packet for it. (2) Store the highest packet serial number so far received for a call (bearing in mind this may wrap). (3) If, when the "not received anything recently" timeout expires on a call, we've received at least one packet for a call and the connection as a whole has received packets more recently than that call, then cancel the call locally with ECONNRESET rather than ETIME. This indicates that the call was definitely in progress on the server. (4) In kAFS, if the rotation algorithm sees ECONNRESET rather than ETIME, don't try the next server, but rather abort the call. This avoids the oops as we don't try to reuse the afs_read struct. Rather, as-yet ungotten pages will be reread at a later data. Also: (5) Add an rxrpc tracepoint to log detection of the call being reset. Without this, I occasionally see an oops like the following: general protection fault: 0000 [#1] SMP PTI ... RIP: 0010:_copy_to_iter+0x204/0x310 RSP: 0018:ffff8800cae0f828 EFLAGS: 00010206 RAX: 0000000000000560 RBX: 0000000000000560 RCX: 0000000000000560 RDX: ffff8800cae0f968 RSI: ffff8800d58b3312 RDI: 0005080000000000 RBP: ffff8800cae0f968 R08: 0000000000000560 R09: ffff8800ca00f400 R10: ffff8800c36f28d4 R11: 00000000000008c4 R12: ffff8800cae0f958 R13: 0000000000000560 R14: ffff8800d58b3312 R15: 0000000000000560 FS: 00007fdaef108080(0000) GS:ffff8800ca680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb28a8fa000 CR3: 00000000d2a76002 CR4: 00000000001606e0 Call Trace: skb_copy_datagram_iter+0x14e/0x289 rxrpc_recvmsg_data.isra.0+0x6f3/0xf68 ? trace_buffer_unlock_commit_regs+0x4f/0x89 rxrpc_kernel_recv_data+0x149/0x421 afs_extract_data+0x1e0/0x798 ? afs_wait_for_call_to_complete+0xc9/0x52e afs_deliver_fs_fetch_data+0x33a/0x5ab afs_deliver_to_call+0x1ee/0x5e0 ? afs_wait_for_call_to_complete+0xc9/0x52e afs_wait_for_call_to_complete+0x12b/0x52e ? wake_up_q+0x54/0x54 afs_make_call+0x287/0x462 ? afs_fs_fetch_data+0x3e6/0x3ed ? rcu_read_lock_sched_held+0x5d/0x63 afs_fs_fetch_data+0x3e6/0x3ed afs_fetch_data+0xbb/0x14a afs_readpages+0x317/0x40d __do_page_cache_readahead+0x203/0x2ba ? ondemand_readahead+0x3a7/0x3c1 ondemand_readahead+0x3a7/0x3c1 generic_file_buffered_read+0x18b/0x62f __vfs_read+0xdb/0xfe vfs_read+0xb2/0x137 ksys_read+0x50/0x8c do_syscall_64+0x7d/0x1a0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Note the weird value in RDI which is a result of trying to kmap() a NULL page pointer. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04Merge tag 'locks-v4.18-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull fasync fix from Jeff Layton: "Just a single fix for a deadlock in the fasync handling code that Kirill observed while testing. The fix is to change the fa_lock to be rwlock_t, and use a read lock in kill_fasync_rcu" * tag 'locks-v4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fasync: Fix deadlock between task-context and interrupt-context kill_fasync()
2018-06-04Merge tag 'docs-4.18' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation updates from Jonathan Corbet: "There's been a fair amount of work in the docs tree this time around, including: - Extensive RST conversions and organizational work in the memory-management docs thanks to Mike Rapoport. - An update of Documentation/features from Andrea Parri and a script to keep it updated. - Various LICENSES updates from Thomas, along with a script to check SPDX tags. - Work to fix dangling references to documentation files; this involved a fair number of one-liner comment changes outside of Documentation/ ... and the usual list of documentation improvements, typo fixes, etc" * tag 'docs-4.18' of git://git.lwn.net/linux: (103 commits) Documentation: document hung_task_panic kernel parameter docs/admin-guide/mm: add high level concepts overview docs/vm: move ksm and transhuge from "user" to "internals" section. docs: Use the kerneldoc comments for memalloc_no*() doc: document scope NOFS, NOIO APIs docs: update kernel versions and dates in tables docs/vm: transhuge: split userspace bits to admin-guide/mm/transhuge docs/vm: transhuge: minor updates docs/vm: transhuge: change sections order Documentation: arm: clean up Marvell Berlin family info Documentation: gpio: driver: Fix a typo and some odd grammar docs: ranoops.rst: fix location of ramoops.txt scripts/documentation-file-ref-check: rewrite it in perl with auto-fix mode docs: uio-howto.rst: use a code block to solve a warning mm, THP, doc: Add document for thp_swpout/thp_swpout_fallback w1: w1_io.c: fix a kernel-doc warning Documentation/process/posting: wrap text at 80 cols docs: admin-guide: add cgroup-v2 documentation Revert "Documentation/features/vm: Remove arch support status file for 'pte_special'" Documentation: refcount-vs-atomic: Update reference to LKMM doc. ...
2018-06-04NFS: Filter cache invalidation when holding a delegationTrond Myklebust
If the client holds a delegation, then ensure we filter out attempts to invalidate the size, owner, group owner, or mode unless we made the change, in which case, check that NFS_INO_REVAL_FORCED is set by the caller. Always filter out attempts to invalidate the change attribute and size, since we are authoritative for those. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-06-04NFS: Ignore NFS_INO_REVAL_FORCED in nfs_check_inode_attributes()Trond Myklebust
If we hold a delegation, we should not need to call nfs_check_inode_attributes() since we already know which attributes are valid, and which ones may still need revalidation. The state of the NFS_INO_REVAL_FORCED flag is therefore irrelevant. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>