summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2020-11-13Merge tag 'ext4_for_linus_bugfixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Two ext4 bug fixes, one being a revert of a commit sent during the merge window" * tag 'ext4_for_linus_bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: Revert "ext4: fix superblock checksum calculation race" ext4: handle dax mount option collision
2020-11-12Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds
Pull fscrypt fix from Eric Biggers: "Fix a regression where new files weren't using inline encryption when they should be" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: fix inline encryption not used on new files
2020-11-12Merge tag 'gfs2-v5.10-rc3-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: "Fix jdata data corruption and glock reference leak" * tag 'gfs2-v5.10-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix case in which ail writes are done to jdata holes Revert "gfs2: Ignore journal log writes for jdata holes" gfs2: fix possible reference leak in gfs2_check_blk_type
2020-11-12Merge tag 'nfs-for-5.10-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Anna Schumaker: "Stable fixes: - Fix failure to unregister shrinker Other fixes: - Fix unnecessary locking to clear up some contention - Fix listxattr receive buffer size - Fix default mount options for nfsroot" * tag 'nfs-for-5.10-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: Remove unnecessary inode lock in nfs_fsync_dir() NFS: Remove unnecessary inode locking in nfs_llseek_dir() NFS: Fix listxattr receive buffer size NFSv4.2: fix failure to unregister shrinker nfsroot: Default mount option should ask for built-in NFS version
2020-11-12gfs2: Fix case in which ail writes are done to jdata holesBob Peterson
Patch b2a846dbef4e ("gfs2: Ignore journal log writes for jdata holes") tried (unsuccessfully) to fix a case in which writes were done to jdata blocks, the blocks are sent to the ail list, then a punch_hole or truncate operation caused the blocks to be freed. In other words, the ail items are for jdata holes. Before b2a846dbef4e, the jdata hole caused function gfs2_block_map to return -EIO, which was eventually interpreted as an IO error to the journal, and then withdraw. This patch changes function gfs2_get_block_noalloc, which is only used for jdata writes, so it returns -ENODATA rather than -EIO, and when -ENODATA is returned to gfs2_ail1_start_one, the error is ignored. We can safely ignore it because gfs2_ail1_start_one is only called when the jdata pages have already been written and truncated, so the ail1 content no longer applies. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-11-12Revert "gfs2: Ignore journal log writes for jdata holes"Bob Peterson
This reverts commit b2a846dbef4ef54ef032f0f5ee188c609a0278a7. That commit changed the behavior of function gfs2_block_map to return -ENODATA in cases where a hole (IOMAP_HOLE) is encountered and create is false. While that fixed the intended problem for jdata, it also broke other callers of gfs2_block_map such as some jdata block reads. Before the patch, an encountered hole would be skipped and the buffer seen as unmapped by the caller. The patch changed the behavior to return -ENODATA, which is interpreted as an error by the caller. The -ENODATA return code should be restricted to the specific case where jdata holes are encountered during ail1 writes. That will be done in a later patch. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-11-12NFS: Remove unnecessary inode lock in nfs_fsync_dir()Trond Myklebust
nfs_inc_stats() is already thread-safe, and there are no other reasons to hold the inode lock here. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-12NFS: Remove unnecessary inode locking in nfs_llseek_dir()Trond Myklebust
Remove the contentious inode lock, and instead provide thread safety using the file->f_lock spinlock. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-12NFS: Fix listxattr receive buffer sizeChuck Lever
Certain NFSv4.2/RDMA tests fail with v5.9-rc1. rpcrdma_convert_kvec() runs off the end of the rl_segments array because rq_rcv_buf.tail[0].iov_len holds a very large positive value. The resultant kernel memory corruption is enough to crash the client system. Callers of rpc_prepare_reply_pages() must reserve an extra XDR_UNIT in the maximum decode size for a possible XDR pad of the contents of the xdr_buf's pages. That guarantees the allocated receive buffer will be large enough to accommodate the usual contents plus that XDR pad word. encode_op_hdr() cannot add that extra word. If it does, xdr_inline_pages() underruns the length of the tail iovec. Fixes: 3e1f02123fba ("NFSv4.2: add client side XDR handling for extended attributes") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-12NFSv4.2: fix failure to unregister shrinkerJ. Bruce Fields
We forgot to unregister the nfs4_xattr_large_entry_shrinker. That leaves the global list of shrinkers corrupted after unload of the nfs module, after which possibly unrelated code that calls register_shrinker() or unregister_shrinker() gets a BUG() with "supervisor write access in kernel mode". And similarly for the nfs4_xattr_large_entry_lru. Reported-by: Kris Karas <bugs-a17@moonlit-rail.com> Tested-By: Kris Karas <bugs-a17@moonlit-rail.com> Fixes: 95ad37f90c33 "NFSv4.2: add client side xattr caching." Signed-off-by: J. Bruce Fields <bfields@redhat.com> CC: stable@vger.kernel.org Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-12gfs2: fix possible reference leak in gfs2_check_blk_typeZhang Qilong
In the fail path of gfs2_check_blk_type, forgetting to call gfs2_glock_dq_uninit will result in rgd_gh reference leak. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-11-11fscrypt: fix inline encryption not used on new filesEric Biggers
The new helper function fscrypt_prepare_new_inode() runs before S_ENCRYPTED has been set on the new inode. This accidentally made fscrypt_select_encryption_impl() never enable inline encryption on newly created files, due to its use of fscrypt_needs_contents_encryption() which only returns true when S_ENCRYPTED is set. Fix this by using S_ISREG() directly instead of fscrypt_needs_contents_encryption(), analogous to what select_encryption_mode() does. I didn't notice this earlier because by design, the user-visible behavior is the same (other than performance, potentially) regardless of whether inline encryption is used or not. Fixes: a992b20cd4ee ("fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context()") Reviewed-by: Satya Tangirala <satyat@google.com> Link: https://lore.kernel.org/r/20201111015224.303073-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-11-11Revert "ext4: fix superblock checksum calculation race"Theodore Ts'o
This reverts commit acaa532687cdc3a03757defafece9c27aa667546 which can result in a ext4_superblock_csum_set() trying to sleep while a spinlock is being held. For more discussion of this issue, please see: https://lore.kernel.org/r/000000000000f50cb705b313ed70@google.com Reported-by: syzbot+7a4ba6a239b91a126c28@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-11ext4: handle dax mount option collisionHarshad Shirwadkar
Mount options dax=inode and dax=never collided with fast_commit and journal checksum. Redefine the mount flags to remove the collision. Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Fixes: 9cb20f94afcd2 ("fs/ext4: Make DAX mount option a tri-state") Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201111183209.447175-1-harshads@google.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-10Merge tag 'for-5.10-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A handful of minor fixes and updates: - handle missing device replace item on mount (syzbot report) - fix space reservation calculation when finishing relocation - fix memory leak on error path in ref-verify (debugging feature) - fix potential overflow during defrag on 32bit arches - minor code update to silence smatch warning - minor error message updates" * tag 'for-5.10-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: ref-verify: fix memory leak in btrfs_ref_tree_mod btrfs: dev-replace: fail mount if we don't have replace item with target device btrfs: scrub: update message regarding read-only status btrfs: clean up NULL checks in qgroup_unreserve_range() btrfs: fix min reserved size calculation in merge_reloc_root btrfs: print the block rsv type when we fail our reservation btrfs: fix potential overflow in cluster_pages_for_defrag on 32bit arch
2020-11-10Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds
Pull fscrypt fix from Eric Biggers: "Fix a regression where a new WARN_ON() was reachable when using FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32 on ext4, causing xfstest generic/602 to sometimes fail on ext4" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: remove reachable WARN in fscrypt_setup_iv_ino_lblk_32_key()
2020-11-09Merge tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fixes from Bruce Fields: "This is mainly server-to-server copy and fallout from Chuck's 5.10 rpc refactoring" * tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linux: net/sunrpc: fix useless comparison in proc_do_xprt() net/sunrpc: return 0 on attempt to write to "transports" NFSD: fix missing refcount in nfsd4_copy by nfsd4_do_async_copy NFSD: Fix use-after-free warning when doing inter-server copy NFSD: MKNOD should return NFSERR_BADTYPE instead of NFSERR_INVAL SUNRPC: Fix general protection fault in trace_rpc_xdr_overflow() NFSD: NFSv3 PATHCONF Reply is improperly formed
2020-11-09Merge tag 'ext4_for_linus_cleanups' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes and cleanups from Ted Ts'o: "More fixes and cleanups for the new fast_commit features, but also a few other miscellaneous bug fixes and a cleanup for the MAINTAINERS file" * tag 'ext4_for_linus_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (28 commits) jbd2: fix up sparse warnings in checkpoint code ext4: fix sparse warnings in fast_commit code ext4: cleanup fast commit mount options jbd2: don't start fast commit on aborted journal ext4: make s_mount_flags modifications atomic ext4: issue fsdev cache flush before starting fast commit ext4: disable fast commit with data journalling ext4: fix inode dirty check in case of fast commits ext4: remove unnecessary fast commit calls from ext4_file_mmap ext4: mark buf dirty before submitting fast commit buffer ext4: fix code documentatioon ext4: dedpulicate the code to wait on inode that's being committed jbd2: don't read journal->j_commit_sequence without taking a lock jbd2: don't touch buffer state until it is filled jbd2: add todo for a fast commit performance optimization jbd2: don't pass tid to jbd2_fc_end_commit_fallback() jbd2: don't use state lock during commit path jbd2: drop jbd2_fc_init documentation ext4: clean up the JBD2 API that initializes fast commits jbd2: rename j_maxlen to j_total_len and add jbd2_journal_max_txn_bufs ...
2020-11-09Merge tag 'erofs-for-5.10-rc4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: "A week ago, Vladimir reported an issue that the kernel log would become polluted if the page allocation debug option is enabled. I also found this when I cleaned up magical page->mapping and originally planned to submit these all for 5.11 but it seems the impact can be noticed so submit the fix in advance. In addition, nl6720 also reported that atime is empty although EROFS has the only one on-disk timestamp as a practical consideration for now but it's better to derive it as what we did for the other timestamps. Summary: - fix setting up pcluster improperly for temporary pages - derive atime instead of leaving it empty" * tag 'erofs-for-5.10-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix setting up pcluster for temporary pages erofs: derive atime instead of leaving it empty
2020-11-08Merge tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Darrick Wong: - Fix an uninitialized struct problem - Fix an iomap problem zeroing unwritten EOF blocks - Fix some clumsy error handling when writeback fails on filesystems with blocksize < pagesize - Fix a retry loop not resetting loop variables properly - Fix scrub flagging rtinherit inodes on a non-rt fs, since the kernel actually does permit that combination - Fix excessive page cache flushing when unsharing part of a file * tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: only flush the unshared range in xfs_reflink_unshare xfs: fix scrub flagging rtinherit even if there is no rt device xfs: fix missing CoW blocks writeback conversion retry iomap: clean up writeback state logic on writepage error iomap: support partial page discard on writeback block mapping failure xfs: flush new eof page on truncate to avoid post-eof corruption xfs: set xefi_discard when creating a deferred agfl free log intent item
2020-11-08Merge branch 'hch' (patches from Christoph)Linus Torvalds
Merge procfs splice read fixes from Christoph Hellwig: "Greg reported a problem due to the fact that Android tests use procfs files to test splice, which stopped working with the changes for set_fs() removal. This series adds read_iter support for seq_file, and uses those for various proc files using seq_file to restore splice read support" [ Side note: Christoph initially had a scripted "move everything over" patch, which looks fine, but I personally would prefer us to actively discourage splice() on random files. So this does just the minimal basic core set of proc file op conversions. For completeness, and in case people care, that script was sed -i -e 's/\.proc_read\(\s*=\s*\)seq_read/\.proc_read_iter\1seq_read_iter/g' but I'll wait and see if somebody has a strong argument for using splice on random small /proc files before I'd run it on the whole kernel. - Linus ] * emailed patches from Christoph Hellwig <hch@lst.de>: proc "seq files": switch to ->read_iter proc "single files": switch to ->read_iter proc/stat: switch to ->read_iter proc/cpuinfo: switch to ->read_iter proc: wire up generic_file_splice_read for iter ops seq_file: add seq_read_iter
2020-11-07Merge tag 'io_uring-5.10-2020-11-07' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: "A set of fixes for io_uring: - SQPOLL cancelation fixes - Two fixes for the io_identity COW - Cancelation overflow fix (Pavel) - Drain request cancelation fix (Pavel) - Link timeout race fix (Pavel)" * tag 'io_uring-5.10-2020-11-07' of git://git.kernel.dk/linux-block: io_uring: fix link lookup racing with link timeout io_uring: use correct pointer for io_uring_show_cred() io_uring: don't forget to task-cancel drained reqs io_uring: fix overflowed cancel w/ linked ->files io_uring: drop req/tctx io_identity separately io_uring: ensure consistent view of original task ->mm from SQPOLL io_uring: properly handle SQPOLL request cancelations io-wq: cancel request if it's asking for files and we don't have them
2020-11-07jbd2: fix up sparse warnings in checkpoint codeTheodore Ts'o
Add missing __acquires() and __releases() annotations. Also, in an "this should never happen" WARN_ON check, if it *does* actually happen, we need to release j_state_lock since this function is always supposed to release that lock. Otherwise, things will quickly grind to a halt after the WARN_ON trips. Fixes: 96f1e0974575 ("jbd2: avoid long hold times of j_state_lock...") Cc: stable@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-07ext4: fix sparse warnings in fast_commit codeTheodore Ts'o
Add missing __acquire() and __releases() annotations, and make fc_ineligible_reasons[] static, as it is not used outside of fs/ext4/fast_commit.c. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: cleanup fast commit mount optionsHarshad Shirwadkar
Drop no_fc mount option that disable fast commit even if it was enabled at mkfs time. Move fc_debug_force mount option under ifdef EXT4_DEBUG to annotate that this is strictly for debugging and testing purposes and should not be used in production. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-23-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06jbd2: don't start fast commit on aborted journalHarshad Shirwadkar
Fast commit should not be started if the journal is aborted. Signed-off-by: Harshad Shirwadkar<harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-22-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: make s_mount_flags modifications atomicHarshad Shirwadkar
Fast commit file system states are recorded in sbi->s_mount_flags. Fast commit expects these bit manipulations to be atomic. This patch adds helpers to make those modifications atomic. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-21-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: issue fsdev cache flush before starting fast commitHarshad Shirwadkar
If the journal dev is different from fsdev, issue a cache flush before committing fast commit blocks to disk. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201106035911.1942128-20-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: disable fast commit with data journallingHarshad Shirwadkar
Fast commits don't work with data journalling. This patch disables the fast commit support when data journalling is turned on. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201106035911.1942128-19-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: fix inode dirty check in case of fast commitsHarshad Shirwadkar
In case of fast commits, determine if the inode is dirty by checking if the inode is on fast commit list. This also helps us get rid of ext4_inode_info.i_fc_committed_subtid field. Reported-by: Andrea Righi <andrea.righi@canonical.com> Tested-by: Andrea Righi <andrea.righi@canonical.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-18-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: remove unnecessary fast commit calls from ext4_file_mmapHarshad Shirwadkar
Remove unnecessary calls to ext4_fc_start_update() and ext4_fc_stop_update() from ext4_file_mmap(). Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-17-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: mark buf dirty before submitting fast commit bufferHarshad Shirwadkar
Mark the fast commit buffer as dirty before submission. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-16-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: fix code documentatioonHarshad Shirwadkar
Add a TODO to remember fixing REQ_FUA | REQ_PREFLUSH for fast commit buffers. Also, fix a typo in top level comment in fast_commit.c Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-15-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: dedpulicate the code to wait on inode that's being committedHarshad Shirwadkar
This patch removes the deduplicates the code that implements waiting on inode that's being committed. That code is moved into a new function. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-14-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06jbd2: don't read journal->j_commit_sequence without taking a lockHarshad Shirwadkar
Take journal state lock before reading journal->j_commit_sequence. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-13-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06jbd2: don't touch buffer state until it is filledHarshad Shirwadkar
Fast commit buffers should be filled in before toucing their state. Remove code that sets buffer state as dirty before the buffer is passed to the file system. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-12-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06jbd2: add todo for a fast commit performance optimizationHarshad Shirwadkar
Fast commit performance can be optimized if commit thread doesn't wait for ongoing fast commits to complete until the transaction enters T_FLUSH state. Document this optimization. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-11-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06jbd2: don't pass tid to jbd2_fc_end_commit_fallback()Harshad Shirwadkar
In jbd2_fc_end_commit_fallback(), we know which tid to commit. There's no need for caller to pass it. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-10-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06jbd2: don't use state lock during commit pathHarshad Shirwadkar
Variables journal->j_fc_off, journal->j_fc_wbuf are accessed during commit path. Since today we allow only one process to perform a fast commit, there is no need take state lock before accessing these variables. This patch removes these locks and adds comments to describe this. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-9-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: clean up the JBD2 API that initializes fast commitsHarshad Shirwadkar
This patch removes jbd2_fc_init() API and its related functions to simplify enabling fast commits. With this change, the number of fast commit blocks to use is solely determined by the JBD2 layer. So, we move the default value for minimum number of fast commit blocks from ext4/fast_commit.h to include/linux/jbd2.h. However, whether or not to use fast commits is determined by the file system. The file system just sets the fast commit feature using jbd2_journal_set_features(). JBD2 layer then determines how many blocks to use for fast commits (based on the value found in the JBD2 superblock). Note that the JBD2 feature flag of fast commits is just an indication that there are fast commit blocks present on disk. It doesn't tell JBD2 layer about the intent of the file system of whether to it wants to use fast commit or not. That's why, we blindly clear the fast commit flag in journal_reset() after the recovery is done. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-7-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06jbd2: rename j_maxlen to j_total_len and add jbd2_journal_max_txn_bufsHarshad Shirwadkar
The on-disk superblock field sb->s_maxlen represents the total size of the journal including the fast commit area and is no more the max number of blocks available for a transaction. The maximum number of blocks available to a transaction is reduced by the number of fast commit blocks. So, this patch renames j_maxlen to j_total_len to better represent its intent. Also, it adds a function to calculate max number of bufs available for a transaction. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-6-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: fixup ext4_fc_track_* functions' signatureHarshad Shirwadkar
Firstly, pass handle to all ext4_fc_track_* functions and use transaction id found in handle->h_transaction->h_tid for tracking fast commit updates. Secondly, don't pass inode to ext4_fc_track_link/create/unlink functions. inode can be found inside these functions as d_inode(dentry). However, rename path is an exeception. That's because in that case, we need inode that's not same as d_inode(dentry). To handle that, add a couple of low-level wrapper functions that take inode and dentry as arguments. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-5-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: drop redundant calls ext4_fc_track_rangeHarshad Shirwadkar
ext4_fc_track_range() should only be called when blocks are added or removed from an inode. So, the only places from where we need to call this function are ext4_map_blocks(), punch hole, collapse / zero range, truncate. Remove all the other redundant calls to ths function. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-4-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: mark fc ineligible if inode gets evictied due to mem pressureHarshad Shirwadkar
If inode gets evicted due to memory pressure, we have to remove it from the fast commit list. However, that inode may have uncommitted changes that fast commits will lose. So, just fall back to full commits in this case. Also, rename the fast commit ineligiblity reason from "EXT4_FC_REASON_MEM" to "EXT4_FC_REASON_MEM_NOMEM" for better expression. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201106035911.1942128-3-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: describe fast_commit feature flagsHarshad Shirwadkar
Fast commit feature has flags in the file system as well in JBD2. The meaning of fast commit feature flags can get confusing. Update docs and code to add more documentation about it. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201106035911.1942128-2-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06ext4: unlock xattr_sem properly in ext4_inline_data_truncate()Joseph Qi
It takes xattr_sem to check inline data again but without unlock it in case not have. So unlock it before return. Fixes: aef1c8513c1f ("ext4: let ext4_truncate handle inline data correctly") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Tao Ma <boyu.mt@taobao.com> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/1604370542-124630-1-git-send-email-joseph.qi@linux.alibaba.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-11-06ext4: silence an uninitialized variable warningDan Carpenter
Smatch complains that "i" can be uninitialized if we don't enter the loop. I don't know if it's possible but we may as well silence this warning. [ Initialize i to sb->s_blocksize instead of 0. The only way the for loop could be skipped entirely is the in-memory data structures, in particular the bh->b_data for the on-disk superblock has gotten corrupted enough that calculated value of group is >= to ext4_get_groups_count(sb). In that case, we want to exit immediately without allocating a block. -- TYT ] Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20201030114620.GB3251003@mwanda Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-11-06ext4: correctly report "not supported" for {usr,grp}jquota when !CONFIG_QUOTAKaixu Xia
The macro MOPT_Q is used to indicates the mount option is related to quota stuff and is defined to be MOPT_NOSUPPORT when CONFIG_QUOTA is disabled. Normally the quota options are handled explicitly, so it didn't matter that the MOPT_STRING flag was missing, even though the usrjquota and grpjquota mount options take a string argument. It's important that's present in the !CONFIG_QUOTA case, since without MOPT_STRING, the mount option matcher will match usrjquota= followed by an integer, and will otherwise skip the table entry, and so "mount option not supported" error message is never reported. [ Fixed up the commit description to better explain why the fix works. --TYT ] Fixes: 26092bf52478 ("ext4: use a table-driven handler for mount options") Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Link: https://lore.kernel.org/r/1603986396-28917-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-11-06Merge tag 'ceph-for-5.10-rc3' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fix from Ilya Dryomov: "A fix for a potential stall on umount caused by the MDS dropping our REQUEST_CLOSE message. The code that handled this case was inadvertently disabled in 5.9, this patch removes it entirely and fixes the problem in a way that is consistent with ceph-fuse" * tag 'ceph-for-5.10-rc3' of git://github.com/ceph/ceph-client: ceph: check session state after bumping session->s_seq
2020-11-06proc "seq files": switch to ->read_iterChristoph Hellwig
Implement ->read_iter for all proc "seq files" so that splice works on them. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>