summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2019-08-29Merge tag '5.3-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "A few small SMB3 fixes, and a larger one to fix various older string handling functions" * tag '5.3-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module number cifs: replace various strncpy with strscpy and similar cifs: Use kzfree() to zero out the password cifs: set domainName when a domain-key is used in multiuser
2019-08-28nfsd: eliminate an unnecessary acl size limitJ. Bruce Fields
We're unnecessarily limiting the size of an ACL to less than what most filesystems will support. Some users do hit the limit and it's confusing and unnecessary. It still seems prudent to impose some limit on the number of ACEs the client gives us before passing it straight to kmalloc(). So, let's just limit it to the maximum number that would be possible given the amount of data left in the argument buffer. That will still leave one limit beyond whatever the filesystem imposes: the client and server negotiate a limit on the size of a request, which we have to respect. But we're no longer imposing any additional arbitrary limit. struct nfs4_ace is 20 bytes on my system and the maximum call size we'll negotiate is about a megabyte, so in practice this is limiting the allocation here to about a megabyte. Reported-by: "de Vandiere, Louis" <louis.devandiere@atos.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-28xfs: log proper length of btree block in scrub/repairEric Sandeen
xfs_trans_log_buf() takes a final argument of the last byte to log in the buffer; b_length is in basic blocks, so this isn't the correct last byte. Fix it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-28xfs: reinitialize rm_flags when unpacking an offset into an rmap irecDarrick J. Wong
In xfs_rmap_irec_offset_unpack, we should always clear the contents of rm_flags before we begin unpacking the encoded (ondisk) offset into the incore rm_offset and incore rm_flags fields. Remove the open-coded field zeroing as this encourages api misuse. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28xfs: remove unnecessary int returns from deferred bmap functionsDarrick J. Wong
Remove the return value from the functions that schedule deferred bmap operations since they never fail and do not return status. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28xfs: remove unnecessary int returns from deferred refcount functionsDarrick J. Wong
Remove the return value from the functions that schedule deferred refcount operations since they never fail and do not return status. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28xfs: remove unnecessary int returns from deferred rmap functionsDarrick J. Wong
Remove the return value from the functions that schedule deferred rmap operations since they never fail and do not return status. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28xfs: remove unnecessary parameter from xfs_iext_inc_seqDarrick J. Wong
This function doesn't use the @state parameter, so get rid of it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28xfs: fix sign handling problem in xfs_bmbt_diff_two_keysDarrick J. Wong
In xfs_bmbt_diff_two_keys, we perform a signed int64_t subtraction with two unsigned 64-bit quantities. If the second quantity is actually the "maximum" key (all ones) as used in _query_all, the subtraction effectively becomes addition of two positive numbers and the function returns incorrect results. Fix this with explicit comparisons of the unsigned values. Nobody needs this now, but the online repair patches will need this to work properly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28xfs: don't return _QUERY_ABORT from xfs_rmap_has_other_keysDarrick J. Wong
The xfs_rmap_has_other_keys helper aborts the iteration as soon as it has an answer. Don't let this abort leak out to callers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-08-28xfs: fix maxicount division by zero errorDarrick J. Wong
In xfs_ialloc_setup_geometry, it's possible for a malicious/corrupt fs image to set an unreasonably large value for sb_inopblog which will cause ialloc_blks to be zero. If sb_imax_pct is also set, this results in a division by zero error in the second do_div call. Therefore, force maxicount to zero if ialloc_blks is zero. Note that the kernel metadata verifiers will catch the garbage inopblog value and abort the fs mount long before it tries to set up the inode geometry; this is needed to avoid a crash in xfs_db while setting up the xfs_mount structure. Found by fuzzing sb_inopblog to 122 in xfs/350. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2019-08-28ext4: fix integer overflow when calculating commit intervalzhangyi (F)
If user specify a large enough value of "commit=" option, it may trigger signed integer overflow which may lead to sbi->s_commit_interval becomes a large or small value, zero in particular. UBSAN: Undefined behaviour in ../fs/ext4/super.c:1592:31 signed integer overflow: 536870912 * 1000 cannot be represented in type 'int' [...] Call trace: [...] [<ffffff9008a2d120>] ubsan_epilogue+0x34/0x9c lib/ubsan.c:166 [<ffffff9008a2d8b8>] handle_overflow+0x228/0x280 lib/ubsan.c:197 [<ffffff9008a2d95c>] __ubsan_handle_mul_overflow+0x4c/0x68 lib/ubsan.c:218 [<ffffff90086d070c>] handle_mount_opt fs/ext4/super.c:1592 [inline] [<ffffff90086d070c>] parse_options+0x1724/0x1a40 fs/ext4/super.c:1773 [<ffffff90086d51c4>] ext4_remount+0x2ec/0x14a0 fs/ext4/super.c:4834 [...] Although it is not a big deal, still silence the UBSAN by limit the input value. Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2019-08-28ext4: use percpu_counters for extent_status cache hits/missesYang Guo
@es_stats_cache_hits and @es_stats_cache_misses are accessed frequently in ext4_es_lookup_extent function, it would influence the ext4 read/write performance in NUMA system. Let's optimize it using percpu_counter, it is profitable for the performance. The test command is as below: fio -name=randwrite -numjobs=8 -filename=/mnt/test1 -rw=randwrite -ioengine=libaio -direct=1 -iodepth=64 -sync=0 -norandommap -group_reporting -runtime=120 -time_based -bs=4k -size=5G And the result is better 10% than the initial implement: without the patch,IOPS=197k, BW=770MiB/s (808MB/s)(90.3GiB/120002msec) with the patch, IOPS=218k, BW=852MiB/s (894MB/s)(99.9GiB/120002msec) Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Yang Guo <guoyang2@huawei.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
2019-08-28ext4: fix potential use after free after remounting with noblock_validityzhangyi (F)
Remount process will release system zone which was allocated before if "noblock_validity" is specified. If we mount an ext4 file system to two mountpoints with default mount options, and then remount one of them with "noblock_validity", it may trigger a use after free problem when someone accessing the other one. # mount /dev/sda foo # mount /dev/sda bar User access mountpoint "foo" | Remount mountpoint "bar" | ext4_map_blocks() | ext4_remount() check_block_validity() | ext4_setup_system_zone() ext4_data_block_valid() | ext4_release_system_zone() | free system_blks rb nodes access system_blks rb nodes | trigger use after free | This problem can also be reproduced by one mountpint, At the same time, add_system_zone() can get called during remount as well so there can be racing ext4_data_block_valid() reading the rbtree at the same time. This patch add RCU to protect system zone from releasing or building when doing a remount which inverse current "noblock_validity" mount option. It assign the rbtree after the whole tree was complete and do actual freeing after rcu grace period, avoid any intermediate state. Reported-by: syzbot+1e470567330b7ad711d5@syzkaller.appspotmail.com Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2019-08-27cifs: update internal module numberSteve French
To 2.22 Signed-off-by: Steve French <stfrench@microsoft.com>
2019-08-27cifs: replace various strncpy with strscpy and similarRonnie Sahlberg
Using strscpy is cleaner, and avoids some problems with handling maximum length strings. Linus noticed the original problem and Aurelien pointed out some additional problems. Fortunately most of this is SMB1 code (and in particular the ASCII string handling older, which is less common). Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-08-27cifs: Use kzfree() to zero out the passwordDan Carpenter
It's safer to zero out the password so that it can never be disclosed. Fixes: 0c219f5799c7 ("cifs: set domainName when a domain-key is used in multiuser") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-08-27cifs: set domainName when a domain-key is used in multiuserRonnie Sahlberg
RHBZ: 1710429 When we use a domain-key to authenticate using multiuser we must also set the domainnmame for the new volume as it will be used and passed to the server in the NTLMSSP Domain-name. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-08-27Merge tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable fixes: - Fix a page lock leak in nfs_pageio_resend() - Ensure O_DIRECT reports an error if the bytes read/written is 0 - Don't handle errors if the bind/connect succeeded - Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidat ed" Bugfixes: - Don't refresh attributes with mounted-on-file information - Fix return values for nfs4_file_open() and nfs_finish_open() - Fix pnfs layoutstats reporting of I/O errors - Don't use soft RPC calls for pNFS/flexfiles I/O, and don't abort for soft I/O errors when the user specifies a hard mount. - Various fixes to the error handling in sunrpc - Don't report writepage()/writepages() errors twice" * tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: remove set but not used variable 'mapping' NFSv2: Fix write regression NFSv2: Fix eof handling NFS: Fix writepage(s) error handling to not report errors twice NFS: Fix spurious EIO read errors pNFS/flexfiles: Don't time out requests on hard mounts SUNRPC: Handle connection breakages correctly in call_status() Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated" SUNRPC: Handle EADDRINUSE and ENOBUFS correctly pNFS/flexfiles: Turn off soft RPC calls SUNRPC: Don't handle errors if the bind/connect succeeded NFS: On fatal writeback errors, we need to call nfs_inode_remove_request() NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0 NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend() NFSv4: Fix return value in nfs_finish_open() NFSv4: Fix return values for nfs4_file_open() NFS: Don't refresh attributes with mounted-on-file information
2019-08-27io_uring: allocate the two rings togetherHristo Venev
Both the sq and the cq rings have sizes just over a power of two, and the sq ring is significantly smaller. By bundling them in a single alllocation, we get the sq ring for free. This also means that IORING_OFF_SQ_RING and IORING_OFF_CQ_RING now mean the same thing. If we indicate this to userspace, we can save a mmap call. Signed-off-by: Hristo Venev <hristo@venev.name> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-27fs/io_uring.c: convert put_page() to put_user_page*()John Hubbard
For pages that were retained via get_user_pages*(), release those pages via the new put_user_page*() routines, instead of via put_page() or release_pages(). This is part a tree-wide conversion, as described in commit fc1d8e7cca2d ("mm: introduce put_user_page*(), placeholder versions"). Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-block@vger.kernel.org Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-27writeback, memcg: Implement cgroup_writeback_by_id()Tejun Heo
Implement cgroup_writeback_by_id() which initiates cgroup writeback from bdi and memcg IDs. This will be used by memcg foreign inode flushing. v2: Use wb_get_lookup() instead of wb_get_create() to avoid creating spurious wbs. v3: Interpret 0 @nr as 1.25 * nr_dirty to implement best-effort flushing while avoding possible livelocks. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-27writeback: Generalize and expose wb_completionTejun Heo
wb_completion is used to track writeback completions. We want to use it from memcg side for foreign inode flushes. This patch updates it to remember the target waitq instead of assuming bdi->wb_waitq and expose it outside of fs-writeback.c. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-27NFS: remove set but not used variable 'mapping'YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning: fs/nfs/write.c: In function nfs_page_async_flush: fs/nfs/write.c:609:24: warning: variable mapping set but not used [-Wunused-but-set-variable] It is not use since commit aefb623c422e ("NFS: Fix writepage(s) error handling to not report errors twice") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-27NFSv2: Fix write regressionTrond Myklebust
Ensure we update the write result count on success, since the RPC call itself does not do so. Reported-by: Jan Stancek <jstancek@redhat.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Jan Stancek <jstancek@redhat.com>
2019-08-27NFSv2: Fix eof handlingTrond Myklebust
If we received a reply from the server with a zero length read and no error, then that implies we are at eof. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-27udf: augment UDF permissions on new inodesSteven J. Magnani
Windows presents files created within Linux as read-only, even when permissions in Linux indicate the file should be writable. UDF defines a slightly different set of basic file permissions than Linux. Specifically, UDF has "delete" and "change attribute" permissions for each access class (user/group/other). Linux has no equivalents for these. When the Linux UDF driver creates a file (or directory), no UDF delete or change attribute permissions are granted. The lack of delete permission appears to cause Windows to mark an item read-only when its permissions otherwise indicate that it should be read-write. Fix this by having UDF delete permissions track Linux write permissions. Also grant UDF change attribute permission to the owner when creating a new inode. Reported by: Ty Young Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Link: https://lore.kernel.org/r/20190827121359.9954-1-steve@digidescorp.com Signed-off-by: Jan Kara <jack@suse.cz>
2019-08-26xfs: bmap scrub should only scrub records onceDarrick J. Wong
The inode block mapping scrub function does more work for btree format extent maps than is absolutely necessary -- first it will walk the bmbt and check all the entries, and then it will load the incore tree and check every entry in that tree, possibly for a second time. Simplify the code and decrease check runtime by separating the two responsibilities. The bmbt walk will make sure the incore extent mappings are loaded, check the shape of the bmap btree (via xchk_btree) and check that every bmbt record has a corresponding incore extent map; and the incore extent map walk takes all the responsibility for checking the mapping records and cross referencing them with other AG metadata. This enables us to clean up some messy parameter handling and reduce redundant code. Rename a few functions to make the split of responsibilities clearer. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-08-26xfs: remove excess function parameter description in ↵zhengbin
'xfs_btree_sblock_v5hdr_verify' Fixes gcc warning: fs/xfs/libxfs/xfs_btree.c:4475: warning: Excess function parameter 'max_recs' description in 'xfs_btree_sblock_v5hdr_verify' fs/xfs/libxfs/xfs_btree.c:4475: warning: Excess function parameter 'pag_max_level' description in 'xfs_btree_sblock_v5hdr_verify' Fixes: c5ab131ba0df ("libxfs: refactor short btree block verification") Signed-off-by: zhengbin <zhengbin13@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26xfs: add kmem_alloc_io()Dave Chinner
Memory we use to submit for IO needs strict alignment to the underlying driver contraints. Worst case, this is 512 bytes. Given that all allocations for IO are always a power of 2 multiple of 512 bytes, the kernel heap provides natural alignment for objects of these sizes and that suffices. Until, of course, memory debugging of some kind is turned on (e.g. red zones, poisoning, KASAN) and then the alignment of the heap objects is thrown out the window. Then we get weird IO errors and data corruption problems because drivers don't validate alignment and do the wrong thing when passed unaligned memory buffers in bios. TO fix this, introduce kmem_alloc_io(), which will guaranteeat least 512 byte alignment of buffers for IO, even if memory debugging options are turned on. It is assumed that the minimum allocation size will be 512 bytes, and that sizes will be power of 2 mulitples of 512 bytes. Use this everywhere we allocate buffers for IO. This no longer fails with log recovery errors when KASAN is enabled due to the brd driver not handling unaligned memory buffers: # mkfs.xfs -f /dev/ram0 ; mount /dev/ram0 /mnt/test Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26xfs: get allocation alignment from the buftargDave Chinner
Needed to feed into the allocation routine to guarantee the memory buffers we add to bios are correctly aligned to the underlying device. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26xfs: add kmem allocation trace pointsDave Chinner
When trying to correlate XFS kernel allocations to memory reclaim behaviour, it is useful to know what allocations XFS is actually attempting. This information is not directly available from tracepoints in the generic memory allocation and reclaim tracepoints, so these new trace points provide a high level indication of what the XFS memory demand actually is. There is no per-filesystem context in this code, so we just trace the type of allocation, the size and the allocation constraints. The kmem code also doesn't include much of the common XFS headers, so there are a few definitions that need to be added to the trace headers and a couple of types that need to be made common to avoid needing to include the whole world in the kmem code. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26NFS: Fix writepage(s) error handling to not report errors twiceTrond Myklebust
If writepage()/writepages() saw an error, but handled it without reporting it, we should not be re-reporting that error on exit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26NFS: Fix spurious EIO read errorsTrond Myklebust
If the client attempts to read a page, but the read fails due to some spurious error (e.g. an ACCESS error or a timeout, ...) then we need to allow other processes to retry. Also try to report errors correctly when doing a synchronous readpage. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26pNFS/flexfiles: Don't time out requests on hard mountsTrond Myklebust
If the mount is hard, we should ignore the 'io_maxretrans' module parameter so that we always keep retrying. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated"Trond Myklebust
This reverts commit a79f194aa4879e9baad118c3f8bb2ca24dbef765. The mechanism for aborting I/O is racy, since we are not guaranteed that the request is asleep while we're changing both task->tk_status and task->tk_action. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v5.1
2019-08-26pNFS/flexfiles: Turn off soft RPC callsTrond Myklebust
The pNFS/flexfiles I/O requests are sent with the SOFTCONN flag set, so they automatically time out if the connection breaks. It should therefore not be necessary to have the soft flag set in addition. Fixes: 5f01d9539496 ("nfs41: create NFSv3 DS connection if specified") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26fs: xfs: Remove KM_NOSLEEP and KM_SLEEP.Tetsuo Handa
Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP, we can remove KM_NOSLEEP and replace KM_SLEEP with 0. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-08-26Deprecate nfsd fault injectionJ. Bruce Fields
This is only useful for client testing. I haven't really maintained it, and reference counting and locking are wrong at this point. You can get some of the same functionality now from nfsd/clients/. It was a good idea but I think its time has passed. In the unlikely event of users, hopefully the BROKEN dependency will prompt them to speak up. Otherwise I expect to remove it soon. Reported-by: Alex Lyakas <alex@zadara.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-26udf: Use dynamic debug infrastructureJan Kara
Instead of relying on UDFFS_DEBUG define for debug printing, just use standard pr_debug() prints and rely on CONFIG_DYNAMIC_DEBUG infrastructure for enabling or disabling prints. Signed-off-by: Jan Kara <jack@suse.cz>
2019-08-26udf: reduce leakage of blocks related to named streamsSteven J. Magnani
Windows is capable of creating UDF files having named streams. One example is the "Zone.Identifier" stream attached automatically to files downloaded from a network. See: https://msdn.microsoft.com/en-us/library/dn392609.aspx Modification of a file having one or more named streams in Linux causes the stream directory to become detached from the file, essentially leaking all blocks pertaining to the file's streams. Fix by saving off information about an inode's streams when reading it, for later use when its on-disk data is updated. Link: https://lore.kernel.org/r/20190814125002.10869-1-steve@digidescorp.com Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-08-25Merge tag 'for-linus-5.3-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBIFS and JFFS2 fixes from Richard Weinberger: "UBIFS: - Don't block too long in writeback_inodes_sb() - Fix for a possible overrun of the log head - Fix double unlock in orphan_delete() JFFS2: - Remove C++ style from UAPI header and unbreak picky toolchains" * tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubifs: Limit the number of pages in shrink_liability ubifs: Correctly initialize c->min_log_bytes ubifs: Fix double unlock around orphan_delete() jffs2: Remove C++ style comments from uapi header
2019-08-24jbd2: add missing tracepoint for reserved handleXiaoguang Wang
This issue was found when I use ebpf to trace every jbd2 handle's running info in dioread_nolock case. Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-08-24userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctxOleg Nesterov
userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even if mm->core_state != NULL. Otherwise a page fault can see userfaultfd_missing() == T and use an already freed userfaultfd_ctx. Link: http://lkml.kernel.org/r/20190820160237.GB4983@redhat.com Fixes: 04f5866e41fb ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Peter Xu <peterx@redhat.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-08-24Merge tag 'xfs-5.3-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fix from Darrick Wong: "A single patch that fixes a xfs lockup problem when a chown/chgrp operation fails due to running out of quota. It has survived the usual xfstests runs and merges cleanly with this morning's master: - Fix a forgotten inode unlock when chown/chgrp fail due to quota" * tag 'xfs-5.3-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
2019-08-24erofs: move erofs out of stagingGao Xiang
EROFS filesystem has been merged into linux-staging for a year. EROFS is designed to be a better solution of saving extra storage space with guaranteed end-to-end performance for read-only files with the help of reduced metadata, fixed-sized output compression and decompression inplace technologies. In the past year, EROFS was greatly improved by many people as a staging driver, self-tested, betaed by a large number of our internal users, successfully applied to almost all in-service HUAWEI smartphones as the part of EMUI 9.1 and proven to be stable enough to be moved out of staging. EROFS is a self-contained filesystem driver. Although there are still some TODOs to be more generic, we have a dedicated team actively keeping on working on EROFS in order to make it better with the evolution of Linux kernel as the other in-kernel filesystems. As Pavel suggested, it's better to do as one commit since git can do moves and all histories will be saved in this way. Let's promote it from staging and enhance it more actively as a "real" part of kernel for more wider scenarios! Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Pavel Machek <pavel@denx.de> Cc: David Sterba <dsterba@suse.cz> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J . Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Richard Weinberger <richard@nod.at> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chao Yu <yuchao0@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: Li Guifu <bluce.liguifu@huawei.com> Cc: Fang Wei <fangwei1@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190822213659.5501-1-hsiangkao@aol.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-23ext4: fix punch hole for inline_data file systemsTheodore Ts'o
If a program attempts to punch a hole on an inline data file, we need to convert it to a normal file first. This was detected using ext4/032 using the adv configuration. Simple reproducer: mke2fs -Fq -t ext4 -O inline_data /dev/vdc mount /vdc echo "" > /vdc/testfile xfs_io -c 'truncate 33554432' /vdc/testfile xfs_io -c 'fpunch 0 1048576' /vdc/testfile umount /vdc e2fsck -fy /dev/vdc Cc: stable@vger.kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-08-23Merge tag 'for-linus-20190823' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Here's a set of fixes that should go into this release. This contains: - Three minor fixes for NVMe. - Three minor tweaks for the io_uring polling logic. - Officially mark Song as the MD maintainer, after he's been filling that role sucessfully for the last 6 months or so" * tag 'for-linus-20190823' of git://git.kernel.dk/linux-block: io_uring: add need_resched() check in inner poll loop md: update MAINTAINERS info io_uring: don't enter poll loop if we have CQEs pending nvme: Add quirk for LiteON CL1 devices running FW 22301111 nvme: Fix cntlid validation when not using NVMEoF nvme-multipath: fix possible I/O hang when paths are updated io_uring: fix potential hang with polled IO
2019-08-23Merge tag 'xfs-5.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Darrick Wong: "Here are a few more bug fixes that trickled in since the last pull. They've survived the usual xfstests runs and merge cleanly with this morning's master. I expect there to be one more pull request tomorrow for the fix to that quota related inode unlock bug that we were reviewing last night, but it will continue to soak in the testing machine for several more hours. - Fix missing compat ioctl handling for get/setlabel - Fix missing ioctl pointer sanitization on s390 - Fix a page locking deadlock in the dedupe comparison code - Fix inadequate locking in reflink code w.r.t. concurrent directio - Fix broken error detection when breaking layouts" * tag 'xfs-5.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: fs/xfs: Fix return code of xfs_break_leased_layouts() xfs: fix reflink source file racing with directio writes vfs: fix page locking deadlocks when deduping files xfs: compat_ioctl: use compat_ptr() xfs: fall back to native ioctls for unhandled compat ones
2019-08-23Merge tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "Three important fixes tagged for stable (an indefinite hang, a crash on an assert and a NULL pointer dereference) plus a small series from Luis fixing instances of vfree() under spinlock" * tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client: libceph: fix PG split vs OSD (re)connect race ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply ceph: clear page dirty before invalidate page ceph: fix buffer free while holding i_ceph_lock in fill_inode() ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob() ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr() libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer