git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-01-09	xfs: reject invalid flags combinations in XFS_IOC_ATTRMULTI_BY_HANDLE	Christoph Hellwig
	While the flags field in the ABI and the on-disk format allows for multiple namespace flags, that is a logically invalid combination that scrub complains about. Reject it at the ioctl level, as all other interface already get this right at higher levels. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-09	xfs: clear kernel only flags in XFS_IOC_ATTRMULTI_BY_HANDLE	Christoph Hellwig
	Don't allow passing arbitrary flags as they change behavior including memory allocation that the call stack is not prepared for. Fixes: ddbca70cc45c ("xfs: allocate xattr buffer on demand") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-09	udf: Fix free space reporting for metadata and virtual partitions	Jan Kara
	Free space on filesystems with metadata or virtual partition maps currently gets misreported. This is because these partitions are just remapped onto underlying real partitions from which keep track of free blocks. Take this remapping into account when counting free blocks as well. Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Reported-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-09	fs: move guard_bio_eod() after bio_set_op_attrs	Ming Lei
	Commit 85a8ce62c2ea ("block: add bio_truncate to fix guard_bio_eod") adds bio_truncate() for handling bio EOD. However, bio_truncate() doesn't use the passed 'op' parameter from guard_bio_eod's callers. So bio_trunacate() may retrieve wrong 'op', and zering pages may not be done for READ bio. Fixes this issue by moving guard_bio_eod() after bio_set_op_attrs() in submit_bh_wbc() so that bio_truncate() can always retrieve correct op info. Meantime remove the 'op' parameter from guard_bio_eod() because it isn't used any more. Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: linux-fsdevel@vger.kernel.org Fixes: 85a8ce62c2ea ("block: add bio_truncate to fix guard_bio_eod") Signed-off-by: Ming Lei <ming.lei@redhat.com> Fold in kerneldoc and bio_op() change. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-08	pstore/ram: Regularize prz label allocation lifetime	Kees Cook
	In my attempt to fix a memory leak, I introduced a double-free in the pstore error path. Instead of trying to manage the allocation lifetime between persistent_ram_new() and its callers, adjust the logic so persistent_ram_new() always takes a kstrdup() copy, and leaves the caller's allocation lifetime up to the caller. Therefore callers are _always_ responsible for freeing their label. Before, it only needed freeing when the prz itself failed to allocate, and not in any of the other prz failure cases, which callers would have no visibility into, which is the root design problem that lead to both the leak and now double-free bugs. Reported-by: Cengiz Can <cengiz@kernel.wtf> Link: https://lore.kernel.org/lkml/d4ec59002ede4aaf9928c7f7526da87c@kernel.wtf Fixes: 8df955a32a73 ("pstore/ram: Fix error-path memory leak in persistent_ram_new() callers") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2020-01-08	btrfs: fix memory leak in qgroup accounting	Johannes Thumshirn
	When running xfstests on the current btrfs I get the following splat from kmemleak: unreferenced object 0xffff88821b2404e0 (size 32): comm "kworker/u4:7", pid 26663, jiffies 4295283698 (age 8.776s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 10 ff fd 26 82 88 ff ff ...........&.... 10 ff fd 26 82 88 ff ff 20 ff fd 26 82 88 ff ff ...&.... ..&.... backtrace: [<00000000f94fd43f>] ulist_alloc+0x25/0x60 [btrfs] [<00000000fd023d99>] btrfs_find_all_roots_safe+0x41/0x100 [btrfs] [<000000008f17bd32>] btrfs_find_all_roots+0x52/0x70 [btrfs] [<00000000b7660afb>] btrfs_qgroup_rescan_worker+0x343/0x680 [btrfs] [<0000000058e66778>] btrfs_work_helper+0xac/0x1e0 [btrfs] [<00000000f0188930>] process_one_work+0x1cf/0x350 [<00000000af5f2f8e>] worker_thread+0x28/0x3c0 [<00000000b55a1add>] kthread+0x109/0x120 [<00000000f88cbd17>] ret_from_fork+0x35/0x40 This corresponds to: (gdb) l (btrfs_find_all_roots_safe+0x41) 0x8d7e1 is in btrfs_find_all_roots_safe (fs/btrfs/backref.c:1413). 1408 1409 tmp = ulist_alloc(GFP_NOFS); 1410 if (!tmp) 1411 return -ENOMEM; 1412 roots = ulist_alloc(GFP_NOFS); 1413 if (!*roots) { 1414 ulist_free(tmp); 1415 return -ENOMEM; 1416 } 1417 Following the lifetime of the allocated 'roots' ulist, it gets freed again in btrfs_qgroup_account_extent(). But this does not happen if the function is called with the 'BTRFS_FS_QUOTA_ENABLED' flag cleared, then btrfs_qgroup_account_extent() does a short leave and directly returns. Instead of directly returning we should jump to the 'out_free' in order to free all resources as expected. CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> [ add comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-08	gfs2: minor cleanup: remove unneeded variable ret in gfs2_jdata_writepage	Bob Peterson
	This patch simply removes variable ret, which is used to store the return code of its call to __gfs2_jdata_writepage, in favor of just returning the result directly. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2020-01-08	btrfs: do not delete mismatched root refs	Josef Bacik
	btrfs_del_root_ref() will simply WARN_ON() if the ref doesn't match in any way, and then continue to delete the reference. This shouldn't happen, we have these values because there's more to the reference than the original root and the sub root. If any of these checks fail, return -ENOENT. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-08	btrfs: fix invalid removal of root ref	Josef Bacik
	If we have the following sequence of events btrfs sub create A btrfs sub create A/B btrfs sub snap A C mkdir C/foo mv A/B C/foo rm -rf * We will end up with a transaction abort. The reason for this is because we create a root ref for B pointing to A. When we create a snapshot of C we still have B in our tree, but because the root ref points to A and not C we will make it appear to be empty. The problem happens when we move B into C. This removes the root ref for B pointing to A and adds a ref of B pointing to C. When we rmdir C we'll see that we have a ref to our root and remove the root ref, despite not actually matching our reference name. Now btrfs_del_root_ref() allowing this to work is a bug as well, however we know that this inode does not actually point to a root ref in the first place, so we shouldn't be calling btrfs_del_root_ref() in the first place and instead simply look up our dir index for this item and do the rest of the removal. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-08	btrfs: rework arguments of btrfs_unlink_subvol	Josef Bacik
	btrfs_unlink_subvol takes the name of the dentry and the root objectid based on what kind of inode this is, either a real subvolume link or a empty one that we inherited as a snapshot. We need to fix how we unlink in the case for BTRFS_EMPTY_SUBVOL_DIR_OBJECTID in the future, so rework btrfs_unlink_subvol to just take the dentry and handle getting the right objectid given the type of inode this is. There is no functional change here, simply pushing the work into btrfs_unlink_subvol() proper. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-08	udf: Update header files to UDF 2.60	Pali Rohár
	This change synchronizes header files ecma_167.h and osta_udf.h with udftools 2.2 project which already has definitions for UDF 2.60 revision. Link: https://lore.kernel.org/r/20200107212904.30471-3-pali.rohar@gmail.com Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-08	udf: Move OSTA Identifier Suffix macros from ecma_167.h to osta_udf.h	Pali Rohár
	Rename structure name and its members to match naming convention and fix endianity type for UDFRevision member. Also remove duplicate definition of UDF_ID_COMPLIANT which is already in osta_udf.h. Link: https://lore.kernel.org/r/20200107212904.30471-2-pali.rohar@gmail.com Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-08	udf: Fix spelling in EXT_NEXT_EXTENT_ALLOCDESCS	Pali Rohár
	Change EXT_NEXT_EXTENT_ALLOCDECS to proper spelling EXT_NEXT_EXTENT_ALLOCDESCS. Link: https://lore.kernel.org/r/20200107212904.30471-1-pali.rohar@gmail.com Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-07	io_uring: remove punt of short reads to async context	Jens Axboe
	We currently punt any short read on a regular file to async context, but this fails if the short read is due to running into EOF. This is especially problematic since we only do the single prep for commands now, as we don't reset kiocb->ki_pos. This can result in a 4k read on a 1k file returning zero, as we detect the short read and then retry from async context. At the time of retry, the position is now 1k, and we end up reading nothing, and hence return 0. Instead of trying to patch around the fact that short reads can be legitimate and won't succeed in case of retry, remove the logic to punt a short read to async context. Simply return it. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-07	xfs: remove shadow variable in xfs_btree_lshift	Eric Sandeen
	Sparse warns about a shadow variable in this function after the Fixed: commit added another int i; with larger scope. It's safe to remove the one with the smaller scope to fix this shadow, although the shadow itself is harmless. Fixes: 2c813ad66a72 ("xfs: support btrees with overlapping intervals for keys") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-07	gfs2: eliminate ssize parameter from gfs2_struct2blk	Bob Peterson
	Every caller of function gfs2_struct2blk specified sizeof(u64). This patch eliminates the unnecessary parameter and replaces the size calculation with a new superblock variable that is computed to be the maximum number of block pointers we can fit inside a log descriptor, as is done for pointers per dinode and indirect block. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-01-07	gfs2: Another gfs2_find_jhead fix	Andreas Gruenbacher
	On filesystems with a block size smaller than the page size, gfs2_find_jhead can split a page across two bios (for example, when blocks are not allocated consecutively). When that happens, the first bio that completes will unlock the page in its bi_end_io handler even though the page hasn't been read completely yet. Fix that by using a chained bio for the rest of the page. While at it, clean up the sector calculation logic in gfs2_log_alloc_bio. In gfs2_find_jhead, simplify the disk block and offset calculation logic and fix a variable name. Fixes: f4686c26ecc3 ("gfs2: read journal in large chunks") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-01-07	erofs: remove void tagging/untagging of workgroup pointers	Vladimir Zapolskiy
	Because workgroup pointers inserted to a radix tree are always tagged with a single value of 0, it is possible to remove tagging and untagging of the pointers completely. Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com> Link: https://lore.kernel.org/r/20200102120118.14979-4-vladimir@tuxera.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
2020-01-07	erofs: remove unused tag argument while registering a workgroup	Vladimir Zapolskiy
	All workgroups are registered with tag value set to 0, to simplify erofs_register_workgroup() interface the tag argument can be removed, if its only value is sent down to the function body. Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com> Link: https://lore.kernel.org/r/20200102120118.14979-3-vladimir@tuxera.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
2020-01-07	erofs: remove unused tag argument while finding a workgroup	Vladimir Zapolskiy
	It is feasible to simplify erofs_find_workgroup() interface by removing an unused function argument. While formally the argument is used in the function itself, its assigned value is ignored on the caller side. Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com> Link: https://lore.kernel.org/r/20200102120118.14979-2-vladimir@tuxera.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
2020-01-07	erofs: correct indentation of an assigned structure inside a function	Vladimir Zapolskiy
	Trivial change, the expected indentation ruled by the coding style hasn't been met. Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com> Link: https://lore.kernel.org/r/20200102120232.15074-1-vladimir@tuxera.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
2020-01-06	debugfs: Fix warnings when building documentation	Daniel W. S. Almeida
	Fix the following warnings: fs/debugfs/inode.c:423: WARNING: Inline literal start-string without end-string. fs/debugfs/inode.c:502: WARNING: Inline literal start-string without end-string. fs/debugfs/inode.c:534: WARNING: Inline literal start-string without end-string. fs/debugfs/inode.c:627: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:496: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:502: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:581: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:587: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:846: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:852: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:899: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:905: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:1091: WARNING: Inline literal start-string without end-string. fs/debugfs/file.c:1097: WARNING: Inline literal start-string without end-string By replacing %ERR_PTR with ERR_PTR. Signed-off-by: Daniel W. S. Almeida <dwlsalmeida@gmail.com> Link: https://lore.kernel.org/r/20191227010035.854913-1-dwlsalmeida@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-06	chardev: Avoid potential use-after-free in 'chrdev_open()'	Will Deacon
	'chrdev_open()' calls 'cdev_get()' to obtain a reference to the 'struct cdev ' stashed in the 'i_cdev' field of the target inode structure. If the pointer is NULL, then it is initialised lazily by looking up the kobject in the 'cdev_map' and so the whole procedure is protected by the 'cdev_lock' spinlock to serialise initialisation of the shared pointer. Unfortunately, it is possible for the initialising thread to fail after* installing the new pointer, for example if the subsequent '->open()' call on the file fails. In this case, 'cdev_put()' is called, the reference count on the kobject is dropped and, if nobody else has taken a reference, the release function is called which finally clears 'inode->i_cdev' from 'cdev_purge()' before potentially freeing the object. The problem here is that a racing thread can happily take the 'cdev_lock' and see the non-NULL pointer in the inode, which can result in a refcount increment from zero and a warning: \| ------------[ cut here ]------------ \| refcount_t: addition on 0; use-after-free. \| WARNING: CPU: 2 PID: 6385 at lib/refcount.c:25 refcount_warn_saturate+0x6d/0xf0 \| Modules linked in: \| CPU: 2 PID: 6385 Comm: repro Not tainted 5.5.0-rc2+ #22 \| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 \| RIP: 0010:refcount_warn_saturate+0x6d/0xf0 \| Code: 05 55 9a 15 01 01 e8 9d aa c8 ff 0f 0b c3 80 3d 45 9a 15 01 00 75 ce 48 c7 c7 00 9c 62 b3 c6 08 \| RSP: 0018:ffffb524c1b9bc70 EFLAGS: 00010282 \| RAX: 0000000000000000 RBX: ffff9e9da1f71390 RCX: 0000000000000000 \| RDX: ffff9e9dbbd27618 RSI: ffff9e9dbbd18798 RDI: ffff9e9dbbd18798 \| RBP: 0000000000000000 R08: 000000000000095f R09: 0000000000000039 \| R10: 0000000000000000 R11: ffffb524c1b9bb20 R12: ffff9e9da1e8c700 \| R13: ffffffffb25ee8b0 R14: 0000000000000000 R15: ffff9e9da1e8c700 \| FS: 00007f3b87d26700(0000) GS:ffff9e9dbbd00000(0000) knlGS:0000000000000000 \| CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 \| CR2: 00007fc16909c000 CR3: 000000012df9c000 CR4: 00000000000006e0 \| DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 \| DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 \| Call Trace: \| kobject_get+0x5c/0x60 \| cdev_get+0x2b/0x60 \| chrdev_open+0x55/0x220 \| ? cdev_put.part.3+0x20/0x20 \| do_dentry_open+0x13a/0x390 \| path_openat+0x2c8/0x1470 \| do_filp_open+0x93/0x100 \| ? selinux_file_ioctl+0x17f/0x220 \| do_sys_open+0x186/0x220 \| do_syscall_64+0x48/0x150 \| entry_SYSCALL_64_after_hwframe+0x44/0xa9 \| RIP: 0033:0x7f3b87efcd0e \| Code: 89 54 24 08 e8 a3 f4 ff ff 8b 74 24 0c 48 8b 3c 24 41 89 c0 44 8b 54 24 08 b8 01 01 00 00 89 f4 \| RSP: 002b:00007f3b87d259f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 \| RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3b87efcd0e \| RDX: 0000000000000000 RSI: 00007f3b87d25a80 RDI: 00000000ffffff9c \| RBP: 00007f3b87d25e90 R08: 0000000000000000 R09: 0000000000000000 \| R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe188f504e \| R13: 00007ffe188f504f R14: 00007f3b87d26700 R15: 0000000000000000 \| ---[ end trace 24f53ca58db8180a ]--- Since 'cdev_get()' can already fail to obtain a reference, simply move it over to use 'kobject_get_unless_zero()' instead of 'kobject_get()', which will cause the racing thread to return -ENXIO if the initialising thread fails unexpectedly. Cc: Hillf Danton <hdanton@sina.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Reported-by: syzbot+82defefbbd8527e1c2cb@syzkaller.appspotmail.com Signed-off-by: Will Deacon <will@kernel.org> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191219120203.32691-1-will@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-06	fs: Fix page_mkwrite off-by-one errors	Andreas Gruenbacher
	The check in block_page_mkwrite that is meant to determine whether an offset is within the inode size is off by one. This bug has been copied into iomap_page_mkwrite and several filesystems (ubifs, ext4, f2fs, ceph). Fix that by introducing a new page_mkwrite_check_truncate helper that checks for truncate and computes the bytes in the page up to EOF. Use the helper in iomap. NOTE from Darrick: The original patch fixed a number of filesystems, but then there were merge conflicts with the f2fs for-next tree; a subsequent re-submission of the patch had different btrfs changes with no explanation; and Christoph complained that each per-fs fix should be a separate patch. In my view that's too much risk to take on, so I decided to drop all the hunks except for iomap, since I've actually QA'd XFS. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: drop everything but the iomap parts] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-06	xfs: quota: move to time64_t interfaces	Arnd Bergmann
	As a preparation for removing the 32-bit time_t type and all associated interfaces, change xfs to use time64_t and ktime_get_real_seconds() for the quota housekeeping. This avoids one difference between 32-bit and 64-bit kernels, raising the theoretical limit for the quota grace period to year 2106 on 32-bit instead of year 2038. Note that common user space tools using the XFS quotactl interface instead of the generic one still use the y2038 dates. To fix quotas properly, both the on-disk format and user space still need to be changed. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-06	xfs: rename compat_time_t to old_time32_t	Arnd Bergmann
	The compat_time_t type has been removed everywhere else, as most users rely on old_time32_t for both native and compat mode handling of 32-bit time_t. Remove the last one in xfs. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-06	ext2: Adjust indentation in ext2_fill_super	Nathan Chancellor
	Clang warns: ../fs/ext2/super.c:1076:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] sbi->s_groups_count = ((le32_to_cpu(es->s_blocks_count) - ^ ../fs/ext2/super.c:1074:2: note: previous statement is here if (EXT2_BLOCKS_PER_GROUP(sb) == 0) ^ 1 warning generated. This warning occurs because there is a space before the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: 41f04d852e35 ("[PATCH] ext2: fix mounts at 16T") Link: https://github.com/ClangBuiltLinux/linux/issues/827 Link: https://lore.kernel.org/r/20191218031930.31393-1-natechancellor@gmail.com Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-01-04	ocfs2: fix the crash due to call ocfs2_get_dlm_debug once less	Gang He
	Because ocfs2_get_dlm_debug() function is called once less here, ocfs2 file system will trigger the system crash, usually after ocfs2 file system is unmounted. This system crash is caused by a generic memory corruption, these crash backtraces are not always the same, for exapmle, ocfs2: Unmounting device (253,16) on (node 172167785) general protection fault: 0000 [#1] SMP PTI CPU: 3 PID: 14107 Comm: fence_legacy Kdump: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:__kmalloc+0xa5/0x2a0 Code: 00 00 4d 8b 07 65 4d 8b RSP: 0018:ffffaa1fc094bbe8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: d310a8800d7a3faf RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000dc0 RDI: ffff96e68fc036c0 RBP: d310a8800d7a3faf R08: ffff96e6ffdb10a0 R09: 00000000752e7079 R10: 000000000001c513 R11: 0000000004091041 R12: 0000000000000dc0 R13: 0000000000000039 R14: ffff96e68fc036c0 R15: ffff96e68fc036c0 FS: 00007f699dfba540(0000) GS:ffff96e6ffd80000(0000) knlGS:00000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f3a9d9b768 CR3: 000000002cd1c000 CR4: 00000000000006e0 Call Trace: ext4_htree_store_dirent+0x35/0x100 [ext4] htree_dirblock_to_tree+0xea/0x290 [ext4] ext4_htree_fill_tree+0x1c1/0x2d0 [ext4] ext4_readdir+0x67c/0x9d0 [ext4] iterate_dir+0x8d/0x1a0 __x64_sys_getdents+0xab/0x130 do_syscall_64+0x60/0x1f0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f699d33a9fb This regression problem was introduced by commit e581595ea29c ("ocfs: no need to check return value of debugfs_create functions"). Link: http://lkml.kernel.org/r/20191225061501.13587-1-ghe@suse.com Fixes: e581595ea29c ("ocfs: no need to check return value of debugfs_create functions") Signed-off-by: Gang He <ghe@suse.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> [5.3+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04	ocfs2: call journal flush to mark journal as empty after journal recovery ↵	Kai Li
	when mount If journal is dirty when mount, it will be replayed but jbd2 sb log tail cannot be updated to mark a new start because journal->j_flag has already been set with JBD2_ABORT first in journal_init_common. When a new transaction is committed, it will be recored in block 1 first(journal->j_tail is set to 1 in journal_reset). If emergency restart happens again before journal super block is updated unfortunately, the new recorded trans will not be replayed in the next mount. The following steps describe this procedure in detail. 1. mount and touch some files 2. these transactions are committed to journal area but not checkpointed 3. emergency restart 4. mount again and its journals are replayed 5. journal super block's first s_start is 1, but its s_seq is not updated 6. touch a new file and its trans is committed but not checkpointed 7. emergency restart again 8. mount and journal is dirty, but trans committed in 6 will not be replayed. This exception happens easily when this lun is used by only one node. If it is used by multi-nodes, other node will replay its journal and its journal super block will be updated after recovery like what this patch does. ocfs2_recover_node->ocfs2_replay_journal. The following jbd2 journal can be generated by touching a new file after journal is replayed, and seq 15 is the first valid commit, but first seq is 13 in journal super block. logdump: Block 0: Journal Superblock Seq: 0 Type: 4 (JBD2_SUPERBLOCK_V2) Blocksize: 4096 Total Blocks: 32768 First Block: 1 First Commit ID: 13 Start Log Blknum: 1 Error: 0 Feature Compat: 0 Feature Incompat: 2 block64 Feature RO compat: 0 Journal UUID: 4ED3822C54294467A4F8E87D2BA4BC36 FS Share Cnt: 1 Dynamic Superblk Blknum: 0 Per Txn Block Limit Journal: 0 Data: 0 Block 1: Journal Commit Block Seq: 14 Type: 2 (JBD2_COMMIT_BLOCK) Block 2: Journal Descriptor Seq: 15 Type: 1 (JBD2_DESCRIPTOR_BLOCK) No. Blocknum Flags 0. 587 none UUID: 00000000000000000000000000000000 1. 8257792 JBD2_FLAG_SAME_UUID 2. 619 JBD2_FLAG_SAME_UUID 3. 24772864 JBD2_FLAG_SAME_UUID 4. 8257802 JBD2_FLAG_SAME_UUID 5. 513 JBD2_FLAG_SAME_UUID JBD2_FLAG_LAST_TAG ... Block 7: Inode Inode: 8257802 Mode: 0640 Generation: 57157641 (0x3682809) FS Generation: 2839773110 (0xa9437fb6) CRC32: 00000000 ECC: 0000 Type: Regular Attr: 0x0 Flags: Valid Dynamic Features: (0x1) InlineData User: 0 (root) Group: 0 (root) Size: 7 Links: 1 Clusters: 0 ctime: 0x5de5d870 0x11104c61 -- Tue Dec 3 11:37:20.286280801 2019 atime: 0x5de5d870 0x113181a1 -- Tue Dec 3 11:37:20.288457121 2019 mtime: 0x5de5d870 0x11104c61 -- Tue Dec 3 11:37:20.286280801 2019 dtime: 0x0 -- Thu Jan 1 08:00:00 1970 ... Block 9: Journal Commit Block Seq: 15 Type: 2 (JBD2_COMMIT_BLOCK) The following is journal recovery log when recovering the upper jbd2 journal when mount again. syslog: ocfs2: File system on device (252,1) was not unmounted cleanly, recovering it. fs/jbd2/recovery.c:(do_one_pass, 449): Starting recovery pass 0 fs/jbd2/recovery.c:(do_one_pass, 449): Starting recovery pass 1 fs/jbd2/recovery.c:(do_one_pass, 449): Starting recovery pass 2 fs/jbd2/recovery.c:(jbd2_journal_recover, 278): JBD2: recovery, exit status 0, recovered transactions 13 to 13 Due to first commit seq 13 recorded in journal super is not consistent with the value recorded in block 1(seq is 14), journal recovery will be terminated before seq 15 even though it is an unbroken commit, inode 8257802 is a new file and it will be lost. Link: http://lkml.kernel.org/r/20191217020140.2197-1-li.kai4@h3c.com Signed-off-by: Kai Li <li.kai4@h3c.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Changwei Ge <gechangwei@live.cn> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04	fs/posix_acl.c: fix kernel-doc warnings	Randy Dunlap
	Fix kernel-doc warnings in fs/posix_acl.c. Also fix one typo (setgit -> setgid). fs/posix_acl.c:647: warning: Function parameter or member 'inode' not described in 'posix_acl_update_mode' fs/posix_acl.c:647: warning: Function parameter or member 'mode_p' not described in 'posix_acl_update_mode' fs/posix_acl.c:647: warning: Function parameter or member 'acl' not described in 'posix_acl_update_mode' Link: http://lkml.kernel.org/r/29b0dc46-1f28-a4e5-b1d0-ba2b65629779@infradead.org Fixes: 073931017b49d ("posix_acl: Clear SGID bit when setting file permissions") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04	fs/namespace.c: make to_mnt_ns() static	Eric Biggers
	Make to_mnt_ns() static to address the following 'sparse' warning: fs/namespace.c:1731:22: warning: symbol 'to_mnt_ns' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20191209234830.156260-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04	fs/nsfs.c: include headers for missing declarations	Eric Biggers
	Include linux/proc_fs.h and fs/internal.h to address the following 'sparse' warnings: fs/nsfs.c:41:32: warning: symbol 'ns_dentry_operations' was not declared. Should it be static? fs/nsfs.c:145:5: warning: symbol 'open_related_ns' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20191209234822.156179-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04	fs/direct-io.c: include fs/internal.h for missing prototype	Eric Biggers
	Include fs/internal.h to address the following 'sparse' warning: fs/direct-io.c:591:5: warning: symbol 'sb_init_dio_done_wq' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20191209234544.128302-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-03	Merge tag 'for-5.5-rc4-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few fixes for btrfs: - blkcg accounting problem with compression that could stall writes - setting up blkcg bio for compression crashes due to NULL bdev pointer - fix possible infinite loop in writeback for nocow files (here possible means almost impossible, 13 things that need to happen to trigger it)" * tag 'for-5.5-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: fix infinite loop during nocow writeback due to race btrfs: fix compressed write bio blkcg attribution btrfs: punt all bios created in btrfs_submit_compressed_write()
2020-01-03	Merge tag 'block-5.5-20200103' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull block fixes from Jens Axboe: "Three fixes in here: - Fix for a missing split on default memory boundary mask (4G) (Ming) - Fix for multi-page read bio truncate (Ming) - Fix for null_blk zone close request handling (Damien)" * tag 'block-5.5-20200103' of git://git.kernel.dk/linux-block: null_blk: Fix REQ_OP_ZONE_CLOSE handling block: fix splitting segments on boundary masks block: add bio_truncate to fix guard_bio_eod
2020-01-03	dax: Pass dax_dev instead of bdev to dax_writeback_mapping_range()	Vivek Goyal
	As of now dax_writeback_mapping_range() takes "struct block_device" as a parameter and dax_dev is searched from bdev name. This also involves taking a fresh reference on dax_dev and putting that reference at the end of function. We are developing a new filesystem virtio-fs and using dax to access host page cache directly. But there is no block device. IOW, we want to make use of dax but want to get rid of this assumption that there is always a block device associated with dax_dev. So pass in "struct dax_device" as parameter instead of bdev. ext2/ext4/xfs are current users and they already have a reference on dax_device. So there is no need to take reference and drop reference to dax_device on each call of this function. Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20200103183307.GB13350@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-01-03	mm/hugetlbfs: fix for_each_hstate() loop in init_hugetlbfs_fs()	Jan Stancek
	LTP memfd_create04 started failing for some huge page sizes after v5.4-10135-gc3bfc5dd73c6. The problem is the check introduced to for_each_hstate() loop that should skip default_hstate_idx. Since it doesn't update 'i' counter, all subsequent huge page sizes are skipped as well. Fixes: 8fc312b32b25 ("mm/hugetlbfs: fix error handling when setting up mounts") Signed-off-by: Jan Stancek <jstancek@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-03	nfsd: use true,false for bool variable in nfssvc.c	zhengbin
	Fixes coccicheck warning: fs/nfsd/nfssvc.c:394:2-14: WARNING: Assignment of 0/1 to bool variable fs/nfsd/nfssvc.c:407:2-14: WARNING: Assignment of 0/1 to bool variable fs/nfsd/nfssvc.c:422:2-14: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-03	nfsd: use true,false for bool variable in nfs4proc.c	zhengbin
	Fixes coccicheck warning: fs/nfsd/nfs4proc.c:235:1-18: WARNING: Assignment of 0/1 to bool variable fs/nfsd/nfs4proc.c:368:1-17: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-03	nfsd: use true,false for bool variable in vfs.c	zhengbin
	Fixes coccicheck warning: fs/nfsd/vfs.c:1389:5-13: WARNING: Assignment of 0/1 to bool variable fs/nfsd/vfs.c:1398:5-13: WARNING: Assignment of 0/1 to bool variable fs/nfsd/vfs.c:1415:2-10: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-03	compat_ioctl: simplify the implementation	Arnd Bergmann
	Now that both native and compat ioctl syscalls are in the same file, a couple of simplifications can be made, bringing the implementation closer together: - do_vfs_ioctl(), ioctl_preallocate(), and compat_ioctl_preallocate() can become static, allowing the compiler to optimize better - slightly update the coding style for consistency between the functions. - rather than listing each command in two switch statements for the compat case, just call a single function that has all the common commands. As a side-effect, FS_IOC_RESVSP/FS_IOC_RESVSP64 are now available to x86 compat tasks, along with FS_IOC_RESVSP_32/FS_IOC_RESVSP64_32. This is harmless for i386 emulation, and can be considered a bugfix for x32 emulation, which never supported these in the past. Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-01-03	compat_ioctl: move sys_compat_ioctl() to ioctl.c	Arnd Bergmann
	The rest of the fs/compat_ioctl.c file is no longer useful now, so move the actual syscall as planned. Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-01-03	compat_ioctl: scsi: move ioctl handling into drivers	Arnd Bergmann
	Each driver calling scsi_ioctl() gets an equivalent compat_ioctl() handler that implements the same commands by calling scsi_compat_ioctl(). The scsi_cmd_ioctl() and scsi_cmd_blk_ioctl() functions are compatible at this point, so any driver that calls those can do so for both native and compat mode, with the argument passed through compat_ptr(). With this, we can remove the entries from fs/compat_ioctl.c. The new code is larger, but should be easier to maintain and keep updated with newly added commands. Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-01-02	Merge tag 'pstore-v5.5-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore bug fixes from Kees Cook: - always reset circular buffer state when writing new dump (Aleksandr Yashkin) - fix rare error-path memory leak (Kees Cook) * tag 'pstore-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/ram: Write new dumps to start of recycled zones pstore/ram: Fix error-path memory leak in persistent_ram_new() callers
2020-01-02	Revert "fs: remove ksys_dup()"	Dominik Brodowski
	This reverts commit 8243186f0cc7 ("fs: remove ksys_dup()") and the subsequent fix for it in commit 2d3145f8d280 ("early init: fix error handling when opening /dev/console"). Trying to use filp_open() and f_dupfd() instead of pseudo-syscalls caused more trouble than what is worth it: it requires accessing vfs internals and it turns out there were other bugs in it too. In particular, the file reference counting was wrong - because unlike the original "open+2dup" sequence it used "filp_open+3f_dupfd" and thus had an extra leaked file reference. That in turn then caused odd problems with Androidx86 long after boot becaue of how the extra reference to the console kept the session active even after all file descriptors had been closed. Reported-by: youling 257 <youling257@gmail.com> Cc: Arvind Sankar <nivedita@alum.mit.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-02	pstore/ram: Write new dumps to start of recycled zones	Aleksandr Yashkin
	The ram_core.c routines treat przs as circular buffers. When writing a new crash dump, the old buffer needs to be cleared so that the new dump doesn't end up in the wrong place (i.e. at the end). The solution to this problem is to reset the circular buffer state before writing a new Oops dump. Signed-off-by: Aleksandr Yashkin <a.yashkin@inango-systems.com> Signed-off-by: Nikolay Merinov <n.merinov@inango-systems.com> Signed-off-by: Ariel Gilman <a.gilman@inango-systems.com> Link: https://lore.kernel.org/r/20191223133816.28155-1-n.merinov@inango-systems.com Fixes: 896fc1f0c4c6 ("pstore/ram: Switch to persistent_ram routines") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2020-01-02	pstore/ram: Fix error-path memory leak in persistent_ram_new() callers	Kees Cook
	For callers that allocated a label for persistent_ram_new(), if the call fails, they must clean up the allocation. Suggested-by: Navid Emamdoost <navid.emamdoost@gmail.com> Fixes: 1227daa43bce ("pstore/ram: Clarify resource reservation labels") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/20191211191353.14385-1-navid.emamdoost@gmail.com Signed-off-by: Kees Cook <keescook@chromium.org>
2019-12-31	fscrypt: remove redundant bi_status check	Eric Biggers
	submit_bio_wait() already returns bi_status translated to an errno. So the additional check of bi_status is redundant and can be removed. Link: https://lore.kernel.org/r/20191209204509.228942-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-12-31	fscrypt: Allow modular crypto algorithms	Herbert Xu
	The commit 643fa9612bf1 ("fscrypt: remove filesystem specific build config option") removed modular support for fs/crypto. This causes the Crypto API to be built-in whenever fscrypt is enabled. This makes it very difficult for me to test modular builds of the Crypto API without disabling fscrypt which is a pain. As fscrypt is still evolving and it's developing new ties with the fs layer, it's hard to build it as a module for now. However, the actual algorithms are not required until a filesystem is mounted. Therefore we can allow them to be built as modules. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/20191227024700.7vrzuux32uyfdgum@gondor.apana.org.au Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-12-31	fscrypt: don't check for ENOKEY from fscrypt_get_encryption_info()	Eric Biggers
	fscrypt_get_encryption_info() returns 0 if the encryption key is unavailable; it never returns ENOKEY. So remove checks for ENOKEY. Link: https://lore.kernel.org/r/20191209212348.243331-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>