summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2020-03-23btrfs: hold a ref on the root in create_subvolJosef Bacik
We're creating the new root here, but we should hold the ref until after we've initialized the inode for it. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: hold a ref on the root in fixup_tree_root_locationJosef Bacik
Looking up the inode from an arbitrary tree means we need to hold a ref on that root. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: hold a ref on the root in __btrfs_run_defrag_inodeJosef Bacik
We are looking up an arbitrary inode, we need to hold a ref on the root while we're doing this. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: hold a root ref in btrfs_get_dentryJosef Bacik
Looking up the inode we need to search the root, make sure we hold a reference on that root while we're doing the lookup. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: hold a ref on the root in resolve_indirect_refJosef Bacik
We're looking up a random root, we need to hold a ref on it while we're using it. Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: hold a ref on fs roots while they're in the radix treeJosef Bacik
If the root is sitting in the radix tree, we should probably have a ref for the radix tree. Grab a ref on the root when we insert it, and drop it when it gets deleted. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: describe the space reservation system in generalJosef Bacik
Add another comment to cover how the space reservation system works generally. This covers the actual reservation flow, as well as how flushing is handled. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: add a comment describing delalloc space reservationJosef Bacik
delalloc space reservation is tricky because it encompasses both data and metadata. Make it clear what each side does, the general flow of how space is moved throughout the lifetime of a write, and what goes into the calculations. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: add a comment describing block reservesJosef Bacik
This is a giant comment at the top of block-rsv.c describing generally how block reserves work. It is purely about the block reserves themselves, and nothing to do with how the actual reservation system works. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: handle NULL roots in btrfs_put/btrfs_grab_fs_rootJosef Bacik
We want to use this for dropping all roots, and in some error cases we may not have a root, so handle this to make the cleanup code easier. Make btrfs_grab_fs_root the same so we can use it in cases where the root may not exist (like the quota root). Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: make the fs root init functions staticJosef Bacik
Now that the orphan cleanup stuff doesn't use this directly we can just make them static. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: open code btrfs_read_fs_root_no_nameJosef Bacik
All this does is call btrfs_get_fs_root() with check_ref == true. Just use btrfs_get_fs_root() so we don't have a bunch of different helpers that do the same thing. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: remove btrfs_read_fs_root, not used anymoreJosef Bacik
All helpers should either be using btrfs_get_fs_root() or btrfs_read_tree_root(). Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: make relocation use btrfs_read_tree_root()Josef Bacik
Relocation has it's special roots, we don't want to save these in the root cache either, so swap it to use btrfs_read_tree_root(). However the reloc root does need REF_COWS set, so make sure we set it everywhere we use this helper, as it no longer does the REF_COWS setting. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: export and use btrfs_read_tree_root for tree-logJosef Bacik
Tree-log uses btrfs_read_fs_root to load its log, but this just calls btrfs_read_tree_root. We don't save the log roots in our root cache, so just export this helper and use it in the logging code. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: make btrfs_find_orphan_roots use btrfs_get_fs_rootJosef Bacik
btrfs_find_orphan_roots has this weird thing where it looks up the root in cache to see if it is there before just reading the root. But the read it uses just reads the root, it doesn't do any of the init work, we do that by hand here. But this is unnecessary, all we really want is to see if the root still exists and add it to the dead roots list to be cleaned up, otherwise we delete the orphan item. Fix this by just using btrfs_get_fs_root directly with check_ref set to false so we get the orphan root items. Then we just handle in cache and out of cache roots the same, add them to the dead roots list and carry on. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: move fs root init stuff into btrfs_init_fs_rootJosef Bacik
We have a helper for reading fs roots that just reads the fs root off the disk and then sets REF_COWS and init's the inheritable flags. Move this into btrfs_init_fs_root so we can later get rid of this helper and consolidate all of the fs root reading into one helper. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: push __setup_root into btrfs_alloc_rootJosef Bacik
There's no reason to not init the root at alloc time, and with later patches it actually causes problems if we error out mounting the fs before the tree_root is init'ed because we expect it to have a valid ref count. Fix this by pushing __setup_root into btrfs_alloc_root. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: delete the ordered isize update codeJosef Bacik
Now that we have a safe way to update the isize, remove all of this code as it's no longer needed. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: replace all uses of btrfs_ordered_update_i_sizeJosef Bacik
Now that we have a safe way to update the i_size, replace all uses of btrfs_ordered_update_i_size with btrfs_inode_safe_disk_i_size_write. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: use the file extent tree infrastructureJosef Bacik
We want to use this everywhere we modify the file extent items permanently. These include: 1) Inserting new file extents for writes and prealloc extents. 2) Truncating inode items. 3) btrfs_cont_expand(). 4) Insert inline extents. 5) Insert new extents from log replay. 6) Insert a new extent for clone, as it could be past i_size. 7) Hole punching For hole punching in particular it might seem it's not necessary because anybody extending would use btrfs_cont_expand, however there is a corner that still can give us trouble. Start with an empty file and fallocate KEEP_SIZE 1M-2M We now have a 0 length file, and a hole file extent from 0-1M, and a prealloc extent from 1M-2M. Now punch 1M-1.5M Because this is past i_size we have [HOLE EXTENT][ NOTHING ][PREALLOC] [0 1M][1M 1.5M][1.5M 2M] with an i_size of 0. Now if we pwrite 0-1.5M we'll increas our i_size to 1.5M, but our disk_i_size is still 0 until the ordered extent completes. However if we now immediately truncate 2M on the file we'll just call btrfs_cont_expand(inode, 1.5M, 2M), since our old i_size is 1.5M. If we commit the transaction here and crash we'll expose the gap. To fix this we need to clear the file extent mapping for the range that we punched but didn't insert a corresponding file extent for. This will mean the truncate will only get an disk_i_size set to 1M if we crash before the finish ordered io happens. I've written an xfstest to reproduce the problem and validate this fix. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: introduce per-inode file extent treeJosef Bacik
In order to keep track of where we have file extents on disk, and thus where it is safe to adjust the i_size to, we need to have a tree in place to keep track of the contiguous areas we have file extents for. Add helpers to use this tree, as it's not required for NO_HOLES file systems. We will use this by setting DIRTY for areas we know we have file extent item's set, and clearing it when we remove file extent items for truncation. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: use btrfs_ordered_update_i_size in clone_finish_inode_updateJosef Bacik
We were using btrfs_i_size_write(), which unconditionally jacks up inode->disk_i_size. However since clone can operate on ranges we could have pending ordered extents for a range prior to the start of our clone operation and thus increase disk_i_size too far and have a hole with no file extent. Fix this by using the btrfs_ordered_update_i_size helper which will do the right thing in the face of pending ordered extents outside of our clone range. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: update the comment of btrfs_control_ioctl()Su Yue
Btrfsctl was removed in 2012, now the function btrfs_control_ioctl() is only used for devices ioctls. So update the comment. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Su Yue <Damenly_Su@gmx.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: relocation: Add introduction of how relocation worksQu Wenruo
Relocation is one of the most complex part of btrfs, while it's also the foundation stone for online resizing, profile converting. For such a complex facility, we should at least have some introduction to it. This patch will add an basic introduction at pretty a high level, explaining: - What relocation does - How relocation is done Only mentioning how data reloc tree and reloc tree are involved in the operation. No details like the backref cache, or the data reloc tree contents. - Which function to refer. More detailed comments will be added for reloc tree creation, data reloc tree creation and backref cache. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23Btrfs: don't iterate mod seq list when putting a tree mod seqFilipe Manana
Each new element added to the mod seq list is always appended to the list, and each one gets a sequence number coming from a counter which gets incremented everytime a new element is added to the list (or a new node is added to the tree mod log rbtree). Therefore the element with the lowest sequence number is always the first element in the list. So just remove the list iteration at btrfs_put_tree_mod_seq() that computes the minimum sequence number in the list and replace it with a check for the first element's sequence number. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23btrfs: Add overview of device replaceQu Wenruo
The overview of btrfs dev-replace. It mentions some corner cases caused by the write duplication and scrub based data copy. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ adjust wording ] Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23xfs: remove xlog_state_want_syncChristoph Hellwig
Open code the xlog_state_want_sync logic in its two callers given that this function is a trivial wrapper around xlog_state_switch_iclogs. Move the lockdep assert into xlog_state_switch_iclogs to not lose this debugging aid, and improve the comment that documents xlog_state_switch_iclogs as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-23xfs: move the ioerror check out of xlog_state_clean_iclogChristoph Hellwig
Use the shutdown flag in the log to bypass xlog_state_clean_iclog entirely in case of a shut down log. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-23xfs: refactor xlog_state_clean_iclogChristoph Hellwig
Factor out a few self-contained helpers from xlog_state_clean_iclog, and update the documentation so it primarily documents why things happens instead of how. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-23xfs: remove the aborted parameter to xlog_state_done_syncingChristoph Hellwig
We can just check for a shut down log all the way down in xlog_cil_committed instead of passing the parameter. This means a slight behavior change in that we now also abort log items if the shutdown came in halfway into the I/O completion processing, which actually is the right thing to do. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-23xfs: simplify log shutdown checking in xfs_log_release_iclogChristoph Hellwig
There is no need to check for the ioerror state before the lock, as the shutdown case is not a fast path. Also remove the call to force shutdown the file system, as it must have been shut down already for an iclog to be in the ioerror state. Also clean up the flow of the function a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-23xfs: simplify the xfs_log_release_iclog calling conventionChristoph Hellwig
The only caller of xfs_log_release_iclog doesn't care about the return value, so remove it. Also don't bother passing the mount pointer, given that we can trivially derive it from the iclog. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-23xfs: factor out a xlog_wait_on_iclog helperChristoph Hellwig
Factor out the shared code to wait for a log force into a new helper. This helper uses the XLOG_FORCED_SHUTDOWN check previous only used by the unmount code over the equivalent iclog ioerror state used by the other two functions. There is a slight behavior change in that the force of the unmount record is now accounted in the log force statistics. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-23xfs: merge xlog_cil_push into xlog_cil_push_workChristoph Hellwig
xlog_cil_push is only called by xlog_cil_push_work, so merge the two functions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-23hibernate: Allow uswsusp to write to swapDomenico Andreoli
It turns out that there is one use case for programs being able to write to swap devices, and that is the userspace hibernation code. Quick fix: disable the S_SWAPFILE check if hibernation is configured. Fixes: dc617f29dbe5 ("vfs: don't allow writes to swap files") Reported-by: Domenico Andreoli <domenico.andreoli@linux.com> Reported-by: Marian Klein <mkleinsoft@gmail.com> Signed-off-by: Domenico Andreoli <domenico.andreoli@linux.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-23io-uring: drop 'free_pfile' in struct io_file_putHillf Danton
Sync removal of file is only used in case of a GFP_KERNEL kmalloc failure at the cost of io_file_put::done and work flush, while a glich like it can be handled at the call site without too much pain. That said, what is proposed is to drop sync removing of file, and the kink in neck as well. Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-23io-uring: drop completion when removing fileHillf Danton
A case of task hung was reported by syzbot, INFO: task syz-executor975:9880 blocked for more than 143 seconds. Not tainted 5.6.0-rc6-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor975 D27576 9880 9878 0x80004000 Call Trace: schedule+0xd0/0x2a0 kernel/sched/core.c:4154 schedule_timeout+0x6db/0xba0 kernel/time/timer.c:1871 do_wait_for_common kernel/sched/completion.c:83 [inline] __wait_for_common kernel/sched/completion.c:104 [inline] wait_for_common kernel/sched/completion.c:115 [inline] wait_for_completion+0x26a/0x3c0 kernel/sched/completion.c:136 io_queue_file_removal+0x1af/0x1e0 fs/io_uring.c:5826 __io_sqe_files_update.isra.0+0x3a1/0xb00 fs/io_uring.c:5867 io_sqe_files_update fs/io_uring.c:5918 [inline] __io_uring_register+0x377/0x2c00 fs/io_uring.c:7131 __do_sys_io_uring_register fs/io_uring.c:7202 [inline] __se_sys_io_uring_register fs/io_uring.c:7184 [inline] __x64_sys_io_uring_register+0x192/0x560 fs/io_uring.c:7184 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe and bisect pointed to 05f3fb3c5397 ("io_uring: avoid ring quiesce for fixed file set unregister and update"). It is down to the order that we wait for work done before flushing it while nobody is likely going to wake us up. We can drop that completion on stack as flushing work itself is a sync operation we need and no more is left behind it. To that end, io_file_put::done is re-used for indicating if it can be freed in the workqueue worker context. Reported-and-Inspired-by: syzbot <syzbot+538d1957ce178382a394@syzkaller.appspotmail.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Rename ->done to ->free_pfile Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-23ceph: fix memory leak in ceph_cleanup_snapid_map()Luis Henriques
kmemleak reports the following memory leak: unreferenced object 0xffff88821feac8a0 (size 96): comm "kworker/1:0", pid 17, jiffies 4294896362 (age 20.512s) hex dump (first 32 bytes): a0 c8 ea 1f 82 88 ff ff 00 c9 ea 1f 82 88 ff ff ................ 00 00 00 00 00 00 00 00 00 01 00 00 00 00 ad de ................ backtrace: [<00000000b3ea77fb>] ceph_get_snapid_map+0x75/0x2a0 [<00000000d4060942>] fill_inode+0xb26/0x1010 [<0000000049da6206>] ceph_readdir_prepopulate+0x389/0xc40 [<00000000e2fe2549>] dispatch+0x11ab/0x1521 [<000000007700b894>] ceph_con_workfn+0xf3d/0x3240 [<0000000039138a41>] process_one_work+0x24d/0x590 [<00000000eb751f34>] worker_thread+0x4a/0x3d0 [<000000007e8f0d42>] kthread+0xfb/0x130 [<00000000d49bd1fa>] ret_from_fork+0x3a/0x50 A kfree is missing while looping the 'to_free' list of ceph_snapid_map objects. Cc: stable@vger.kernel.org Fixes: 75c9627efb72 ("ceph: map snapid to anonymous bdev ID") Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-03-23ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULLIlya Dryomov
CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult per-pool flags as well. Unfortunately the backwards compatibility here is lacking: - the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but was guarded by require_osd_release >= RELEASE_LUMINOUS - it was subsequently backported to luminous in v12.2.2, but that makes no difference to clients that only check OSDMAP_FULL/NEARFULL because require_osd_release is not client-facing -- it is for OSDs Since all kernels are affected, the best we can do here is just start checking both map flags and pool flags and send that to stable. These checks are best effort, so take osdc->lock and look up pool flags just once. Remove the FIXME, since filesystem quotas are checked above and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set. Cc: stable@vger.kernel.org Reported-by: Yanhu Cao <gmayyyha@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Sage Weil <sage@redhat.com>
2020-03-23ext2: fix empty body warnings when -Wextra is usedRandy Dunlap
When EXT2_ATTR_DEBUG is not defined, modify the 2 debug macros to use the no_printk() macro instead of <nothing>. This fixes gcc warnings when -Wextra is used: ../fs/ext2/xattr.c:252:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] ../fs/ext2/xattr.c:258:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] ../fs/ext2/xattr.c:330:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] ../fs/ext2/xattr.c:872:45: warning: suggest braces around empty body in an ‘else’ statement [-Wempty-body] I have verified that the only object code change (with gcc 7.5.0) is the reversal of some instructions from 'cmp a,b' to 'cmp b,a'. Link: https://lore.kernel.org/r/e18a7395-61fb-2093-18e8-ed4f8cf56248@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jan Kara <jack@suse.com> Cc: linux-ext4@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2020-03-22f2fs: fix to account compressed blocks in f2fs_compressed_blocks()Chao Yu
por_fsstress reports inconsistent status in orphan inode, the root cause of this is in f2fs_write_raw_pages() we decrease i_compr_blocks incorrectly due to wrong calculation in f2fs_compressed_blocks(). So this patch exposes below two functions based on __f2fs_cluster_blocks: - f2fs_compressed_blocks: get count of compressed blocks in compressed cluster - f2fs_cluster_blocks: get count of valid blocks (including reserved blocks) in compressed cluster. Then use f2fs_compress_blocks() to get correct compressed blocks count in f2fs_write_raw_pages(). sanity_check_inode: inode (ino=ad80) hash inconsistent i_compr_blocks:2, i_blocks:1, run fsck to fix Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-22f2fs: xattr.h: Replace zero-length array with flexible-array memberGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-22f2fs: fix to update f2fs_super_block fields under sb_lockChao Yu
Fields in struct f2fs_super_block should be updated under coverage of sb_lock, fix to adjust update_sb_metadata() for that rule. Fixes: 04f0b2eaa3b3 ("f2fs: ioctl for removing a range from F2FS") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-22f2fs: Add a new CP flag to help fsck fix resize SPO issuesSahitya Tummala
Add and set a new CP flag CP_RESIZEFS_FLAG during online resize FS to help fsck fix the metadata mismatch that may happen due to SPO during resize, where SB got updated but CP data couldn't be written yet. fsck errors - Info: CKPT version = 6ed7bccb Wrong user_block_count(2233856) [f2fs_do_mount:3365] Checkpoint is polluted Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-22f2fs: Fix mount failure due to SPO after a successful online resize FSSahitya Tummala
Even though online resize is successfully done, a SPO immediately after resize, still causes below error in the next mount. [ 11.294650] F2FS-fs (sda8): Wrong user_block_count: 2233856 [ 11.300272] F2FS-fs (sda8): Failed to get valid F2FS checkpoint This is because after FS metadata is updated in update_fs_metadata() if the SBI_IS_DIRTY is not dirty, then CP will not be done to reflect the new user_block_count. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-22f2fs: use kmem_cache pool during inline xattr lookupsChao Yu
It's been observed that kzalloc() on lookup_all_xattrs() are called millions of times on Android, quickly becoming the top abuser of slub memory allocator. Use a dedicated kmem cache pool for xattr lookups to mitigate this. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-22CIFS: Fix bug which the return value by asynchronous read is errorYilu Lin
This patch is used to fix the bug in collect_uncached_read_data() that rc is automatically converted from a signed number to an unsigned number when the CIFS asynchronous read fails. It will cause ctx->rc is error. Example: Share a directory and create a file on the Windows OS. Mount the directory to the Linux OS using CIFS. On the CIFS client of the Linux OS, invoke the pread interface to deliver the read request. The size of the read length plus offset of the read request is greater than the maximum file size. In this case, the CIFS server on the Windows OS returns a failure message (for example, the return value of smb2.nt_status is STATUS_INVALID_PARAMETER). After receiving the response message, the CIFS client parses smb2.nt_status to STATUS_INVALID_PARAMETER and converts it to the Linux error code (rdata->result=-22). Then the CIFS client invokes the collect_uncached_read_data function to assign the value of rdata->result to rc, that is, rc=rdata->result=-22. The type of the ctx->total_len variable is unsigned integer, the type of the rc variable is integer, and the type of the ctx->rc variable is ssize_t. Therefore, during the ternary operation, the value of rc is automatically converted to an unsigned number. The final result is ctx->rc=4294967274. However, the expected result is ctx->rc=-22. Signed-off-by: Yilu Lin <linyilu@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
2020-03-22CIFS: check new file size when extending file by fallocateMurphy Zhou
xfstests generic/228 checks if fallocate respect RLIMIT_FSIZE. After fallocate mode 0 extending enabled, we can hit this failure. Fix this by check the new file size with vfs helper, return error if file size is larger then RLIMIT_FSIZE(ulimit -f). This patch has been tested by LTP/xfstests aginst samba and Windows server. Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2020-03-22SMB3: Minor cleanup of protocol definitionsSteve French
And add one missing define (COMPRESSION_TRANSFORM_ID) and flag (TRANSFORM_FLAG_ENCRYPTED) Signed-off-by: Steve French <stfrench@microsoft.com>