summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2021-10-26btrfs: reflink: initialize return value to 0 in btrfs_extent_same()Sidong Yang
Fix a warning reported by smatch that ret could be returned without initialized. The dedupe operations are supposed to to return 0 for a 0 length range but the caller does not pass olen == 0. To keep this behaviour and also fix the warning initialize ret to 0. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Sidong Yang <realwakka@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26btrfs: subpage: pack all subpage bitmaps into a larger bitmapQu Wenruo
Currently we use u16 bitmap to make 4k sectorsize work for 64K page size. But this u16 bitmap is not large enough to contain larger page size like 128K, nor is space efficient for 16K page size. To handle both cases, here we pack all subpage bitmaps into a larger bitmap, now btrfs_subpage::bitmaps[] will be the ultimate bitmap for subpage usage. Each sub-bitmap will has its start bit number recorded in btrfs_subpage_info::*_start, and its bitmap length will be recorded in btrfs_subpage_info::bitmap_nr_bits. All subpage bitmap operations will be converted from using direct u16 operations to bitmap operations, with above *_start calculated. For 64K page size with 4K sectorsize, this should not cause much difference. While for 16K page size, we will only need 1 unsigned long (u32) to store all the bitmaps, which saves quite some space. Furthermore, this allows us to support larger page size like 128K and 258K. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: subpage: introduce btrfs_subpage_bitmap_infoQu Wenruo
Currently we use fixed size u16 bitmap for subpage bitmap. This is fine for 4K sectorsize with 64K page size. But for 4K sectorsize and larger page size, the bitmap is too small, while for smaller page size like 16K, u16 bitmaps waste too much space. Here we introduce a new helper structure, btrfs_subpage_bitmap_info, to record the proper bitmap size, and where each bitmap should start at. By this, we can later compact all subpage bitmaps into one u32 bitmap. This patch is the first step. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: subpage: make btrfs_alloc_subpage() return btrfs_subpage directlyQu Wenruo
The existing calling convention of btrfs_alloc_subpage() is pretty awful. Change it to a more common pattern by returning struct btrfs_subpage directly and let the caller to determine if the call succeeded. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: subpage: only call btrfs_alloc_subpage() when sectorsize is smaller ↵Qu Wenruo
than PAGE_SIZE There are two call sites of btrfs_alloc_subpage(): - btrfs_attach_subpage() We have ensured sectorsize is smaller than PAGE_SIZE - alloc_extent_buffer() We call btrfs_alloc_subpage() unconditionally. The alloc_extent_buffer() forces us to check the sectorsize size against page size inside btrfs_alloc_subpage(). Since the function name, btrfs_alloc_subpage(), already indicates it should only get called for subpage cases, do the check in alloc_extent_buffer() and add an ASSERT() in btrfs_alloc_subpage(). Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: update comment for fs_devices::seed_list in btrfs_rm_deviceSu Yue
Update it since commit 944d3f9fac61 ("btrfs: switch seed device to list api") did conversion from fs_devices::seed to fs_devices::seed_list. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Su Yue <l@damenly.su> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: drop unnecessary ret in ioctl_quota_rescan_statusAnand Jain
There is no need for the variable ret after d66105cfa873 ("btrfs: allocate btrfs_ioctl_quota_rescan_args on stack"), remove it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: send: simplify send_create_inode_if_neededMarcos Paulo de Souza
The out label is being overused, we can simply return if the condition permits. No functional changes. Reviewed-by: Su Yue <l@damenly.su> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25btrfs: rename btrfs_alloc_chunk to btrfs_create_chunkNikolay Borisov
The user facing function used to allocate new chunks is btrfs_chunk_alloc, unfortunately there is yet another similar sounding function - btrfs_alloc_chunk. This creates confusion, especially since the latter function can be considered "private" in the sense that it implements the first stage of chunk creation and as such is called by btrfs_chunk_alloc. To avoid the awkwardness that comes with having similarly named but distinctly different in their purpose function rename btrfs_alloc_chunk to btrfs_create_chunk, given that the main purpose of this function is to orchestrate the whole process of allocating a chunk - reserving space into devices, deciding on characteristics of the stripe size and creating the in-memory structures. Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-24Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull autofs fix from Al Viro: "Fix for a braino of mine (in getting rid of open-coded dentry_path_raw() in autofs a couple of cycles ago). Mea culpa... Obvious -stable fodder" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: autofs: fix wait name hash calculation in autofs_wait()
2021-10-24Merge tag '5.15-rc6-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull ksmbd fixes from Steve French: "Ten fixes for the ksmbd kernel server, for improved security and additional buffer overflow checks: - a security improvement to session establishment to reduce the possibility of dictionary attacks - fix to ensure that maximum i/o size negotiated in the protocol is not less than 64K and not more than 8MB to better match expected behavior - fix for crediting (flow control) important to properly verify that sufficient credits are available for the requested operation - seven additional buffer overflow, buffer validation checks" * tag '5.15-rc6-ksmbd-fixes' of git://git.samba.org/ksmbd: ksmbd: add buffer validation in session setup ksmbd: throttle session setup failures to avoid dictionary attacks ksmbd: validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requests ksmbd: validate credit charge after validating SMB2 PDU body size ksmbd: add buffer validation for smb direct ksmbd: limit read/write/trans buffer size not to exceed 8MB ksmbd: validate compound response buffer ksmbd: fix potencial 32bit overflow from data area check in smb2_write ksmbd: improve credits management ksmbd: add validation in smb2_ioctl
2021-10-22Merge tag 'io_uring-5.15-2021-10-22' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: "Two fixes for the max workers limit API that was introduced this series: one fix for an issue with that code, and one fixing a linked timeout regression in this series" * tag 'io_uring-5.15-2021-10-22' of git://git.kernel.dk/linux-block: io_uring: apply worker limits to previous users io_uring: fix ltimeout unprep io_uring: apply max_workers limit to all future users io-wq: max_worker fixes
2021-10-22Merge tag 'fuse-fixes-5.15-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "Syzbot discovered a race in case of reusing the fuse sb (introduced in this cycle). Fix it by doing the s_fs_info initialization at the proper place" * tag 'fuse-fixes-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: clean up error exits in fuse_fill_super() fuse: always initialize sb->s_fs_info fuse: clean up fuse_mount destruction fuse: get rid of fuse_put_super() fuse: check s_root when destroying sb
2021-10-21io_uring: apply worker limits to previous usersPavel Begunkov
Another change to the API io-wq worker limitation API added in 5.15, apply the limit to all prior users that already registered a tctx. It may be confusing as it's now, in particular the change covers the following 2 cases: TASK1 | TASK2 _________________________________________________ ring = create() | | limit_iowq_workers() *not limited* | TASK1 | TASK2 _________________________________________________ ring = create() | | issue_requests() limit_iowq_workers() | | *not limited* A note on locking, it's safe to traverse ->tctx_list as we hold ->uring_lock, but do that after dropping sqd->lock to avoid possible problems. It's also safe to access tctx->io_wq there because tasks kill it only after removing themselves from tctx_list, see io_uring_cancel_generic() -> io_uring_clean_tctx() Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d6e09ecc3545e4dc56e43c906ee3d71b7ae21bed.1634818641.git.asml.silence@gmail.com Reviewed-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-21fuse: clean up error exits in fuse_fill_super()Miklos Szeredi
Instead of "goto err", return error directly, since there's no error cleanup to do now. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-21fuse: always initialize sb->s_fs_infoMiklos Szeredi
Syzkaller reports a null pointer dereference in fuse_test_super() that is caused by sb->s_fs_info being NULL. This is due to the fact that fuse_fill_super() is initializing s_fs_info, which is too late, it's already on the fs_supers list. The initialization needs to be done in sget_fc() with the sb_lock held. Move allocation of fuse_mount and fuse_conn from fuse_fill_super() into fuse_get_tree(). After this ->kill_sb() will always be called with non-NULL ->s_fs_info, hence fuse_mount_destroy() can drop the test for non-NULL "fm". Reported-by: syzbot+74a15f02ccb51f398601@syzkaller.appspotmail.com Fixes: 5d5b74aa9c76 ("fuse: allow sharing existing sb") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-21fuse: clean up fuse_mount destructionMiklos Szeredi
1. call fuse_mount_destroy() for open coded variants 2. before deactivate_locked_super() don't need fuse_mount destruction since that will now be done (if ->s_fs_info is not cleared) 3. rearrange fuse_mount setup in fuse_get_tree_submount() so that the regular pattern can be used Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-21fuse: get rid of fuse_put_super()Miklos Szeredi
The ->put_super callback is called from generic_shutdown_super() in case of a fully initialized sb. This is called from kill_***_super(), which is called from ->kill_sb instances. Fuse uses ->put_super to destroy the fs specific fuse_mount and drop the reference to the fuse_conn, while it does the same on each error case during sb setup. This patch moves the destruction from fuse_put_super() to fuse_mount_destroy(), called at the end of all ->kill_sb instances. A follup patch will clean up the error paths. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-21fuse: check s_root when destroying sbMiklos Szeredi
Checking "fm" works because currently sb->s_fs_info is cleared on error paths; however, sb->s_root is what generic_shutdown_super() checks to determine whether the sb was fully initialized or not. This change will allow cleanup of sb setup error paths. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-10-20autofs: fix wait name hash calculation in autofs_wait()Ian Kent
There's a mistake in commit 2be7828c9fefc ("get rid of autofs_getpath()") that affects kernels from v5.13.0, basically missed because of me not fully testing the change for Al. The problem is that the hash calculation for the wait name qstr hasn't been updated to account for the change to use dentry_path_raw(). This prevents the correct matching an existing wait resulting in multiple notifications being sent to the daemon for the same mount which must not occur. The problem wasn't discovered earlier because it only occurs when multiple processes trigger a request for the same mount concurrently so it only shows up in more aggressive testing. Fixes: 2be7828c9fefc ("get rid of autofs_getpath()") Cc: stable@vger.kernel.org Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-10-20Merge tag 'ceph-for-5.15-rc7' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "Two important filesystem fixes, marked for stable. The blocklisted superblocks issue was particularly annoying because for unexperienced users it essentially exacted a reboot to establish a new functional mount in that scenario" * tag 'ceph-for-5.15-rc7' of git://github.com/ceph/ceph-client: ceph: fix handling of "meta" errors ceph: skip existing superblocks that are blocklisted or shut down when mounting
2021-10-20io_uring: fix ltimeout unprepPavel Begunkov
io_unprep_linked_timeout() is broken, first it needs to return back REQ_F_ARM_LTIMEOUT, so the linked timeout is enqueued and disarmed. But now we refcounted it, and linked timeouts may get not executed at all, leaking a request. Just kill the unprep optimisation. Fixes: 906c6caaf586 ("io_uring: optimise io_prep_linked_timeout()") Reported-by: Beld Zhang <beldzhang@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/51b8e2bfc4bea8ee625cf2ba62b2a350cc9be031.1634719585.git.asml.silence@gmail.com Link: https://github.com/axboe/liburing/issues/460 Reported-by: Beld Zhang <beldzhang@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-20io_uring: apply max_workers limit to all future usersPavel Begunkov
Currently, IORING_REGISTER_IOWQ_MAX_WORKERS applies only to the task that issued it, it's unexpected for users. If one task creates a ring, limits workers and then passes it to another task the limit won't be applied to the other task. Another pitfall is that a task should either create a ring or submit at least one request for IORING_REGISTER_IOWQ_MAX_WORKERS to work at all, furher complicating the picture. Change the API, save the limits and apply to all future users. Note, it should be done first before giving away the ring or submitting new requests otherwise the result is not guaranteed. Fixes: 2e480058ddc2 ("io-wq: provide a way to limit max number of workers") Link: https://github.com/axboe/liburing/issues/460 Reported-by: Beld Zhang <beldzhang@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/51d0bae97180e08ab722c0d5c93e7439cfb6f697.1634683237.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-20ksmbd: add buffer validation in session setupMarios Makassikis
Make sure the security buffer's length/offset are valid with regards to the packet length. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-20ksmbd: throttle session setup failures to avoid dictionary attacksNamjae Jeon
To avoid dictionary attacks (repeated session setups rapidly sent) to connect to server, ksmbd make a delay of a 5 seconds on session setup failure to make it harder to send enough random connection requests to break into a server if a user insert the wrong password 10 times in a row. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-20ksmbd: validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requestsHyunchul Lee
Validate OutputBufferLength of QUERY_DIR, QUERY_INFO, IOCTL requests and check the free size of response buffer for these requests. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-19io-wq: max_worker fixesPavel Begunkov
First, fix nr_workers checks against max_workers, with max_worker registration, it may pretty easily happen that nr_workers > max_workers. Also, synchronise writing to acct->max_worker with wqe->lock. It's not an actual problem, but as we don't care about io_wqe_create_worker(), it's better than WRITE_ONCE()/READ_ONCE(). Fixes: 2e480058ddc2 ("io-wq: provide a way to limit max number of workers") Reported-by: Beld Zhang <beldzhang@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/11f90e6b49410b7d1a88f5d04fb8d95bb86b8cf3.1634671835.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19ceph: fix handling of "meta" errorsJeff Layton
Currently, we check the wb_err too early for directories, before all of the unsafe child requests have been waited on. In order to fix that we need to check the mapping->wb_err later nearer to the end of ceph_fsync. We also have an overly-complex method for tracking errors after blocklisting. The errors recorded in cleanup_session_requests go to a completely separate field in the inode, but we end up reporting them the same way we would for any other error (in fsync). There's no real benefit to tracking these errors in two different places, since the only reporting mechanism for them is in fsync, and we'd need to advance them both every time. Given that, we can just remove i_meta_err, and convert the places that used it to instead just use mapping->wb_err instead. That also fixes the original problem by ensuring that we do a check_and_advance of the wb_err at the end of the fsync op. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/52864 Reported-by: Patrick Donnelly <pdonnell@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-10-19ceph: skip existing superblocks that are blocklisted or shut down when mountingJeff Layton
Currently when mounting, we may end up finding an existing superblock that corresponds to a blocklisted MDS client. This means that the new mount ends up being unusable. If we've found an existing superblock with a client that is already blocklisted, and the client is not configured to recover on its own, fail the match. Ditto if the superblock has been forcibly unmounted. While we're in here, also rename "other" to the more conventional "fsc". Cc: stable@vger.kernel.org URL: https://bugzilla.redhat.com/show_bug.cgi?id=1901499 Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-10-18vfs: check fd has read access in kernel_read_file_from_fd()Matthew Wilcox (Oracle)
If we open a file without read access and then pass the fd to a syscall whose implementation calls kernel_read_file_from_fd(), we get a warning from __kernel_read(): if (WARN_ON_ONCE(!(file->f_mode & FMODE_READ))) This currently affects both finit_module() and kexec_file_load(), but it could affect other syscalls in the future. Link: https://lkml.kernel.org/r/20211007220110.600005-1-willy@infradead.org Fixes: b844f0ecbc56 ("vfs: define kernel_copy_file_from_fd()") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Hao Sun <sunhao.th@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18ocfs2: mount fails with buffer overflow in strlenValentin Vidic
Starting with kernel 5.11 built with CONFIG_FORTIFY_SOURCE mouting an ocfs2 filesystem with either o2cb or pcmk cluster stack fails with the trace below. Problem seems to be that strings for cluster stack and cluster name are not guaranteed to be null terminated in the disk representation, while strlcpy assumes that the source string is always null terminated. This causes a read outside of the source string triggering the buffer overflow detection. detected buffer overflow in strlen ------------[ cut here ]------------ kernel BUG at lib/string.c:1149! invalid opcode: 0000 [#1] SMP PTI CPU: 1 PID: 910 Comm: mount.ocfs2 Not tainted 5.14.0-1-amd64 #1 Debian 5.14.6-2 RIP: 0010:fortify_panic+0xf/0x11 ... Call Trace: ocfs2_initialize_super.isra.0.cold+0xc/0x18 [ocfs2] ocfs2_fill_super+0x359/0x19b0 [ocfs2] mount_bdev+0x185/0x1b0 legacy_get_tree+0x27/0x40 vfs_get_tree+0x25/0xb0 path_mount+0x454/0xa20 __x64_sys_mount+0x103/0x140 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Link: https://lkml.kernel.org/r/20210929180654.32460-1-vvidic@valentin-vidic.from.hr Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18ocfs2: fix data corruption after conversion from inline formatJan Kara
Commit 6dbf7bb55598 ("fs: Don't invalidate page buffers in block_write_full_page()") uncovered a latent bug in ocfs2 conversion from inline inode format to a normal inode format. The code in ocfs2_convert_inline_data_to_extents() attempts to zero out the whole cluster allocated for file data by grabbing, zeroing, and dirtying all pages covering this cluster. However these pages are beyond i_size, thus writeback code generally ignores these dirty pages and no blocks were ever actually zeroed on the disk. This oversight was fixed by commit 693c241a5f6a ("ocfs2: No need to zero pages past i_size.") for standard ocfs2 write path, inline conversion path was apparently forgotten; the commit log also has a reasoning why the zeroing actually is not needed. After commit 6dbf7bb55598, things became worse as writeback code stopped invalidating buffers on pages beyond i_size and thus these pages end up with clean PageDirty bit but with buffers attached to these pages being still dirty. So when a file is converted from inline format, then writeback triggers, and then the file is grown so that these pages become valid, the invalid dirtiness state is preserved, mark_buffer_dirty() does nothing on these pages (buffers are already dirty) but page is never written back because it is clean. So data written to these pages is lost once pages are reclaimed. Simple reproducer for the problem is: xfs_io -f -c "pwrite 0 2000" -c "pwrite 2000 2000" -c "fsync" \ -c "pwrite 4000 2000" ocfs2_file After unmounting and mounting the fs again, you can observe that end of 'ocfs2_file' has lost its contents. Fix the problem by not doing the pointless zeroing during conversion from inline format similarly as in the standard write path. [akpm@linux-foundation.org: fix whitespace, per Joseph] Link: https://lkml.kernel.org/r/20210930095405.21433-1-jack@suse.cz Fixes: 6dbf7bb55598 ("fs: Don't invalidate page buffers in block_write_full_page()") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Gang He <ghe@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: "Markov, Andrey" <Markov.Andrey@Dell.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-18userfaultfd: fix a race between writeprotect and exit_mmap()Nadav Amit
A race is possible when a process exits, its VMAs are removed by exit_mmap() and at the same time userfaultfd_writeprotect() is called. The race was detected by KASAN on a development kernel, but it appears to be possible on vanilla kernels as well. Use mmget_not_zero() to prevent the race as done in other userfaultfd operations. Link: https://lkml.kernel.org/r/20210921200247.25749-1-namit@vmware.com Fixes: 63b2d4174c4ad ("userfaultfd: wp: add the writeprotect API to userfaultfd ioctl") Signed-off-by: Nadav Amit <namit@vmware.com> Tested-by: Li Wang <liwang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-17Merge tag 'io_uring-5.15-2021-10-17' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fix from Jens Axboe: "Just a single fix for a wrong condition for grabbing a lock, a regression in this merge window" * tag 'io_uring-5.15-2021-10-17' of git://git.kernel.dk/linux-block: io_uring: fix wrong condition to grab uring lock
2021-10-17Merge tag 'driver-core-5.15-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are some small driver core fixes for 5.15-rc6, all of which have been in linux-next for a while with no reported issues. They include: - kernfs negative dentry bugfix - simple pm bus fixes to resolve reported issues" * tag 'driver-core-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: drivers: bus: Delete CONFIG_SIMPLE_PM_BUS drivers: bus: simple-pm-bus: Add support for probing simple bus only devices driver core: Reject pointless SYNC_STATE_ONLY device links kernfs: don't create a negative dentry if inactive node exists
2021-10-15ksmbd: validate credit charge after validating SMB2 PDU body sizeRalph Boehme
smb2_validate_credit_charge() accesses fields in the SMB2 PDU body, but until smb2_calc_size() is called the PDU has not yet been verified to be large enough to access the PDU dynamic part length field. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-15ksmbd: add buffer validation for smb directHyunchul Lee
Add buffer validation for smb direct. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-15ksmbd: limit read/write/trans buffer size not to exceed 8MBNamjae Jeon
ksmbd limit read/write/trans buffer size not to exceed maximum 8MB. And set the minimum value of max response buffer size to 64KB. Windows client doesn't send session setup request if ksmbd set max trans/read/write size lower than 64KB in smb2 negotiate. It means windows allow at least 64 KB or more about this value. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-15Merge tag 'ntfs3_for_5.15' of ↵Linus Torvalds
git://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs3 fixes from Konstantin Komarov: "Use the new api for mounting as requested by Christoph. Also fixed: - some memory leaks and panic - xfstests (tested on x86_64) generic/016 generic/021 generic/022 generic/041 generic/274 generic/423 - some typos, wrong returned error codes, dead code, etc" * tag 'ntfs3_for_5.15' of git://github.com/Paragon-Software-Group/linux-ntfs3: (70 commits) fs/ntfs3: Check for NULL pointers in ni_try_remove_attr_list fs/ntfs3: Refactor ntfs_read_mft fs/ntfs3: Refactor ni_parse_reparse fs/ntfs3: Refactor ntfs_create_inode fs/ntfs3: Refactor ntfs_readlink_hlp fs/ntfs3: Rework ntfs_utf16_to_nls fs/ntfs3: Fix memory leak if fill_super failed fs/ntfs3: Keep prealloc for all types of files fs/ntfs3: Remove unnecessary functions fs/ntfs3: Forbid FALLOC_FL_PUNCH_HOLE for normal files fs/ntfs3: Refactoring of ntfs_set_ea fs/ntfs3: Remove locked argument in ntfs_set_ea fs/ntfs3: Use available posix_acl_release instead of ntfs_posix_acl_release fs/ntfs3: Check for NULL if ATTR_EA_INFO is incorrect fs/ntfs3: Refactoring of ntfs_init_from_boot fs/ntfs3: Reject mount if boot's cluster size < media sector size fs/ntfs3: Refactoring lock in ntfs_init_acl fs/ntfs3: Change posix_acl_equiv_mode to posix_acl_update_mode fs/ntfs3: Pass flags to ntfs_set_ea in ntfs_set_acl_ex fs/ntfs3: Refactor ntfs_get_acl_ex for better readability ...
2021-10-14io_uring: fix wrong condition to grab uring lockHao Xu
Grab uring lock when we are in io-worker rather than in the original or system-wq context since we already hold it in these two situation. Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Fixes: b66ceaf324b3 ("io_uring: move iopoll reissue into regular IO path") Link: https://lore.kernel.org/r/20211014140400.50235-1-haoxu@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-13ksmbd: validate compound response bufferNamjae Jeon
Add the check to validate compound response buffer. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-13ksmbd: fix potencial 32bit overflow from data area check in smb2_writeNamjae Jeon
DataOffset and Length validation can be potencial 32bit overflow. This patch fix it. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-13ksmbd: improve credits managementHyunchul Lee
* Requests except READ, WRITE, IOCTL, INFO, QUERY DIRECOTRY, CANCEL must consume one credit. * If client's granted credits are insufficient, refuse to handle requests. * Windows server 2016 or later grant up to 8192 credits to clients at once. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-13ksmbd: add validation in smb2_ioctlNamjae Jeon
Add validation for request/response buffer size check in smb2_ioctl and fsctl_copychunk() take copychunk_ioctl_req pointer and the other arguments instead of smb2_ioctl_req structure and remove an unused smb2_ioctl_req argument of fsctl_validate_negotiate_info. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Steve French <smfrench@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-12fs/ntfs3: Check for NULL pointers in ni_try_remove_attr_listKonstantin Komarov
Check for potential NULL pointers. Print error message if found. Thread, that leads to this commit: https://lore.kernel.org/ntfs3/227c13e3-5a22-0cba-41eb-fcaf41940711@paragon-software.com/ Reported-by: Mohammad Rasim <mohammad.rasim96@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11Merge tag 'for-5.15-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more error handling fixes, stemming from code inspection, error injection or fuzzing" * tag 'for-5.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix abort logic in btrfs_replace_file_extents btrfs: check for error when looking up inode during dir entry replay btrfs: unify lookup return value when dir entry is missing btrfs: deal with errors when adding inode reference during log replay btrfs: deal with errors when replaying dir entry during log replay btrfs: deal with errors when checking if a dir entry exists during log replay btrfs: update refs for any root except tree log roots btrfs: unlock newly allocated extent buffer after error
2021-10-11fs/ntfs3: Refactor ntfs_read_mftKonstantin Komarov
Don't save size of attribute reparse point as size of symlink. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11fs/ntfs3: Refactor ni_parse_reparseKonstantin Komarov
Change argument from void* to struct REPARSE_DATA_BUFFER* We copy data to buffer, so we can read it later in ntfs_read_mft. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11fs/ntfs3: Refactor ntfs_create_inodeKonstantin Komarov
Set size for symlink, so we don't need to calculate it on the fly. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11fs/ntfs3: Refactor ntfs_readlink_hlpKonstantin Komarov
Rename some variables. Returned err by default is EINVAL. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>