summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2023-02-02ceph: move mount state enum to super.hXiubo Li
These flags are only used in ceph filesystem in fs/ceph, so just move it to the place it should be. Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-02-01hostfs: Replace kmap() with kmap_local_page()Fabio M. De Francesco
The use of kmap() is being deprecated in favor of kmap_local_page(). There are two main problems with kmap(): (1) It comes with an overhead as the mapping space is restricted and protected by a global lock for synchronization and (2) it also requires global TLB invalidation when the kmap’s pool wraps and it might block when the mapping space is fully utilized until a slot becomes available. With kmap_local_page() the mappings are per thread, CPU local, can take page faults, and can be called from any context (including interrupts). It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, the tasks can be preempted and, when they are scheduled to run again, the kernel virtual addresses are restored and still valid. Therefore, replace kmap() with kmap_local_page() in hostfs_kern.c, it being the only file with kmap() call sites currently left in fs/hostfs. Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com> Suggested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-01-31Sync mm-stable with mm-hotfixes-stable to pick up dependent patchesAndrew Morton
Merge branch 'mm-hotfixes-stable' into mm-stable
2023-01-31Squashfs: fix handling and sanity checking of xattr_ids countPhillip Lougher
A Sysbot [1] corrupted filesystem exposes two flaws in the handling and sanity checking of the xattr_ids count in the filesystem. Both of these flaws cause computation overflow due to incorrect typing. In the corrupted filesystem the xattr_ids value is 4294967071, which stored in a signed variable becomes the negative number -225. Flaw 1 (64-bit systems only): The signed integer xattr_ids variable causes sign extension. This causes variable overflow in the SQUASHFS_XATTR_*(A) macros. The variable is first multiplied by sizeof(struct squashfs_xattr_id) where the type of the sizeof operator is "unsigned long". On a 64-bit system this is 64-bits in size, and causes the negative number to be sign extended and widened to 64-bits and then become unsigned. This produces the very large number 18446744073709548016 or 2^64 - 3600. This number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and divided by SQUASHFS_METADATA_SIZE overflows and produces a length of 0 (stored in len). Flaw 2 (32-bit systems only): On a 32-bit system the integer variable is not widened by the unsigned long type of the sizeof operator (32-bits), and the signedness of the variable has no effect due it always being treated as unsigned. The above corrupted xattr_ids value of 4294967071, when multiplied overflows and produces the number 4294963696 or 2^32 - 3400. This number when rounded up by SQUASHFS_METADATA_SIZE - 1 (8191 bytes) and divided by SQUASHFS_METADATA_SIZE overflows again and produces a length of 0. The effect of the 0 length computation: In conjunction with the corrupted xattr_ids field, the filesystem also has a corrupted xattr_table_start value, where it matches the end of filesystem value of 850. This causes the following sanity check code to fail because the incorrectly computed len of 0 matches the incorrect size of the table reported by the superblock (0 bytes). len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids); indexes = SQUASHFS_XATTR_BLOCKS(*xattr_ids); /* * The computed size of the index table (len bytes) should exactly * match the table start and end points */ start = table_start + sizeof(*id_table); end = msblk->bytes_used; if (len != (end - start)) return ERR_PTR(-EINVAL); Changing the xattr_ids variable to be "usigned int" fixes the flaw on a 64-bit system. This relies on the fact the computation is widened by the unsigned long type of the sizeof operator. Casting the variable to u64 in the above macro fixes this flaw on a 32-bit system. It also means 64-bit systems do not implicitly rely on the type of the sizeof operator to widen the computation. [1] https://lore.kernel.org/lkml/000000000000cd44f005f1a0f17f@google.com/ Link: https://lkml.kernel.org/r/20230127061842.10965-1-phillip@squashfs.org.uk Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup") Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: <syzbot+082fa4af80a5bb1a9843@syzkaller.appspotmail.com> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Fedor Pchelkin <pchelkin@ispras.ru> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smapsMike Kravetz
Patch series "Fixes for hugetlb mapcount at most 1 for shared PMDs". This issue of mapcount in hugetlb pages referenced by shared PMDs was discussed in [1]. The following two patches address user visible behavior caused by this issue. [1] https://lore.kernel.org/linux-mm/Y9BF+OCdWnCSilEu@monkey/ This patch (of 2): A hugetlb page will have a mapcount of 1 if mapped by multiple processes via a shared PMD. This is because only the first process increases the map count, and subsequent processes just add the shared PMD page to their page table. page_mapcount is being used to decide if a hugetlb page is shared or private in /proc/PID/smaps. Pages referenced via a shared PMD were incorrectly being counted as private. To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found count the hugetlb page as shared. A new helper to check for a shared PMD is added. [akpm@linux-foundation.org: simplification, per David] [akpm@linux-foundation.org: hugetlb.h: include page_ref.h for page_count()] Link: https://lkml.kernel.org/r/20230126222721.222195-2-mike.kravetz@oracle.com Fixes: 25ee01a2fca0 ("mm: hugetlb: proc: add hugetlb-related fields to /proc/PID/smaps") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: James Houghton <jthoughton@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31freevxfs: Kconfig: fix spellingRandy Dunlap
Fix a spello in freevxfs Kconfig. (reported by codespell) Link: https://lkml.kernel.org/r/20230124181638.15604-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31squashfs: harden sanity check in squashfs_read_xattr_id_tableFedor Pchelkin
While mounting a corrupted filesystem, a signed integer '*xattr_ids' can become less than zero. This leads to the incorrect computation of 'len' and 'indexes' values which can cause null-ptr-deref in copy_bio_to_actor() or out-of-bounds accesses in the next sanity checks inside squashfs_read_xattr_id_table(). Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Link: https://lkml.kernel.org/r/20230117105226.329303-2-pchelkin@ispras.ru Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup") Reported-by: <syzbot+082fa4af80a5bb1a9843@syzkaller.appspotmail.com> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31gfs2: Improve gfs2_make_fs_rw error handlingAndreas Gruenbacher
In gfs2_make_fs_rw(), make sure to call gfs2_consist() to report an inconsistency and mark the filesystem as withdrawn when gfs2_find_jhead() fails. At the end of gfs2_make_fs_rw(), when we discover that the filesystem has been withdrawn, make sure we report an error. This also replaces the gfs2_withdrawn() check after gfs2_find_jhead(). Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: syzbot+f51cb4b9afbd87ec06f2@syzkaller.appspotmail.com Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31Revert "GFS2: free disk inode which is deleted by remote node -V2"Bob Peterson
This reverts commit 970343cd4904 ("GFS2: free disk inode which is deleted by remote node -V2"). The original intent behind commit 970343cd49 was to cull dentries when a remote node requests to demote an iopen glock, which happens when the remote node tries to delete the inode. This is now handled by gfs2_try_evict(), which is called via iopen_go_callback() -> gfs2_queue_try_to_evict(). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Evict inodes cooperativelyAndreas Gruenbacher
Add a gfs2_evict_inodes() helper that evicts inodes cooperatively across the cluster. This avoids running into timeouts during unmount unnecessarily. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Flush delete work before shrinking inode cacheAndreas Gruenbacher
In gfs2_kill_sb(), flush the delete work queue after setting the SDF_DEACTIVATING flag. This ensures that no new inodes will be instantiated anymore, and the inode cache will be empty after the following kill_block_super() -> generic_shutdown_super() -> evict_inodes() call. With that, function gfs2_make_fs_ro() now calls gfs2_flush_delete_work() after the workqueue has been destroyed. Skip that by checking for the presence of the SDF_DEACTIVATING flag. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Cease delete work during unmountBob Peterson
Add a check to delete_work_func() so that it quits when it finds that the filesystem is deactivating. This speeds up the delete workqueue draining in gfs2_kill_sb(). In addition, make sure that iopen_go_callback() won't queue any new delete work while the filesystem is deactivating. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Add SDF_DEACTIVATING super block flagBob Peterson
Add a new SDF_DEACTIVATING super block flag that is set when the filesystem has started to deactivate. This will be used in the next patch to stop and drain the delete work during unmount. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: check gl_object in rgrp glopsBob Peterson
Function gfs2_clear_rgrpd() is called during unmount to free all rgrps and their sub-objects. If the rgrp glock is held (e.g. in SH) it calls gfs2_glock_cb() to unlock, then calls flush_delayed_work() to make sure any glock work is finished. However, there is a race with other cluster nodes who may request the rgrp glock in another mode (say, EX). Func gfs2_clear_rgrpd() calls glock_clear_object() which sets gl_object to NULL but that's done without holding the gl_lockref spin_lock. While the lock is not held Another node's demote request can cause the state machine to run again, and since the gl_lockref is released in do_xmote, the second process's call to do_xmote can call go_inval (rgrp_go_inval) after the gl_object has been cleared, which results in NULL pointer reference of the rgrp glock's gl_object. Other go_inval glops functions don't require the gl_object to exist, as evidenced by function inode_go_inval() which explicitly checks for if (ip) before referencing gl_object. This patch does the same thing for rgrp glocks. Both the go_inval and go_sync ops are patched to check the existence of gl_object (rgd) before trying to dereference it. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Split the two kinds of glock "delete" workAndreas Gruenbacher
Function delete_work_func() is used for two purposes: * to immediately try to evict the glock's inode, and * to verify after a little while that the inode has been deleted as expected, and didn't just get skipped. These two operations are not separated very well, so introduce two new glock flags to improved that. Split gfs2_queue_delete_work() into gfs2_queue_try_to_evict and gfs2_queue_verify_evict(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Move delete workqueue into super blockAndreas Gruenbacher
Move the global delete workqueue into struct gfs2_sbd so that we can flush / drain it without interfering with other filesystems. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Get rid of GLF_PENDING_DELETE flagAndreas Gruenbacher
Get rid of the GLF_PENDING_DELETE glock flag introduced by commit a0e3cc65fa29 ("gfs2: Turn gl_delete into a delayed work"). The only use of that flag is to prevent the iopen glock from being demoted (i.e., unlocked) while delete work is pending. It turns out that demoting the iopen glock while delete work is pending is perfectly fine; we only need to make sure that the glock isn't being freed while still in use. This is ensured by the previous patch because delete_work_func() owns a reference while the work is queued or running. With these changes, gfs2_queue_delete_work() no longer takes the glock spin lock, so we can use it in iopen_go_callback() instead of open-coding it there. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Make glock lru list scanning saferAndreas Gruenbacher
In __gfs2_glock_put(), remove the glock from the lru list *after* dropping the glock lock. This prevents deadlocks against gfs2_scan_glock_lru(). In gfs2_scan_glock_lru(), make sure that the glock's reference count is zero before moving the glock to the dispose list. This skips glocks that are marked dead as well as glocks that are still in use. Additionally, switch to spin_trylock() as we already do in gfs2_dispose_glock_lru(); this alone would also be enough to prevent deadlocks against __gfs2_glock_put(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Clean up gfs2_scan_glock_lruAndreas Gruenbacher
Switch to list_for_each_entry_safe() and eliminate the "skipped" list in gfs2_scan_glock_lru(). At the same time, scan the requested number of items to scan, not one more than that number. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31gfs2: Improve gfs2_upgrade_iopen_glock commentAndreas Gruenbacher
Improve the comment describing the inode and iopen glock interactions and the glock poking related to inode evict. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-01-31f2fs: remove __add_sum_entryChristoph Hellwig
This function just assigns a summary entry. This can be done entirely typesafe with an open code struct assignment that relies on array indexing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31f2fs: fix to abort atomic write only during do_exist()Chao Yu
Commit 7a10f0177e11 ("f2fs: don't give partially written atomic data from process crash") attempted to drop atomic write data after process crash, however, f2fs_abort_atomic_write() may be called from noncrash case, fix it by adding missed PF_EXITING check condition f2fs_file_flush(). - application crashs - do_exit - exit_signals -- sets PF_EXITING - exit_files - put_files_struct - close_files - filp_close - flush (f2fs_file_flush) - check atomic_write_task && PF_EXITING - f2fs_abort_atomic_write Fixes: 7a10f0177e11 ("f2fs: don't give partially written atomic data from process crash") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31f2fs: allow set compression option of files without blocksYangtao Li
Files created by truncate have a size but no blocks, so they can be allowed to set compression option. Fixes: e1e8debec656 ("f2fs: add F2FS_IOC_SET_COMPRESS_OPTION ioctl") Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31f2fs: fix information leak in f2fs_move_inline_dirents()Eric Biggers
When converting an inline directory to a regular one, f2fs is leaking uninitialized memory to disk because it doesn't initialize the entire directory block. Fix this by zero-initializing the block. This bug was introduced by commit 4ec17d688d74 ("f2fs: avoid unneeded initializing when converting inline dentry"), which didn't consider the security implications of leaking uninitialized memory to disk. This was found by running xfstest generic/435 on a KMSAN-enabled kernel. Fixes: 4ec17d688d74 ("f2fs: avoid unneeded initializing when converting inline dentry") Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31fs: f2fs: initialize fsdata in pagecache_write()Alexander Potapenko
When aops->write_begin() does not initialize fsdata, KMSAN may report an error passing the latter to aops->write_end(). Fix this by unconditionally initializing fsdata. Suggested-by: Eric Biggers <ebiggers@kernel.org> Fixes: 95ae251fe828 ("f2fs: add fs-verity support") Signed-off-by: Alexander Potapenko <glider@google.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31f2fs: fix to check warm_data_age_thresholdYangtao Li
hot_data_age_threshold is a non-zero positive number, and condition 2 includes condition 1, so there is no need to additionally judge whether t is 0. And let's remove it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31f2fs: return true if all cmd were issued or no cmd need to be issued for ↵Yangtao Li
f2fs_issue_discard_timeout() f2fs_issue_discard_timeout() returns whether discard cmds are dropped, which does not match the meaning of the function. Let's change it to return whether all discard cmd are issued. After commit 4d67490498ac ("f2fs: Don't create discard thread when device doesn't support realtime discard"), f2fs_issue_discard_timeout() is alse called by f2fs_remount(). Since the comments of f2fs_issue_discard_timeout() doesn't make much sense, let's update it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31f2fs: fix to show discard_unit mount optYangtao Li
Convert to show discard_unit only when has DISCARD opt. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31f2fs: fix to do sanity check on extent cache correctlyChao Yu
In do_read_inode(), sanity_check_inode() should be called after f2fs_init_read_extent_tree(), fix it. Fixes: 72840cccc0a1 ("f2fs: allocate the extent_cache by default") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31f2fs: remove unneeded f2fs_cp_error() in f2fs_create_whiteout()Chao Yu
f2fs_rename() has checked CP_ERROR_FLAG, so remove redundant check in f2fs_create_whiteout(). Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-31Merge tag 'v6.2-rc6' into sched/core, to pick up fixesIngo Molnar
Pick up fixes before merging another batch of cpuidle updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-30ksmbd: Fix spelling mistake "excceed" -> "exceeded"Colin Ian King
There is a spelling mistake in an error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-01-30ksmbd: update Kconfig to note Kerberos support and fix indentationSteve French
Fix indentation of server config options, and also since support for very old, less secure, NTLM authentication was removed (and quite a while ago), remove the mention of that in Kconfig, but do note Kerberos (not just NTLMv2) which are supported and much more secure. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-01-30ksmbd: Remove duplicated codesDawei Li
ksmbd_neg_token_init_mech_token() and ksmbd_neg_token_targ_resp_token() share same implementation, unify them. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-01-30ksmbd: fix typo, syncronous->synchronousDawei Li
syncronous->synchronous Signed-off-by: Dawei Li <set_pte_at@outlook.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-01-30f2fs: clear atomic_write_task in f2fs_abort_atomic_write()Chao Yu
Otherwise, last .atomic_write_task will be remained in structure f2fs_inode_info, resulting in aborting atomic_write accidentally in race case. Meanwhile, clear original_i_size as well. Fixes: 7a10f0177e11 ("f2fs: don't give partially written atomic data from process crash") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-30f2fs: introduce trace_f2fs_replace_atomic_write_blockChao Yu
Commit 3db1de0e582c ("f2fs: change the current atomic write way") removed old tracepoints, but it missed to add new one, this patch fixes to introduce trace_f2fs_replace_atomic_write_block to trace atomic_write commit flow. Fixes: 3db1de0e582c ("f2fs: change the current atomic write way") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-30f2fs: introduce discard_io_aware_gran sysfs nodeYangtao Li
The current discard_io_aware_gran is a fixed value, change it to be configurable through the sys node. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-30f2fs: drop useless initializer and unneeded local variableYangtao Li
No need to initialize idx twice. BTW, remove the unnecessary cnt variable. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-30f2fs: add iostat support for flushYangtao Li
In this patch, it adds to account flush count. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-30fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work()Hou Tao
fscache_create_volume_work() uses wake_up_bit() to wake up the processes which are waiting for the completion of volume creation. According to comments in wake_up_bit() and waitqueue_active(), an extra smp_mb() is needed to guarantee the memory order between FSCACHE_VOLUME_CREATING flag and waitqueue_active() before invoking wake_up_bit(). Fixing it by using clear_and_wake_up_bit() to add the missing memory barrier. Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20230113115211.2895845-3-houtao@huaweicloud.com/ # v3
2023-01-30fscache: Use wait_on_bit() to wait for the freeing of relinquished volumeHou Tao
The freeing of relinquished volume will wake up the pending volume acquisition by using wake_up_bit(), however it is mismatched with wait_var_event() used in fscache_wait_on_volume_collision() and it will never wake up the waiter in the wait-queue because these two functions operate on different wait-queues. According to the implementation in fscache_wait_on_volume_collision(), if the wake-up of pending acquisition is delayed longer than 20 seconds (e.g., due to the delay of on-demand fd closing), the first wait_var_event_timeout() will timeout and the following wait_var_event() will hang forever as shown below: FS-Cache: Potential volume collision new=00000024 old=00000022 ...... INFO: task mount:1148 blocked for more than 122 seconds. Not tainted 6.1.0-rc6+ #1 task:mount state:D stack:0 pid:1148 ppid:1 Call Trace: <TASK> __schedule+0x2f6/0xb80 schedule+0x67/0xe0 fscache_wait_on_volume_collision.cold+0x80/0x82 __fscache_acquire_volume+0x40d/0x4e0 erofs_fscache_register_volume+0x51/0xe0 [erofs] erofs_fscache_register_fs+0x19c/0x240 [erofs] erofs_fc_fill_super+0x746/0xaf0 [erofs] vfs_get_super+0x7d/0x100 get_tree_nodev+0x16/0x20 erofs_fc_get_tree+0x20/0x30 [erofs] vfs_get_tree+0x24/0xb0 path_mount+0x2fa/0xa90 do_mount+0x7c/0xa0 __x64_sys_mount+0x8b/0xe0 do_syscall_64+0x30/0x60 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Considering that wake_up_bit() is more selective, so fix it by using wait_on_bit() instead of wait_var_event() to wait for the freeing of relinquished volume. In addition because waitqueue_active() is used in wake_up_bit() and clear_bit() doesn't imply any memory barrier, use clear_and_wake_up_bit() to add the missing memory barrier between cursor->flags and waitqueue_active(). Fixes: 62ab63352350 ("fscache: Implement volume registration") Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20230113115211.2895845-2-houtao@huaweicloud.com/ # v3
2023-01-29ksmbd: Implements sess->rpc_handle_list as xarrayDawei Li
For some ops on rpc handle: 1. ksmbd_session_rpc_method(), possibly on high frequency. 2. ksmbd_session_rpc_close(). id is used as indexing key to lookup channel, in that case, linear search based on list may suffer a bit for performance. Implements sess->rpc_handle_list as xarray. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-01-29ksmbd: Implements sess->ksmbd_chann_list as xarrayDawei Li
For some ops on channel: 1. lookup_chann_list(), possibly on high frequency. 2. ksmbd_chann_del(). Connection is used as indexing key to lookup channel, in that case, linear search based on list may suffer a bit for performance. Implements sess->ksmbd_chann_list as xarray. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-01-28fscrypt: support decrypting data from large foliosEric Biggers
Try to make the filesystem-level decryption functions in fs/crypto/ aware of large folios. This includes making fscrypt_decrypt_bio() support the case where the bio contains large folios, and making fscrypt_decrypt_pagecache_blocks() take a folio instead of a page. There's no way to actually test this with large folios yet, but I've tested that this doesn't cause any regressions. Note that this patch just handles *decryption*, not encryption which will be a little more difficult. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230127224202.355629-1-ebiggers@kernel.org
2023-01-28Merge tag '6.2-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull ksmbd server fixes from Steve French: "Four smb3 server fixes, all also for stable: - fix for signing bug - fix to more strictly check packet length - add a max connections parm to limit simultaneous connections - fix error message flood that can occur with newer Samba xattr format" * tag '6.2-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: downgrade ndr version error message to debug ksmbd: limit pdu length size according to connection status ksmbd: do not sign response to session request for guest login ksmbd: add max connections parameter
2023-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: drivers/net/ethernet/intel/ice/ice_main.c 418e53401e47 ("ice: move devlink port creation/deletion") 643ef23bd9dd ("ice: Introduce local var for readability") https://lore.kernel.org/all/20230127124025.0dacef40@canb.auug.org.au/ https://lore.kernel.org/all/20230124005714.3996270-1-anthony.l.nguyen@intel.com/ drivers/net/ethernet/engleder/tsnep_main.c 3d53aaef4332 ("tsnep: Fix TX queue stop/wake for multiple queues") 25faa6a4c5ca ("tsnep: Replace TX spin_lock with __netif_tx_lock") https://lore.kernel.org/all/20230127123604.36bb3e99@canb.auug.org.au/ net/netfilter/nf_conntrack_proto_sctp.c 13bd9b31a969 ("Revert "netfilter: conntrack: add sctp DATA_SENT state"") a44b7651489f ("netfilter: conntrack: unify established states for SCTP paths") f71cb8f45d09 ("netfilter: conntrack: sctp: use nf log infrastructure for invalid packets") https://lore.kernel.org/all/20230127125052.674281f9@canb.auug.org.au/ https://lore.kernel.org/all/d36076f3-6add-a442-6d4b-ead9f7ffff86@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-27Merge tag '6.2-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fix from Steve French: "Fix for reconnect oops in smbdirect (RDMA), also is marked for stable" * tag '6.2-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix oops due to uncleared server->smbd_conn in reconnect
2023-01-27ipc,namespace: batch free ipc_namespace structuresRik van Riel
Instead of waiting for an RCU grace period between each ipc_namespace structure that is being freed, wait an RCU grace period for every batch of ipc_namespace structures. Thanks to Al Viro for the suggestion of the helper function. This speeds up the run time of the test case that allocates ipc_namespaces in a loop from 6 minutes, to a little over 1 second: real 0m1.192s user 0m0.038s sys 0m1.152s Signed-off-by: Rik van Riel <riel@surriel.com> Reported-by: Chris Mason <clm@meta.com> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-01-27fsverity: support verifying data from large foliosEric Biggers
Try to make fs/verity/verify.c aware of large folios. This includes making fsverity_verify_bio() support the case where the bio contains large folios, and adding a function fsverity_verify_folio() which is the equivalent of fsverity_verify_page(). There's no way to actually test this with large folios yet, but I've tested that this doesn't cause any regressions. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20230127221529.299560-1-ebiggers@kernel.org