summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2022-05-21ksmbd: Fix some kernel-doc commentsYang Li
Remove some warnings found by running scripts/kernel-doc, which is caused by using 'make W=1'. fs/ksmbd/misc.c:30: warning: Function parameter or member 'str' not described in 'match_pattern' fs/ksmbd/misc.c:30: warning: Excess function parameter 'string' description in 'match_pattern' fs/ksmbd/misc.c:163: warning: Function parameter or member 'share' not described in 'convert_to_nt_pathname' fs/ksmbd/misc.c:163: warning: Function parameter or member 'path' not described in 'convert_to_nt_pathname' fs/ksmbd/misc.c:163: warning: Excess function parameter 'filename' description in 'convert_to_nt_pathname' fs/ksmbd/misc.c:163: warning: Excess function parameter 'sharepath' description in 'convert_to_nt_pathname' fs/ksmbd/misc.c:259: warning: Function parameter or member 'share' not described in 'convert_to_unix_name' fs/ksmbd/misc.c:259: warning: Function parameter or member 'name' not described in 'convert_to_unix_name' fs/ksmbd/misc.c:259: warning: Excess function parameter 'path' description in 'convert_to_unix_name' fs/ksmbd/misc.c:259: warning: Excess function parameter 'tid' description in 'convert_to_unix_name' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: fix wrong smbd max read/write size checkNamjae Jeon
smb-direct max read/write size can be different with smb2 max read/write size. So smb2_read() can return error by wrong max read/write size check. This patch use smb_direct_max_read_write_size for this check in smb-direct read/write(). Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: add smbd max io size parameterNamjae Jeon
Add 'smbd max io size' parameter to adjust smbd-direct max read/write size. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: handle smb2 query dir request for OutputBufferLength that is too smallNamjae Jeon
We found the issue that ksmbd return STATUS_NO_MORE_FILES response even though there are still dentries that needs to be read while file read/write test using framtest utils. windows client send smb2 query dir request included OutputBufferLength(128) that is too small to contain even one entry. This patch make ksmbd immediately returns OutputBufferLength of response as zero to client. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: handle multiple Buffer descriptorsHyunchul Lee
Make ksmbd handle multiple buffer descriptors when reading and writing files using SMB direct: Post the work requests of rdma_rw_ctx for RDMA read/write in smb_direct_rdma_xmit(), and the work request for the READ/WRITE response with a remote invalidation in smb_direct_writev(). Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: change the return value of get_sg_listHyunchul Lee
Make get_sg_list return EINVAL if there aren't mapped scatterlists. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: simplify tracking pending packetsHyunchul Lee
Because we don't have to tracking pending packets by dividing these into packets with payload and packets without payload, merge the tracking code. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: introduce read/write credits for RDMA read/writeHyunchul Lee
SMB2_READ/SMB2_WRITE request has to be granted the number of rw credits, the pages the request wants to transfer / the maximum pages which can be registered with one MR to read and write a file. And allocate enough RDMA resources for the maximum number of rw credits allowed by ksmbd. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: change prototypes of RDMA read/write related functionsHyunchul Lee
Change the prototypes of RDMA read/write operations to accept a pointer and length of buffer descriptors. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21cifs: set the CREATE_NOT_FILE when opening the directory in use_cached_dir()Ronnie Sahlberg
This enforces that we can only do this for directories and not normal files or else the server will return an error. This means that we will have conditionally check IF the path refers to a directory or not in all the call-sites where we are unsure. Right now this check is for "" i.e. root. Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21cifs: check for smb1 in open_cached_dir()Ronnie Sahlberg
Check protocol version in open_cached_dir() and return not supported for SMB1. This allows us to call open_cached_dir() from code that is common to both smb1 and smb2/3 in future patches without having to do this check in the call-site. At the same time, add a check if tcon is valid or not for the same reason. Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21cifs: move definition of cifs_fattr earlier in cifsglob.hRonnie Sahlberg
This only moves these definitions to come earlier in the file but not change the definition itself. This is done to reduce the amount of changes in future patches. Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21io_uring: cleanup handling of the two task_work listsJens Axboe
Rather than pass in a bool for whether or not this work item needs to go into the priority list or not, provide separate helpers for it. For most use cases, this also then gets rid of the branch for non-priority task work. While at it, rename the prior_task_list to prio_task_list. Prior is a confusing name for it, as it would seem to indicate that this is the previous task_work list. prio makes it clear that this is a priority task_work list. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-20cifs: print TIDs as hexEnzo Matsumiya
Makes these debug messages easier to read Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-20cifs: return ENOENT for DFS lookup_cache_entry()Enzo Matsumiya
EEXIST didn't make sense to use when dfs_cache_find() couldn't find a cache entry nor retrieve a referral target. It also doesn't make sense cifs_dfs_query_info_nonascii_quirk() to emulate ENOENT anymore. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-20cifs: don't call cifs_dfs_query_info_nonascii_quirk() if nodfs was setEnzo Matsumiya
Also return EOPNOTSUPP if path is remote but nodfs was set. Fixes: a2809d0e1696 ("cifs: quirk for STATUS_OBJECT_NAME_INVALID returned for non-ASCII dfs refs") Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-20NFSD: Clean up nfsd_open_verified()Chuck Lever
Its only caller always passes S_IFREG as the @type parameter. As an additional clean-up, add a kerneldoc comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-20NFSD: Remove do_nfsd_create()Chuck Lever
Now that its two callers have their own version-specific instance of this function, do_nfsd_create() is no longer used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-20NFSD: Refactor NFSv4 OPEN(CREATE)Chuck Lever
Copy do_nfsd_create() to nfs4proc.c and remove NFSv3-specific logic. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-20NFSD: Refactor NFSv3 CREATEChuck Lever
The NFSv3 CREATE and NFSv4 OPEN(CREATE) use cases are about to diverge such that it makes sense to split do_nfsd_create() into one version for NFSv3 and one for NFSv4. As a first step, copy do_nfsd_create() to nfs3proc.c and remove NFSv4-specific logic. One immediate legibility benefit is that the logic for handling NFSv3 createhow is now quite straightforward. NFSv4 createhow has some subtleties that IMO do not belong in generic code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-20NFSD: Refactor nfsd_create_setattr()Chuck Lever
I'd like to move do_nfsd_create() out of vfs.c. Therefore nfsd_create_setattr() needs to be made publicly visible. Note that both call sites in vfs.c commit both the new object and its parent directory, so just combine those common metadata commits into nfsd_create_setattr(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-20NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create()Chuck Lever
Clean up: The "out" label already invokes fh_drop_write(). Note that fh_drop_write() is already careful not to invoke mnt_drop_write() if either it has already been done or there is nothing to drop. Therefore no change in behavior is expected. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-20NFSD: Clean up nfsd3_proc_create()Chuck Lever
As near as I can tell, mode bit masking and setting S_IFREG is already done by do_nfsd_create() and vfs_create(). The NFSv4 path (do_open_lookup), for example, does not bother with this special processing. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-20xfs: free xfs_attrd_log_items correctlyDarrick J. Wong
Technically speaking, objects allocated out of a specific slab cache are supposed to be freed to that slab cache. The popular slab backends will take care of this for us, but SLOB famously doesn't. Fix this, even if slob + xfs are not that common of a combination. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-20xfs: validate xattr name earlier in recoveryDarrick J. Wong
When we're validating a recovered xattr log item during log recovery, we should check the name before starting to allocate resources. This isn't strictly necessary on its own, but it means that we won't bother with huge memory allocations during recovery if the attr name is garbage, which will simplify the changes in the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-20xfs: reject unknown xattri log item filter flags during recoveryDarrick J. Wong
Make sure we screen the "attr flags" field of recovered xattr intent log items to reject flag bits that we don't know about. This is really the attr *filter* field from xfs_da_args, so rename the field and create a mask to make checking for invalid bits easier. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-20xfs: reject unknown xattri log item operation flags during recoveryDarrick J. Wong
Make sure we screen the op flags field of recovered xattr intent log items to reject flag bits that we don't know about. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-20xfs: don't leak the retained da state when doing a leaf to node conversionDarrick J. Wong
If a setxattr operation finds an xattr structure in leaf format, adding the attr can fail due to lack of space and hence requires an upgrade to node format. After this happens, we'll roll the transaction and re-enter the state machine, at which time we need to perform a second lookup of the attribute name to find its new location. This lookup attaches a new da state structure to the xfs_attr_item but doesn't free the old one (from the leaf lookup) and leaks it. Fix that. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-20xfs: don't leak da state when freeing the attr intent itemDarrick J. Wong
kmemleak reported that we lost an xfs_da_state while removing xattrs in generic/020: unreferenced object 0xffff88801c0e4b40 (size 480): comm "attr", pid 30515, jiffies 4294931061 (age 5.960s) hex dump (first 32 bytes): 78 bc 65 07 00 c9 ff ff 00 30 60 1c 80 88 ff ff x.e......0`..... 02 00 00 00 00 00 00 00 80 18 83 4e 80 88 ff ff ...........N.... backtrace: [<ffffffffa023ef4a>] xfs_da_state_alloc+0x1a/0x30 [xfs] [<ffffffffa021b6f3>] xfs_attr_node_hasname+0x23/0x90 [xfs] [<ffffffffa021c6f1>] xfs_attr_set_iter+0x441/0xa30 [xfs] [<ffffffffa02b5104>] xfs_xattri_finish_update+0x44/0x80 [xfs] [<ffffffffa02b515e>] xfs_attr_finish_item+0x1e/0x40 [xfs] [<ffffffffa0244744>] xfs_defer_finish_noroll+0x184/0x740 [xfs] [<ffffffffa02a6473>] __xfs_trans_commit+0x153/0x3e0 [xfs] [<ffffffffa021d149>] xfs_attr_set+0x469/0x7e0 [xfs] [<ffffffffa02a78d9>] xfs_xattr_set+0x89/0xd0 [xfs] [<ffffffff812e6512>] __vfs_removexattr+0x52/0x70 [<ffffffff812e6a08>] __vfs_removexattr_locked+0xb8/0x150 [<ffffffff812e6af6>] vfs_removexattr+0x56/0x100 [<ffffffff812e6bf8>] removexattr+0x58/0x90 [<ffffffff812e6cce>] path_removexattr+0x9e/0xc0 [<ffffffff812e6d44>] __x64_sys_lremovexattr+0x14/0x20 [<ffffffff81786b35>] do_syscall_64+0x35/0x80 I think this is a consequence of xfs_attr_node_removename_setup attaching a new da(btree) state to xfs_attr_item and never freeing it. I /think/ it's the case that the remove paths could detach the da state earlier in the remove state machine since nothing else accesses the state. However, let's future-proof the new xattr code by adding a catch-all when we free the xfs_attr_item to make sure we never leak the da state. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-19namei: cleanup double word in commentTom Rix
Remove the second 'to'. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-19get rid of dead code in legitimize_root()Al Viro
Combination of LOOKUP_IS_SCOPED and NULL nd->root.mnt is impossible after successful path_init(). All places where ->root.mnt might become NULL do that only if LOOKUP_IS_SCOPED is not there and path_init() itself can return success without setting nd->root only if ND_ROOT_PRESET had been set (in which case nd->root had been set by caller and never changed) or if the name had been a relative one *and* none of the bits in LOOKUP_IS_SCOPED had been present. Since all calls of legitimize_root() must be downstream of successful path_init(), the check for !nd->root.mnt && (nd->flags & LOOKUP_IS_SCOPED) is pure paranoia. FWIW, it had been discussed (and agreed upon) with Aleksa back when scoped lookups had been merged; looks like that had fallen through the cracks back then. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-19fs/namei.c:reserve_stack(): tidy up the call of try_to_unlazy()Al Viro
!foo() != 0 is a strange way to spell !foo(); fallout from "fs: make unlazy_walk() error handling consistent"... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-19m->mnt_root->d_inode->i_sb is a weird way to spell m->mnt_sb...Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-19uninline may_mount() and don't opencode it in fspick(2)/fsopen(2)Al Viro
It's done once per (mount-related) syscall and there's no point whatsoever making it inline. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-19f2fs: make f2fs_read_inline_data() more readableChao Liu
In f2fs_read_inline_data(), it is confused with checking of inline_data flag, as we checked it before calling. So this patch add some comments for f2fs_has_inline_data(). Signed-off-by: Chao Liu <liuchao@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-05-19fs/ntfs: remove redundant variable idxColin Ian King
The variable idx is assigned a value and is never read. The variable is not used and is redundant, remove it. Cleans up clang scan build warning: warning: Although the value stored to 'idx' is used in the enclosing expression, the value is never actually read from 'idx' [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20220517093646.93628-2-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: remove time truncations in vfat_create/vfat_mkdirChung-Chiang Cheng
All the timestamps in vfat_create() and vfat_mkdir() come from fat_time_fat2unix() which ensures time granularity. We don't need to truncate them to fit FAT's format. Moreover, fat_truncate_crtime() and fat_timespec64_trunc_10ms() are also removed because there is no caller anymore. Link: https://lkml.kernel.org/r/20220503152536.2503003-4-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: report creation time in statxChung-Chiang Cheng
creation time is no longer mixed with change time. Add an in-memory field for it, and report it in statx if supported. Link: https://lkml.kernel.org/r/20220503152536.2503003-3-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: ignore ctime updates, and keep ctime identical to mtime in memoryChung-Chiang Cheng
FAT supports creation time but not change time, and there was no corresponding timestamp for creation time in previous VFS. The original implementation took the compromise of saving the in-memory change time into the on-disk creation time field, but this would lead to compatibility issues with non-linux systems. To address this issue, this patch changes the behavior of ctime. It will no longer be loaded and stored from the creation time on disk. Instead of that, it'll be consistent with the in-memory mtime and share the same on-disk field. All updates to mtime will also be applied to ctime in memory, while all updates to ctime will be ignored. Link: https://lkml.kernel.org/r/20220503152536.2503003-2-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: split fat_truncate_time() into separate functionsChung-Chiang Cheng
Separate fat_truncate_time() to each timestamps for later creation time work. This patch does not introduce any functional changes, it's merely refactoring change. Link: https://lkml.kernel.org/r/20220503152536.2503003-1-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19mm: zswap: add basic meminfo and vmstat coverageJohannes Weiner
Currently it requires poking at debugfs to figure out the size and population of the zswap cache on a host. There are no counters for reads and writes against the cache. As a result, it's difficult to understand zswap behavior on production systems. Print zswap memory consumption and how many pages are zswapped out in /proc/meminfo. Count zswapouts and zswapins in /proc/vmstat. Link: https://lkml.kernel.org/r/20220510152847.230957-6-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: David Hildenbrand <david@redhat.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Roman Gushchin <guro@fb.com> Cc: Seth Jennings <sjenning@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ethernet/mellanox/mlx5/core/main.c b33886971dbc ("net/mlx5: Initialize flow steering during driver probe") 40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support") f2b41b32cde8 ("net/mlx5: Remove ipsec_ops function table") https://lore.kernel.org/all/20220519040345.6yrjromcdistu7vh@sx1/ 16d42d313350 ("net/mlx5: Drain fw_reset when removing device") 8324a02c342a ("net/mlx5: Add exit route when waiting for FW") https://lore.kernel.org/all/20220519114119.060ce014@canb.auug.org.au/ tools/testing/selftests/net/mptcp/mptcp_join.sh e274f7154008 ("selftests: mptcp: add subflow limits test-cases") b6e074e171bc ("selftests: mptcp: add infinite map testcase") 5ac1d2d63451 ("selftests: mptcp: Add tests for userspace PM type") https://lore.kernel.org/all/20220516111918.366d747f@canb.auug.org.au/ net/mptcp/options.c ba2c89e0ea74 ("mptcp: fix checksum byte order") 1e39e5a32ad7 ("mptcp: infinite mapping sending") ea66758c1795 ("tcp: allow MPTCP to update the announced window") https://lore.kernel.org/all/20220519115146.751c3a37@canb.auug.org.au/ net/mptcp/pm.c 95d686517884 ("mptcp: fix subflow accounting on close") 4d25247d3ae4 ("mptcp: bypass in-kernel PM restrictions for non-kernel PMs") https://lore.kernel.org/all/20220516111435.72f35dca@canb.auug.org.au/ net/mptcp/subflow.c ae66fb2ba6c3 ("mptcp: Do TCP fallback on early DSS checksum failure") 0348c690ed37 ("mptcp: add the fallback check") f8d4bcacff3b ("mptcp: infinite mapping receiving") https://lore.kernel.org/all/20220519115837.380bb8d4@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-19kernfs: Separate kernfs_pr_cont_buf and rename_lock.Hao Luo
Previously the protection of kernfs_pr_cont_buf was piggy backed by rename_lock, which means that pr_cont() needs to be protected under rename_lock. This can cause potential circular lock dependencies. If there is an OOM, we have the following call hierarchy: -> cpuset_print_current_mems_allowed() -> pr_cont_cgroup_name() -> pr_cont_kernfs_name() pr_cont_kernfs_name() will grab rename_lock and call printk. So we have the following lock dependencies: kernfs_rename_lock -> console_sem Sometimes, printk does a wakeup before releasing console_sem, which has the dependence chain: console_sem -> p->pi_lock -> rq->lock Now, imagine one wants to read cgroup_name under rq->lock, for example, printing cgroup_name in a tracepoint in the scheduler code. They will be holding rq->lock and take rename_lock: rq->lock -> kernfs_rename_lock Now they will deadlock. A prevention to this circular lock dependency is to separate the protection of pr_cont_buf from rename_lock. In principle, rename_lock is to protect the integrity of cgroup name when copying to buf. Once pr_cont_buf has got its content, rename_lock can be dropped. So it's safe to drop rename_lock after kernfs_name_locked (and kernfs_path_from_node_locked) and rely on a dedicated pr_cont_lock to protect pr_cont_buf. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220516190951.3144144-1-haoluo@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-19fs-verity: Use struct_size() helper in enable_verity()Zhang Jianhua
Follow the best practice for allocating a variable-sized structure. Signed-off-by: Zhang Jianhua <chris.zjh@huawei.com> [ebiggers: adjusted commit message] Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220519022450.2434483-1-chris.zjh@huawei.com
2022-05-19NFSD: Show state of courtesy client in client infoDai Ngo
Update client_info_show to show state of courtesy client and seconds since last renew. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-19NFSD: add support for lock conflict to courteous serverDai Ngo
This patch allows expired client with lock state to be in COURTESY state. Lock conflict with COURTESY client is resolved by the fs/lock code using the lm_lock_expirable and lm_expire_lock callback in the struct lock_manager_operations. If conflict client is in COURTESY state, set it to EXPIRABLE and schedule the laundromat to run immediately to expire the client. The callback lm_expire_lock waits for the laundromat to flush its work queue before returning to caller. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-19fs/lock: add 2 callbacks to lock_manager_operations to resolve conflictDai Ngo
Add 2 new callbacks, lm_lock_expirable and lm_expire_lock, to lock_manager_operations to allow the lock manager to take appropriate action to resolve the lock conflict if possible. A new field, lm_mod_owner, is also added to lock_manager_operations. The lm_mod_owner is used by the fs/lock code to make sure the lock manager module such as nfsd, is not freed while lock conflict is being resolved. lm_lock_expirable checks and returns true to indicate that the lock conflict can be resolved else return false. This callback must be called with the flc_lock held so it can not block. lm_expire_lock is called to resolve the lock conflict if the returned value from lm_lock_expirable is true. This callback is called without the flc_lock held since it's allowed to block. Upon returning from this callback, the lock conflict should be resolved and the caller is expected to restart the conflict check from the beginnning of the list. Lock manager, such as NFSv4 courteous server, uses this callback to resolve conflict by destroying lock owner, or the NFSv4 courtesy client (client that has expired but allowed to maintains its states) that owns the lock. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-05-19fs/lock: add helper locks_owner_has_blockers to check for blockersDai Ngo
Add helper locks_owner_has_blockers to check if there is any blockers for a given lockowner. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-05-19NFSD: move create/destroy of laundry_wq to init_nfsd and exit_nfsdDai Ngo
This patch moves create/destroy of laundry_wq from nfs4_state_start and nfs4_state_shutdown_net to init_nfsd and exit_nfsd to prevent the laundromat from being freed while a thread is processing a conflicting lock. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-19NFSD: add support for share reservation conflict to courteous serverDai Ngo
This patch allows expired client with open state to be in COURTESY state. Share/access conflict with COURTESY client is resolved by setting COURTESY client to EXPIRABLE state, schedule laundromat to run and returning nfserr_jukebox to the request client. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>