summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2022-05-23NFSD: Trace filecache opensChuck Lever
Instrument calls to nfsd_open_verified() to get a sense of the filecache hit rate. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-23NFSD: Move documenting comment for nfsd4_process_open2()Chuck Lever
Clean up nfsd4_open() by converting a large comment at the only call site for nfsd4_process_open2() to a kerneldoc comment in front of that function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-23NFSD: Fix whitespaceChuck Lever
Clean up: Pull case arms back one tab stop to conform every other switch statement in fs/nfsd/nfs4proc.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-23NFSD: Remove dprintk call sites from tail of nfsd4_open()Chuck Lever
Clean up: These relics are not likely to benefit server administrators. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-23NFSD: Instantiate a struct file when creating a regular NFSv4 fileChuck Lever
There have been reports of races that cause NFSv4 OPEN(CREATE) to return an error even though the requested file was created. NFSv4 does not provide a status code for this case. To mitigate some of these problems, reorganize the NFSv4 OPEN(CREATE) logic to allocate resources before the file is actually created, and open the new file while the parent directory is still locked. Two new APIs are added: + Add an API that works like nfsd_file_acquire() but does not open the underlying file. The OPEN(CREATE) path can use this API when it already has an open file. + Add an API that is kin to dentry_open(). NFSD needs to create a file and grab an open "struct file *" atomically. The alloc_empty_file() has to be done before the inode create. If it fails (for example, because the NFS server has exceeded its max_files limit), we avoid creating the file and can still return an error to the NFS client. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: JianHong Yin <jiyin@redhat.com>
2022-05-23fanotify: fix incorrect fmode_t castsVasily Averin
Fixes sparce warnings: fs/notify/fanotify/fanotify_user.c:267:63: sparse: warning: restricted fmode_t degrades to integer fs/notify/fanotify/fanotify_user.c:1351:28: sparse: warning: restricted fmode_t degrades to integer FMODE_NONTIFY have bitwise fmode_t type and requires __force attribute for any casts. Signed-off-by: Vasily Averin <vvs@openvz.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/9adfd6ac-1b89-791e-796b-49ada3293985@openvz.org
2022-05-23exfat: check if cluster num is validTadeusz Struk
Syzbot reported slab-out-of-bounds read in exfat_clear_bitmap. This was triggered by reproducer calling truncute with size 0, which causes the following trace: BUG: KASAN: slab-out-of-bounds in exfat_clear_bitmap+0x147/0x490 fs/exfat/balloc.c:174 Read of size 8 at addr ffff888115aa9508 by task syz-executor251/365 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack_lvl+0x1e2/0x24b lib/dump_stack.c:118 print_address_description+0x81/0x3c0 mm/kasan/report.c:233 __kasan_report mm/kasan/report.c:419 [inline] kasan_report+0x1a4/0x1f0 mm/kasan/report.c:436 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:309 exfat_clear_bitmap+0x147/0x490 fs/exfat/balloc.c:174 exfat_free_cluster+0x25a/0x4a0 fs/exfat/fatent.c:181 __exfat_truncate+0x99e/0xe00 fs/exfat/file.c:217 exfat_truncate+0x11b/0x4f0 fs/exfat/file.c:243 exfat_setattr+0xa03/0xd40 fs/exfat/file.c:339 notify_change+0xb76/0xe10 fs/attr.c:336 do_truncate+0x1ea/0x2d0 fs/open.c:65 Move the is_valid_cluster() helper from fatent.c to a common header to make it reusable in other *.c files. And add is_valid_cluster() to validate if cluster number is within valid range in exfat_clear_bitmap() and exfat_set_bitmap(). Link: https://syzkaller.appspot.com/bug?id=50381fc73821ecae743b8cf24b4c9a04776f767c Reported-by: syzbot+a4087e40b9c13aad7892@syzkaller.appspotmail.com Fixes: 1e49a94cf707 ("exfat: add bitmap operations") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-05-23exfat: reduce block requests when zeroing a clusterYuezhang Mo
If 'dirsync' is enabled, when zeroing a cluster, submitting sector by sector will generate many block requests, will cause the block device to not fully perform its performance. This commit makes the sectors in a cluster to be submitted in once, it will reduce the number of block requests. This will make the block device to give full play to its performance. Test create 1000 directories on SD card with: $ time (for ((i=0;i<1000;i++)); do mkdir dir${i}; done) Performance has been improved by more than 73% on imx6q-sabrelite. Cluster size Before After Improvement 64 KBytes 3m34.036s 0m56.052s 73.8% 128 KBytes 6m2.644s 1m13.354s 79.8% 256 KBytes 11m22.202s 1m39.451s 85.4% imx6q-sabrelite: - CPU: 792 MHz x4 - Memory: 1GB DDR3 - SD Card: SanDisk 8GB Class 4 Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-05-23exfat: introduce mount option 'sys_tz'Chung-Chiang Cheng
EXFAT_TZ_VALID bit in {create,modify,access}_tz is corresponding to OffsetValid field in exfat specification [1]. When this bit isn't set, timestamps should be treated as having the same UTC offset as the current local time. Currently, there is an option 'time_offset' for users to specify the UTC offset for this issue. This patch introduces a new mount option 'sys_tz' to use system timezone as time offset. Link: [1] https://docs.microsoft.com/en-us/windows/win32/fileio/exfat-specification#74102-offsetvalid-field Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-05-23exfat: fix referencing wrong parent directory information after renamingYuezhang Mo
During renaming, the parent directory information maybe updated. But the file/directory still references to the old parent directory information. This bug will cause 2 problems. (1) The renamed file can not be written. [10768.175172] exFAT-fs (sda1): error, failed to bmap (inode : 7afd50e4 iblock : 0, err : -5) [10768.184285] exFAT-fs (sda1): Filesystem has been set read-only ash: write error: Input/output error (2) Some dentries of the renamed file/directory are not set to deleted after removing the file/directory. exfat_update_parent_info() is a workaround for the wrong parent directory information being used after renaming. Now that bug is fixed, this is no longer needed, so remove it. Fixes: 5f2aa075070c ("exfat: add inode operations") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Daniel Palmer <daniel.palmer@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-05-23Merge branch 'guilt/xfs-5.19-misc-3' into xfs-5.19-for-nextDave Chinner
2022-05-23xfs: share xattr name and value buffers when logging xattr updatesDarrick J. Wong
While running xfs/297 and generic/642, I noticed a crash in xfs_attri_item_relog when it tries to copy the attr name to the new xattri log item. I think what happened here was that we called ->iop_commit on the old attri item (which nulls out the pointers) as part of a log force at the same time that a chained attr operation was ongoing. The system was busy enough that at some later point, the defer ops operation decided it was necessary to relog the attri log item, but as we've detached the name buffer from the old attri log item, we can't copy it to the new one, and kaboom. I think there's a broader refcounting problem with LARP mode -- the setxattr code can return to userspace before the CIL actually formats and commits the log item, which results in a UAF bug. Therefore, the xattr log item needs to be able to retain a reference to the name and value buffers until the log items have completely cleared the log. Furthermore, each time we create an intent log item, we allocate new memory and (re)copy the contents; sharing here would be very useful. Solve the UAF and the unnecessary memory allocations by having the log code create a single refcounted buffer to contain the name and value contents. This buffer can be passed from old to new during a relog operation, and the logging code can (optionally) attach it to the xfs_attr_item for reuse when LARP mode is enabled. This also fixes a problem where the xfs_attri_log_item objects weren't being freed back to the same cache where they came from. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-23xfs: do not use logged xattr updates on V4 filesystemsDarrick J. Wong
V4 superblocks do not contain the log_incompat feature bit, which means that we cannot protect xattr log items against kernels that are too old to know how to recover them. Turn off the log items for such filesystems and adjust the "delayed" name to reflect what it's really controlling. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22afs: Adjust ACK interpretation to try and cope with NATDavid Howells
If a client's address changes, say if it is NAT'd, this can disrupt an in progress operation. For most operations, this is not much of a problem, but StoreData can be different as some servers modify the target file as the data comes in, so if a store request is disrupted, the file can get corrupted on the server. The problem is that the server doesn't recognise packets that come after the change of address as belonging to the original client and will bounce them, either by sending an OUT_OF_SEQUENCE ACK to the apparent new call if the packet number falls within the initial sequence number window of a call or by sending an EXCEEDS_WINDOW ACK if it falls outside and then aborting it. In both cases, firstPacket will be 1 and previousPacket will be 0 in the ACK information. Fix this by the following means: (1) If a client call receives an EXCEEDS_WINDOW ACK with firstPacket as 1 and previousPacket as 0, assume this indicates that the server saw the incoming packets from a different peer and thus as a different call. Fail the call with error -ENETRESET. (2) Also fail the call if a similar OUT_OF_SEQUENCE ACK occurs if the first packet has been hard-ACK'd. If it hasn't been hard-ACK'd, the ACK packet will cause it to get retransmitted, so the call will just be repeated. (3) Make afs_select_fileserver() treat -ENETRESET as a straight fail of the operation. (4) Prioritise the error code over things like -ECONNRESET as the server did actually respond. (5) Make writeback treat -ENETRESET as a retryable error and make it redirty all the pages involved in a write so that the VM will retry. Note that there is still a circumstance that I can't easily deal with: if the operation is fully received and processed by the server, but the reply is lost due to address change. There's no way to know if the op happened. We can examine the server, but a conflicting change could have been made by a third party - and we can't tell the difference. In such a case, a message like: kAFS: vnode modified {100058:146266} b7->b8 YFS.StoreData64 (op=2646a) will be logged to dmesg on the next op to touch the file and the client will reset the inode state, including invalidating clean parts of the pagecache. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2021-December/004811.html # v1 Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-22rxrpc, afs: Fix selection of abort codesDavid Howells
The RX_USER_ABORT code should really only be used to indicate that the user of the rxrpc service (ie. userspace) implicitly caused a call to be aborted - for instance if the AF_RXRPC socket is closed whilst the call was in progress. (The user may also explicitly abort a call and specify the abort code to use). Change some of the points of generation to use other abort codes instead: (1) Abort the call with RXGEN_SS_UNMARSHAL or RXGEN_CC_UNMARSHAL if we see ENOMEM and EFAULT during received data delivery and abort with RX_CALL_DEAD in the default case. (2) Abort with RXGEN_SS_MARSHAL if we get ENOMEM whilst trying to send a reply. (3) Abort with RX_CALL_DEAD if we stop hearing from the peer if we had heard from the peer and abort with RX_CALL_TIMEOUT if we hadn't. (4) Abort with RX_CALL_DEAD if we try to disconnect a call that's not completed successfully or been aborted. Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-22rxrpc: Fix locking issueDavid Howells
There's a locking issue with the per-netns list of calls in rxrpc. The pieces of code that add and remove a call from the list use write_lock() and the calls procfile uses read_lock() to access it. However, the timer callback function may trigger a removal by trying to queue a call for processing and finding that it's already queued - at which point it has a spare refcount that it has to do something with. Unfortunately, if it puts the call and this reduces the refcount to 0, the call will be removed from the list. Unfortunately, since the _bh variants of the locking functions aren't used, this can deadlock. ================================ WARNING: inconsistent lock state 5.18.0-rc3-build4+ #10 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/2/25 [HC0[0]:SC1[1]:HE1:SE0] takes: ffff888107ac4038 (&rxnet->call_lock){+.?.}-{2:2}, at: rxrpc_put_call+0x103/0x14b {SOFTIRQ-ON-W} state was registered at: ... Possible unsafe locking scenario: CPU0 ---- lock(&rxnet->call_lock); <Interrupt> lock(&rxnet->call_lock); *** DEADLOCK *** 1 lock held by ksoftirqd/2/25: #0: ffff8881008ffdb0 ((&call->timer)){+.-.}-{0:0}, at: call_timer_fn+0x5/0x23d Changes ======= ver #2) - Changed to using list_next_rcu() rather than rcu_dereference() directly. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-22afs: Fix afs_getattr() to refetch file status if callback break occurredDavid Howells
If a callback break occurs (change notification), afs_getattr() needs to issue an FS.FetchStatus RPC operation to update the status of the file being examined by the stat-family of system calls. Fix afs_getattr() to do this if AFS_VNODE_CB_PROMISED has been cleared on a vnode by a callback break. Skip this if AT_STATX_DONT_SYNC is set. This can be tested by appending to a file on one AFS client and then using "stat -L" to examine its length on a machine running kafs. This can also be watched through tracing on the kafs machine. The callback break is seen: kworker/1:1-46 [001] ..... 978.910812: afs_cb_call: c=0000005f YFSCB.CallBack kworker/1:1-46 [001] ...1. 978.910829: afs_cb_break: 100058:23b4c:242d2c2 b=2 s=1 break-cb kworker/1:1-46 [001] ..... 978.911062: afs_call_done: c=0000005f ret=0 ab=0 [0000000082994ead] And then the stat command generated no traffic if unpatched, but with this change a call to fetch the status can be observed: stat-4471 [000] ..... 986.744122: afs_make_fs_call: c=000000ab 100058:023b4c:242d2c2 YFS.FetchStatus stat-4471 [000] ..... 986.745578: afs_call_done: c=000000ab ret=0 ab=0 [0000000087fc8c84] Fixes: 08e0e7c82eea ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.") Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Tested-by: Markus Suvanto <markus.suvanto@gmail.com> Tested-by: kafs-testing+fedora34_64checkkafs-build-496@auristor.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=216010 Link: https://lore.kernel.org/r/165308359800.162686.14122417881564420962.stgit@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-22xfs: Remove duplicate includeJiapeng Chong
Clean up the following includecheck warning: ./fs/xfs/xfs_attr_item.c: xfs_inode.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: reduce IOCB_NOWAIT judgment for retry exclusive unaligned DIOKaixu Xia
Retry unaligned DIO with exclusive blocking semantics only when the IOCB_NOWAIT flag is not set. If we are doing nonblocking user I/O, propagate the error directly. Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: Remove dead codeJiapeng Chong
Remove tht entire xlog_recover_check_summary() function, this entire function is dead code and has been for 12 years. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: fix typo in commentJulia Lawall
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: rename struct xfs_attr_item to xfs_attr_intentDarrick J. Wong
Everywhere else in XFS, structures that capture the state of an ongoing deferred work item all have names that end with "_intent". The new extended attribute deferred work items are not named as such, so fix it to follow the naming convention used elsewhere. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: clean up state variable usage in xfs_attr_node_remove_attrDarrick J. Wong
The state variable is now a local variable pointing to a heap allocation, so we don't need to zero-initialize it, nor do we need the conditional to decide if we should free it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: put attr[id] log item cache init with the othersDarrick J. Wong
Initialize and destroy the xattr log item caches in the same places that we do all the other log item caches. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: remove struct xfs_attr_item.xattri_flagsDarrick J. Wong
Nobody uses this field, so get rid of it and the unused flag definition. Rearrange the structure layout to reduce its size from 104 to 96 bytes. This gets us from 39 to 42 objects per page. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: use a separate slab cache for deferred xattr work stateDarrick J. Wong
Create a separate slab cache for struct xfs_attr_item objects, since we can pack the (104-byte) intent items more tightly than we can with the general slab cache objects. On x86, this means 39 intents per memory page instead of 32. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: put the xattr intent item op flags in their own namespaceDarrick J. Wong
The flags that are stored in the extended attr intent log item really should have a separate namespace from the rest of the XFS_ATTR_* flags. Give them one to make it a little more obvious that they're intent item flags. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22xfs: clean up xfs_attr_node_hasnameDarrick J. Wong
The calling conventions of this function are a mess -- callers /can/ provide a pointer to a pointer to a state structure, but it's not required, and as evidenced by the last two patches, the callers that do weren't be careful enough about how to deal with an existing da state. Push the allocation and freeing responsibilty to the callers, which means that callers from the xattr node state machine steps now have the visibility to allocate or free the da state structure as they please. As a bonus, the node remove/add paths for larp-mode replaces can reset the da state structure instead of freeing and immediately reallocating it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-22smb3: add trace point for oplock not foundSteve French
In order to debug problems with server potentially sending us an oplock that we don't recognize (or a race with close and oplock break) it would be helpful to have a dynamic trace point for this case. New tracepoint is called trace_smb3_oplock_not_found Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-22cifs: return the more nuanced writeback error on close()ChenXiaoSong
As filemap_check_errors() only report -EIO or -ENOSPC, we return more nuanced writeback error -(file->f_mapping->wb_err & MAX_ERRNO). filemap_write_and_wait filemap_write_and_wait_range filemap_check_errors -ENOSPC or -EIO filemap_check_wb_err errseq_check return -(file->f_mapping->wb_err & MAX_ERRNO) Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21smb3: add trace point for lease not found issueSteve French
When trying to debug problems with server sending us a lease we don't recognize, it would be helpful to have a dynamic trace point for this case. New tracepoint is called trace_smb3_lease_not_found Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21cifs: smbd: fix typo in commentJulia Lawall
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ext4: fix bug_on in ext4_writepagesYe Bin
we got issue as follows: EXT4-fs error (device loop0): ext4_mb_generate_buddy:1141: group 0, block bitmap and bg descriptor inconsistent: 25 vs 31513 free cls ------------[ cut here ]------------ kernel BUG at fs/ext4/inode.c:2708! invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 2 PID: 2147 Comm: rep Not tainted 5.18.0-rc2-next-20220413+ #155 RIP: 0010:ext4_writepages+0x1977/0x1c10 RSP: 0018:ffff88811d3e7880 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88811c098000 RDX: 0000000000000000 RSI: ffff88811c098000 RDI: 0000000000000002 RBP: ffff888128140f50 R08: ffffffffb1ff6387 R09: 0000000000000000 R10: 0000000000000007 R11: ffffed10250281ea R12: 0000000000000001 R13: 00000000000000a4 R14: ffff88811d3e7bb8 R15: ffff888128141028 FS: 00007f443aed9740(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020007200 CR3: 000000011c2a4000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> do_writepages+0x130/0x3a0 filemap_fdatawrite_wbc+0x83/0xa0 filemap_flush+0xab/0xe0 ext4_alloc_da_blocks+0x51/0x120 __ext4_ioctl+0x1534/0x3210 __x64_sys_ioctl+0x12c/0x170 do_syscall_64+0x3b/0x90 It may happen as follows: 1. write inline_data inode vfs_write new_sync_write ext4_file_write_iter ext4_buffered_write_iter generic_perform_write ext4_da_write_begin ext4_da_write_inline_data_begin -> If inline data size too small will allocate block to write, then mapping will has dirty page ext4_da_convert_inline_data_to_extent ->clear EXT4_STATE_MAY_INLINE_DATA 2. fallocate do_vfs_ioctl ioctl_preallocate vfs_fallocate ext4_fallocate ext4_convert_inline_data ext4_convert_inline_data_nolock ext4_map_blocks -> fail will goto restore data ext4_restore_inline_data ext4_create_inline_data ext4_write_inline_data ext4_set_inode_state -> set inode EXT4_STATE_MAY_INLINE_DATA 3. writepages __ext4_ioctl ext4_alloc_da_blocks filemap_flush filemap_fdatawrite_wbc do_writepages ext4_writepages if (ext4_has_inline_data(inode)) BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) The root cause of this issue is we destory inline data until call ext4_writepages under delay allocation mode. But there maybe already convert from inline to extent. To solve this issue, we call filemap_flush first.. Cc: stable@kernel.org Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220516122634.1690462-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-05-21ext4: refactor and move ext4_ioctl_get_encryption_pwsalt()Ritesh Harjani
This patch move code for FS_IOC_GET_ENCRYPTION_PWSALT case into ext4's crypto.c file, i.e. ext4_ioctl_get_encryption_pwsalt() and uuid_is_zero(). This is mostly refactoring logic and should not affect any functionality change. Suggested-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ritesh Harjani <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/5af98b17152a96b245b4f7d2dfb8607fc93e36aa.1652595565.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-05-21ext4: cleanup function defs from ext4.h into crypto.cRitesh Harjani
Some of these functions when CONFIG_FS_ENCRYPTION is enabled are not really inline (let compiler be the best judge of it). Remove inline and move them into crypto.c where they should be present. Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ritesh Harjani <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/b7b9de2c7226298663fb5a0c28909135e2ab220f.1652595565.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-05-21ext4: move ext4 crypto code to its own file crypto.cRitesh Harjani
This is to cleanup super.c file which has grown quite large. So, start moving ext4 crypto related code to where it should be in the first place i.e. fs/ext4/crypto.c Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ritesh Harjani <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/7d637e093cbc34d727397e8d41a53a1b9ca7d7a4.1652595565.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-05-21ksmbd: fix outstanding credits related bugsHyunchul Lee
outstanding credits must be initialized to 0, because it means the sum of credits consumed by in-flight requests. And outstanding credits must be compared with total credits in smb2_validate_credit_charge(), because total credits are the sum of credits granted by ksmbd. This patch fix the following error, while frametest with Windows clients: Limits exceeding the maximum allowable outstanding requests, given : 128, pending : 8065 Fixes: b589f5db6d4a ("ksmbd: limits exceeding the maximum allowable outstanding requests") Cc: stable@vger.kernel.org Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Reported-by: Yufan Chen <wiz.chen@gmail.com> Tested-by: Yufan Chen <wiz.chen@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: fix connection dropped issueHyunchul Lee
When there are bursty connection requests, RDMA connection event handler is deferred and Negotiation requests are received even if connection status is NEW. To handle it, set the status to CONNECTED if Negotiation requests are received. Reported-by: Yufan Chen <wiz.chen@gmail.com> Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Tested-by: Yufan Chen <wiz.chen@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: Fix some kernel-doc commentsYang Li
Remove some warnings found by running scripts/kernel-doc, which is caused by using 'make W=1'. fs/ksmbd/misc.c:30: warning: Function parameter or member 'str' not described in 'match_pattern' fs/ksmbd/misc.c:30: warning: Excess function parameter 'string' description in 'match_pattern' fs/ksmbd/misc.c:163: warning: Function parameter or member 'share' not described in 'convert_to_nt_pathname' fs/ksmbd/misc.c:163: warning: Function parameter or member 'path' not described in 'convert_to_nt_pathname' fs/ksmbd/misc.c:163: warning: Excess function parameter 'filename' description in 'convert_to_nt_pathname' fs/ksmbd/misc.c:163: warning: Excess function parameter 'sharepath' description in 'convert_to_nt_pathname' fs/ksmbd/misc.c:259: warning: Function parameter or member 'share' not described in 'convert_to_unix_name' fs/ksmbd/misc.c:259: warning: Function parameter or member 'name' not described in 'convert_to_unix_name' fs/ksmbd/misc.c:259: warning: Excess function parameter 'path' description in 'convert_to_unix_name' fs/ksmbd/misc.c:259: warning: Excess function parameter 'tid' description in 'convert_to_unix_name' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: fix wrong smbd max read/write size checkNamjae Jeon
smb-direct max read/write size can be different with smb2 max read/write size. So smb2_read() can return error by wrong max read/write size check. This patch use smb_direct_max_read_write_size for this check in smb-direct read/write(). Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: add smbd max io size parameterNamjae Jeon
Add 'smbd max io size' parameter to adjust smbd-direct max read/write size. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: handle smb2 query dir request for OutputBufferLength that is too smallNamjae Jeon
We found the issue that ksmbd return STATUS_NO_MORE_FILES response even though there are still dentries that needs to be read while file read/write test using framtest utils. windows client send smb2 query dir request included OutputBufferLength(128) that is too small to contain even one entry. This patch make ksmbd immediately returns OutputBufferLength of response as zero to client. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: handle multiple Buffer descriptorsHyunchul Lee
Make ksmbd handle multiple buffer descriptors when reading and writing files using SMB direct: Post the work requests of rdma_rw_ctx for RDMA read/write in smb_direct_rdma_xmit(), and the work request for the READ/WRITE response with a remote invalidation in smb_direct_writev(). Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: change the return value of get_sg_listHyunchul Lee
Make get_sg_list return EINVAL if there aren't mapped scatterlists. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: simplify tracking pending packetsHyunchul Lee
Because we don't have to tracking pending packets by dividing these into packets with payload and packets without payload, merge the tracking code. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: introduce read/write credits for RDMA read/writeHyunchul Lee
SMB2_READ/SMB2_WRITE request has to be granted the number of rw credits, the pages the request wants to transfer / the maximum pages which can be registered with one MR to read and write a file. And allocate enough RDMA resources for the maximum number of rw credits allowed by ksmbd. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21ksmbd: smbd: change prototypes of RDMA read/write related functionsHyunchul Lee
Change the prototypes of RDMA read/write operations to accept a pointer and length of buffer descriptors. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21cifs: set the CREATE_NOT_FILE when opening the directory in use_cached_dir()Ronnie Sahlberg
This enforces that we can only do this for directories and not normal files or else the server will return an error. This means that we will have conditionally check IF the path refers to a directory or not in all the call-sites where we are unsure. Right now this check is for "" i.e. root. Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21cifs: check for smb1 in open_cached_dir()Ronnie Sahlberg
Check protocol version in open_cached_dir() and return not supported for SMB1. This allows us to call open_cached_dir() from code that is common to both smb1 and smb2/3 in future patches without having to do this check in the call-site. At the same time, add a check if tcon is valid or not for the same reason. Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-05-21cifs: move definition of cifs_fattr earlier in cifsglob.hRonnie Sahlberg
This only moves these definitions to come earlier in the file but not change the definition itself. This is done to reduce the amount of changes in future patches. Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>