summaryrefslogtreecommitdiff
path: root/fs/smb/client/file.c
AgeCommit message (Collapse)Author
2025-02-11cifs: pick channels for individual subrequestsShyam Prasad N
The netfs library could break down a read request into multiple subrequests. When multichannel is used, there is potential to improve performance when each of these subrequests pick a different channel. Today we call cifs_pick_channel when the main read request is initialized in cifs_init_request. This change moves this to cifs_prepare_read, which is the right place to pick channel since it gets called for each subrequest. Interestingly cifs_prepare_write already does channel selection for individual subreq, but looks like it was missed for read. This is especially important when multichannel is used with increased rasize. In my test setup, with rasize set to 8MB, a sequential read of large file was taking 11.5s without this change. With the change, it completed in 9s. The difference is even more signigicant with bigger rasize. Cc: <stable@vger.kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-20Merge tag 'vfs-6.14-rc1.netfs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs netfs updates from Christian Brauner: "This contains read performance improvements and support for monolithic single-blob objects that have to be read/written as such (e.g. AFS directory contents). The implementation of the two parts is interwoven as each makes the other possible. - Read performance improvements The read performance improvements are intended to speed up some loss of performance detected in cifs and to a lesser extend in afs. The problem is that we queue too many work items during the collection of read results: each individual subrequest is collected by its own work item, and then they have to interact with each other when a series of subrequests don't exactly align with the pattern of folios that are being read by the overall request. Whilst the processing of the pages covered by individual subrequests as they complete potentially allows folios to be woken in parallel and with minimum delay, it can shuffle wakeups for sequential reads out of order - and that is the most common I/O pattern. The final assessment and cleanup of an operation is then held up until the last I/O completes - and for a synchronous sequential operation, this means the bouncing around of work items just adds latency. Two changes have been made to make this work: (1) All collection is now done in a single "work item" that works progressively through the subrequests as they complete (and also dispatches retries as necessary). (2) For readahead and AIO, this work item be done on a workqueue and can run in parallel with the ultimate consumer of the data; for synchronous direct or unbuffered reads, the collection is run in the application thread and not offloaded. Functions such as smb2_readv_callback() then just tell netfslib that the subrequest has terminated; netfslib does a minimal bit of processing on the spot - stat counting and tracing mostly - and then queues/wakes up the worker. This simplifies the logic as the collector just walks sequentially through the subrequests as they complete and walks through the folios, if buffered, unlocking them as it goes. It also keeps to a minimum the amount of latency injected into the filesystem's low-level I/O handling The way netfs supports filesystems using the deprecated PG_private_2 flag is changed: folios are flagged and added to a write request as they complete and that takes care of scheduling the writes to the cache. The originating read request can then just unlock the pages whatever happens. - Single-blob object support Single-blob objects are files for which the content of the file must be read from or written to the server in a single operation because reading them in parts may yield inconsistent results. AFS directories are an example of this as there exists the possibility that the contents are generated on the fly and would differ between reads or might change due to third party interference. Such objects will be written to and retrieved from the cache if one is present, though we allow/may need to propose multiple subrequests to do so. The important part is that read from/write to the *server* is monolithic. Single blob reading is, for the moment, fully synchronous and does result collection in the application thread and, also for the moment, the API is supplied the buffer in the form of a folio_queue chain rather than using the pagecache. - Related afs changes This series makes a number of changes to the kafs filesystem, primarily in the area of directory handling: - AFS's FetchData RPC reply processing is made partially asynchronous which allows the netfs_io_request's outstanding operation counter to be removed as part of reducing the collection to a single work item. - Directory and symlink reading are plumbed through netfslib using the single-blob object API and are now cacheable with fscache. This also allows the afs_read struct to be eliminated and netfs_io_subrequest to be used directly instead. - Directory and symlink content are now stored in a folio_queue buffer rather than in the pagecache. This means we don't require the RCU read lock and xarray iteration to access it, and folios won't randomly disappear under us because the VM wants them back. - The vnode operation lock is changed from a mutex struct to a private lock implementation. The problem is that the lock now needs to be dropped in a separate thread and mutexes don't permit that. - When a new directory or symlink is created, we now initialise it locally and mark it valid rather than downloading it (we know what it's likely to look like). - We now use the in-directory hashtable to reduce the number of entries we need to scan when doing a lookup. The edit routines have to maintain the hash chains. - Cancellation (e.g. by signal) of an async call after the rxrpc_call has been set up is now offloaded to the worker thread as there will be a notification from rxrpc upon completion. This avoids a double cleanup. - A "rolling buffer" implementation is created to abstract out the two separate folio_queue chaining implementations I had (one for read and one for write). - Functions are provided to create/extend a buffer in a folio_queue chain and tear it down again. This is used to handle AFS directories, but could also be used to create bounce buffers for content crypto and transport crypto. - The was_async argument is dropped from netfs_read_subreq_terminated() Instead we wake the read collection work item by either queuing it or waking up the app thread. - We don't need to use BH-excluding locks when communicating between the issuing thread and the collection thread as neither of them now run in BH context. - Also included are a number of new tracepoints; a split of the netfslib write collection code to put retrying into its own file (it gets more complicated with content encryption). - There are also some minor fixes AFS included, including fixing the AFS directory format struct layout, reducing some directory over-invalidation and making afs_mkdir() translate EEXIST to ENOTEMPY (which is not available on all systems the servers support). - Finally, there's a patch to try and detect entry into the folio unlock function with no folio_queue structs in the buffer (which isn't allowed in the cases that can get there). This is a debugging patch, but should be minimal overhead" * tag 'vfs-6.14-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (31 commits) netfs: Report on NULL folioq in netfs_writeback_unlock_folios() afs: Add a tracepoint for afs_read_receive() afs: Locally initialise the contents of a new symlink on creation afs: Use the contained hashtable to search a directory afs: Make afs_mkdir() locally initialise a new directory's content netfs: Change the read result collector to only use one work item afs: Make {Y,}FS.FetchData an asynchronous operation afs: Fix cleanup of immediately failed async calls afs: Eliminate afs_read afs: Use netfslib for symlinks, allowing them to be cached afs: Use netfslib for directories afs: Make afs_init_request() get a key if not given a file netfs: Add support for caching single monolithic objects such as AFS dirs netfs: Add functions to build/clean a buffer in a folio_queue afs: Add more tracepoints to do with tracking validity cachefiles: Add auxiliary data trace cachefiles: Add some subrequest tracepoints netfs: Remove some extraneous directory invalidations afs: Fix directory format encoding struct afs: Fix EEXIST error returned from afs_rmdir() to be ENOTEMPTY ...
2024-12-23smb: enable reuse of deferred file handles for write operationsBharath SM
Previously, deferred file handles were reused only for read operations, this commit extends to reusing deferred handles for write operations. By reusing these handles we can reduce the need for open/close operations over the wire. Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-20netfs: Drop the was_async arg from netfs_read_subreq_terminated()David Howells
Drop the was_async argument from netfs_read_subreq_terminated(). Almost every caller is either in process context and passes false. Some filesystems delegate the call to a workqueue to avoid doing the work in their network message queue parsing thread. The only exception is netfs_cache_read_terminated() which handles completion in the cache - which is usually a callback from the backing filesystem in softirq context, though it can be from process context if an error occurred. In this case, delegate to a workqueue. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wiVC5Cgyz6QKXFu6fTaA6h4CjexDR-OV9kL6Vo5x9v8=A@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241216204124.3752367-10-dhowells@redhat.com cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-20netfs: Drop the error arg from netfs_read_subreq_terminated()David Howells
Drop the error argument from netfs_read_subreq_terminated() in favour of passing the value in subreq->error. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241216204124.3752367-9-dhowells@redhat.com cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-02smb: client: Correct typos in multiple comments across various filesShen Lichuan
Fixed some confusing typos that were currently identified witch codespell, the details are as follows: -in the code comments: fs/smb/client/cifsacl.h:58: inheritence ==> inheritance fs/smb/client/cifsencrypt.c:242: origiginal ==> original fs/smb/client/cifsfs.c:164: referece ==> reference fs/smb/client/cifsfs.c:292: ned ==> need fs/smb/client/cifsglob.h:779: initital ==> initial fs/smb/client/cifspdu.h:784: altetnative ==> alternative fs/smb/client/cifspdu.h:2409: conrol ==> control fs/smb/client/cifssmb.c:1218: Expirement ==> Experiment fs/smb/client/cifssmb.c:3021: conver ==> convert fs/smb/client/cifssmb.c:3998: asterik ==> asterisk fs/smb/client/file.c:2505: useable ==> usable fs/smb/client/fs_context.h:263: timemout ==> timeout fs/smb/client/misc.c:257: responsbility ==> responsibility fs/smb/client/netmisc.c:1006: divisable ==> divisible fs/smb/client/readdir.c:556: endianess ==> endianness fs/smb/client/readdir.c:818: bu ==> by fs/smb/client/smb2ops.c:2180: snaphots ==> snapshots fs/smb/client/smb2ops.c:3586: otions ==> options fs/smb/client/smb2pdu.c:2979: timestaps ==> timestamps fs/smb/client/smb2pdu.c:4574: memmory ==> memory fs/smb/client/smb2transport.c:699: origiginal ==> original fs/smb/client/smbdirect.c:222: happenes ==> happens fs/smb/client/smbdirect.c:1347: registartions ==> registrations fs/smb/client/smbdirect.h:114: accoutning ==> accounting Signed-off-by: Shen Lichuan <shenlichuan@vivo.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-19Merge tag 'v6.12-rc-smb3-client-fixes-part1' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - cleanups (moving duplicated code, removing unused code etc) - fixes relating to "sfu" mount options (for better handling special file types) - SMB3.1.1 compression fixes/improvements * tag 'v6.12-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: (24 commits) smb: client: fix compression heuristic functions cifs: Update SFU comments about fifos and sockets cifs: Add support for creating SFU symlinks smb: use LIST_HEAD() to simplify code cifs: Recognize SFU socket type cifs: Show debug message when SFU Fifo type was detected cifs: Put explicit zero byte into SFU block/char types cifs: Add support for reading SFU symlink location cifs: Fix recognizing SFU symlinks smb: client: compress: fix an "illegal accesses" issue smb: client: compress: fix a potential issue of freeing an invalid pointer smb: client: compress: LZ77 code improvements cleanup smb: client: insert compression check/call on write requests smb3: mark compression as CONFIG_EXPERIMENTAL and fix missing compression operation cifs: Remove obsoleted declaration for cifs_dir_open smb: client: Use min() macro cifs: convert to use ERR_CAST() smb: add comment to STATUS_MCA_OCCURED smb: move SMB2 Status code to common header file smb: move some duplicate definitions to common/smbacl.h ...
2024-09-15smb: use LIST_HEAD() to simplify codeHongbo Li
list_head can be initialized automatically with LIST_HEAD() instead of calling INIT_LIST_HEAD(). No functional impact. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-12netfs: Speed up buffered readingDavid Howells
Improve the efficiency of buffered reads in a number of ways: (1) Overhaul the algorithm in general so that it's a lot more compact and split the read submission code between buffered and unbuffered versions. The unbuffered version can be vastly simplified. (2) Read-result collection is handed off to a work queue rather than being done in the I/O thread. Multiple subrequests can be processes simultaneously. (3) When a subrequest is collected, any folios it fully spans are collected and "spare" data on either side is donated to either the previous or the next subrequest in the sequence. Notes: (*) Readahead expansion is massively slows down fio, presumably because it causes a load of extra allocations, both folio and xarray, up front before RPC requests can be transmitted. (*) RDMA with cifs does appear to work, both with SIW and RXE. (*) PG_private_2-based reading and copy-to-cache is split out into its own file and altered to use folio_queue. Note that the copy to the cache now creates a new write transaction against the cache and adds the folios to be copied into it. This allows it to use part of the writeback I/O code. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20240814203850.2240469-20-dhowells@redhat.com/ # v2 Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-05netfs: Move max_len/max_nr_segs from netfs_io_subrequest to netfs_io_streamDavid Howells
Move max_len/max_nr_segs from struct netfs_io_subrequest to struct netfs_io_stream as we only issue one subreq at a time and then don't need these values again for that subreq unless and until we have to retry it - in which case we want to renegotiate them. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20240814203850.2240469-8-dhowells@redhat.com/ # v2 Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-05netfs, cifs: Move CIFS_INO_MODIFIED_ATTR to netfs_inodeDavid Howells
Move CIFS_INO_MODIFIED_ATTR to netfs_inode as NETFS_ICTX_MODIFIED_ATTR and then make netfs_perform_write() set it. This means that cifs doesn't need to implement the ->post_modify() hook. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20240814203850.2240469-7-dhowells@redhat.com/ # v2 Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-28cifs: Fix lack of credit renegotiation on read retryDavid Howells
When netfslib asks cifs to issue a read operation, it prefaces this with a call to ->clamp_length() which cifs uses to negotiate credits, providing receive capacity on the server; however, in the event that a read op needs reissuing, netfslib doesn't call ->clamp_length() again as that could shorten the subrequest, leaving a gap. This causes the retried read to be done with zero credits which causes the server to reject it with STATUS_INVALID_PARAMETER. This is a problem for a DIO read that is requested that would go over the EOF. The short read will be retried, causing EINVAL to be returned to the user when it fails. Fix this by making cifs_req_issue_read() negotiate new credits if retrying (NETFS_SREQ_RETRYING now gets set in the read side as well as the write side in this instance). This isn't sufficient, however: the new credits might not be sufficient to complete the remainder of the read, so also add an additional field, rreq->actual_len, that holds the actual size of the op we want to perform without having to alter subreq->len. We then rely on repeated short reads being retried until we finish the read or reach the end of file and make a zero-length read. Also fix a couple of places where the subrequest start and length need to be altered by the amount so far transferred when being used. Fixes: 69c3c023af25 ("cifs: Implement netfslib hooks") Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-18smb3: fix broken cached reads when posix locksSteve French
Mandatory locking is enforced for cached reads, which violates default posix semantics, and also it is enforced inconsistently. This affected recent versions of libreoffice, and can be demonstrated by opening a file twice from the same client, locking it from handle one and trying to read from it from handle two (which fails, returning EACCES). There is already a mount option "forcemandatorylock" (which defaults to off), so with this change only when the user intentionally specifies "forcemandatorylock" on mount will we break posix semantics on read to a locked range (ie we will only fail in this case, if the user mounts with "forcemandatorylock"). An earlier patch fixed the write path. Fixes: 85160e03a79e ("CIFS: Implement caching mechanism for mandatory brlocks") Cc: stable@vger.kernel.org Cc: Pavel Shilovsky <piastryyy@gmail.com> Reviewed-by: David Howells <dhowells@redhat.com> Reported-by: abartlet@samba.org Reported-by: Kevin Ottens <kevin.ottens@enioka.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-17Merge tag 'v6.11-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - fix for clang warning - additional null check - fix for cached write with posix locks - flexible structure fix * tag 'v6.11-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: smb2pdu.h: Use static_assert() to check struct sizes smb3: fix lock breakage for cached writes smb/client: avoid possible NULL dereference in cifs_free_subrequest()
2024-08-15smb3: fix lock breakage for cached writesSteve French
Mandatory locking is enforced for cached writes, which violates default posix semantics, and also it is enforced inconsistently. This apparently breaks recent versions of libreoffice, but can also be demonstrated by opening a file twice from the same client, locking it from handle one and writing to it from handle two (which fails, returning EACCES). Since there was already a mount option "forcemandatorylock" (which defaults to off), with this change only when the user intentionally specifies "forcemandatorylock" on mount will we break posix semantics on write to a locked range (ie we will only fail the write in this case, if the user mounts with "forcemandatorylock"). Fixes: 85160e03a79e ("CIFS: Implement caching mechanism for mandatory brlocks") Cc: stable@vger.kernel.org Cc: Pavel Shilovsky <piastryyy@gmail.com> Reported-by: abartlet@samba.org Reported-by: Kevin Ottens <kevin.ottens@enioka.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-15smb/client: avoid possible NULL dereference in cifs_free_subrequest()Su Hui
Clang static checker (scan-build) warning: cifsglob.h:line 890, column 3 Access to field 'ops' results in a dereference of a null pointer. Commit 519be989717c ("cifs: Add a tracepoint to track credits involved in R/W requests") adds a check for 'rdata->server', and let clang throw this warning about NULL dereference. When 'rdata->credits.value != 0 && rdata->server == NULL' happens, add_credits_and_wake_if() will call rdata->server->ops->add_credits(). This will cause NULL dereference problem. Add a check for 'rdata->server' to avoid NULL dereference. Cc: stable@vger.kernel.org Fixes: 69c3c023af25 ("cifs: Implement netfslib hooks") Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-08-139p: Fix DIO read through netfsDominique Martinet
If a program is watching a file on a 9p mount, it won't see any change in size if the file being exported by the server is changed directly in the source filesystem, presumably because 9p doesn't have change notifications, and because netfs skips the reads if the file is empty. Fix this by attempting to read the full size specified when a DIO read is requested (such as when 9p is operating in unbuffered mode) and dealing with a short read if the EOF was less than the expected read. To make this work, filesystems using netfslib must not set NETFS_SREQ_CLEAR_TAIL if performing a DIO read where that read hit the EOF. I don't want to mandatorily clear this flag in netfslib for DIO because, say, ceph might make a read from an object that is not completely filled, but does not reside at the end of file - and so we need to clear the excess. This can be tested by watching an empty file over 9p within a VM (such as in the ktest framework): while true; do read content; if [ -n "$content" ]; then echo $content; break; fi; done < /host/tmp/foo then writing something into the empty file. The watcher should immediately display the file content and break out of the loop. Without this fix, it remains in the loop indefinitely. Fixes: 80105ed2fd27 ("9p: Use netfslib read/write_iter") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218916 Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/1229195.1723211769@warthog.procyon.org.uk cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-20cifs: Fix missing fscache invalidationDavid Howells
A network filesystem needs to implement a netfslib hook to invalidate fscache if it's to be able to use the cache. Fix cifs to implement the cache invalidation hook. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib") Signed-off-by: Steve French <stfrench@microsoft.com>
2024-07-19cifs: Add a tracepoint to track credits involved in R/W requestsDavid Howells
Add a tracepoint to track the credit changes and server in_flight value involved in the lifetime of a R/W request, logging it against the request/subreq debugging ID. This requires the debugging IDs to be recorded in the cifs_credits struct. The tracepoint can be enabled with: echo 1 >/sys/kernel/debug/tracing/events/cifs/smb3_rw_credits/enable Also add a three-state flag to struct cifs_credits to note if we're interested in determining when the in_flight contribution ends and, if so, to track whether we've decremented the contribution yet. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-07-19cifs: Fix setting of zero_point after DIO writeDavid Howells
At the moment, at the end of a DIO write, cifs calls netfs_resize_file() to adjust the size of the file if it needs it. This will reduce the zero_point (the point above which we assume a read will just return zeros) if it's more than the new i_size, but won't increase it. With DIO writes, however, we definitely want to increase it as we have clobbered the local pagecache and then written some data that's not available locally. Fix cifs to make the zero_point above the end of a DIO or unbuffered write. This fixes corruption seen occasionally with the generic/708 xfs-test. In that case, the read-back of some of the written data is being short-circuited and replaced with zeroes. Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib") Cc: stable@vger.kernel.org Reported-by: Steve French <sfrench@samba.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-07-19cifs: Fix missing error code setDavid Howells
In cifs_strict_readv(), the default rc (-EACCES) is accidentally cleared by a successful return from netfs_start_io_direct(), such that if cifs_find_lock_conflict() fails, we don't return an error. Fix this by resetting the default error code. Fixes: 14b1cd25346b ("cifs: Fix locking in cifs_strict_readv()") Cc: stable@vger.kernel.org Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-07-02cifs: Fix read-performance regression by dropping readahead expansionDavid Howells
cifs_expand_read() is causing a performance regression of around 30% by causing extra pagecache to be allocated for an inode in the readahead path before we begin actually dispatching RPC requests, thereby delaying the actual I/O. The expansion is sized according to the rsize parameter, which seems to be 4MiB on my test system; this is a big step up from the first requests made by the fio test program. Simple repro (look at read bandwidth number): fio --name=writetest --filename=/xfstest.test/foo --time_based --runtime=60 --size=16M --numjobs=1 --rw=read Fix this by removing cifs_expand_readahead(). Readahead expansion is mostly useful for when we're using the local cache if the local cache has a block size greater than PAGE_SIZE, so we can dispense with it when not caching. Fixes: 69c3c023af25 ("cifs: Implement netfslib hooks") Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: Matthew Wilcox <willy@infradead.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-06-20cifs: Move the 'pid' from the subreq to the reqDavid Howells
Move the reference pid from the cifs_io_subrequest struct to the cifs_io_request struct as it's the same for all subreqs of a particular request. Signed-off-by: David Howells <dhowells@redhat.com> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-06-20cifs: Only pick a channel once per read requestDavid Howells
In cifs, only pick a channel when setting up a read request rather than doing so individually for every subrequest and instead use that channel for all. This mirrors what the code in v6.9 does. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-06-18cifs: drop the incorrect assertion in cifs_swap_rw()Barry Song
Since commit 2282679fb20b ("mm: submit multipage write for SWP_FS_OPS swap-space"), we can plug multiple pages then unplug them all together. That means iov_iter_count(iter) could be way bigger than PAGE_SIZE, it actually equals the size of iov_iter_npages(iter, INT_MAX). Note this issue has nothing to do with large folios as we don't support THP_SWPOUT to non-block devices. Fixes: 2282679fb20b ("mm: submit multipage write for SWP_FS_OPS swap-space") Reported-by: Christoph Hellwig <hch@lst.de> Closes: https://lore.kernel.org/linux-mm/20240614100329.1203579-1-hch@lst.de/ Cc: NeilBrown <neilb@suse.de> Cc: Anna Schumaker <anna@kernel.org> Cc: Steve French <sfrench@samba.org> Cc: Trond Myklebust <trondmy@kernel.org> Cc: Chuanhua Han <hanchuanhua@oppo.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Chris Li <chrisl@kernel.org> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Paulo Alcantara <pc@manguebit.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Shyam Prasad N <sprasad@microsoft.com> Cc: Tom Talpey <tom@talpey.com> Cc: Bharath SM <bharathsm@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Barry Song <v-songbaohua@oppo.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-05-21smb3: reenable swapfiles over SMB3 mountsSteve French
With the changes to folios/netfs it is now easier to reenable swapfile support over SMB3 which fixes various xfstests Reviewed-by: David Howells <dhowells@redhat.com> Suggested-by: David Howells <dhowells@redhat.com> Fixes: e1209d3a7a67 ("mm: introduce ->swap_rw and use it for reads from SWP_FS_OPS swap-space") Signed-off-by: Steve French <stfrench@microsoft.com>
2024-05-13cifs: Fix locking in cifs_strict_readv()Steve French
Fix to take the i_rwsem (through the netfs locking wrappers) before taking cinode->lock_sem. Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib") Reported-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-05-01cifs: Remove some code that's no longer used, part 3David Howells
Remove some code that was #if'd out with the netfslib conversion. This is split into parts for file.c as the diff generator otherwise produces a hard to read diff for part of it where a big chunk is cut out. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Remove some code that's no longer used, part 2David Howells
Remove some code that was #if'd out with the netfslib conversion. This is split into parts for file.c as the diff generator otherwise produces a hard to read diff for part of it where a big chunk is cut out. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Remove some code that's no longer used, part 1David Howells
Remove some code that was #if'd out with the netfslib conversion. This is split into parts for file.c as the diff generator otherwise produces a hard to read diff for part of it where a big chunk is cut out. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Cut over to using netfslibDavid Howells
Make the cifs filesystem use netfslib to handle reading and writing on behalf of cifs. The changes include: (1) Various read_iter/write_iter type functions are turned into wrappers around netfslib API functions or are pointed directly at those functions: cifs_file_direct{,_nobrl}_ops switch to use netfs_unbuffered_read_iter and netfs_unbuffered_write_iter. Large pieces of code that will be removed are #if'd out and will be removed in subsequent patches. [?] Why does cifs mark the page dirty in the destination buffer of a DIO read? Should that happen automatically? Does netfs need to do that? Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Implement netfslib hooksDavid Howells
Provide implementation of the netfslib hooks that will be used by netfslib to ask cifs to set up and perform operations. Of particular note are (*) cifs_clamp_length() - This is used to negotiate the size of the next subrequest in a read request, taking into account the credit available and the rsize. The credits are attached to the subrequest. (*) cifs_req_issue_read() - This is used to issue a subrequest that has been set up and clamped. (*) cifs_prepare_write() - This prepares to fill a subrequest by picking a channel, reopening the file and requesting credits so that we can set the maximum size of the subrequest and also sets the maximum number of segments if we're doing RDMA. (*) cifs_issue_write() - This releases any unneeded credits and issues an asynchronous data write for the contiguous slice of file covered by the subrequest. This should possibly be folded in to all ->async_writev() ops and that called directly. (*) cifs_begin_writeback() - This gets the cached writable handle through which we do writeback (this does not affect writethrough, unbuffered or direct writes). At this point, cifs is not wired up to actually *use* netfslib; that will be done in a subsequent patch. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Move cifs_loose_read_iter() and cifs_file_write_iter() to file.cDavid Howells
Move cifs_loose_read_iter() and cifs_file_write_iter() to file.c so that they are colocated with similar functions rather than being split with cifsfs.c. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Replace the writedata replay bool with a netfs sreq flagDavid Howells
Replace the 'replay' bool in cifs_writedata (now cifs_io_subrequest) with a flag in the netfs_io_subrequest flags. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Make wait_mtu_credits take size_t argsDavid Howells
Make the wait_mtu_credits functions use size_t for the size and num arguments rather than unsigned int as netfslib uses size_t/ssize_t for arguments and return values to allow for extra capacity. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: linux-cachefs@redhat.com cc: netfs@lists.linux.dev cc: linux-mm@kvack.org
2024-05-01cifs: Use more fields from netfs_io_subrequestDavid Howells
Use more fields from netfs_io_subrequest instead of those incorporated into cifs_io_subrequest from cifs_readdata and cifs_writedata. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Replace cifs_writedata with a wrapper around netfs_io_subrequestDavid Howells
Replace the cifs_writedata struct with the same wrapper around netfs_io_subrequest that was used to replace cifs_readdata. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Replace cifs_readdata with a wrapper around netfs_io_subrequestDavid Howells
Netfslib has a facility whereby the allocation for netfs_io_subrequest can be increased to so that filesystem-specific data can be tagged on the end. Prepare to use this by making a struct, cifs_io_subrequest, that wraps netfs_io_subrequest, and absorb struct cifs_readdata into it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Use alternative invalidation to using launder_folioDavid Howells
Use writepages-based flushing invalidation instead of invalidate_inode_pages2() and ->launder_folio(). This will allow ->launder_folio() to be removed eventually. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org
2024-04-29mm: Remove the PG_fscache alias for PG_private_2David Howells
Remove the PG_fscache alias for PG_private_2 and use the latter directly. Use of this flag for marking pages undergoing writing to the cache should be considered deprecated and the folios should be marked dirty instead and the write done in ->writepages(). Note that PG_private_2 itself should be considered deprecated and up for future removal by the MM folks too. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Ilya Dryomov <idryomov@gmail.com> cc: Xiubo Li <xiubli@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: Bharath SM <bharathsm@microsoft.com> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna@kernel.org> cc: netfs@lists.linux.dev cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-04-03smb3: retrying on failed server closeRitvik Budhiraja
In the current implementation, CIFS close sends a close to the server and does not check for the success of the server close. This patch adds functionality to check for server close return status and retries in case of an EBUSY or EAGAIN error. This can help avoid handle leaks Cc: stable@vger.kernel.org Signed-off-by: Ritvik Budhiraja <rbudhiraja@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-02cifs: Fix caching to try to do open O_WRONLY as rdwr on serverDavid Howells
When we're engaged in local caching of a cifs filesystem, we cannot perform caching of a partially written cache granule unless we can read the rest of the granule. This can result in unexpected access errors being reported to the user. Fix this by the following: if a file is opened O_WRONLY locally, but the mount was given the "-o fsc" flag, try first opening the remote file with GENERIC_READ|GENERIC_WRITE and if that returns -EACCES, try dropping the GENERIC_READ and doing the open again. If that last succeeds, invalidate the cache for that file as for O_DIRECT. Fixes: 70431bfd825d ("cifs: Support fscache indexing rewrite") Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-14cifs: fixes for get_inode_infoMeetakshi Setiya
Fix potential memory leaks, add error checking, remove unnecessary initialisation of status_file_deleted and do not use cifs_iget() to get inode in reparse_info_to_fattr since fattrs may not be fully set. Fixes: ffceb7640cbf ("smb: client: do not defer close open handles to deleted files") Reported-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-14cifs: defer close file handles having RH leaseBharath SM
Previously we only deferred closing file handles with RHW lease. To enhance performance benefits from deferred closes, we now include handles with RH leases as well. Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-13Merge tag '6.9-rc-smb3-client-fixes-part1' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - fix for folios/netfs data corruption in cifs_extend_writeback - additional tracepoint added - updates for special files and symlinks: improvements to allow selecting use of either WSL or NFS reparse point format on creating special files - allocation size improvement for cached files - minor cleanup patches - fix to allow changing the password on remount when password for the session is expired. - lease key related fixes: caching hardlinked files, deletes of deferred close files, and an important fix to better reuse lease keys for compound operations, which also can avoid lease break timeouts when low on credits - fix potential data corruption with write/readdir races - compression cleanups and a fix for compression headers * tag '6.9-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: (24 commits) cifs: update internal module version number for cifs.ko smb: common: simplify compression headers smb: common: fix fields sizes in compression_pattern_payload_v1 smb: client: negotiate compression algorithms smb3: add dynamic trace point for ioctls cifs: Fix writeback data corruption smb: client: return reparse type in /proc/mounts smb: client: set correct d_type for reparse DFS/DFSR and mount point smb: client: parse uid, gid, mode and dev from WSL reparse points smb: client: introduce SMB2_OP_QUERY_WSL_EA smb: client: Fix a NULL vs IS_ERR() check in wsl_set_xattrs() smb: client: add support for WSL reparse points smb: client: reduce number of parameters in smb2_compound_op() smb: client: fix potential broken compound request smb: client: move most of reparse point handling code to common file smb: client: introduce reparse mount option smb: client: retry compound request without reusing lease smb: client: do not defer close open handles to deleted files smb: client: reuse file lease key in compound operations smb3: update allocation size more accurately on write completion ...
2024-03-11Merge tag 'vfs-6.9.file' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull file locking updates from Christian Brauner: "A few years ago struct file_lock_context was added to allow for separate lists to track different types of file locks instead of using a singly-linked list for all of them. Now leases no longer need to be tracked using struct file_lock. However, a lot of the infrastructure is identical for leases and locks so separating them isn't trivial. This splits a group of fields used by both file locks and leases into a new struct file_lock_core. The new core struct is embedded in struct file_lock. Coccinelle was used to convert a lot of the callers to deal with the move, with the remaining 25% or so converted by hand. Afterwards several internal functions in fs/locks.c are made to work with struct file_lock_core. Ultimately this allows to split struct file_lock into struct file_lock and struct file_lease. The file lease APIs are then converted to take struct file_lease" * tag 'vfs-6.9.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (51 commits) filelock: fix deadlock detection in POSIX locking filelock: always define for_each_file_lock() smb: remove redundant check filelock: don't do security checks on nfsd setlease calls filelock: split leases out of struct file_lock filelock: remove temporary compatibility macros smb/server: adapt to breakup of struct file_lock smb/client: adapt to breakup of struct file_lock ocfs2: adapt to breakup of struct file_lock nfsd: adapt to breakup of struct file_lock nfs: adapt to breakup of struct file_lock lockd: adapt to breakup of struct file_lock fuse: adapt to breakup of struct file_lock gfs2: adapt to breakup of struct file_lock dlm: adapt to breakup of struct file_lock ceph: adapt to breakup of struct file_lock afs: adapt to breakup of struct file_lock 9p: adapt to breakup of struct file_lock filelock: convert seqfile handling to use file_lock_core filelock: convert locks_translate_pid to take file_lock_core ...
2024-03-10cifs: Fix writeback data corruptionDavid Howells
cifs writeback doesn't correctly handle the case where cifs_extend_writeback() hits a point where it is considering an additional folio, but this would overrun the wsize - at which point it drops out of the xarray scanning loop and calls xas_pause(). The problem is that xas_pause() advances the loop counter - thereby skipping that page. What needs to happen is for xas_reset() to be called any time we decide we don't want to process the page we're looking at, but rather send the request we are building and start a new one. Fix this by copying and adapting the netfslib writepages code as a temporary measure, with cifs writeback intending to be offloaded to netfslib in the near future. This also fixes the issue with the use of filemap_get_folios_tag() causing retry of a bunch of pages which the extender already dealt with. This can be tested by creating, say, a 64K file somewhere not on cifs (otherwise copy-offload may get underfoot), mounting a cifs share with a wsize of 64000, copying the file to it and then comparing the original file and the copy: dd if=/dev/urandom of=/tmp/64K bs=64k count=1 mount //192.168.6.1/test /mnt -o user=...,pass=...,wsize=64000 cp /tmp/64K /mnt/64K cmp /tmp/64K /mnt/64K Without the fix, the cmp fails at position 64000 (or shortly thereafter). Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: samba-technical@lists.samba.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: do not defer close open handles to deleted filesMeetakshi Setiya
When a file/dentry has been deleted before closing all its open handles, currently, closing them can add them to the deferred close list. This can lead to problems in creating file with the same name when the file is re-created before the deferred close completes. This issue was seen while reusing a client's already existing lease on a file for compound operations and xfstest 591 failed because of the deferred close handle that remained valid even after the file was deleted and was being reused to create a file with the same name. The server in this case returns an error on open with STATUS_DELETE_PENDING. Recreating the file would fail till the deferred handles are closed (duration specified in closetimeo). This patch fixes the issue by flagging all open handles for the deleted file (file path to be precise) by setting status_file_deleted to true in the cifsFileInfo structure. As per the information classes specified in MS-FSCC, SMB2 query info response from the server has a DeletePending field, set to true to indicate that deletion has been requested on that file. If this is the case, flag the open handles for this file too. When doing close in cifs_close for each of these handles, check the value of this boolean field and do not defer close these handles if the corresponding filepath has been deleted. Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb3: update allocation size more accurately on write completionSteve French
Changes to allocation size are approximated for extending writes of cached files until the server returns the actual value (on SMB3 close or query info for example), but it was setting the estimated value for number of blocks to larger than the file size even if the file is likely sparse which breaks various xfstests (e.g. generic/129, 130, 221, 228). When i_size and i_blocks are updated in write completion do not increase allocation size more than what was written (rounded up to 512 bytes). Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10cifs: prevent updating file size from server if we have a read/write leaseBharath SM
In cases of large directories, the readdir operation may span multiple round trips to retrieve contents. This introduces a potential race condition in case of concurrent write and readdir operations. If the readdir operation initiates before a write has been processed by the server, it may update the file size attribute to an older value. Address this issue by avoiding file size updates from readdir when we have read/write lease. Scenario: 1) process1: open dir xyz 2) process1: readdir instance 1 on xyz 3) process2: create file.txt for write 4) process2: write x bytes to file.txt 5) process2: close file.txt 6) process2: open file.txt for read 7) process1: readdir 2 - overwrites file.txt inode size to 0 8) process2: read contents of file.txt - bug, short read with 0 bytes Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>