summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2021-09-23cifs: fix incorrect check for null pointer in header_assembleSteve French
Although very unlikely that the tlink pointer would be null in this case, get_next_mid function can in theory return null (but not an error) so need to check for null (not for IS_ERR, which can not be returned here). Address warning: fs/smbfs_client/connect.c:2392 cifs_match_super() warn: 'tlink' isn't an ERR_PTR Pointed out by Dan Carpenter via smatch code analysis tool CC: stable@vger.kernel.org Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23smb3: correct server pointer dereferencing check to be more consistentSteve French
Address warning: fs/smbfs_client/misc.c:273 header_assemble() warn: variable dereferenced before check 'treeCon->ses->server' Pointed out by Dan Carpenter via smatch code analysis tool Although the check is likely unneeded, adding it makes the code more consistent and easier to read, as the same check is done elsewhere in the function. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23Merge tag 'for-5.15-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - regression fix for leak of transaction handle after verity rollback failure - properly reset device last error between mounts - improve one error handling case when checksumming bios - fixup confusing displayed size of space info free space * tag 'for-5.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: prevent __btrfs_dump_space_info() to underflow its free space btrfs: fix mount failure due to past and transient device flush error btrfs: fix transaction handle leak after verity rollback failure btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
2021-09-23smb3: correct smb3 ACL security descriptorSteve French
Address warning: fs/smbfs_client/smb2pdu.c:2425 create_sd_buf() warn: struct type mismatch 'smb3_acl vs cifs_acl' Pointed out by Dan Carpenter via smatch code analysis tool Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23cifs: Clear modified attribute bit from inode flagsSteve French
Clear CIFS_INO_MODIFIED_ATTR bit from inode flags after updating mtime and ctime Signed-off-by: Rohith Surabattula <rohiths@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Cc: stable@vger.kernel.org # 5.13+ Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23cifs: Deal with some warnings from W=1David Howells
Deal with some warnings generated from make W=1: (1) Add/remove/fix kerneldoc parameters descriptions. (2) Turn cifs' rqst_page_get_length()'s banner comment into a kerneldoc comment. It should probably be prefixed with "cifs_" though. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-23erofs: clear compacted_2b if compacted_4b_initial > totalidxYue Hu
Currently, the whole indexes will only be compacted 4B if compacted_4b_initial > totalidx. So, the calculated compacted_2b is worthless for that case. It may waste CPU resources. No need to update compacted_4b_initial as mkfs since it's used to fulfill the alignment of the 1st compacted_2b pack and would handle the case above. We also need to clarify compacted_4b_end here. It's used for the last lclusters which aren't fitted in the previous compacted_2b packs. Some messages are from Xiang. Link: https://lore.kernel.org/r/20210914035915.1190-1-zbestahu@gmail.com Signed-off-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> [ Gao Xiang: it's enough to use "compacted_4b_initial < totalidx". ] Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-09-23erofs: fix misbehavior of unsupported chunk format checkGao Xiang
Unsupported chunk format should be checked with "if (vi->chunkformat & ~EROFS_CHUNK_FORMAT_ALL)" Found when checking with 4k-byte blockmap (although currently mkfs uses inode chunk indexes format by default.) Link: https://lore.kernel.org/r/20210922095141.233938-1-hsiangkao@linux.alibaba.com Fixes: c5aa903a59db ("erofs: support reading chunk-based uncompressed files") Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-09-22ksmbd: remove follow symlinks supportNamjae Jeon
Use LOOKUP_NO_SYMLINKS flags for default lookup to prohibit the middle of symlink component lookup and remove follow symlinks parameter support. We re-implement it as reparse point later. Test result: smbclient -Ulinkinjeon%1234 //172.30.1.42/share -c "get hacked/passwd passwd" NT_STATUS_OBJECT_NAME_NOT_FOUND opening remote file \hacked\passwd Cc: Ralph Böhme <slow@samba.org> Cc: Steve French <smfrench@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-22ksmbd: check protocol id in ksmbd_verify_smb_message()Namjae Jeon
When second smb2 pdu has invalid protocol id, ksmbd doesn't detect it and allow to process smb2 request. This patch add the check it in ksmbd_verify_smb_message() and don't use protocol id of smb2 request as protocol id of response. Reviewed-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Reviewed-by: Ralph Böhme <slow@samba.org> Reported-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-22fs-verity: fix signed integer overflow with i_size near S64_MAXEric Biggers
If the file size is almost S64_MAX, the calculated number of Merkle tree levels exceeds FS_VERITY_MAX_LEVELS, causing FS_IOC_ENABLE_VERITY to fail. This is unintentional, since as the comment above the definition of FS_VERITY_MAX_LEVELS states, it is enough for over U64_MAX bytes of data using SHA-256 and 4K blocks. (Specifically, 4096*128**8 >= 2**64.) The bug is actually that when the number of blocks in the first level is calculated from i_size, there is a signed integer overflow due to i_size being signed. Fix this by treating i_size as unsigned. This was found by the new test "generic: test fs-verity EFBIG scenarios" (https://lkml.kernel.org/r/b1d116cd4d0ea74b9cd86f349c672021e005a75c.1631558495.git.boris@bur.io). This didn't affect ext4 or f2fs since those have a smaller maximum file size, but it did affect btrfs which allows files up to S64_MAX bytes. Reported-by: Boris Burkov <boris@bur.io> Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl") Fixes: fd2d1acfcadf ("fs-verity: add the hook for file ->open()") Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Boris Burkov <boris@bur.io> Link: https://lore.kernel.org/r/20210916203424.113376-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-09-22Merge tag 'nfsd-5.15-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Critical bug fixes: - Fix crash in NLM TEST procedure - NFSv4.1+ backchannel not restored after PATH_DOWN" * tag 'nfsd-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN NLM: Fix svcxdr_encode_owner()
2021-09-22ext2: fix sleeping in atomic bugs on errorDan Carpenter
The ext2_error() function syncs the filesystem so it sleeps. The caller is holding a spinlock so it's not allowed to sleep. ext2_statfs() <- disables preempt -> ext2_count_free_blocks() -> ext2_get_group_desc() Fix this by using WARN() to print an error message and a stack trace instead of using ext2_error(). Link: https://lore.kernel.org/r/20210921203233.GA16529@kili Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-09-21cifs: fix a sign extension bugDan Carpenter
The problem is the mismatched types between "ctx->total_len" which is an unsigned int, "rc" which is an int, and "ctx->rc" which is a ssize_t. The code does: ctx->rc = (rc == 0) ? ctx->total_len : rc; We want "ctx->rc" to store the negative "rc" error code. But what happens is that "rc" is type promoted to a high unsigned int and 'ctx->rc" will store the high positive value instead of a negative value. The fix is to change "rc" from an int to a ssize_t. Fixes: c610c4b619e5 ("CIFS: Add asynchronous write support through kernel AIO") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-21ksmbd: add default data stream name in FILE_STREAM_INFORMATIONNamjae Jeon
Windows client expect to get default stream name(::DATA) in FILE_STREAM_INFORMATION response even if there is no stream data in file. This patch fix update failure when writing ppt or doc files. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-By: Tom Talpey <tom@talpey.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-21ksmbd: log that server is experimental at module loadSteve French
While we are working through detailed security reviews of ksmbd server code we should remind users that it is an experimental module by adding a warning when the module loads. Currently the module shows as experimental in Kconfig and is disabled by default, but we don't want to confuse users. Although ksmbd passes a wide variety of the important functional tests (since initial focus had been largely on functional testing such as smbtorture, xfstests etc.), and ksmbd has added key security features (e.g. GCM256 encryption, Kerberos support), there are ongoing detailed reviews of the code base for path processing and network buffer decoding, and this patch reminds users that the module should be considered "experimental." Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-21ceph: fix off by one bugs in unsafe_request_wait()Dan Carpenter
The "> max" tests should be ">= max" to prevent an out of bounds access on the next lines. Fixes: e1a4541ec0b9 ("ceph: flush the mdlog before waiting on unsafe reqs") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-09-21qnx4: work around gcc false positive warning bugLinus Torvalds
In commit b7213ffa0e58 ("qnx4: avoid stringop-overread errors") I tried to teach gcc about how the directory entry structure can be two different things depending on a status flag. It made the code clearer, and it seemed to make gcc happy. However, Arnd points to a gcc bug, where despite using two different members of a union, gcc then gets confused, and uses the size of one of the members to decide if a string overrun happens. And not necessarily the rigth one. End result: with some configurations, gcc-11 will still complain about the source buffer size being overread: fs/qnx4/dir.c: In function 'qnx4_readdir': fs/qnx4/dir.c:76:32: error: 'strnlen' specified bound [16, 48] exceeds source size 1 [-Werror=stringop-overread] 76 | size = strnlen(name, size); | ^~~~~~~~~~~~~~~~~~~ fs/qnx4/dir.c:26:22: note: source object declared here 26 | char de_name; | ^~~~~~~ because gcc will get confused about which union member entry is actually getting accessed, even when the source code is very clear about it. Gcc internally will have combined two "redundant" pointers (pointing to different union elements that are at the same offset), and takes the size checking from one or the other - not necessarily the right one. This is clearly a gcc bug, but we can work around it fairly easily. The biggest thing here is the big honking comment about why we do what we do. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578#c6 Reported-and-tested-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-21debugfs: debugfs_create_file_size(): use IS_ERR to check for errorNirmoy Das
debugfs_create_file() returns encoded error so use IS_ERR for checking return value. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Fixes: ff9fb72bc077 ("debugfs: return error values, not NULL") Cc: stable <stable@vger.kernel.org> References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686 Link: https://lore.kernel.org/r/20210902102917.2233-1-nirmoy.das@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-20Merge tag 'afs-fixes-20210913' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "Fixes for AFS problems that can cause data corruption due to interaction with another client modifying data cached locally: - When d_revalidating a dentry, don't look at the inode to which it points. Only check the directory to which the dentry belongs. This was confusing things and causing the silly-rename cleanup code to remove the file now at the dentry of a file that got deleted. - Fix mmap data coherency. When a callback break is received that relates to a file that we have cached, the data content may have been changed (there are other reasons, such as the user's rights having been changed). However, we're checking it lazily, only on entry to the kernel, which doesn't happen if we have a writeable shared mapped page on that file. We make the kernel keep track of mmapped files and clear all PTEs mapping to that file as soon as the callback comes in by calling unmap_mapping_pages() (we don't necessarily want to zap the pagecache). This causes the kernel to be reentered when userspace tries to access the mmapped address range again - and at that point we can query the server and, if we need to, zap the page cache. Ideally, I would check each file at the point of notification, but that involves poking the server[*] - which is holding an exclusive lock on the vnode it is changing, waiting for all the clients it notified to reply. This could then deadlock against the server. Further, invalidating the pagecache might call ->launder_page(), which would try to write to the file, which would definitely deadlock. (AFS doesn't lease file access). [*] Checking to see if the file content has changed is a matter of comparing the current data version number, but we have to ask the server for that. We also need to get a new callback promise and we need to poke the server for that too. - Add some more points at which the inode is validated, since we're doing it lazily, notably in ->read_iter() and ->page_mkwrite(), but also when performing some directory operations. Ideally, checking in ->read_iter() would be done in some derivation of filemap_read(). If we're going to call the server to read the file, then we get the file status fetch as part of that. - The above is now causing us to make a lot more calls to afs_validate() to check the inode - and afs_validate() takes the RCU read lock each time to make a quick check (ie. afs_check_validity()). This is entirely for the purpose of checking cb_s_break to see if the server we're using reinitialised its list of callbacks - however this isn't a very common event, so most of the time we're taking this needlessly. Add a new cell-wide counter to count the number of reinitialisations done by any server and check that - and only if that changes, take the RCU read lock and check the server list (the server list may change, but the cell a file is part of won't). - Don't update vnode->cb_s_break and ->cb_v_break inside the validity checking loop. The cb_lock is done with read_seqretry, so we might go round the loop a second time after resetting those values - and that could cause someone else checking validity to miss something (I think). Also included are patches for fixes for some bugs encountered whilst debugging this: - Fix a leak of afs_read objects and fix a leak of keys hidden by that. - Fix a leak of pages that couldn't be added to extend a writeback. - Fix the maintenance of i_blocks when i_size is changed by a local write or a local dir edit" Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217 [1] Link: https://lore.kernel.org/r/163111665183.283156.17200205573146438918.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/ # i_blocks patch * tag 'afs-fixes-20210913' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix updating of i_blocks on file/dir extension afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS server afs: Try to avoid taking RCU read lock when checking vnode validity afs: Fix mmap coherency vs 3rd-party changes afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation afs: Add missing vnode validation checks afs: Fix page leak afs: Fix missing put on afs_read objects and missing get on the key therein
2021-09-20Merge tag '5.15-rc1-ksmbd' of git://git.samba.org/ksmbdLinus Torvalds
Pull ksmbd server fixes from Steve French: "Three ksmbd fixes, including an important security fix for path processing, and a buffer overflow check, and a trivial fix for incorrect header inclusion" * tag '5.15-rc1-ksmbd' of git://git.samba.org/ksmbd: ksmbd: add validation for FILE_FULL_EA_INFORMATION of smb2_get_info ksmbd: prevent out of share access ksmbd: transport_rdma: Don't include rwlock.h directly
2021-09-20Merge tag '5.15-rc1-smb3' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs client fixes from Steve French: - two deferred close fixes (for bugs found with xfstests 478 and 461) - a deferred close improvement in rename - two trivial fixes for incorrect Linux comment formatting of multiple cifs files (pointed out by automated kernel test robot and checkpatch) * tag '5.15-rc1-smb3' of git://git.samba.org/sfrench/cifs-2.6: cifs: Not to defer close on file when lock is set cifs: Fix soft lockup during fsstress cifs: Deferred close performance improvements cifs: fix incorrect kernel doc comments cifs: remove pathname for file from SPDX header
2021-09-18ksmbd: add validation for FILE_FULL_EA_INFORMATION of smb2_get_infoNamjae Jeon
Add validation to check whether req->InputBufferLength is smaller than smb2_ea_info_req structure size. Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Steve French <smfrench@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-17ksmbd: prevent out of share accessHyunchul Lee
Because of .., files outside the share directory could be accessed. To prevent this, normalize the given path and remove all . and .. components. In addition to the usual large set of regression tests (smbtorture and xfstests), ran various tests on this to specifically check path name validation including libsmb2 tests to verify path normalization: ./examples/smb2-ls-async smb://172.30.1.15/homes2/../ ./examples/smb2-ls-async smb://172.30.1.15/homes2/foo/../ ./examples/smb2-ls-async smb://172.30.1.15/homes2/foo/../../ ./examples/smb2-ls-async smb://172.30.1.15/homes2/foo/../ ./examples/smb2-ls-async smb://172.30.1.15/homes2/foo/..bar/ ./examples/smb2-ls-async smb://172.30.1.15/homes2/foo/bar../ ./examples/smb2-ls-async smb://172.30.1.15/homes2/foo/bar.. ./examples/smb2-ls-async smb://172.30.1.15/homes2/foo/bar../../../../ Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-17cifs: Not to defer close on file when lock is setRohith Surabattula
Close file immediately when lock is set. Cc: stable@vger.kernel.org # 5.13+ Signed-off-by: Rohith Surabattula <rohiths@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-17cifs: Fix soft lockup during fsstressRohith Surabattula
Below traces are observed during fsstress and system got hung. [ 130.698396] watchdog: BUG: soft lockup - CPU#6 stuck for 26s! Cc: stable@vger.kernel.org # 5.13+ Signed-off-by: Rohith Surabattula <rohiths@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-17cifs: Deferred close performance improvementsRohith Surabattula
During unlink/rename instead of closing all the deferred handles under tcon, close only handles under the requested dentry. Signed-off-by: Rohith Surabattula <rohiths@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-17btrfs: prevent __btrfs_dump_space_info() to underflow its free spaceQu Wenruo
It's not uncommon where __btrfs_dump_space_info() gets called under over-commit situations. In that case free space would underflow as total allocated space is not enough to handle all the over-committed space. Such underflow values can sometimes cause confusion for users enabled enospc_debug mount option, and takes some seconds for developers to convert the underflow value to signed result. Just output the free space as s64 to avoid such problem. Reported-by: Eli V <eliventer@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CAJtFHUSy4zgyhf-4d9T+KdJp9w=UgzC2A0V=VtmaeEpcGgm1-Q@mail.gmail.com/ CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-17btrfs: fix mount failure due to past and transient device flush errorFilipe Manana
When we get an error flushing one device, during a super block commit, we record the error in the device structure, in the field 'last_flush_error'. This is used to later check if we should error out the super block commit, depending on whether the number of flush errors is greater than or equals to the maximum tolerated device failures for a raid profile. However if we get a transient device flush error, unmount the filesystem and later try to mount it, we can fail the mount because we treat that past error as critical and consider the device is missing. Even if it's very likely that the error will happen again, as it's probably due to a hardware related problem, there may be cases where the error might not happen again. One example is during testing, and a test case like the new generic/648 from fstests always triggers this. The test cases generic/019 and generic/475 also trigger this scenario, but very sporadically. When this happens we get an error like this: $ mount /dev/sdc /mnt mount: /mnt wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error. $ dmesg (...) [12918.886926] BTRFS warning (device sdc): chunk 13631488 missing 1 devices, max tolerance is 0 for writable mount [12918.888293] BTRFS warning (device sdc): writable mount is not allowed due to too many missing devices [12918.890853] BTRFS error (device sdc): open_ctree failed The failure happens because when btrfs_check_rw_degradable() is called at mount time, or at remount from RO to RW time, is sees a non zero value in a device's ->last_flush_error attribute, and therefore considers that the device is 'missing'. Fix this by setting a device's ->last_flush_error to zero when we close a device, making sure the error is not seen on the next mount attempt. We only need to track flush errors during the current mount, so that we never commit a super block if such errors happened. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-17btrfs: fix transaction handle leak after verity rollback failureFilipe Manana
During a verity rollback, if we fail to update the inode or delete the orphan, we abort the transaction and return without releasing our transaction handle. Fix that by releasing the handle. Fixes: 146054090b0859 ("btrfs: initial fsverity support") Fixes: 705242538ff348 ("btrfs: verity metadata orphan items") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-17btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handlingQu Wenruo
There is a BUG_ON() in btrfs_csum_one_bio() to catch code logic error. It has indeed caught several bugs during subpage development. But the BUG_ON() itself will bring down the whole system which is an overkill. Replace it with a WARN() and exit gracefully, so that it won't crash the whole system while we can still catch the code logic error. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-17Merge tag 'iov_iter.3-5.15-2021-09-17' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring iov_iter retry fixes from Jens Axboe: "This adds a helper to save/restore iov_iter state, and modifies io_uring to use it. After that is done, we can now kill the iter->truncated addition that we added for this release. The io_uring change is being overly cautious with the save/restore/advance, but better safe than sorry and we can always improve that and reduce the overhead if it proves to be of concern. The only case to be worried about in this regard is huge IO, where iteration can take a while to iterate segments. I spent some time writing test cases, and expanded the coverage quite a bit from the last posting of this. liburing carries this regression test case now: https://git.kernel.dk/cgit/liburing/tree/test/file-verify.c which exercises all of this. It now also supports provided buffers, and explicitly tests for end-of-file/device truncation as well. On top of that, Pavel sanitized the IOPOLL retry path to follow the exact same pattern as normal IO" * tag 'iov_iter.3-5.15-2021-09-17' of git://git.kernel.dk/linux-block: io_uring: move iopoll reissue into regular IO path Revert "iov_iter: track truncated size" io_uring: use iov_iter state save/restore helpers iov_iter: add helper to save iov_iter state
2021-09-17Merge tag 'io_uring-5.15-2021-09-17' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: "Mostly fixes for regressions in this cycle, but also a few fixes that predate this release. The odd one out is a tweak to the direct files added in this release, where attempting to reuse a slot is allowed instead of needing an explicit removal of that slot first. It's a considerable improvement in usability to that API, hence I'm sending it for -rc2. - io-wq race fix and cleanup (Hao) - loop_rw_iter() type fix - SQPOLL max worker race fix - Allow poll arm for O_NONBLOCK files, fixing a case where it's impossible to properly use io_uring if you cannot modify the file flags - Allow direct open to simply reuse a slot, instead of needing it explicitly removed first (Pavel) - Fix a case where we missed signal mask restoring in cqring_wait, if we hit -EFAULT (Xiaoguang)" * tag 'io_uring-5.15-2021-09-17' of git://git.kernel.dk/linux-block: io_uring: allow retry for O_NONBLOCK if async is supported io_uring: auto-removal for direct open/accept io_uring: fix missing sigmask restore in io_cqring_wait() io_uring: pin SQPOLL data before unlocking ring lock io-wq: provide IO_WQ_* constants for IORING_REGISTER_IOWQ_MAX_WORKERS arg items io-wq: fix potential race of acct->nr_workers io-wq: code clean of io_wqe_create_worker() io_uring: ensure symmetry in handling iter types in loop_rw_iter()
2021-09-17nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWNDai Ngo
When the back channel enters SEQ4_STATUS_CB_PATH_DOWN state, the client recovers by sending BIND_CONN_TO_SESSION but the server fails to recover the back channel and leaves it as NFSD4_CB_DOWN. Fix by enhancing nfsd4_bind_conn_to_session to probe the back channel by calling nfsd4_probe_callback. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-17NLM: Fix svcxdr_encode_owner()Chuck Lever
Dai Ngo reports that, since the XDR overhaul, the NLM server crashes when the TEST procedure wants to return NLM_DENIED. There is a bug in svcxdr_encode_owner() that none of our standard test cases found. Replace the open-coded function with a call to an appropriate pre-fabricated XDR helper. Reported-by: Dai Ngo <Dai.Ngo@oracle.com> Fixes: a6a63ca5652e ("lockd: Common NLM XDR helpers") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-17ksmbd: transport_rdma: Don't include rwlock.h directlyMike Galbraith
rwlock.h specifically asks to not be included directly. In fact, the proper spinlock.h include isn't needed either, it comes with the huge pile that kthread.h ends up pulling in, so just drop it entirely. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-17mm: Fully initialize invalidate_lock, amend lock class laterSebastian Andrzej Siewior
The function __init_rwsem() is not part of the official API, it just a helper function used by init_rwsem(). Changing the lock's class and name should be done by using lockdep_set_class_and_name() after the has been fully initialized. The overhead of the additional class struct and setting it twice is negligible and it works across all locks. Fully initialize the lock with init_rwsem() and then set the custom class and name for the lock. Fixes: 730633f0b7f95 ("mm: Protect operations adding pages to page cache with invalidate_lock") Link: https://lore.kernel.org/r/20210901084403.g4fezi23cixemlhh@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jan Kara <jack@suse.cz>
2021-09-15qnx4: avoid stringop-overread errorsLinus Torvalds
The qnx4 directory entries are 64-byte blocks that have different contents depending on the a status byte that is in the last byte of the block. In particular, a directory entry can be either a "link info" entry with a 48-byte name and pointers to the real inode information, or an "inode entry" with a smaller 16-byte name and the full inode information. But the code was written to always just treat the directory name as if it was part of that "inode entry", and just extend the name to the longer case if the status byte said it was a link entry. That work just fine and gives the right results, but now that gcc is tracking data structure accesses much more, the code can trigger a compiler error about using up to 48 bytes (the long name) in a structure that only has that shorter name in it: fs/qnx4/dir.c: In function ‘qnx4_readdir’: fs/qnx4/dir.c:51:32: error: ‘strnlen’ specified bound 48 exceeds source size 16 [-Werror=stringop-overread] 51 | size = strnlen(de->di_fname, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from fs/qnx4/qnx4.h:3, from fs/qnx4/dir.c:16: include/uapi/linux/qnx4_fs.h:45:25: note: source object declared here 45 | char di_fname[QNX4_SHORT_NAME_MAX]; | ^~~~~~~~ which is because the source code doesn't really make this whole "one of two different types" explicit. Fix this by introducing a very explicit union of the two types, and basically explaining to the compiler what is really going on. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-15io_uring: move iopoll reissue into regular IO pathPavel Begunkov
230d50d448acb ("io_uring: move reissue into regular IO path") made non-IOPOLL I/O to not retry from ki_complete handler. Follow it steps and do the same for IOPOLL. Same problems, same implementation, same -EAGAIN assumptions. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f80dfee2d5fa7678f0052a8ab3cfca9496a112ca.1631699928.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-15io_uring: use iov_iter state save/restore helpersJens Axboe
Get rid of the need to do re-expand and revert on an iterator when we encounter a short IO, or failure that warrants a retry. Use the new state save/restore helpers instead. We keep the iov_iter_state persistent across retries, if we need to restart the read or write operation. If there's a pending retry, the operation will always exit with the state correctly saved. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-14io_uring: allow retry for O_NONBLOCK if async is supportedJens Axboe
A common complaint is that using O_NONBLOCK files with io_uring can be a bit of a pain. Be a bit nicer and allow normal retry IFF the file does support async behavior. This makes it possible to use io_uring more reliably with O_NONBLOCK files, for use cases where it either isn't possible or feasible to modify the file flags. Cc: stable@vger.kernel.org Reported-and-tested-by: Dan Melnic <dmm@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-14io_uring: auto-removal for direct open/acceptPavel Begunkov
It might be inconvenient that direct open/accept deviates from the update semantics and fails if the slot is taken instead of removing a file sitting there. Implement this auto-removal. Note that removal might need to allocate and so may fail. However, if an empty slot is specified, it's guaraneed to not fail on the fd installation side for valid userspace programs. It's needed for users who can't tolerate such failures, e.g. accept where the other end never retries. Suggested-by: Franz-B. Tuneke <franz-bernhard.tuneke@tu-dortmund.de> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c896f14ea46b0eaa6c09d93149e665c2c37979b4.1631632300.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-14io_uring: fix missing sigmask restore in io_cqring_wait()Xiaoguang Wang
Move get_timespec() section in io_cqring_wait() before the sigmask saving, otherwise we'll fail to restore sigmask once get_timespec() returns error. Fixes: c73ebb685fb6 ("io_uring: add timeout support for io_uring_enter()") Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Link: https://lore.kernel.org/r/20210914143852.9663-1-xiaoguang.wang@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-13io_uring: pin SQPOLL data before unlocking ring lockJens Axboe
We need to re-check sqd->thread after we've dropped the lock. Pin the sqd before doing the lockdep lock dance, and check if the thread is alive after that. It's either NULL or alive, as the SQPOLL thread cannot exit without holding the same sqd->lock. Reported-and-tested-by: syzbot+337de45f13a4fd54d708@syzkaller.appspotmail.com Fixes: fa84693b3c89 ("io_uring: ensure IORING_REGISTER_IOWQ_MAX_WORKERS works with SQPOLL") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-13cifs: fix incorrect kernel doc commentsSteve French
Correct kernel-doc comments pointed out by the automated kernel test robot. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-13cifs: remove pathname for file from SPDX headerSteve French
checkpatch complains about source files with filenames (e.g. in these cases just below the SPDX header in comments at the top of various files in fs/cifs). It also is helpful to change this now so will be less confusing when the parent directory is renamed e.g. from fs/cifs to fs/smb_client (or fs/smbfs) Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-13io-wq: provide IO_WQ_* constants for IORING_REGISTER_IOWQ_MAX_WORKERS arg itemsEugene Syromiatnikov
The items passed in the array pointed by the arg parameter of IORING_REGISTER_IOWQ_MAX_WORKERS io_uring_register operation carry certain semantics: they refer to different io-wq worker categories; provide IO_WQ_* constants in the UAPI, so these categories can be referenced in the user space code. Suggested-by: Jens Axboe <axboe@kernel.dk> Complements: 2e480058ddc21ec5 ("io-wq: provide a way to limit max number of workers") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Link: https://lore.kernel.org/r/20210913154415.GA12890@asgard.redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-13afs: Fix updating of i_blocks on file/dir extensionDavid Howells
When an afs file or directory is modified locally such that the total file size is extended, i_blocks needs to be recalculated too. Fix this by making afs_write_end() and afs_edit_dir_add() call afs_set_i_size() rather than setting inode->i_size directly as that also recalculates inode->i_blocks. This can be tested by creating and writing into directories and files and then examining them with du. Without this change, directories show a 4 blocks (they start out at 2048 bytes) and files show 0 blocks; with this change, they should show a number of blocks proportional to the file size rounded up to 1024. Fixes: 31143d5d515e ("AFS: implement basic file write support") Fixes: 63a4681ff39c ("afs: Locally edit directory data for mkdir/create/unlink/...") Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/
2021-09-13afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS serverDavid Howells
AFS-3 has two data fetch RPC variants, FS.FetchData and FS.FetchData64, and Linux's afs client switches between them when talking to a non-YFS server if the read size, the file position or the sum of the two have the upper 32 bits set of the 64-bit value. This is a problem, however, since the file position and length fields of FS.FetchData are *signed* 32-bit values. Fix this by capturing the capability bits obtained from the fileserver when it's sent an FS.GetCapabilities RPC, rather than just discarding them, and then picking out the VICED_CAPABILITY_64BITFILES flag. This can then be used to decide whether to use FS.FetchData or FS.FetchData64 - and also FS.StoreData or FS.StoreData64 - rather than using upper_32_bits() to switch on the parameter values. This capabilities flag could also be used to limit the maximum size of the file, but all servers must be checked for that. Note that the issue does not exist with FS.StoreData - that uses *unsigned* 32-bit values. It's also not a problem with Auristor servers as its YFS.FetchData64 op uses unsigned 64-bit values. This can be tested by cloning a git repo through an OpenAFS client to an OpenAFS server and then doing "git status" on it from a Linux afs client[1]. Provided the clone has a pack file that's in the 2G-4G range, the git status will show errors like: error: packfile .git/objects/pack/pack-5e813c51d12b6847bbc0fcd97c2bca66da50079c.pack does not match index error: packfile .git/objects/pack/pack-5e813c51d12b6847bbc0fcd97c2bca66da50079c.pack does not match index This can be observed in the server's FileLog with something like the following appearing: Sun Aug 29 19:31:39 2021 SRXAFS_FetchData, Fid = 2303380852.491776.3263114, Host 192.168.11.201:7001, Id 1001 Sun Aug 29 19:31:39 2021 CheckRights: len=0, for host=192.168.11.201:7001 Sun Aug 29 19:31:39 2021 FetchData_RXStyle: Pos 18446744071815340032, Len 3154 Sun Aug 29 19:31:39 2021 FetchData_RXStyle: file size 2400758866 ... Sun Aug 29 19:31:40 2021 SRXAFS_FetchData returns 5 Note the file position of 18446744071815340032. This is the requested file position sign-extended. Fixes: b9b1f8d5930a ("AFS: write support fixes") Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org cc: openafs-devel@openafs.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217#c9 [1] Link: https://lore.kernel.org/r/951332.1631308745@warthog.procyon.org.uk/
2021-09-13afs: Try to avoid taking RCU read lock when checking vnode validityDavid Howells
Try to avoid taking the RCU read lock when checking the validity of a vnode's callback state. The only thing it's needed for is to pin the parent volume's server list whilst we search it to find the record of the server we're currently using to see if it has been reinitialised (ie. it sent us a CB.InitCallBackState* RPC). Do this by the following means: (1) Keep an additional per-cell counter (fs_s_break) that's incremented each time any of the fileservers in the cell reinitialises. Since the new counter can be accessed without RCU from the vnode, we can check that first - and only if it differs, get the RCU read lock and check the volume's server list. (2) Replace afs_get_s_break_rcu() with afs_check_server_good() which now indicates whether the callback promise is still expected to be present on the server. This does the checks as described in (1). (3) Restructure afs_check_validity() to take account of the change in (2). We can also get rid of the valid variable and just use the need_clear variable with the addition of the afs_cb_break_no_promise reason. (4) afs_check_validity() probably shouldn't be altering vnode->cb_v_break and vnode->cb_s_break when it doesn't have cb_lock exclusively locked. Move the change to vnode->cb_v_break to __afs_break_callback(). Delegate the change to vnode->cb_s_break to afs_select_fileserver() and set vnode->cb_fs_s_break there also. (5) afs_validate() no longer needs to get the RCU read lock around its call to afs_check_validity() - and can skip the call entirely if we don't have a promise. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Markus Suvanto <markus.suvanto@gmail.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/163111669583.283156.1397603105683094563.stgit@warthog.procyon.org.uk/