summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2023-03-23ext4: remove unused group parameter in ext4_block_bitmap_csum_verifyKemeng Shi
Remove unused group parameter in ext4_block_bitmap_csum_verify. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-4-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: remove unused group parameter in ext4_inode_bitmap_csum_setKemeng Shi
Remove unused group parameter in ext4_inode_bitmap_csum_set. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: remove unused group parameter in ext4_inode_bitmap_csum_verifyKemeng Shi
Remove unused group parameter in ext4_inode_bitmap_csum_verify. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-2-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: improve inode table blocks counting in ext4_num_overhead_clustersKemeng Shi
As inode table blocks are contiguous, inode table blocks inside the block_group can be represented as range [itbl_cluster_start, itbl_cluster_last]. Then we can simply account inode table cluters and check cluster overlap with [itbl_cluster_start, itbl_cluster_last] instead of traverse each block of inode table. By the way, this patch fixes code style problem of comment for ext4_num_overhead_clusters. [ Merged fix-up patch which fixed potentially access to an uninitialzied stack variable. --TYT ] Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230221115919.1918161-8-shikemeng@huaweicloud.com Link: https://lore.kernel.org/r/202303171446.eLEhZzAu-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: stop trying to verify just initialized bitmap in ↵Kemeng Shi
ext4_read_block_bitmap_nowait For case we initialize a bitmap bh, we will set bitmap bh verified. We can return immediately instead of goto verify to remove unnecessary work for trying to verify bitmap bh. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230221115919.1918161-7-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: remove stale comment in ext4_init_block_bitmapKemeng Shi
Commit bdfb6ff4a255d ("ext4: mark group corrupt on group descriptor checksum") added flag to indicate corruption of group instead of marking all blocks used. Just remove stale comment. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230221115919.1918161-6-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: remove unnecessary check in ext4_bg_num_gdb_nometaKemeng Shi
We only call ext4_bg_num_gdb_nometa if there is no meta_bg feature or group does not reach first_meta_bg, so group must not reside in meta group. Then group has a global gdt if superblock exists. Just remove confusing branch that meta_bg feature exists. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230221115919.1918161-5-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: call ext4_bg_num_gdb_[no]meta directly in ext4_num_base_meta_clustersKemeng Shi
ext4_num_base_meta_clusters is already aware of meta_bg feature and test if block_group is inside real meta block groups before calling ext4_bg_num_gdb. Then ext4_bg_num_gdb will check if block group is inside a real meta block groups again to decide either ext4_bg_num_gdb_meta or ext4_bg_num_gdb_nometa is needed. Call ext4_bg_num_gdb_meta or ext4_bg_num_gdb_nometa directly after we check if block_group is inside a meta block groups in ext4_num_base_meta_clusters to remove redundant check of meta block groups in ext4_bg_num_gdb. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230221115919.1918161-4-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: correct validation check of inode table in ext4_valid_block_bitmapKemeng Shi
1.Last valid cluster of inode table is EXT4_B2C(sbi, offset + sbi->s_itb_per_group - 1). We should make sure last valid cluster is < max_bit, i.e., EXT4_B2C(sbi, offset + sbi->s_itb_per_group - 1) is < max_bit rather than EXT4_B2C(sbi, offset + sbi->s_itb_per_group) is < max_bit. 2.Bit search length should be last valid cluster plus 1, i.e., EXT4_B2C(sbi, offset + sbi->s_itb_per_group - 1) + 1 rather than EXT4_B2C(sbi, offset + sbi->s_itb_per_group). Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230221115919.1918161-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: properly handle error of ext4_init_block_bitmap in ↵Kemeng Shi
ext4_read_block_bitmap_nowait We mark buffer_head of bitmap successfully initialized even error occurs in ext4_init_block_bitmap. Although we will return error, we will get a invalid buffer_head of bitmap from next ext4_read_block_bitmap_nowait which is marked buffer_verified but not successfully initialized actually in previous ext4_read_block_bitmap_nowait. Fix this by only marking buffer_head successfully initialized if ext4_init_block_bitmap successes. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230221115919.1918161-2-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: fix comment: "start start" -> "start" in mpage_prepare_extent_to_map()Theodore Ts'o
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23ext4: Fix warnings when freezing filesystem with journaled dataJan Kara
Test generic/390 in data=journal mode often triggers a warning that ext4_do_writepages() tries to start a transaction on frozen filesystem. This happens because although all dirty data is properly written, jbd2 checkpointing code writes data through submit_bh() and as a result only buffer dirty bits are cleared but page dirty bits stay set. Later when the filesystem is frozen, writeback code comes, tries to write supposedly dirty pages and the warning triggers. Fix the problem by calling sync_filesystem() once more after flushing the whole journal to clear stray page dirty bits. [ Applied fixup patches to address crashes when running data=journal tests; see links for more details -- TYT ] Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230308142528.12384-1-jack@suse.cz Reported-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/all/20230319183617.GA896@sol.localdomain Link: https://lore.kernel.org/r/20230323145404.21381-1-jack@suse.cz Link: https://lore.kernel.org/r/20230323145404.21381-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-23nilfs2: fix kernel-infoleak in nilfs_ioctl_wrap_copy()Ryusuke Konishi
The ioctl helper function nilfs_ioctl_wrap_copy(), which exchanges a metadata array to/from user space, may copy uninitialized buffer regions to user space memory for read-only ioctl commands NILFS_IOCTL_GET_SUINFO and NILFS_IOCTL_GET_CPINFO. This can occur when the element size of the user space metadata given by the v_size member of the argument nilfs_argv structure is larger than the size of the metadata element (nilfs_suinfo structure or nilfs_cpinfo structure) on the file system side. KMSAN-enabled kernels detect this issue as follows: BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_user+0xc0/0x100 lib/usercopy.c:33 instrument_copy_to_user include/linux/instrumented.h:121 [inline] _copy_to_user+0xc0/0x100 lib/usercopy.c:33 copy_to_user include/linux/uaccess.h:169 [inline] nilfs_ioctl_wrap_copy+0x6fa/0xc10 fs/nilfs2/ioctl.c:99 nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline] nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290 nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343 __do_compat_sys_ioctl fs/ioctl.c:968 [inline] __se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910 __ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178 do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246 entry_SYSENTER_compat_after_hwframe+0x70/0x82 Uninit was created at: __alloc_pages+0x9f6/0xe90 mm/page_alloc.c:5572 alloc_pages+0xab0/0xd80 mm/mempolicy.c:2287 __get_free_pages+0x34/0xc0 mm/page_alloc.c:5599 nilfs_ioctl_wrap_copy+0x223/0xc10 fs/nilfs2/ioctl.c:74 nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline] nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290 nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343 __do_compat_sys_ioctl fs/ioctl.c:968 [inline] __se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910 __ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178 do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246 entry_SYSENTER_compat_after_hwframe+0x70/0x82 Bytes 16-127 of 3968 are uninitialized ... This eliminates the leak issue by initializing the page allocated as buffer using get_zeroed_page(). Link: https://lkml.kernel.org/r/20230307085548.6290-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+132fdd2f1e1805fdc591@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/000000000000a5bd2d05f63f04ae@google.com Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-23ocfs2: Switch to security_inode_init_security()Roberto Sassu
In preparation for removing security_old_inode_init_security(), switch to security_inode_init_security(). Extend the existing ocfs2_initxattrs() to take the ocfs2_security_xattr_info structure from fs_info, and populate the name/value/len triple with the first xattr provided by LSMs. As fs_info was not used before, ocfs2_initxattrs() can now handle the case of replicating the behavior of security_old_inode_init_security(), i.e. just obtaining the xattr, in addition to setting all xattrs provided by LSMs. Supporting multiple xattrs is not currently supported where security_old_inode_init_security() was called (mknod, symlink), as it requires non-trivial changes that can be done at a later time. Like for reiserfs, even if EVM is invoked, it will not provide an xattr (if it is not the first to set it, its xattr will be discarded; if it is the first, it does not have xattrs to calculate the HMAC on). Finally, since security_inode_init_security(), unlike security_old_inode_init_security(), returns zero instead of -EOPNOTSUPP if no xattrs were provided by LSMs or if inodes are private, additionally check in ocfs2_init_security_get() if the xattr name is set. If not, act as if security_old_inode_init_security() returned -EOPNOTSUPP, and set si->enable to zero to notify to the functions following ocfs2_init_security_get() that no xattrs are available. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-23reiserfs: Switch to security_inode_init_security()Roberto Sassu
In preparation for removing security_old_inode_init_security(), switch to security_inode_init_security(). Commit 572302af1258 ("reiserfs: Add missing calls to reiserfs_security_free()") fixed possible memory leaks and another issue related to adding an xattr at inode creation time. Define the initxattrs callback reiserfs_initxattrs(), to populate the name/value/len triple in the reiserfs_security_handle() with the first xattr provided by LSMs. Make a copy of the xattr value, as security_inode_init_security() frees it. After the call to security_inode_init_security(), remove the check for returning -EOPNOTSUPP, as security_inode_init_security() changes it to zero. Multiple xattrs are currently not supported, as the reiserfs_security_handle structure is exported to user space. As a consequence, even if EVM is invoked, it will not provide an xattr (if it is not the first to set it, its xattr will be discarded; if it is the first, it does not have xattrs to calculate the HMAC on). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-23Merge tag 'gfs2-v6.3-rc3-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Andreas Gruenbacher: - Reinstate commit 970343cd4904 ("GFS2: free disk inode which is deleted by remote node -V2") as reverting that commit could cause gfs2_put_super() to hang. * tag 'gfs2-v6.3-rc3-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: Reinstate "GFS2: free disk inode which is deleted by remote node -V2"
2023-03-23Reinstate "GFS2: free disk inode which is deleted by remote node -V2"Bob Peterson
It turns out that reverting commit 970343cd4904 ("GFS2: free disk inode which is deleted by remote node -V2") causes a regression related to evicting inodes that were unlinked on a different cluster node. We could also have simply added a call to d_mark_dontcache() to function gfs2_try_evict(), but the original pre-revert code is better tested and proven. This reverts commit 445cb1277e10d7e19b631ef8a64aa3f055df377d. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-03-23Merge tag 'zonefs-6.3-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fixes from Damien Le Moal: - Silence a false positive smatch warning about an uninitialized variable - Fix an error message to provide more useful information about invalid zone append write results * tag 'zonefs-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Fix error message in zonefs_file_dio_append() zonefs: Prevent uninitialized symbol 'size' warning
2023-03-23cifs: print session id while listing open filesShyam Prasad N
In the output of /proc/fs/cifs/open_files, we only print the tree id for the tcon of each open file. It becomes difficult to know which tcon these files belong to with just the tree id. This change dumps ses id in addition to all other data today. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-23cifs: dump pending mids for all channels in DebugDataShyam Prasad N
Currently, we only dump the pending mid information only on the primary channel in /proc/fs/cifs/DebugData. If multichannel is active, we do not print the pending MID list on secondary channels. This change will dump the pending mids for all the channels based on server->conn_id. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-23cifs: empty interface list when server doesn't support query interfacesShyam Prasad N
When querying server interfaces returns -EOPNOTSUPP, clear the list of interfaces. Assumption is that multichannel would be disabled too. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-23cifs: do not poll server interfaces too regularlyShyam Prasad N
We have the server interface list hanging off the tcon structure today for reasons unknown. So each tcon which is connected to a file server can query them separately, which is really unnecessary. To avoid this, in the query function, we will check the time of last update of the interface list, and avoid querying the server if it is within a certain range. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-23ext4: Convert data=journal writeback to use ext4_writepages()Jan Kara
Add support for writeback of journalled data directly into ext4_writepages() instead of offloading it to write_cache_pages(). This actually significantly simplifies the code and reduces code duplication. For checkpointing of committed data we can use ext4_writepages() rightaway the same way as writeback of ordered data uses it on transaction commit. For journalling of dirty mapped pages, we need to add a special case to mpage_prepare_extent_to_map() to add all page buffers to the journal. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230228051319.4085470-8-tytso@mit.edu
2023-03-23ext4: Move mpage_page_done() calls after error handlingJan Kara
In case mpage_submit_page() returns error, it doesn't really matter whether we call mpage_page_done() and then return error or whether we return directly because in that case page cleanup will be done by mpage_release_unused_pages() instead. Logically, it makes more sense to leave the cleanup to mpage_release_unused_pages() because we didn't succeed in writing the page. So move mpage_page_done() calls after the error handling. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230228051319.4085470-7-tytso@mit.edu
2023-03-23ext4: Move page unlocking out of mpage_submit_page()Jan Kara
Move page unlocking during page writeback out of mpage_submit_page() into the callers. This will allow writeback in data=journal mode to keep the page locked for a bit longer. Since page unlocking it tightly connected to increment of mpd->first_page (as that determines cleanup of locked but unwritten pages), move page unlocking as well as mpd->first_page handling into a helper function. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230228051319.4085470-6-tytso@mit.edu
2023-03-23ext4: Don't unlock page in ext4_bio_write_page()Jan Kara
Do not unlock the written page in ext4_bio_write_page(). Instead leave the page locked and unlock it in the callers. We'll need to keep the page locked for data=journal writeback for a bit longer. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230228051319.4085470-5-tytso@mit.edu
2023-03-23ext4: Mark page for delayed dirtying only if it is pinnedJan Kara
In data=journal mode, page should be dirtied only when it has buffers for checkpoint or it is writeably mapped. In the first case, we don't need to do anything special. In the second case, page was already added to the journal by ext4_page_mkwrite() and since transaction commit writeprotects mapped pages again, page should be writeable (and thus dirtied) only while it is part of the running transaction. So nothing needs to be done either. The only special case is when someone pins the page and uses this pin for modifying page data. So recognize this special case and only then mark the page as having data that needs adding to the journal. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230228051319.4085470-4-tytso@mit.edu
2023-03-23ext4: Use nr_to_write directly in mpage_prepare_extent_to_map()Jan Kara
When looking up extent of pages to map in mpage_prepare_extent_to_map() we count how many pages we still need to find in a copy of wbc->nr_to_write counter. With more complex page handling for data=journal mode, it will be easier to use wbc->nr_to_write directly so that we don't forget to carry over changes back to nr_to_write counter. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230228051319.4085470-3-tytso@mit.edu
2023-03-23ext4: Update stale comment about write constraintsJan Kara
The comment above do_journal_get_write_access() is very stale. Most of it just does not refer to what the function does today or how jbd2 works. The bit about transaction handling during write(2) is still correct so just update the function names in that part and move the comment to a more appropriate place. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230228051319.4085470-2-tytso@mit.edu
2023-03-22cifs: lock chan_lock outside match_sessionShyam Prasad N
Coverity had rightly indicated a possible deadlock due to chan_lock being done inside match_session. All callers of match_* functions should pick up the necessary locks and call them. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Cc: stable@vger.kernel.org Fixes: 724244cdb382 ("cifs: protect session channel fields with chan_lock") Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-22ksmbd: return STATUS_NOT_SUPPORTED on unsupported smb2.0 dialectNamjae Jeon
ksmbd returned "Input/output error" when mounting with vers=2.0 to ksmbd. It should return STATUS_NOT_SUPPORTED on unsupported smb2.0 dialect. Cc: stable@vger.kernel.org Reported-by: Steve French <stfrench@microsoft.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-22ksmbd: don't terminate inactive sessions after a few secondsNamjae Jeon
Steve reported that inactive sessions are terminated after a few seconds. ksmbd terminate when receiving -EAGAIN error from kernel_recvmsg(). -EAGAIN means there is no data available in timeout. So ksmbd should keep connection with unlimited retries instead of terminating inactive sessions. Cc: stable@vger.kernel.org Reported-by: Steve French <stfrench@microsoft.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-22ksmbd: fix possible refcount leak in smb2_open()ChenXiaoSong
Reference count of acls will leak when memory allocation fails. Fix this by adding the missing posix_acl_release(). Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-22ksmbd: add low bound validation to FSCTL_QUERY_ALLOCATED_RANGESNamjae Jeon
Smatch static checker warning: fs/ksmbd/vfs.c:1040 ksmbd_vfs_fqar_lseek() warn: no lower bound on 'length' fs/ksmbd/vfs.c:1041 ksmbd_vfs_fqar_lseek() warn: no lower bound on 'start' Fix unexpected result that could caused from negative start and length. Fixes: f44158485826 ("cifsd: add file operations") Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-22ksmbd: add low bound validation to FSCTL_SET_ZERO_DATANamjae Jeon
Smatch static checker warning: fs/ksmbd/smb2pdu.c:7759 smb2_ioctl() warn: no lower bound on 'off' Fix unexpected result that could caused from negative off and bfz. Fixes: b5e5f9dfc915 ("ksmbd: check invalid FileOffset and BeyondFinalZero in FSCTL_ZERO_DATA") Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-22ksmbd: set FILE_NAMED_STREAMS attribute in FS_ATTRIBUTE_INFORMATIONNamjae Jeon
If vfs objects = streams_xattr in ksmbd.conf FILE_NAMED_STREAMS should be set to Attributes in FS_ATTRIBUTE_INFORMATION. MacOS client show "Format: SMB (Unknown)" on faked NTFS and no streams support. Cc: stable@vger.kernel.org Reported-by: Miao Lihua <441884205@qq.com> Tested-by: Miao Lihua <441884205@qq.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-22ksmbd: fix wrong signingkey creation when encryption is AES256Namjae Jeon
MacOS and Win11 support AES256 encrytion and it is included in the cipher array of encryption context. Especially on macOS, The most preferred cipher is AES256. Connecting to ksmbd fails on newer MacOS clients that support AES256 encryption. MacOS send disconnect request after receiving final session setup response from ksmbd. Because final session setup is signed with signing key was generated incorrectly. For signging key, 'L' value should be initialized to 128 if key size is 16bytes. Cc: stable@vger.kernel.org Reported-by: Miao Lihua <441884205@qq.com> Tested-by: Miao Lihua <441884205@qq.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-03-22NFSv4: Fix hangs when recovering open state after a server rebootTrond Myklebust
When we're using a cached open stateid or a delegation in order to avoid sending a CLAIM_PREVIOUS open RPC call to the server, we don't have a new open stateid to present to update_open_stateid(). Instead rely on nfs4_try_open_cached(), just as if we were doing a normal open. Fixes: d2bfda2e7aa0 ("NFSv4: don't reprocess cached open CLAIM_PREVIOUS") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2023-03-22open: return EINVAL for O_DIRECTORY | O_CREATChristian Brauner
After a couple of years and multiple LTS releases we received a report that the behavior of O_DIRECTORY | O_CREAT changed starting with v5.7. On kernels prior to v5.7 combinations of O_DIRECTORY, O_CREAT, O_EXCL had the following semantics: (1) open("/tmp/d", O_DIRECTORY | O_CREAT) * d doesn't exist: create regular file * d exists and is a regular file: ENOTDIR * d exists and is a directory: EISDIR (2) open("/tmp/d", O_DIRECTORY | O_CREAT | O_EXCL) * d doesn't exist: create regular file * d exists and is a regular file: EEXIST * d exists and is a directory: EEXIST (3) open("/tmp/d", O_DIRECTORY | O_EXCL) * d doesn't exist: ENOENT * d exists and is a regular file: ENOTDIR * d exists and is a directory: open directory On kernels since to v5.7 combinations of O_DIRECTORY, O_CREAT, O_EXCL have the following semantics: (1) open("/tmp/d", O_DIRECTORY | O_CREAT) * d doesn't exist: ENOTDIR (create regular file) * d exists and is a regular file: ENOTDIR * d exists and is a directory: EISDIR (2) open("/tmp/d", O_DIRECTORY | O_CREAT | O_EXCL) * d doesn't exist: ENOTDIR (create regular file) * d exists and is a regular file: EEXIST * d exists and is a directory: EEXIST (3) open("/tmp/d", O_DIRECTORY | O_EXCL) * d doesn't exist: ENOENT * d exists and is a regular file: ENOTDIR * d exists and is a directory: open directory This is a fairly substantial semantic change that userspace didn't notice until Pedro took the time to deliberately figure out corner cases. Since no one noticed this breakage we can somewhat safely assume that O_DIRECTORY | O_CREAT combinations are likely unused. The v5.7 breakage is especially weird because while ENOTDIR is returned indicating failure a regular file is actually created. This doesn't make a lot of sense. Time was spent finding potential users of this combination. Searching on codesearch.debian.net showed that codebases often express semantical expectations about O_DIRECTORY | O_CREAT which are completely contrary to what our code has done and currently does. The expectation often is that this particular combination would create and open a directory. This suggests users who tried to use that combination would stumble upon the counterintuitive behavior no matter if pre-v5.7 or post v5.7 and quickly realize neither semantics give them what they want. For some examples see the code examples in [1] to [3] and the discussion in [4]. There are various ways to address this issue. The lazy/simple option would be to restore the pre-v5.7 behavior and to just live with that bug forever. But since there's a real chance that the O_DIRECTORY | O_CREAT quirk isn't relied upon we should try to get away with murder(ing bad semantics) first. If we need to Frankenstein pre-v5.7 behavior later so be it. So let's simply return EINVAL categorically for O_DIRECTORY | O_CREAT combinations. In addition to cleaning up the old bug this also opens up the possiblity to make that flag combination do something more intuitive in the future. Starting with this commit the following semantics apply: (1) open("/tmp/d", O_DIRECTORY | O_CREAT) * d doesn't exist: EINVAL * d exists and is a regular file: EINVAL * d exists and is a directory: EINVAL (2) open("/tmp/d", O_DIRECTORY | O_CREAT | O_EXCL) * d doesn't exist: EINVAL * d exists and is a regular file: EINVAL * d exists and is a directory: EINVAL (3) open("/tmp/d", O_DIRECTORY | O_EXCL) * d doesn't exist: ENOENT * d exists and is a regular file: ENOTDIR * d exists and is a directory: open directory One additional note, O_TMPFILE is implemented as: #define __O_TMPFILE 020000000 #define O_TMPFILE (__O_TMPFILE | O_DIRECTORY) #define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT) For older kernels it was important to return an explicit error when O_TMPFILE wasn't supported. So O_TMPFILE requires that O_DIRECTORY is raised alongside __O_TMPFILE. It also enforced that O_CREAT wasn't specified. Since O_DIRECTORY | O_CREAT could be used to create a regular allowing that combination together with __O_TMPFILE would've meant that false positives were possible, i.e., that a regular file was created instead of a O_TMPFILE. This could've been used to trick userspace into thinking it operated on a O_TMPFILE when it wasn't. Now that we block O_DIRECTORY | O_CREAT completely the check for O_CREAT in the __O_TMPFILE branch via if ((flags & O_TMPFILE_MASK) != O_TMPFILE) can be dropped. Instead we can simply check verify that O_DIRECTORY is raised via if (!(flags & O_DIRECTORY)) and explain this in two comments. As Aleksa pointed out O_PATH is unaffected by this change since it always returned EINVAL if O_CREAT was specified - with or without O_DIRECTORY. Link: https://lore.kernel.org/lkml/20230320071442.172228-1-pedro.falcato@gmail.com Link: https://sources.debian.org/src/flatpak/1.14.4-1/subprojects/libglnx/glnx-dirfd.c/?hl=324#L324 [1] Link: https://sources.debian.org/src/flatpak-builder/1.2.3-1/subprojects/libglnx/glnx-shutil.c/?hl=251#L251 [2] Link: https://sources.debian.org/src/ostree/2022.7-2/libglnx/glnx-dirfd.c/?hl=324#L324 [3] Link: https://www.openwall.com/lists/oss-security/2014/11/26/14 [4] Reported-by: Pedro Falcato <pedro.falcato@gmail.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-03-21Merge tag 'nfsd-6.3-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a crash during NFS READs from certain client implementations - Address a minor kbuild regression in v6.3 * tag 'nfsd-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: don't replace page in rq_pages if it's a continuation of last page NFS & NFSD: Update GSS dependencies
2023-03-21ext2: remove redundant assignment to pointer endColin Ian King
Pointer is assigned a value that is never read, the assignment is redundant and can be removed. Cleans up clang-scan warning: fs/ext2/xattr.c:555:3: warning: Value stored to 'end' is never read [deadcode.DeadStores] end = (char *)header + sb->s_blocksize; Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230317143420.419005-1-colin.i.king@gmail.com>
2023-03-20userfaultfd: move unprivileged_userfaultfd sysctl to its own fileZhangPeng
The sysctl_unprivileged_userfaultfd is part of userfaultfd, move it to its own file. Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-03-20Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linuxLinus Torvalds
Pull fsverity fixes from Eric Biggers: "Fix two significant performance issues with fsverity" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY fsverity: Remove WQ_UNBOUND from fsverity read workqueue
2023-03-20Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linuxLinus Torvalds
Pull fscrypt fix from Eric Biggers: "Fix a bug where when a filesystem was being unmounted, the fscrypt keyring was destroyed before inodes have been released by the Landlock LSM. This bug was found by syzbot" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux: fscrypt: check for NULL keyring in fscrypt_put_master_key_activeref() fscrypt: improve fscrypt_destroy_keyring() documentation fscrypt: destroy keyring after security_sb_delete()
2023-03-21zonefs: Fix error message in zonefs_file_dio_append()Damien Le Moal
Since the expected write location in a sequential file is always at the end of the file (append write), when an invalid write append location is detected in zonefs_file_dio_append(), print the invalid written location instead of the expected write location. Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
2023-03-21zonefs: Prevent uninitialized symbol 'size' warningDamien Le Moal
In zonefs_file_dio_append(), initialize the variable size to 0 to prevent compilation and static code analizers warning such as: New smatch warnings: fs/zonefs/file.c:441 zonefs_file_dio_append() error: uninitialized symbol 'size'. The warning is a false positive as size is never actually used uninitialized. No functional change. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202303191227.GL8Dprbi-lkp@intel.com/ Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
2023-03-20Merge tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client fixes from Anna Schumaker: - Fix /proc/PID/io read_bytes accounting - Fix setting NLM file_lock start and end during decoding testargs - Fix timing for setting access cache timestamps * tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: Correct timing for assigning access cache timestamp lockd: set file_lock start and end when decoding nlm4 testargs NFS: Fix /proc/PID/io read_bytes for buffered reads
2023-03-20quota: make dquot_set_dqinfo return errors from ->write_infoYangtao Li
dquot_set_dqinfo() ignores the return code from the ->write_info call, which means that quotacalls like Q_SETINFO never see the error. This doesn't seem right, so fix that. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230227120216.31306-2-frank.li@vivo.com>
2023-03-20quota: fixup *_write_file_info() to return proper error codeYangtao Li
For v1_write_file_info function, when quota_write() returns 0, it should be considered an EIO error. And for v2_write_file_info(), fix to proper error return code instead of raw number. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230227120216.31306-1-frank.li@vivo.com>
2023-03-19xfs: test dir/attr hash when loading moduleDarrick J. Wong
Back in the 6.2-rc1 days, Eric Whitney reported a fstests regression in ext4 against generic/454. The cause of this test failure was the unfortunate combination of setting an xattr name containing UTF8 encoded emoji, an xattr hash function that accepted a char pointer with no explicit signedness, signed type extension of those chars to an int, and the 6.2 build tools maintainers deciding to mandate -funsigned-char across the board. As a result, the ondisk extended attribute structure written out by 6.1 and 6.2 were not the same. This discrepancy, in fact, had been noticeable if a filesystem with such an xattr were moved between any two architectures that don't employ the same signedness of a raw "char" declaration. The only reason anyone noticed is that x86 gcc defaults to signed, and no such -funsigned-char update was made to e2fsprogs, so e2fsck immediately started reporting data corruption. After a day and a half of discussing how to handle this use case (xattrs with bit 7 set anywhere in the name) without breaking existing users, Linus merged his own patch and didn't tell the maintainer. None of the ext4 developers realized this until AUTOSEL announced that the commit had been backported to stable. In the end, this problem could have been detected much earlier if there had been any useful tests of hash function(s) in use inside ext4 to make sure that they always produce the same outputs given the same inputs. The XFS dirent/xattr name hash takes a uint8_t*, so I don't think it's vulnerable to this problem. However, let's avoid all this drama by adding our own self test to check that the da hash produces the same outputs for a static pile of inputs on various platforms. This enables us to fix any breakage that may result in a controlled fashion. The buffer and test data are identical to the patches submitted to xfsprogs. Link: https://lore.kernel.org/linux-ext4/Y8bpkm3jA3bDm3eL@debian-BULLSEYE-live-builder-AMD64/ Link: https://lore.kernel.org/linux-xfs/ZBUKCRR7xvIqPrpX@destitution/T/#md38272cc684e2c0d61494435ccbb91f022e8dee4 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>