summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2025-01-19Merge branch 'efivarfs' into nextArd Biesheuvel
2025-01-19efivarfs: fix error on write to new variable leaving remnantsJames Bottomley
Make variable cleanup go through the fops release mechanism and use zero inode size as the indicator to delete the file. Since all EFI variables must have an initial u32 attribute, zero size occurs either because the update deleted the variable or because an unsuccessful write after create caused the size never to be set in the first place. In the case of multiple racing opens and closes, the open is counted to ensure that the zero size check is done on the last close. Even though this fixes the bug that a create either not followed by a write or followed by a write that errored would leave a remnant file for the variable, the file will appear momentarily globally visible until the last close of the fd deletes it. This is safe because the normal filesystem operations will mediate any races; however, it is still possible for a directory listing at that instant between create and close contain a zero size variable that doesn't exist in the EFI table. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19efivarfs: remove unused efivarfs_listJames Bottomley
Remove all function helpers and mentions of the efivarfs_list now that all consumers of the list have been removed and entry management goes exclusively through the inode. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19efivarfs: move variable lifetime management into the inodesJames Bottomley
Make the inodes the default management vehicle for struct efivar_entry, so they are now all freed automatically if the file is removed and on unmount in kill_litter_super(). Remove the now superfluous iterator to free the entries after kill_litter_super(). Also fixes a bug where some entry freeing was missing causing efivarfs to leak memory. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19efivarfs: prevent setting of zero size on the inodes in the cacheJames Bottomley
Current efivarfs uses simple_setattr which allows the setting of any size in the inode cache. This is wrong because a zero size file is used to indicate an "uncommitted" variable, so by simple means of truncating the file (as root) any variable may be turned to look like it's uncommitted. Fix by adding an efivarfs_setattr routine which does not allow updating of the cached inode size (which now only comes from the underlying variable). Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-19erofs: remove dead code in erofs_fc_parse_paramChen Linxuan
If an option is unknown to erofs, which means that option is not in `erofs_fs_parameters`, `fs_parse` will return -ENOPARAM, which makes `erofs_fc_parse_param` returns earlier. Signed-off-by: Chen Linxuan <chenlinxuan@uniontech.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/DB86A4E2BB2BB44E+20250117100635.335963-2-chenlinxuan@uniontech.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-01-18ubifs: skip dumping tnc tree when zroot is nullpangliyuan
Clearing slab cache will free all znode in memory and make c->zroot.znode = NULL, then dumping tnc tree will access c->zroot.znode which cause null pointer dereference. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219624#c0 Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Signed-off-by: pangliyuan <pangliyuan1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-18ubifs: ubifs_dump_leb: remove return from end of void functionPintu Kumar
Noticed that there is a useless return statement at the end of void function ubifs_dump_leb(). Just removed it. Signed-off-by: Pintu Kumar <quic_pintu@quicinc.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-18ubifs: dump_lpt_leb: remove return at end of void functionPintu Kumar
Noticed that there is a useless return statement at the end of void function dump_lpt_leb(). Just removing it. Signed-off-by: Pintu Kumar <quic_pintu@quicinc.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-01-17make take_dentry_name_snapshot() locklessAl Viro
Use ->d_seq instead of grabbing ->d_lock; in case of shortname dentries that avoids any stores to shared data objects and in case of long names we are down to (unavoidable) atomic_inc on the external_name refcount. Makes the thing safer as well - the areas where ->d_seq is held odd are all nested inside the areas where ->d_lock is held, and the latter are much more numerous. NOTE: now that there is a lockless path where we might try to grab a reference to an already doomed external_name instance, it is no longer possible for external_name.u.count and external_name.u.head to share space (kudos to Linus for spotting that). To reduce the noise this commit just make external_name.u a struct (instead of union); the next commit will dissolve it. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-01-17dcache: back inline names with a struct-wrapped array of unsigned longAl Viro
... so that they can be copied with struct assignment (which generates better code) and accessed word-by-word. The type is union shortname_storage; it's a union of arrays of unsigned char and unsigned long. struct name_snapshot.inline_name turned into union shortname_storage; users (all in fs/dcache.c) adjusted. struct dentry.d_iname has some users outside of fs/dcache.c; to reduce the amount of noise in commit, it is replaced with union shortname_storage d_shortname and d_iname is turned into a macro that expands to d_shortname.string (similar to d_lock handling). That compat macro is temporary - most of the remaining instances will be taken out by debugfs series, and once that is merged and few others are taken care of this will go away. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-01-17make sure that DNAME_INLINE_LEN is a multiple of word sizeAl Viro
... calling the number of words DNAME_INLINE_WORDS. The next step will be to have a structure to hold inline name arrays (both in dentry and in name_snapshot) and use that to alias the existing arrays of unsigned char there. That will allow both full-structure copies and convenient word-by-word accesses. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-01-16Merge tag 'mm-hotfixes-stable-2025-01-16-21-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "7 singleton hotfixes. 6 are MM. Two are cc:stable and the remainder address post-6.12 issues" * tag 'mm-hotfixes-stable-2025-01-16-21-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: ocfs2: check dir i_size in ocfs2_find_entry mailmap: update entry for Ethan Carter Edwards mm: zswap: move allocations during CPU init outside the lock mm: khugepaged: fix call hpage_collapse_scan_file() for anonymous vma mm: shmem: use signed int for version handling in casefold option alloc_tag: skip pgalloc_tag_swap if profiling is disabled mm: page_alloc: fix missed updates of lowmem_reserve in adjust_managed_page_count
2025-01-16Merge tag '6.13-rc7-SMB3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - fix double free when reconnect racing with closing session - fix SMB1 reconnect with password rotation * tag '6.13-rc7-SMB3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix double free of TCP_Server_Info::hostname cifs: support reconnect with alternate password for SMB1
2025-01-16fs/overlayfs/namei.c: get rid of include ../internal.hAl Viro
Added for the sake of vfs_path_lookup(), which is in linux/namei.h these days. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-01-17erofs: return SHRINK_EMPTY if no objects to freeChen Linxuan
Comments in file include/linux/shrinker.h says that `count_objects` of `struct shrinker` should return SHRINK_EMPTY when there are no objects to free. > If there are no objects to free, it should return SHRINK_EMPTY, > while 0 is returned in cases of the number of freeable items cannot > be determined or shrinker should skip this cache for this time > (e.g., their number is below shrinkable limit). Signed-off-by: Chen Linxuan <chenlinxuan@uniontech.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/149E6E64B5B6B5E8+20250116083303.199817-1-chenlinxuan@uniontech.com [ Gao Xiang: should have no impact since it's not memcg-aware. ] Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-01-17erofs: convert z_erofs_bind_cache() to foliosGao Xiang
The managed cache uses a pseudo inode to keep (necessary) compressed data. Currently, it still uses zero-order folios, so this is just a trivial conversion, except that the use of the pagepool is temporarily dropped. Drop some obsoleted comments too. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250114034429.431408-4-hsiangkao@linux.alibaba.com
2025-01-17erofs: tidy up zdata.cGao Xiang
All small code style adjustments, no logic changes: - z_erofs_decompress_frontend => z_erofs_frontend; - z_erofs_decompress_backend => z_erofs_backend; - Use Z_EROFS_DEFINE_FRONTEND() to replace DECOMPRESS_FRONTEND_INIT(); - `nr_folios` should be `nrpages` in z_erofs_readahead(); - Refine in-line comments. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250114034429.431408-3-hsiangkao@linux.alibaba.com
2025-01-17erofs: get rid of `z_erofs_next_pcluster_t`Gao Xiang
It was originally intended for tagged pointer reservation. Now all encoded data can be represented uniformally with `struct z_erofs_pcluster` as described in commit bf1aa03980f4 ("erofs: sunset `struct erofs_workgroup`"), let's drop it too. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250114034429.431408-2-hsiangkao@linux.alibaba.com
2025-01-17erofs: simplify z_erofs_load_compact_lcluster()Gao Xiang
- Get rid of unpack_compacted_index() and fold it into z_erofs_load_compact_lcluster(); - Avoid a goto. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250114034429.431408-1-hsiangkao@linux.alibaba.com
2025-01-17erofs: fix potential return value overflow of z_erofs_shrink_scan()Gao Xiang
z_erofs_shrink_scan() could return small numbers due to the mistyped `freed`. Although I don't think it has any visible impact. Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250114040058.459981-1-hsiangkao@linux.alibaba.com
2025-01-17erofs: shorten bvecs[] for file-backed mountsGao Xiang
BIO_MAX_VECS is too large for __GFP_NOFAIL allocation. We could use a mempool (since BIOs can always proceed), but it seems overly complicated for now. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250107082825.74242-1-hsiangkao@linux.alibaba.com
2025-01-17erofs: micro-optimize superblock checksumGao Xiang
Just verify the remaining unknown on-disk data instead of allocating a temporary buffer for the whole superblock and zeroing out the checksum field since .magic(EROFS_SUPER_MAGIC_V1) is verified and .checksum(0) is fixed. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241212023948.1143038-1-hsiangkao@linux.alibaba.com
2025-01-17fs: erofs: xattr.c change kzalloc to kcallocEthan Carter Edwards
Refactor xattr.c to use kcalloc instead of kzalloc when multiplying allocation size by count. This refactor prevents unintentional memory overflows. Discovered by checkpatch.pl. Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com> Link: https://lore.kernel.org/r/i3CLJhMELKzBJr3DaRyv-hP_4m-3Twx0sgBWXW6naZlMtHrIeWr93xOFshX8qZHDrJeSjHMTiUOh8JmBZ9v0AB-S1lIYM_d-vasSRlsF_s4=@ethancedwards.com Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-01-16f2fs: fix to do sanity check correctly on i_inline_xattr_sizeChao Yu
syzbot reported an out-of-range access issue as below: UBSAN: array-index-out-of-bounds in fs/f2fs/f2fs.h:3292:19 index 18446744073709550491 is out of range for type '__le32[923]' (aka 'unsigned int[923]') CPU: 0 UID: 0 PID: 5338 Comm: syz.0.0 Not tainted 6.12.0-syzkaller-10689-g7af08b57bcb9 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:231 [inline] __ubsan_handle_out_of_bounds+0x121/0x150 lib/ubsan.c:429 read_inline_xattr+0x273/0x280 lookup_all_xattrs fs/f2fs/xattr.c:341 [inline] f2fs_getxattr+0x57b/0x13b0 fs/f2fs/xattr.c:533 vfs_getxattr_alloc+0x472/0x5c0 fs/xattr.c:393 ima_read_xattr+0x38/0x60 security/integrity/ima/ima_appraise.c:229 process_measurement+0x117a/0x1fb0 security/integrity/ima/ima_main.c:353 ima_file_check+0xd9/0x120 security/integrity/ima/ima_main.c:572 security_file_post_open+0xb9/0x280 security/security.c:3121 do_open fs/namei.c:3830 [inline] path_openat+0x2ccd/0x3590 fs/namei.c:3987 do_file_open_root+0x3a7/0x720 fs/namei.c:4039 file_open_root+0x247/0x2a0 fs/open.c:1382 do_handle_open+0x85b/0x9d0 fs/fhandle.c:414 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f index: 18446744073709550491 (decimal, unsigned long long) = 0xfffffffffffffb9b (hexadecimal) = -1125 (decimal, long long) UBSAN detects that inline_xattr_addr() tries to access .i_addr[-1125]. w/ below testcase, it can reproduce this bug easily: - mkfs.f2fs -f -O extra_attr,flexible_inline_xattr /dev/sdb - mount -o inline_xattr_size=512 /dev/sdb /mnt/f2fs - touch /mnt/f2fs/file - umount /mnt/f2fs - inject.f2fs --node --mb i_inline --nid 4 --val 0x1 /dev/sdb - inject.f2fs --node --mb i_inline_xattr_size --nid 4 --val 2048 /dev/sdb - mount /dev/sdb /mnt/f2fs - getfattr /mnt/f2fs/file The root cause is if metadata of filesystem and inode were fuzzed as below: - extra_attr feature is enabled - flexible_inline_xattr feature is enabled - ri.i_inline_xattr_size = 2048 - F2FS_EXTRA_ATTR bit in ri.i_inline was not set sanity_check_inode() will skip doing sanity check on fi->i_inline_xattr_size, result in using invalid inline_xattr_size later incorrectly, fix it. Meanwhile, let's fix to check lower boundary for .i_inline_xattr_size w/ MIN_INLINE_XATTR_SIZE like we did in parse_options(). There is a related issue reported by syzbot, Qasim Ijaz has anlyzed and fixed it w/ very similar way [1], as discussed, we all agree that it will be better to do sanity check in sanity_check_inode() for fix, so finally, let's fix these two related bugs w/ current patch. Including commit message from Qasim's patch as below, thanks a lot for his contribution. "In f2fs_getxattr(), the function lookup_all_xattrs() allocates a 12-byte (base_size) buffer for an inline extended attribute. However, when __find_inline_xattr() calls __find_xattr(), it uses the macro "list_for_each_xattr(entry, addr)", which starts by calling XATTR_FIRST_ENTRY(addr). This skips a 24-byte struct f2fs_xattr_header at the beginning of the buffer, causing an immediate out-of-bounds read in a 12-byte allocation. The subsequent !IS_XATTR_LAST_ENTRY(entry) check then dereferences memory outside the allocated region, triggering the slab-out-of bounds read. This patch prevents the out-of-bounds read by adding a check to bail out early if inline_size is too small and does not account for the header plus the 4-byte value that IS_XATTR_LAST_ENTRY reads." [1]: https://lore.kernel.org/linux-f2fs-devel/Z32y1rfBY9Qb5ZjM@qasdev.system/ Fixes: 6afc662e68b5 ("f2fs: support flexible inline xattr size") Reported-by: syzbot+69f5379a1717a0b982a1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/674f4e7d.050a0220.17bd51.004f.GAE@google.com Reported-by: syzbot <syzbot+f5e74075e096e757bdbf@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=f5e74075e096e757bdbf Tested-by: syzbot <syzbot+f5e74075e096e757bdbf@syzkaller.appspotmail.com> Tested-by: Qasim Ijaz <qasdev00@gmail.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-01-16f2fs: remove blk_finish_plugJaegeuk Kim
Let's remove unclear blk_finish_plug. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-01-16f2fs: Optimize f2fs_truncate_data_blocks_range()Yi Sun
Function f2fs_invalidate_blocks() can process consecutive blocks at a time, so f2fs_truncate_data_blocks_range() is optimized to use the new functionality of f2fs_invalidate_blocks(). Add two variables @blkstart and @blklen, @blkstart records the first address of the consecutive blocks, and @blkstart records the number of consecutive blocks. Signed-off-by: Yi Sun <yi.sun@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-01-16Merge tag 'for-6.13-rc7-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: - handle d_path() errors when canonicalizing device mapper paths during device scan * tag 'for-6.13-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: add the missing error handling inside get_canonical_dev_path
2025-01-16gfs2: use lockref_init for qd_lockrefChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250115094702.504610-9-hch@lst.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-16erofs: use lockref_init for pcl->lockrefChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250115094702.504610-8-hch@lst.de Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-16dcache: use lockref_init for d_lockrefChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250115094702.504610-7-hch@lst.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-16fs: Fix return type of do_mount() from long to intSentaro Onizuka
Fix the return type of do_mount() function from long to int to match its ac tual behavior. The function only returns int values, and all callers, inclu ding those in fs/namespace.c and arch/alpha/kernel/osf_sys.c, already treat the return value as int. This change improves type consistency across the filesystem code and aligns the function signature with its existing impleme ntation and usage. Signed-off-by: Sentaro Onizuka <sentaro@amazon.com> Link: https://lore.kernel.org/r/20250113151400.55512-1-sentaro@amazon.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-16xfs: fix buffer lookup vs release raceChristoph Hellwig
Since commit 298f34224506 ("xfs: lockless buffer lookup") the buffer lookup fastpath is done without a hash-wide lock (then pag_buf_lock, now bc_lock) and only under RCU protection. But this means that nothing serializes lookups against the temporary 0 reference count for buffers that are added to the LRU after dropping the last regular reference, and a concurrent lookup would fail to find them. Fix this by doing all b_hold modifications under b_lock. We're already doing this for release so this "only" ~ doubles the b_lock round trips. We'll later look into the lockref infrastructure to optimize the number of lock round trips again. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-16xfs: check for dead buffers in xfs_buf_find_insertChristoph Hellwig
Commit 32dd4f9c506b ("xfs: remove a superflous hash lookup when inserting new buffers") converted xfs_buf_find_insert to use rhashtable_lookup_get_insert_fast and thus an operation that returns the existing buffer when an insert would duplicate the hash key. But this code path misses the check for a buffer with a reference count of zero, which could lead to reusing an about to be freed buffer. Fix this by using the same atomic_inc_not_zero pattern as xfs_buf_insert. Fixes: 32dd4f9c506b ("xfs: remove a superflous hash lookup when inserting new buffers") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: stable@vger.kernel.org # v6.0 Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-15ksmbd: fix integer overflows on 32 bit systemsDan Carpenter
On 32bit systems the addition operations in ipc_msg_alloc() can potentially overflow leading to memory corruption. Add bounds checking using KSMBD_IPC_MAX_PAYLOAD to avoid overflow. Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-15ksmbd: browse interfaces list on FSCTL_QUERY_INTERFACE_INFO IOCTLNamjae Jeon
ksmbd.mount will give each interfaces list and bind_interfaces_only flags to ksmbd server. Previously, the interfaces list was sent only when bind_interfaces_only was enabled. ksmbd server browse only interfaces list given from ksmbd.conf on FSCTL_QUERY_INTERFACE_INFO IOCTL. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-15ksmbd: Remove unused functionsDr. David Alan Gilbert
ksmbd_rpc_rap() was added in 2021 as part of commit 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") ksmbd_vfs_posix_lock_wait_timeout() was added in 2021 as part of commit f44158485826 ("cifsd: add file operations") both have remained unused. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-15ocfs2: check dir i_size in ocfs2_find_entrySu Yue
syz reports an out of bounds read: ================================================================== BUG: KASAN: slab-out-of-bounds in ocfs2_match fs/ocfs2/dir.c:334 [inline] BUG: KASAN: slab-out-of-bounds in ocfs2_search_dirblock+0x283/0x6e0 fs/ocfs2/dir.c:367 Read of size 1 at addr ffff88804d8b9982 by task syz-executor.2/14802 CPU: 0 UID: 0 PID: 14802 Comm: syz-executor.2 Not tainted 6.13.0-rc4 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Sched_ext: serialise (enabled+all), task: runnable_at=-10ms Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x229/0x350 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x164/0x530 mm/kasan/report.c:489 kasan_report+0x147/0x180 mm/kasan/report.c:602 ocfs2_match fs/ocfs2/dir.c:334 [inline] ocfs2_search_dirblock+0x283/0x6e0 fs/ocfs2/dir.c:367 ocfs2_find_entry_id fs/ocfs2/dir.c:414 [inline] ocfs2_find_entry+0x1143/0x2db0 fs/ocfs2/dir.c:1078 ocfs2_find_files_on_disk+0x18e/0x530 fs/ocfs2/dir.c:1981 ocfs2_lookup_ino_from_name+0xb6/0x110 fs/ocfs2/dir.c:2003 ocfs2_lookup+0x30a/0xd40 fs/ocfs2/namei.c:122 lookup_open fs/namei.c:3627 [inline] open_last_lookups fs/namei.c:3748 [inline] path_openat+0x145a/0x3870 fs/namei.c:3984 do_filp_open+0xe9/0x1c0 fs/namei.c:4014 do_sys_openat2+0x135/0x1d0 fs/open.c:1402 do_sys_open fs/open.c:1417 [inline] __do_sys_openat fs/open.c:1433 [inline] __se_sys_openat fs/open.c:1428 [inline] __x64_sys_openat+0x15d/0x1c0 fs/open.c:1428 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf6/0x210 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f01076903ad Code: c3 e8 a7 2b 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f01084acfc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 00007f01077cbf80 RCX: 00007f01076903ad RDX: 0000000000105042 RSI: 0000000020000080 RDI: ffffffffffffff9c RBP: 00007f01077cbf80 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000001ff R11: 0000000000000246 R12: 0000000000000000 R13: 00007f01077cbf80 R14: 00007f010764fc90 R15: 00007f010848d000 </TASK> ================================================================== And a general protection fault in ocfs2_prepare_dir_for_insert: ================================================================== loop0: detected capacity change from 0 to 32768 JBD2: Ignoring recovery information on journal ocfs2: Mounting device (7,0) on (node local, slot 0) with ordered data mode. Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 5096 Comm: syz-executor792 Not tainted 6.11.0-rc4-syzkaller-00002-gb0da640826ba #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:ocfs2_find_dir_space_id fs/ocfs2/dir.c:3406 [inline] RIP: 0010:ocfs2_prepare_dir_for_insert+0x3309/0x5c70 fs/ocfs2/dir.c:4280 Code: 00 00 e8 2a 25 13 fe e9 ba 06 00 00 e8 20 25 13 fe e9 4f 01 00 00 e8 16 25 13 fe 49 8d 7f 08 49 8d 5f 09 48 89 f8 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 bd 23 00 00 48 89 d8 48 c1 e8 03 42 0f RSP: 0018:ffffc9000af9f020 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000009 RCX: ffff88801e27a440 RDX: 0000000000000000 RSI: 0000000000000400 RDI: 0000000000000008 RBP: ffffc9000af9f830 R08: ffffffff8380395b R09: ffffffff838090a7 R10: 0000000000000002 R11: ffff88801e27a440 R12: dffffc0000000000 R13: ffff88803c660878 R14: f700000000000088 R15: 0000000000000000 FS: 000055555a677380(0000) GS:ffff888020800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000560bce569178 CR3: 000000001de5a000 CR4: 0000000000350ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ocfs2_mknod+0xcaf/0x2b40 fs/ocfs2/namei.c:292 vfs_mknod+0x36d/0x3b0 fs/namei.c:4088 do_mknodat+0x3ec/0x5b0 __do_sys_mknodat fs/namei.c:4166 [inline] __se_sys_mknodat fs/namei.c:4163 [inline] __x64_sys_mknodat+0xa7/0xc0 fs/namei.c:4163 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f2dafda3a99 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffe336a6658 EFLAGS: 00000246 ORIG_RAX: 0000000000000103 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2dafda3a99 RDX: 00000000000021c0 RSI: 0000000020000040 RDI: 00000000ffffff9c RBP: 00007f2dafe1b5f0 R08: 0000000000004480 R09: 000055555a6784c0 R10: 0000000000000103 R11: 0000000000000246 R12: 00007ffe336a6680 R13: 00007ffe336a68a8 R14: 431bde82d7b634db R15: 00007f2dafdec03b </TASK> ================================================================== The two reports are all caused invalid negative i_size of dir inode. For ocfs2, dir_inode can't be negative or zero. Here add a check in which is called by ocfs2_check_dir_for_entry(). It fixes the second report as ocfs2_check_dir_for_entry() must be called before ocfs2_prepare_dir_for_insert(). Also set a up limit for dir with OCFS2_INLINE_DATA_FL. The i_size can't be great than blocksize. Link: https://lkml.kernel.org/r/20250106140640.92260-1-glass.su@suse.com Reported-by: Jiacheng Xu <stitch@zju.edu.cn> Link: https://lore.kernel.org/ocfs2-devel/17a04f01.1ae74.19436d003fc.Coremail.stitch@zju.edu.cn/T/#u Reported-by: syzbot+5a64828fcc4c2ad9b04f@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/0000000000005894f3062018caf1@google.com/T/ Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-15smb: client: fix double free of TCP_Server_Info::hostnamePaulo Alcantara
When shutting down the server in cifs_put_tcp_session(), cifsd thread might be reconnecting to multiple DFS targets before it realizes it should exit the loop, so @server->hostname can't be freed as long as cifsd thread isn't done. Otherwise the following can happen: RIP: 0010:__slab_free+0x223/0x3c0 Code: 5e 41 5f c3 cc cc cc cc 4c 89 de 4c 89 cf 44 89 44 24 08 4c 89 1c 24 e8 fb cf 8e 00 44 8b 44 24 08 4c 8b 1c 24 e9 5f fe ff ff <0f> 0b 41 f7 45 08 00 0d 21 00 0f 85 2d ff ff ff e9 1f ff ff ff 80 RSP: 0018:ffffb26180dbfd08 EFLAGS: 00010246 RAX: ffff8ea34728e510 RBX: ffff8ea34728e500 RCX: 0000000000800068 RDX: 0000000000800068 RSI: 0000000000000000 RDI: ffff8ea340042400 RBP: ffffe112041ca380 R08: 0000000000000001 R09: 0000000000000000 R10: 6170732e31303000 R11: 70726f632e786563 R12: ffff8ea34728e500 R13: ffff8ea340042400 R14: ffff8ea34728e500 R15: 0000000000800068 FS: 0000000000000000(0000) GS:ffff8ea66fd80000(0000) 000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffc25376080 CR3: 000000012a2ba001 CR4: PKRU: 55555554 Call Trace: <TASK> ? show_trace_log_lvl+0x1c4/0x2df ? show_trace_log_lvl+0x1c4/0x2df ? __reconnect_target_unlocked+0x3e/0x160 [cifs] ? __die_body.cold+0x8/0xd ? die+0x2b/0x50 ? do_trap+0xce/0x120 ? __slab_free+0x223/0x3c0 ? do_error_trap+0x65/0x80 ? __slab_free+0x223/0x3c0 ? exc_invalid_op+0x4e/0x70 ? __slab_free+0x223/0x3c0 ? asm_exc_invalid_op+0x16/0x20 ? __slab_free+0x223/0x3c0 ? extract_hostname+0x5c/0xa0 [cifs] ? extract_hostname+0x5c/0xa0 [cifs] ? __kmalloc+0x4b/0x140 __reconnect_target_unlocked+0x3e/0x160 [cifs] reconnect_dfs_server+0x145/0x430 [cifs] cifs_handle_standard+0x1ad/0x1d0 [cifs] cifs_demultiplex_thread+0x592/0x730 [cifs] ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] kthread+0xdd/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x29/0x50 </TASK> Fixes: 7be3248f3139 ("cifs: To match file servers, make sure the server hostname matches") Reported-by: Jay Shin <jaeshin@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-15bcachefs: Fix check_inode_hash_info_matches_root()Kent Overstreet
Can't use memcmp() when the struct contains padding. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-15ceph: fix memory leak in ceph_mds_auth_match()Antoine Viallon
We now free the temporary target path substring allocation on every possible branch, instead of omitting the default branch. In some cases, a memory leak occured, which could rapidly crash the system (depending on how many file accesses were attempted). This was detected in production because it caused a continuous memory growth, eventually triggering kernel OOM and completely hard-locking the kernel. Relevant kmemleak stacktrace: unreferenced object 0xffff888131e69900 (size 128): comm "git", pid 66104, jiffies 4295435999 hex dump (first 32 bytes): 76 6f 6c 75 6d 65 73 2f 63 6f 6e 74 61 69 6e 65 volumes/containe 72 73 2f 67 69 74 65 61 2f 67 69 74 65 61 2f 67 rs/gitea/gitea/g backtrace (crc 2f3bb450): [<ffffffffaa68fb49>] __kmalloc_noprof+0x359/0x510 [<ffffffffc32bf1df>] ceph_mds_check_access+0x5bf/0x14e0 [ceph] [<ffffffffc3235722>] ceph_open+0x312/0xd80 [ceph] [<ffffffffaa7dd786>] do_dentry_open+0x456/0x1120 [<ffffffffaa7e3729>] vfs_open+0x79/0x360 [<ffffffffaa832875>] path_openat+0x1de5/0x4390 [<ffffffffaa834fcc>] do_filp_open+0x19c/0x3c0 [<ffffffffaa7e44a1>] do_sys_openat2+0x141/0x180 [<ffffffffaa7e4945>] __x64_sys_open+0xe5/0x1a0 [<ffffffffac2cc2f7>] do_syscall_64+0xb7/0x210 [<ffffffffac400130>] entry_SYSCALL_64_after_hwframe+0x77/0x7f It can be triggered by mouting a subdirectory of a CephFS filesystem, and then trying to access files on this subdirectory with an auth token using a path-scoped capability: $ ceph auth get client.services [client.services] key = REDACTED caps mds = "allow rw fsname=cephfs path=/volumes/" caps mon = "allow r fsname=cephfs" caps osd = "allow rw tag cephfs data=cephfs" $ cat /proc/self/mounts services@[REDACTED].cephfs=/volumes/containers /ceph/containers ceph rw,noatime,name=services,secret=<hidden>,ms_mode=prefer-crc,mount_timeout=300,acl,mon_addr=[REDACTED]:3300,recover_session=clean 0 0 $ seq 1 1000000 | xargs -P32 --replace={} touch /ceph/containers/file-{} && \ seq 1 1000000 | xargs -P32 --replace={} cat /ceph/containers/file-{} [ idryomov: combine if statements, rename rc to path_matched and make it a bool, formatting ] Cc: stable@vger.kernel.org Fixes: 596afb0b8933 ("ceph: add ceph_mds_check_access() helper") Signed-off-by: Antoine Viallon <antoine@lesviallon.fr> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-01-15saner replacement for debugfs_rename()Al Viro
Existing primitive has several problems: 1) calling conventions are clumsy - it returns a dentry reference that is either identical to its second argument or is an ERR_PTR(-E...); in both cases no refcount changes happen. Inconvenient for users and bug-prone; it would be better to have it return 0 on success and -E... on failure. 2) it allows cross-directory moves; however, no such caller have ever materialized and considering the way debugfs is used, it's unlikely to happen in the future. What's more, any such caller would have fun issues to deal with wrt interplay with recursive removal. It also makes the calling conventions clumsier... 3) tautological rename fails; the callers have no race-free way to deal with that. 4) new name must have been formed by the caller; quite a few callers have it done by sprintf/kasprintf/etc., ending up with considerable boilerplate. Proposed replacement: int debugfs_change_name(dentry, fmt, ...). All callers convert to that easily, and it's simpler internally. IMO debugfs_rename() should go; if we ever get a real-world use case for cross-directory moves in debugfs, we can always look into the right way to handle that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20250112080705.141166-21-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15orangefs-debugfs: don't mess with ->d_nameAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20250112080705.141166-20-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15debugfs: allow to store an additional opaque pointer at file creationAl Viro
Set by debugfs_create_file_aux(name, mode, parent, data, aux, fops). Plain debugfs_create_file() has it set to NULL. Accessed by debugfs_get_aux(file). Convenience macros for numeric opaque data - debugfs_create_file_aux_num and debugfs_get_aux_num, resp. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20250112080705.141166-5-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15debugfs: don't mess with bits in ->d_fsdataAl Viro
The reason we need that crap is the dual use ->d_fsdata has there - it's both holding a debugfs_fsdata reference after the first debugfs_file_get() (actually, after the call of proxy ->open()) *and* it serves as a place to stash a reference to real file_operations from object creation to the first open. Oh, and it's triple use, actually - that stashed reference might be to debugfs_short_fops. Bugger that for a game of solidiers - just put the operations reference into debugfs-private augmentation of inode. And split debugfs_full_file_operations into full and short cases, so that debugfs_get_file() could tell one from another. Voila - ->d_fsdata holds NULL until the first (successful) debugfs_get_file() and a reference to struct debugfs_fsdata afterwards. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20250112080705.141166-4-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15debugfs: get rid of dynamically allocation proxy_opsAl Viro
All it takes is having full_proxy_open() collect the information about available methods and store it in debugfs_fsdata. Wrappers are called only after full_proxy_open() has succeeded calling debugfs_get_file(), so they are guaranteed to have ->d_fsdata already pointing to debugfs_fsdata. As the result, they can check if method is absent and bugger off early, without any atomic operations, etc. - same effect as what we'd have from NULL method. Which makes the entire proxy_fops contents unconditional, making it completely pointless - we can just put those methods (unconditionally) into debugfs_full_proxy_file_operations and forget about dynamic allocation, replace_fops, etc. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20250112080705.141166-3-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15debugfs: move ->automount into debugfs_inode_infoAl Viro
... and don't bother with debugfs_fsdata for those. Life's simpler that way... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20250112080705.141166-2-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15debugfs: separate cache for debugfs inodesAl Viro
Embed them into container (struct debugfs_inode_info, with nothing else in it at the moment), set the cache up, etc. Just the infrastructure changes letting us augment debugfs inodes here; adding stuff will come at the next step. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20250112080705.141166-1-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15afs: Fix the fallback handling for the YFS.RemoveFile2 RPC callDavid Howells
Fix a pair of bugs in the fallback handling for the YFS.RemoveFile2 RPC call: (1) Fix the abort code check to also look for RXGEN_OPCODE. The lack of this masks the second bug. (2) call->server is now not used for ordinary filesystem RPC calls that have an operation descriptor. Fix to use call->op->server instead. Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/109541.1736865963@warthog.procyon.org.uk cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-14nfs: probe for LOCALIO when v3 client reconnects to serverMike Snitzer
Re-enabling NFSv3 LOCALIO is made more complex (than NFSv4) because v3 is stateless. As such, the hueristic used to identify a LOCALIO probe point is more adhoc by nature: if/when NFSv3 client IO begins to complete again in terms of normal RPC-based NFSv3 server IO, attempt nfs_local_probe_async(). Care is taken to throttle the frequency of nfs_local_probe_async(), otherwise there could be a flood of repeat calls to nfs_local_probe_async(). The throttle is admin controlled using a new module parameter for nfsv3, e.g.: echo 512 > /sys/module/nfsv3/parameters/nfs3_localio_probe_throttle Probe for NFSv3 LOCALIO every N IO requests (512 in this case). Must be power-of-2, defaults to 0 (probing disabled). On systems that expect to use LOCALIO with NFSv3 the admin should configure the 'nfs3_localio_probe_throttle' module parameter. This commit backfills module parameter documentation in localio.rst Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>