summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2024-11-18erofs: simplify definition of the log functionsGou Hao
Use printk instead of pr_info/err to reduce redundant code. Signed-off-by: Gou Hao <gouhao@uniontech.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241114013247.30821-1-gouhao@uniontech.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-11-18erofs: add sysfs node to drop internal cachesChunhai Guo
Add a sysfs node to drop compression-related caches, currently used to drop in-memory pclusters and cached compressed folios. Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241113041148.749129-1-guochunhai@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-11-18erofs: free pclusters if no cached folio is attachedChunhai Guo
Once a pcluster is fully decompressed and there are no attached cached folios, its corresponding `struct z_erofs_pcluster` will be freed. This will significantly reduce the frequency of calls to erofs_shrink_scan() and the memory allocated for `struct z_erofs_pcluster`. The tables below show approximately a 96% reduction in the calls to erofs_shrink_scan() and in the memory allocated for `struct z_erofs_pcluster` after applying this patch. The results were obtained by performing a test to copy a 4.1GB partition on ARM64 Android devices running the 6.6 kernel with an 8-core CPU and 12GB of memory. 1. The reduction in calls to erofs_shrink_scan(): +-----------------+-----------+----------+---------+ | | w/o patch | w/ patch | diff | +-----------------+-----------+----------+---------+ | Average (times) | 11390 | 390 | -96.57% | +-----------------+-----------+----------+---------+ 2. The reduction in memory released by erofs_shrink_scan(): +-----------------+-----------+----------+---------+ | | w/o patch | w/ patch | diff | +-----------------+-----------+----------+---------+ | Average (Byte) | 133612656 | 4434552 | -96.68% | +-----------------+-----------+----------+---------+ Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241112043235.546164-1-guochunhai@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-11-18erofs: sunset `struct erofs_workgroup`Gao Xiang
`struct erofs_workgroup` was introduced to provide a unique header for all physically indexed objects. However, after big pclusters and shared pclusters are implemented upstream, it seems that all EROFS encoded data (which requires transformation) can be represented with `struct z_erofs_pcluster` directly. Move all members into `struct z_erofs_pcluster` for simplicity. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241021035323.3280682-3-hsiangkao@linux.alibaba.com
2024-11-18erofs: move erofs_workgroup operations into zdata.cGao Xiang
Move related helpers into zdata.c as an intermediate step of getting rid of `struct erofs_workgroup`, and rename: erofs_workgroup_put => z_erofs_put_pcluster erofs_workgroup_get => z_erofs_get_pcluster erofs_try_to_release_workgroup => erofs_try_to_release_pcluster erofs_shrink_workstation => z_erofs_shrink_scan Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241021035323.3280682-2-hsiangkao@linux.alibaba.com
2024-11-18erofs: get rid of erofs_{find,insert}_workgroupGao Xiang
Just fold them into the only two callers since they are simple enough. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241021035323.3280682-1-hsiangkao@linux.alibaba.com
2024-11-17smb: client: fix use-after-free of signing keyPaulo Alcantara
Customers have reported use-after-free in @ses->auth_key.response with SMB2.1 + sign mounts which occurs due to following race: task A task B cifs_mount() dfs_mount_share() get_session() cifs_mount_get_session() cifs_send_recv() cifs_get_smb_ses() compound_send_recv() cifs_setup_session() smb2_setup_request() kfree_sensitive() smb2_calc_signature() crypto_shash_setkey() *UAF* Fix this by ensuring that we have a valid @ses->auth_key.response by checking whether @ses->ses_status is SES_GOOD or SES_EXITING with @ses->ses_lock held. After commit 24a9799aa8ef ("smb: client: fix UAF in smb2_reconnect_server()"), we made sure to call ->logoff() only when @ses was known to be good (e.g. valid ->auth_key.response), so it's safe to access signing key when @ses->ses_status == SES_EXITING. Cc: stable@vger.kernel.org Reported-by: Jay Shin <jaeshin@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-17smb: client: Use str_yes_no() helper functionThorsten Blum
Remove hard-coded strings by using the str_yes_no() helper function. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-17smb: client: memcpy() with surrounding object base addressKees Cook
Like commit f1f047bd7ce0 ("smb: client: Fix -Wstringop-overflow issues"), adjust the memcpy() destination address to be based off the surrounding object rather than based off the 4-byte "Protocol" member. This avoids a build-time warning when compiling under CONFIG_FORTIFY_SOURCE with GCC 15: In function 'fortify_memcpy_chk', inlined from 'CIFSSMBSetPathInfo' at ../fs/smb/client/cifssmb.c:5358:2: ../include/linux/fortify-string.h:571:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 571 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-17cifs: Remove pre-historic unused CIFSSMBCopyDr. David Alan Gilbert
CIFSSMBCopy() is unused, remove it. It seems to have been that way pre-git; looking in a historic archive, I think it landed around May 2004 in Linus' BKrev: 40ab7591J_OgkpHW-qhzZukvAUAw9g and was unused back then. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Acked-by: Tom Talpey <tom@talpey.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-17ksmbd: fix malformed unsupported smb1 negotiate responseNamjae Jeon
When mounting with vers=1.0, ksmbd should return unsupported smb1 negotiate response. But this response is malformed. [ 6010.586702] CIFS: VFS: Bad protocol string signature header 0x25000000 [ 6010.586708] 00000000: 25000000 25000000 424d53ff 00000072 ...%...%.SMBr... [ 6010.586711] 00000010: c8408000 00000000 00000000 00000000 ..@............. [ 6010.586713] 00000020: 00 00 b9 32 00 00 01 00 01 ...2..... Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-16Merge tag 'mm-hotfixes-stable-2024-11-16-15-33' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "10 hotfixes, 7 of which are cc:stable. All singletons, please see the changelogs for details" * tag 'mm-hotfixes-stable-2024-11-16-15-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: revert "mm: shmem: fix data-race in shmem_getattr()" ocfs2: uncache inode which has failed entering the group mm: fix NULL pointer dereference in alloc_pages_bulk_noprof mm, doc: update read_ahead_kb for MADV_HUGEPAGE fs/proc/task_mmu: prevent integer overflow in pagemap_scan_get_args() sched/task_stack: fix object_is_on_stack() for KASAN tagged pointers crash, powerpc: default to CRASH_DUMP=n on PPC_BOOK3S_32 mm/mremap: fix address wraparound in move_page_tables() tools/mm: fix compile error mm, swap: fix allocation and scanning race with swapoff
2024-11-16fs/9p: replace functions v9fs_cache_{register|unregister} with direct callsColin Ian King
The helper functions v9fs_cache_register and v9fs_cache_unregister are trivial helper functions that don't offer any extra functionality and are unncessary. Replace them with direct calls to v9fs_init_inode_cache and v9fs_destroy_inode_cache respectively to simplify the code. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Message-ID: <20241107095756.10261-1-colin.i.king@gmail.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2024-11-15hostfs: Fix the NULL vs IS_ERR() bug for __filemap_get_folio()ZhangPeng
The __filemap_get_folio() function returns error pointers. It never returns NULL. So use IS_ERR() to check it. Fixes: 1da86618bdce ("fs: Convert aops->write_begin to take a folio") Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-15dlm: fix recovery of middle conversionsAlexander Aring
In one special case, recovery is unable to reliably rebuild lock state by simply recreating lkb structs as sent from the lock holders. That case is when the lkb's include conversions between PR and CW modes. The recovery code has always recognized this special case, but the implemention has always been broken, and would set invalid modes in recovered lkb's. Unpredictable or bogus errors could then be returned for further locking calls on these locks. This bug has gone unnoticed for so long due to some combination of: - applications never or infrequently converting between PR/CW - recovery not occuring during these conversions - if the recovery bug does occur, the caller may not notice, depending on what further locking calls are made, e.g. if the lock is simply unlocked it may go unnoticed However, a core analysis from a recent gfs2 bug report points to this broken code. PR = Protected Read CW = Concurrent Write PR and CW are incompatible PR and PR are compatible CW and CW are compatible Example 1 node C, resource R granted: PR node A granted: PR node B granted: NL node C granted: NL node D - A sends convert PR->CW to C - C fails before A gets a reply - recovery occurs At this point, A does not know if it still holds the lock in PR, or if its conversion to CW was granted: - If A's conversion to CW was granted, then another node's CW lock may also have been granted. - If A's conversion to CW was not granted, it still holds a PR lock, and other nodes may also hold PR locks. So, the new master of R cannot simply recreate the lock from A using granted mode PR and requested mode CW. The new master must look at all the recovered locks to determine the correct granted modes, and ensure that all the recovered locks are recreated in compatible states. The correct lock recovery steps in this example are: - node D becomes the new master of R - node B sends D its lkb, granted PR - node A sends D its lkb, convert PR->CW - D determines the correct lock state is: granted: PR node B convert: PR->CW node A The lkb sent by each node was recreated without any change on the new master node. Example 2 node C, resource R granted: PR node A granted: NL node C granted: NL node D waiting: CW node B - A sends convert PR->CW to C - C grants the conversion to CW for A - C grants the waiting request for CW to B - C sends granted message to B, but fails before it can send the granted message to A - B receives the granted message from C At this point: - A believes it is converting PR->CW - B believes it is holding a CW lock The correct lock recovery steps in this example are: - node D becomes the new master of R - node A sends D its lkb, convert PR->CW - node B sends D its lkb, granted CW - D determins the correct lock state is: granted: CW node B granted: CW node A The lkb sent by B is recreated without change, but the lkb sent by A is changed because the granted mode was not compatible. Fixes to make this work correctly: recover_convert_waiter: should not make any changes to a converting lkb that is still waiting for a reply message. It was previously setting grmode to IV, which is invalid state, so the lkb would not be handled correctly by other code. receive_rcom_lock_args: was checking the wrong lkb field (wait_type instead of status) to determine if the lkb is being converted, and in need of inspection for this special recovery. It was also setting grmode to IV in the lkb, causing it to be mishandled by other code. Now, this function just puts the lkb, directly as sent, onto the convert queue of the resource being recovered, and corrects it in recover_conversion() later, if needed. recover_conversion: the job of this function is to detect and correct lkb states for the special PR/CW conversions. The new code now checks for recovered lkbs on the granted queue with grmode PR or CW, and takes the real grmode from that. Then it looks for lkbs on the convert queue with an incompatible grmode (i.e. grmode PR when the real grmode is CW, or v.v.) These converting lkbs need to be fixed. They are fixed by temporarily setting their grmode to NL, so that grmodes are not incompatible and won't confuse other locking code. The converting lkb will then be granted at the end of recovery, replacing the temporary NL grmode. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2024-11-15Merge tag 'for-6.12-rc7-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One more fix that seems urgent and good to have in 6.12 final. It could potentially lead to unexpected transaction aborts, due to wrong comparison and order of processing of delayed refs" * tag 'for-6.12-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix incorrect comparison for delayed refs
2024-11-15ubifs: Fix uninitialized use of err in ubifs_jnl_write_inode()Nathan Chancellor
Clang warns (or errors with CONFIG_WERROR=y): fs/ubifs/journal.c:986:20: error: variable 'err' is uninitialized when used here [-Werror,-Wuninitialized] 986 | ubifs_ro_mode(c, err); | ^~~ Set err to -EPERM before the call to ubifs_ro_mode() and reuse it in the return statement to resolve the warning. Fixes: 957e1c4e1779 ("ubifs: ubifs_jnl_write_inode: Only check once for the limitation of xattr count") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-15ecryptfs: Fix spelling mistake "validationg" -> "validating"Colin Ian King
There is a spelling mistake in an error message literal string. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20241108112509.109891-1-colin.i.king@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-15ecryptfs: Convert ecryptfs to use the new mount APIEric Sandeen
Convert ecryptfs to the new mount API. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Link: https://lore.kernel.org/r/20241028143359.605061-3-sandeen@redhat.com Acked-by: Tyler Hicks <code@tyhicks.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-15ecryptfs: Factor out mount option validationEric Sandeen
Under the new mount API, mount options are parsed one at a time. Any validation that examines multiple options must be done after parsing is complete, so factor out a ecryptfs_validate_options() which can be called separately. To facilitate this, temporarily move the local variables that tracked whether various options have been set in the parsing function, into the ecryptfs_mount_crypt_stat structure so that they can be examined later. These will be moved to a more ephemeral struct in the mount api conversion patch to follow. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Link: https://lore.kernel.org/r/20241028143359.605061-2-sandeen@redhat.com Acked-by: Tyler Hicks <code@tyhicks.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-15fs: open_by_handle_at() support for decoding "explicit connectable" file handlesAmir Goldstein
Teach open_by_handle_at(2) about the type format of "explicit connectable" file handles that were created using the AT_HANDLE_CONNECTABLE flag to name_to_handle_at(2). When decoding an "explicit connectable" file handles, name_to_handle_at(2) should fail if it cannot open a "connected" fd with known path, which is accessible (to capable user) from mount fd path. Note that this does not check if the path is accessible to the calling user, just that it is accessible wrt the mount namesapce, so if there is no "connected" alias, or if parts of the path are hidden in the mount namespace, open_by_handle_at(2) will return -ESTALE. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20241011090023.655623-4-amir73il@gmail.com Fixes: 570df4e9c23f ("ceph: snapshot nfs re-export") Acked-by: Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-15fs: name_to_handle_at() support for "explicit connectable" file handlesAmir Goldstein
nfsd encodes "connectable" file handles for the subtree_check feature, which can be resolved to an open file with a connected path. So far, userspace nfs server could not make use of this functionality. Introduce a new flag AT_HANDLE_CONNECTABLE to name_to_handle_at(2). When used, the encoded file handle is "explicitly connectable". The "explicitly connectable" file handle sets bits in the high 16bit of the handle_type field, so open_by_handle_at(2) will know that it needs to open a file with a connected path. old kernels will now recognize the handle_type with high bits set, so "explicitly connectable" file handles cannot be decoded by open_by_handle_at(2) on old kernels. The flag AT_HANDLE_CONNECTABLE is not allowed together with either AT_HANDLE_FID or AT_EMPTY_PATH. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20241011090023.655623-3-amir73il@gmail.com Fixes: 570df4e9c23f ("ceph: snapshot nfs re-export") Acked-by: Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-15fs: prepare for "explicit connectable" file handlesAmir Goldstein
We would like to use the high 16bit of the handle_type field to encode file handle traits, such as "connectable". In preparation for this change, make sure that filesystems do not return a handle_type value with upper bits set and that the open_by_handle_at(2) syscall rejects these handle types. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20241011090023.655623-2-amir73il@gmail.com Fixes: 570df4e9c23f ("ceph: snapshot nfs re-export") Acked-by: Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-15ovl: convert ovl_real_fdget() callers to ovl_real_file()Amir Goldstein
Stop using struct fd to return a real file from ovl_real_fdget(), because we no longer return a temporary file object and the callers always get a borrowed file reference. Rename the helper to ovl_real_file(), return a borrowed reference of the real file that is referenced from the overlayfs file or an error. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-11-15ovl: convert ovl_real_fdget_path() callers to ovl_real_file_path()Amir Goldstein
Stop using struct fd to return a real file from ovl_real_fdget_path(), because we no longer return a temporary file object and the callers always get a borrowed file reference. Rename the helper to ovl_real_file_path(), return a borrowed reference of the real file that is referenced from the overlayfs file or an error. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-11-15ovl: store upper real file in ovl_file structAmir Goldstein
When an overlayfs file is opened as lower and then the file is copied up, every operation on the overlayfs open file will open a temporary backing file to the upper dentry and close it at the end of the operation. Store the upper real file along side the original (lower) real file in ovl_file instead of opening a temporary upper file on every operation. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-11-15ovl: allocate a container struct ovl_file for ovl private contextAmir Goldstein
Instead of using ->private_data to point at realfile directly, so that we can add more context per ovl open file. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-11-15ovl: do not open non-data lower file for fsyncAmir Goldstein
ovl_fsync() with !datasync opens a backing file from the top most dentry in the stack, checks if this dentry is non-upper and skips the fsync. In case of an overlay dentry stack with lower data and lower metadata above it, but without an upper metadata above it, the backing file is opened from the top most lower metadata dentry and never used. Refactor the helper ovl_real_fdget_meta() into ovl_real_fdget_path() and open code the checks for non-upper inode in ovl_fsync(), so in that case we can avoid the unneeded backing file open. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-11-15ovl: Optimize override/revert credsVinicius Costa Gomes
Use override_creds_light() in ovl_override_creds() and revert_creds_light() in ovl_revert_creds(). The _light() functions do not change the 'usage' of the credentials in question, as they refer to the credentials associated with the mounter, which have a longer lifetime. In ovl_setup_cred_for_create(), do not need to modify the mounter credentials (returned by override_creds_light()) 'usage' counter. Add a warning to verify that we are indeed working with the mounter credentials (stored in the superblock). Failure in this assumption means that creds may leak. Suggested-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-11-14ocfs2: uncache inode which has failed entering the groupDmitry Antipov
Syzbot has reported the following BUG: kernel BUG at fs/ocfs2/uptodate.c:509! ... Call Trace: <TASK> ? __die_body+0x5f/0xb0 ? die+0x9e/0xc0 ? do_trap+0x15a/0x3a0 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ? do_error_trap+0x1dc/0x2c0 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ? __pfx_do_error_trap+0x10/0x10 ? handle_invalid_op+0x34/0x40 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ? exc_invalid_op+0x38/0x50 ? asm_exc_invalid_op+0x1a/0x20 ? ocfs2_set_new_buffer_uptodate+0x2e/0x160 ? ocfs2_set_new_buffer_uptodate+0x144/0x160 ? ocfs2_set_new_buffer_uptodate+0x145/0x160 ocfs2_group_add+0x39f/0x15a0 ? __pfx_ocfs2_group_add+0x10/0x10 ? __pfx_lock_acquire+0x10/0x10 ? mnt_get_write_access+0x68/0x2b0 ? __pfx_lock_release+0x10/0x10 ? rcu_read_lock_any_held+0xb7/0x160 ? __pfx_rcu_read_lock_any_held+0x10/0x10 ? smack_log+0x123/0x540 ? mnt_get_write_access+0x68/0x2b0 ? mnt_get_write_access+0x68/0x2b0 ? mnt_get_write_access+0x226/0x2b0 ocfs2_ioctl+0x65e/0x7d0 ? __pfx_ocfs2_ioctl+0x10/0x10 ? smack_file_ioctl+0x29e/0x3a0 ? __pfx_smack_file_ioctl+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x43d/0x780 ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10 ? __pfx_ocfs2_ioctl+0x10/0x10 __se_sys_ioctl+0xfb/0x170 do_syscall_64+0xf3/0x230 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... </TASK> When 'ioctl(OCFS2_IOC_GROUP_ADD, ...)' has failed for the particular inode in 'ocfs2_verify_group_and_input()', corresponding buffer head remains cached and subsequent call to the same 'ioctl()' for the same inode issues the BUG() in 'ocfs2_set_new_buffer_uptodate()' (trying to cache the same buffer head of that inode). Fix this by uncaching the buffer head with 'ocfs2_remove_from_cache()' on error path in 'ocfs2_group_add()'. Link: https://lkml.kernel.org/r/20241114043844.111847-1-dmantipov@yandex.ru Fixes: 7909f2bf8353 ("[PATCH 2/2] ocfs2: Implement group add for online resize") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: syzbot+453873f1588c2d75b447@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=453873f1588c2d75b447 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Dmitry Antipov <dmantipov@yandex.ru> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-14fs/proc/task_mmu: prevent integer overflow in pagemap_scan_get_args()Dan Carpenter
The "arg->vec_len" variable is a u64 that comes from the user at the start of the function. The "arg->vec_len * sizeof(struct page_region))" multiplication can lead to integer wrapping. Use size_mul() to avoid that. Also the size_add/mul() functions work on unsigned long so for 32bit systems we need to ensure that "arg->vec_len" fits in an unsigned long. Link: https://lkml.kernel.org/r/39d41335-dd4d-48ed-8a7f-402c57d8ea84@stanley.mountain Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Andrei Vagin <avagin@google.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-14jffs2: Prevent rtime decompress memory corruptionKinsey Moore
The rtime decompression routine does not fully check bounds during the entirety of the decompression pass and can corrupt memory outside the decompression buffer if the compressed data is corrupted. This adds the required check to prevent this failure mode. Cc: stable@vger.kernel.org Signed-off-by: Kinsey Moore <kinsey.moore@oarcorp.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.12-rc8). Conflicts: tools/testing/selftests/net/.gitignore 252e01e68241 ("selftests: net: add netlink-dumps to .gitignore") be43a6b23829 ("selftests: ncdevmem: Move ncdevmem under drivers/net/hw") https://lore.kernel.org/all/20241113122359.1b95180a@canb.auug.org.au/ drivers/net/phy/phylink.c 671154f174e0 ("net: phylink: ensure PHY momentary link-fails are handled") 7530ea26c810 ("net: phylink: remove "using_mac_select_pcs"") Adjacent changes: drivers/net/ethernet/stmicro/stmmac/dwmac-intel-plat.c 5b366eae7193 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines") e96321fad3ad ("net: ethernet: Switch back to struct platform_driver::remove()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14jffs2: remove redundant check on outpos > posColin Ian King
The check for outpos > pos is always false because outpos is zero and pos is at least zero; outpos can never be greater than pos. The check is redundant and can be removed. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14fs: jffs2: Fix inconsistent indentation in jffs2_mark_node_obsoleteSuraj Sonawane
Fix the indentation to ensure consistent code style and improve readability, and to fix this warnings: fs/jffs2/nodemgmt.c:635 jffs2_mark_node_obsolete() warn: inconsistent indenting fs/jffs2/nodemgmt.c:646 jffs2_mark_node_obsolete() warn: inconsistent indenting Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14jffs2: Correct some typos in commentsShen Lichuan
Fixed some confusing spelling errors, the details are as follows: -in the code comments: wating -> waiting succefully -> successfully Signed-off-by: Shen Lichuan <shenlichuan@vivo.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14jffs2: fix use of uninitialized variableQingfang Deng
When building the kernel with -Wmaybe-uninitialized, the compiler reports this warning: In function 'jffs2_mark_erased_block', inlined from 'jffs2_erase_pending_blocks' at fs/jffs2/erase.c:116:4: fs/jffs2/erase.c:474:9: warning: 'bad_offset' may be used uninitialized [-Wmaybe-uninitialized] 474 | jffs2_erase_failed(c, jeb, bad_offset); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fs/jffs2/erase.c: In function 'jffs2_erase_pending_blocks': fs/jffs2/erase.c:402:18: note: 'bad_offset' was declared here 402 | uint32_t bad_offset; | ^~~~~~~~~~ When mtd->point() is used, jffs2_erase_pending_blocks can return -EIO without initializing bad_offset, which is later used at the filebad label in jffs2_mark_erased_block. Fix it by initializing this variable. Fixes: 8a0f572397ca ("[JFFS2] Return values of jffs2_block_check_erase error paths") Signed-off-by: Qingfang Deng <qingfang.deng@siflower.com.cn> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14jffs2: Use str_yes_no() helper functionThorsten Blum
Remove hard-coded strings by using the str_yes_no() helper function. Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: authentication: Fix use-after-free in ubifs_tnc_end_commitWaqar Hameed
After an insertion in TNC, the tree might split and cause a node to change its `znode->parent`. A further deletion of other nodes in the tree (which also could free the nodes), the aforementioned node's `znode->cparent` could still point to a freed node. This `znode->cparent` may not be updated when getting nodes to commit in `ubifs_tnc_start_commit()`. This could then trigger a use-after-free when accessing the `znode->cparent` in `write_index()` in `ubifs_tnc_end_commit()`. This can be triggered by running rm -f /etc/test-file.bin dd if=/dev/urandom of=/etc/test-file.bin bs=1M count=60 conv=fsync in a loop, and with `CONFIG_UBIFS_FS_AUTHENTICATION`. KASAN then reports: BUG: KASAN: use-after-free in ubifs_tnc_end_commit+0xa5c/0x1950 Write of size 32 at addr ffffff800a3af86c by task ubifs_bgt0_20/153 Call trace: dump_backtrace+0x0/0x340 show_stack+0x18/0x24 dump_stack_lvl+0x9c/0xbc print_address_description.constprop.0+0x74/0x2b0 kasan_report+0x1d8/0x1f0 kasan_check_range+0xf8/0x1a0 memcpy+0x84/0xf4 ubifs_tnc_end_commit+0xa5c/0x1950 do_commit+0x4e0/0x1340 ubifs_bg_thread+0x234/0x2e0 kthread+0x36c/0x410 ret_from_fork+0x10/0x20 Allocated by task 401: kasan_save_stack+0x38/0x70 __kasan_kmalloc+0x8c/0xd0 __kmalloc+0x34c/0x5bc tnc_insert+0x140/0x16a4 ubifs_tnc_add+0x370/0x52c ubifs_jnl_write_data+0x5d8/0x870 do_writepage+0x36c/0x510 ubifs_writepage+0x190/0x4dc __writepage+0x58/0x154 write_cache_pages+0x394/0x830 do_writepages+0x1f0/0x5b0 filemap_fdatawrite_wbc+0x170/0x25c file_write_and_wait_range+0x140/0x190 ubifs_fsync+0xe8/0x290 vfs_fsync_range+0xc0/0x1e4 do_fsync+0x40/0x90 __arm64_sys_fsync+0x34/0x50 invoke_syscall.constprop.0+0xa8/0x260 do_el0_svc+0xc8/0x1f0 el0_svc+0x34/0x70 el0t_64_sync_handler+0x108/0x114 el0t_64_sync+0x1a4/0x1a8 Freed by task 403: kasan_save_stack+0x38/0x70 kasan_set_track+0x28/0x40 kasan_set_free_info+0x28/0x4c __kasan_slab_free+0xd4/0x13c kfree+0xc4/0x3a0 tnc_delete+0x3f4/0xe40 ubifs_tnc_remove_range+0x368/0x73c ubifs_tnc_remove_ino+0x29c/0x2e0 ubifs_jnl_delete_inode+0x150/0x260 ubifs_evict_inode+0x1d4/0x2e4 evict+0x1c8/0x450 iput+0x2a0/0x3c4 do_unlinkat+0x2cc/0x490 __arm64_sys_unlinkat+0x90/0x100 invoke_syscall.constprop.0+0xa8/0x260 do_el0_svc+0xc8/0x1f0 el0_svc+0x34/0x70 el0t_64_sync_handler+0x108/0x114 el0t_64_sync+0x1a4/0x1a8 The offending `memcpy()` in `ubifs_copy_hash()` has a use-after-free when a node becomes root in TNC but still has a `cparent` to an already freed node. More specifically, consider the following TNC: zroot / / zp1 / / zn Inserting a new node `zn_new` with a key smaller then `zn` will trigger a split in `tnc_insert()` if `zp1` is full: zroot / \ / \ zp1 zp2 / \ / \ zn_new zn `zn->parent` has now been moved to `zp2`, *but* `zn->cparent` still points to `zp1`. Now, consider a removal of all the nodes _except_ `zn`. Just when `tnc_delete()` is about to delete `zroot` and `zp2`: zroot \ \ zp2 \ \ zn `zroot` and `zp2` get freed and the tree collapses: zn `zn` now becomes the new `zroot`. `get_znodes_to_commit()` will now only find `zn`, the new `zroot`, and `write_index()` will check its `znode->cparent` that wrongly points to the already freed `zp1`. `ubifs_copy_hash()` thus gets wrongly called with `znode->cparent->zbranch[znode->iip].hash` that triggers the use-after-free! Fix this by explicitly setting `znode->cparent` to `NULL` in `get_znodes_to_commit()` for the root node. The search for the dirty nodes is bottom-up in the tree. Thus, when `find_next_dirty(znode)` returns NULL, the current `znode` _is_ the root node. Add an assert for this. Fixes: 16a26b20d2af ("ubifs: authentication: Add hashes to index nodes") Tested-by: Waqar Hameed <waqar.hameed@axis.com> Co-developed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Waqar Hameed <waqar.hameed@axis.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: xattr: remove unused anonymous enumPascal Eberhard
commit 2b88fc21cae9 ("ubifs: Switch to generic xattr handlers") removes usage of this anonymous enum. Delete the enum as well. Signed-off-by: Pascal Eberhard <pascal.eberhard@se.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14Merge tag 'bcachefs-2024-11-13' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: "This fixes one minor regression from the btree cache fixes (in the scan_for_btree_nodes repair path) - and the shutdown path fix is the big one here, in terms of bugs closed: - Assorted tiny syzbot fixes - Shutdown path fix: "bch2_btree_write_buffer_flush_going_ro()" The shutdown path wasn't flushing the btree write buffer, leading to shutting down while we still had operations in flight. This fixes a whole slew of syzbot bugs, and undoubtedly other strange heisenbugs. * tag 'bcachefs-2024-11-13' of git://evilpiepirate.org/bcachefs: bcachefs: Fix assertion pop in bch2_ptr_swab() bcachefs: Fix journal_entry_dev_usage_to_text() overrun bcachefs: Allow for unknown key types in backpointers fsck bcachefs: Fix assertion pop in topology repair bcachefs: Fix hidden btree errors when reading roots bcachefs: Fix validate_bset() repair path bcachefs: Fix missing validation for bch_backpointer.level bcachefs: Fix bch_member.btree_bitmap_shift validation bcachefs: bch2_btree_write_buffer_flush_going_ro()
2024-11-14ubifs: Reduce kfree() calls in ubifs_purge_xattrs()Markus Elfring
Move a pair of kfree() calls behind the label “out_err” so that two statements can be better reused at the end of this function implementation. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: Call iput(xino) only once in ubifs_purge_xattrs()Markus Elfring
An iput(xino) call was immediately used after a return value check for a remove_xattr() call in this function implementation. Thus call such a function only once instead directly before the check. This issue was transformed by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: Correct the total block count by deducting journal reservationZhihao Cheng
Since commit e874dcde1cbf ("ubifs: Reserve one leb for each journal head while doing budget"), available space is calulated by deducting reservation for all journal heads. However, the total block count ( which is only used by statfs) is not updated yet, which will cause the wrong displaying for used space(total - available). Fix it by deducting reservation for all journal heads from total block count. Fixes: e874dcde1cbf ("ubifs: Reserve one leb for each journal head while doing budget") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: Convert to use ERR_CAST()Shen Lichuan
As opposed to open-code, using the ERR_CAST macro clearly indicates that this is a pointer to an error value and a type conversion was performed. Signed-off-by: Shen Lichuan <shenlichuan@vivo.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: add support for FS_IOC_GETFSSYSFSPATHHongbo Li
In commit ae8c51175730 ("fs: add FS_IOC_GETFSSYSFSPATH"), a new fs ioctl was introduced to standardize exporting data from sysfs across filesystems. The returned path will always be of the form "$FSTYP/$SYSFS_IDENTIFIER", where the sysfs identifier may be a UUID or a device name. The ubifs is a file system based on char device, and the common method to fill s_sysfs_name (super_set_sysfs_name_bdev) is unavialable. So in order to support FS_IOC_GETFSSYSFSPATH ioctl, we fill the s_sysfs_name with ubi_volume_info member which keeps the format defined in macro UBIFS_DFS_DIR_NAME by using super_set_sysfs_name_generic. That's for ubifs, it will output "ubifs/<dev>". ``` $ ./ioctl_getfssysfs_path /mnt/ubifs/testfile path: ubifs/ubi0_0 $ ls /sys/fs/ubifs/ubi0_0/ errors_crc errors_magic errors_node ``` Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: remove unused ioctl flags GETFLAGS/SETFLAGSHongbo Li
In the ubifs, ubifs_fileattr_get and ubifs_fileattr_set have been implemented, GETFLAGS and SETFLAGS ioctl are not handled in filesystem's own ioctl helper. Additionally, these flags' cases are not handled in ubifs's ioctl helper, so we can remove them. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: Display the inode number when orphan twice happensLiu Mingrui
Display the inode number in error message when the same orphan inode is added twice, which could provide more information for debugging. Signed-off-by: Liu Mingrui <liumingrui@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: Remove ineffective function ubifs_evict_xattr_inode()Zhihao Cheng
Function ubifs_evict_xattr_inode() is imported by commit 272eda8298dc ("ubifs: Correctly evict xattr inodes") to reclaim xattr inode when the host inode is deleted. The xattr inode is evicted in the host inode deleting process since commit 7959cf3a7506 ("ubifs: journal: Handle xattrs like files"). So the ineffective function ubifs_evict_xattr_inode() can be deleted safely. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-11-14ubifs: ubifs_jnl_write_inode: Only check once for the limitation of xattr countZhihao Cheng
No need to check the limitation of xattr count every time in function ubifs_jnl_write_inode(), because the 'ui->xattr_cnt' won't be modified by others in the inode evicting process. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>