summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2021-10-12fs/ntfs3: Check for NULL pointers in ni_try_remove_attr_listKonstantin Komarov
Check for potential NULL pointers. Print error message if found. Thread, that leads to this commit: https://lore.kernel.org/ntfs3/227c13e3-5a22-0cba-41eb-fcaf41940711@paragon-software.com/ Reported-by: Mohammad Rasim <mohammad.rasim96@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11Merge tag 'for-5.15-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more error handling fixes, stemming from code inspection, error injection or fuzzing" * tag 'for-5.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix abort logic in btrfs_replace_file_extents btrfs: check for error when looking up inode during dir entry replay btrfs: unify lookup return value when dir entry is missing btrfs: deal with errors when adding inode reference during log replay btrfs: deal with errors when replaying dir entry during log replay btrfs: deal with errors when checking if a dir entry exists during log replay btrfs: update refs for any root except tree log roots btrfs: unlock newly allocated extent buffer after error
2021-10-11f2fs: fix wrong condition to trigger background checkpoint correctlyChao Yu
In f2fs_balance_fs_bg(), it needs to check both NAT_ENTRIES and INO_ENTRIES memory usage to decide whether we should skip background checkpoint, otherwise we may always skip checking INO_ENTRIES memory usage, so that INO_ENTRIES may potentially cause high memory footprint. Fixes: 493720a48543 ("f2fs: fix to avoid REQ_TIME and CP_TIME collision") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-10-11f2fs: fix to use WHINT_MODEKeoseong Park
Since active_logs can be set to 2 or 4 or NR_CURSEG_PERSIST_TYPE(6), it cannot be set to NR_CURSEG_TYPE(8). That is, whint_mode is always off. Therefore, the condition is changed from NR_CURSEG_TYPE to NR_CURSEG_PERSIST_TYPE. Cc: Chao Yu <chao@kernel.org> Fixes: d0b9e42ab615 (f2fs: introduce inmem curseg) Reported-by: tanghuan <tanghuan@vivo.com> Signed-off-by: Keoseong Park <keosung.park@samsung.com> Signed-off-by: Fengnan Chang <changfengnan@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-10-11xfs: use kmem_cache_free() for kmem_cache objectsRustam Kovhaev
For kmalloc() allocations SLOB prepends the blocks with a 4-byte header, and it puts the size of the allocated blocks in that header. Blocks allocated with kmem_cache_alloc() allocations do not have that header. SLOB explodes when you allocate memory with kmem_cache_alloc() and then try to free it with kfree() instead of kmem_cache_free(). SLOB will assume that there is a header when there is none, read some garbage to size variable and corrupt the adjacent objects, which eventually leads to hang or panic. Let's make XFS work with SLOB by using proper free function. Fixes: 9749fee83f38 ("xfs: enable the xfs_defer mechanism to process extents to free") Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-11xfs: Use kvcalloc() instead of kvzalloc()Gustavo A. R. Silva
Use 2-factor argument multiplication form kvcalloc() instead of kvzalloc(). Link: https://github.com/KSPP/linux/issues/162 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-10-11orangefs: Fix sb refcount leak when allocate sb info failed.Chenyuan Mi
The reference counting issue happens in one exception handling path of orangefs_mount(). When failing to allocate sb info, the function forgets to decrease the refcount of sb increased by sget(), causing a refcount leak. Fix this issue by jumping to the label "free_sb_and_op" instead of "free_op" Signed-off-by: Chenyuan Mi <cymi20@fudan.edu.cn> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2021-10-11fs: orangefs: fix error return code of orangefs_revalidate_lookup()Jia-Ju Bai
When op_alloc() returns NULL to new_op, no error return code of orangefs_revalidate_lookup() is assigned. To fix this bug, ret is assigned with -ENOMEM in this case. Fixes: 8bb8aefd5afb ("OrangeFS: Change almost all instances of the string PVFS2 to OrangeFS.") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2021-10-11orangefs: Remove redundant initialization of variable retColin Ian King
The variable ret is being initialized with a value that is never read, it is being updated later on. The assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2021-10-11fs/ntfs3: Refactor ntfs_read_mftKonstantin Komarov
Don't save size of attribute reparse point as size of symlink. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11fs/ntfs3: Refactor ni_parse_reparseKonstantin Komarov
Change argument from void* to struct REPARSE_DATA_BUFFER* We copy data to buffer, so we can read it later in ntfs_read_mft. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11fs/ntfs3: Refactor ntfs_create_inodeKonstantin Komarov
Set size for symlink, so we don't need to calculate it on the fly. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11fs/ntfs3: Refactor ntfs_readlink_hlpKonstantin Komarov
Rename some variables. Returned err by default is EINVAL. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11fs/ntfs3: Rework ntfs_utf16_to_nlsKonstantin Komarov
Now ntfs_utf16_to_nls takes length as one of arguments. If length of symlink > 255, then we tried to convert length of symlink +- some random number. Now 255 symbols limit was removed. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11fs/ntfs3: Fix memory leak if fill_super failedKonstantin Komarov
In ntfs_init_fs_context we allocate memory in fc->s_fs_info. In case of failed mount we must free it in ntfs_fill_super. We can't do it in ntfs_fs_free, because ntfs_fs_free called with fc->s_fs_info == NULL. fc->s_fs_info became NULL in sget_fc. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11fs/ntfs3: Keep prealloc for all types of filesKonstantin Komarov
Before we haven't kept prealloc for sparse files because we thought that it will speed up create / write operations. It lead to situation, when user reserved some space for sparse file, filled volume, and wasn't able to write in reserved file. With this commit we keep prealloc. Now xfstest generic/274 pass. Fixes: be71b5cba2e6 ("fs/ntfs3: Add attrib operations") Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-11quota: correct error number in free_dqentry()Zhang Yi
Fix the error path in free_dqentry(), pass out the error number if the block to free is not correct. Fixes: 1ccd14b9c271 ("quota: Split off quota tree handling into a separate file") Link: https://lore.kernel.org/r/20211008093821.1001186-3-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Cc: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2021-10-11quota: check block number when reading the block in quota fileZhang Yi
The block number in the quota tree on disk should be smaller than the v2_disk_dqinfo.dqi_blocks. If the quota file was corrupted, we may be allocating an 'allocated' block and that would lead to a loop in a tree, which will probably trigger oops later. This patch adds a check for the block number in the quota tree to prevent such potential issue. Link: https://lore.kernel.org/r/20211008093821.1001186-2-yi.zhang@huawei.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Cc: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2021-10-10NFS: Fix deadlocks in nfs_scan_commit_list()Trond Myklebust
Partially revert commit 2ce209c42c01 ("NFS: Wait for requests that are locked on the commit list"), since it can lead to deadlocks between commit requests and nfs_join_page_group(). For now we should assume that any locked requests on the commit list are either about to be removed and committed by another task, or the writes they describe are about to be retransmitted. In either case, we should not need to worry. Fixes: 2ce209c42c01 ("NFS: Wait for requests that are locked on the commit list") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-10-10NFS: Instrument i_size_write()Chuck Lever
Generate a trace event whenever the NFS client modifies the size of a file. These new events aid troubleshooting workloads that trigger races around size updates. There are four new trace points, all named nfs_size_something so they are easy to grep for or enable as a group with a single glob. Size updated on the server: kworker/u24:10-194 [010] 369.939174: nfs_size_update: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899344277980615 cursize=250471 newsize=172083 Server-side size update reported via NFSv3 WCC attributes: fsx-1387 [006] 380.760686: nfs_size_wcc: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899355909932456 cursize=146792 newsize=171216 File has been truncated locally: fsx-1387 [007] 369.437421: nfs_size_truncate: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899231200117272 cursize=215244 newsize=0 File has been extended locally: fsx-1387 [007] 369.439213: nfs_size_grow: fileid=00:28:2 fhandle=0x36fbbe51 version=1752899343704248410 cursize=258048 newsize=262144 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-10-10NFS: Remove unnecessary TRACE_DEFINE_ENUM()sChuck Lever
Clean up: TRACE_DEFINE_ENUM is unnecessary because the target symbols are all C macros, not enums. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-10-09Merge tag '5.15-rc4-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull ksmbd fixes from Steve French: "Six fixes for the ksmbd kernel server, including two additional overflow checks, a fix for oops, and some cleanup (e.g. remove dead code for less secure dialects that has been removed)" * tag '5.15-rc4-ksmbd-fixes' of git://git.samba.org/ksmbd: ksmbd: fix oops from fuse driver ksmbd: fix version mismatch with out of tree ksmbd: use buf_data_size instead of recalculation in smb3_decrypt_req() ksmbd: remove the leftover of smb2.0 dialect support ksmbd: check strictly data area in ksmbd_smb2_check_message() ksmbd: add the check to vaildate if stream protocol length exceeds maximum value
2021-10-08tracefs: Have tracefs directories not set OTH permission bits by defaultSteven Rostedt (VMware)
The tracefs file system is by default mounted such that only root user can access it. But there are legitimate reasons to create a group and allow those added to the group to have access to tracing. By changing the permissions of the tracefs mount point to allow access, it will allow group access to the tracefs directory. There should not be any real reason to allow all access to the tracefs directory as it contains sensitive information. Have the default permission of directories being created not have any OTH (other) bits set, such that an admin that wants to give permission to a group has to first disable all OTH bits in the file system. Link: https://lkml.kernel.org/r/20210818153038.664127804@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-08coredump: Limit coredumps to a single thread groupEric W. Biederman
Today when a signal is delivered with a handler of SIG_DFL whose default behavior is to generate a core dump not only that process but every process that shares the mm is killed. In the case of vfork this looks like a real world problem. Consider the following well defined sequence. if (vfork() == 0) { execve(...); _exit(EXIT_FAILURE); } If a signal that generates a core dump is received after vfork but before the execve changes the mm the process that called vfork will also be killed (as the mm is shared). Similarly if the execve fails after the point of no return the kernel delivers SIGSEGV which will kill both the exec'ing process and because the mm is shared the process that called vfork as well. As far as I can tell this behavior is a violation of people's reasonable expectations, POSIX, and is unnecessarily fragile when the system is low on memory. Solve this by making a userspace visible change to only kill a single process/thread group. This is possible because Jann Horn recently modified[1] the coredump code so that the mm can safely be modified while the coredump is happening. With LinuxThreads long gone I don't expect anyone to have a notice this behavior change in practice. To accomplish this move the core_state pointer from mm_struct to signal_struct, which allows different thread groups to coredump simultatenously. In zap_threads remove the work to kill anything except for the current thread group. v2: Remove core_state from the VM_BUG_ON_MM print to fix compile failure when CONFIG_DEBUG_VM is enabled. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> [1] a07279c9a8cd ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot") Fixes: d89f3847def4 ("[PATCH] thread-aware coredumps, 2.5.43-C3") History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Link: https://lkml.kernel.org/r/87y27mvnke.fsf@disp2133 Link: https://lkml.kernel.org/r/20211007144701.67592574@canb.auug.org.au Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-07Merge tag 'nfsd-5.15-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Bug fixes for NFSD error handling paths" * tag 'nfsd-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Keep existing listeners on portlist error SUNRPC: fix sign error causing rpcsec_gss drops nfsd: Fix a warning for nfsd_file_close_inode nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero nfsd: fix error handling of register_pernet_subsys() in init_nfsd()
2021-10-07btrfs: fix abort logic in btrfs_replace_file_extentsJosef Bacik
Error injection testing uncovered a case where we'd end up with a corrupt file system with a missing extent in the middle of a file. This occurs because the if statement to decide if we should abort is wrong. The only way we would abort in this case is if we got a ret != -EOPNOTSUPP and we called from the file clone code. However the prealloc code uses this path too. Instead we need to abort if there is an error, and the only error we _don't_ abort on is -EOPNOTSUPP and only if we came from the clone file code. CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-07btrfs: check for error when looking up inode during dir entry replayFilipe Manana
At replay_one_name(), we are treating any error from btrfs_lookup_inode() as if the inode does not exists. Fix this by checking for an error and returning it to the caller. CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-07btrfs: unify lookup return value when dir entry is missingFilipe Manana
btrfs_lookup_dir_index_item() and btrfs_lookup_dir_item() lookup for dir entries and both are used during log replay or when updating a log tree during an unlink. However when the dir item does not exists, btrfs_lookup_dir_item() returns NULL while btrfs_lookup_dir_index_item() returns PTR_ERR(-ENOENT), and if the dir item exists but there is no matching entry for a given name or index, both return NULL. This makes the call sites during log replay to be more verbose than necessary and it makes it easy to miss this slight difference. Since we don't need to distinguish between those two cases, make btrfs_lookup_dir_index_item() always return NULL when there is no matching directory entry - either because there isn't any dir entry or because there is one but it does not match the given name and index. Also rename the argument 'objectid' of btrfs_lookup_dir_index_item() to 'index' since it is supposed to match an index number, and the name 'objectid' is not very good because it can easily be confused with an inode number (like the inode number a dir entry points to). CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-07btrfs: deal with errors when adding inode reference during log replayFilipe Manana
At __inode_add_ref(), we treating any error returned from btrfs_lookup_dir_item() or from btrfs_lookup_dir_index_item() as meaning that there is no existing directory entry in the fs/subvolume tree. This is not correct since we can get errors such as, for example, -EIO when reading extent buffers while searching the fs/subvolume's btree. So fix that and return the error to the caller when it is not -ENOENT. CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-07btrfs: deal with errors when replaying dir entry during log replayFilipe Manana
At replay_one_one(), we are treating any error returned from btrfs_lookup_dir_item() or from btrfs_lookup_dir_index_item() as meaning that there is no existing directory entry in the fs/subvolume tree. This is not correct since we can get errors such as, for example, -EIO when reading extent buffers while searching the fs/subvolume's btree. So fix that and return the error to the caller when it is not -ENOENT. CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-07btrfs: deal with errors when checking if a dir entry exists during log replayFilipe Manana
Currently inode_in_dir() ignores errors returned from btrfs_lookup_dir_index_item() and from btrfs_lookup_dir_item(), treating any errors as if the directory entry does not exists in the fs/subvolume tree, which is obviously not correct, as we can get errors such as -EIO when reading extent buffers while searching the fs/subvolume's tree. Fix that by making inode_in_dir() return the errors and making its only caller, add_inode_ref(), deal with returned errors as well. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-07btrfs: update refs for any root except tree log rootsJosef Bacik
I hit a stuck relocation on btrfs/061 during my overnight testing. This turned out to be because we had left over extent entries in our extent root for a data reloc inode that no longer existed. This happened because in btrfs_drop_extents() we only update refs if we have SHAREABLE set or we are the tree_root. This regression was introduced by aeb935a45581 ("btrfs: don't set SHAREABLE flag for data reloc tree") where we stopped setting SHAREABLE for the data reloc tree. The problem here is we actually do want to update extent references for data extents in the data reloc tree, in fact we only don't want to update extent references if the file extents are in the log tree. Update this check to only skip updating references in the case of the log tree. This is relatively rare, because you have to be running scrub at the same time, which is what btrfs/061 does. The data reloc inode has its extents pre-allocated, and then we copy the extent into the pre-allocated chunks. We theoretically should never be calling btrfs_drop_extents() on a data reloc inode. The exception of course is with scrub, if our pre-allocated extent falls inside of the block group we are scrubbing, then the block group will be marked read only and we will be forced to cow that extent. This means we will call btrfs_drop_extents() on that range when we COW that file extent. This isn't really problematic if we do this, the data reloc inode requires that our extent lengths match exactly with the extent we are copying, thankfully we validate the extent is correct with get_new_location(), so if we happen to COW only part of the extent we won't link it in when we do the relocation, so we are safe from any other shenanigans that arise because of this interaction with scrub. Fixes: aeb935a45581 ("btrfs: don't set SHAREABLE flag for data reloc tree") CC: stable@vger.kernel.org # 5.8+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-07btrfs: unlock newly allocated extent buffer after errorQu Wenruo
[BUG] There is a bug report that injected ENOMEM error could leave a tree block locked while we return to user-space: BTRFS info (device loop0): enabling ssd optimizations FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 0 CPU: 0 PID: 7579 Comm: syz-executor Not tainted 5.15.0-rc1 #16 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106 fail_dump lib/fault-inject.c:52 [inline] should_fail+0x13c/0x160 lib/fault-inject.c:146 should_failslab+0x5/0x10 mm/slab_common.c:1328 slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494 slab_alloc_node mm/slub.c:3120 [inline] slab_alloc mm/slub.c:3214 [inline] kmem_cache_alloc+0x44/0x280 mm/slub.c:3219 btrfs_alloc_delayed_extent_op fs/btrfs/delayed-ref.h:299 [inline] btrfs_alloc_tree_block+0x38c/0x670 fs/btrfs/extent-tree.c:4833 __btrfs_cow_block+0x16f/0x7d0 fs/btrfs/ctree.c:415 btrfs_cow_block+0x12a/0x300 fs/btrfs/ctree.c:570 btrfs_search_slot+0x6b0/0xee0 fs/btrfs/ctree.c:1768 btrfs_insert_empty_items+0x80/0xf0 fs/btrfs/ctree.c:3905 btrfs_new_inode+0x311/0xa60 fs/btrfs/inode.c:6530 btrfs_create+0x12b/0x270 fs/btrfs/inode.c:6783 lookup_open+0x660/0x780 fs/namei.c:3282 open_last_lookups fs/namei.c:3352 [inline] path_openat+0x465/0xe20 fs/namei.c:3557 do_filp_open+0xe3/0x170 fs/namei.c:3588 do_sys_openat2+0x357/0x4a0 fs/open.c:1200 do_sys_open+0x87/0xd0 fs/open.c:1216 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x46ae99 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f46711b9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000055 RAX: ffffffffffffffda RBX: 000000000078c0a0 RCX: 000000000046ae99 RDX: 0000000000000000 RSI: 00000000000000a1 RDI: 0000000020005800 RBP: 00007f46711b9c80 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000017 R13: 0000000000000000 R14: 000000000078c0a0 R15: 00007ffc129da6e0 ================================================ WARNING: lock held when returning to user space! 5.15.0-rc1 #16 Not tainted ------------------------------------------------ syz-executor/7579 is leaving the kernel with locks still held! 1 lock held by syz-executor/7579: #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112 [CAUSE] In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new extent buffer @buf is locked, but if later operations like adding delayed tree ref fail, we just free @buf without unlocking it, resulting above warning. [FIX] Unlock @buf in out_free_buf: label. Reported-by: Hao Sun <sunhao.th@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/ CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-07Merge tag 'misc-fixes-20211007' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull netfslib, cachefiles and afs fixes from David Howells: - Fix another couple of oopses in cachefiles tracing stemming from the possibility of passing in a NULL object pointer - Fix netfs_clear_unread() to set READ on the iov_iter so that source it is passed to doesn't do the wrong thing (some drivers look at the flag on iov_iter rather than other available information to determine the direction) - Fix afs_launder_page() to write back at the correct file position on the server so as not to corrupt data * tag 'misc-fixes-20211007' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix afs_launder_page() to set correct start file position netfs: Fix READ/WRITE confusion when calling iov_iter_xarray() cachefiles: Fix oops with cachefiles_cull() due to NULL object
2021-10-07ksmbd: fix oops from fuse driverNamjae Jeon
Marios reported kernel oops from fuse driver when ksmbd call mark_inode_dirty(). This patch directly update ->i_ctime after removing mark_inode_ditry() and notify_change will put inode to dirty list. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Hyunchul Lee <hyc.lee@gmail.com> Reported-by: Marios Makassikis <mmakassikis@freebox.fr> Tested-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-07ksmbd: fix version mismatch with out of treeNamjae Jeon
Fix version mismatch with out of tree, This updated version will be matched with ksmbd-tools. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Steve French <smfrench@gmail.com> Cc: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-07ksmbd: use buf_data_size instead of recalculation in smb3_decrypt_req()Namjae Jeon
Tom suggested to use buf_data_size that is already calculated, to verify these offsets. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Suggested-by: Tom Talpey <tom@talpey.com> Acked-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-07ksmbd: remove the leftover of smb2.0 dialect supportNamjae Jeon
Although ksmbd doesn't send SMB2.0 support in supported dialect list of smb negotiate response, There is the leftover of smb2.0 dialect. This patch remove it not to support SMB2.0 in ksmbd. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Cc: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-07ksmbd: check strictly data area in ksmbd_smb2_check_message()Namjae Jeon
When invalid data offset and data length in request, ksmbd_smb2_check_message check strictly and doesn't allow to process such requests. Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Acked-by: Hyunchul Lee <hyc.lee@gmail.com> Reviewed-by: Ralph Boehme <slow@samba.org> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-06NFSD: Keep existing listeners on portlist errorBenjamin Coddington
If nfsd has existing listening sockets without any processes, then an error returned from svc_create_xprt() for an additional transport will remove those existing listeners. We're seeing this in practice when userspace attempts to create rpcrdma transports without having the rpcrdma modules present before creating nfsd kernel processes. Fix this by checking for existing sockets before calling nfsd_destroy(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-10-06coredump: Don't perform any cleanups before dumping coreEric W. Biederman
Rename coredump_exit_mm to coredump_task_exit and call it from do_exit before PTRACE_EVENT_EXIT, and before any cleanup work for a task happens. This ensures that an accurate copy of the process can be captured in the coredump as no cleanup for the process happens before the coredump completes. This also ensures that PTRACE_EVENT_EXIT will not be visited by any thread until the coredump is complete. Add a new flag PF_POSTCOREDUMP so that tasks that have passed through coredump_task_exit can be recognized and ignored in zap_process. Now that all of the coredumping happens before exit_mm remove code to test for a coredump in progress from mm_release. Replace "may_ptrace_stop()" with a simple test of "current->ptrace". The other tests in may_ptrace_stop all concern avoiding stopping during a coredump. These tests are no longer necessary as it is now guaranteed that fatal_signal_pending will be set if the code enters ptrace_stop during a coredump. The code in ptrace_stop is guaranteed not to stop if fatal_signal_pending returns true. Until this change "ptrace_event(PTRACE_EVENT_EXIT)" could call ptrace_stop without fatal_signal_pending being true, as signals are dequeued in get_signal before calling do_exit. This is no longer an issue as "ptrace_event(PTRACE_EVENT_EXIT)" is no longer reached until after the coredump completes. Link: https://lkml.kernel.org/r/874kaax26c.fsf@disp2133 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-06exit: Factor coredump_exit_mm out of exit_mmEric W. Biederman
Separate the coredump logic from the ordinary exit_mm logic by moving the coredump logic out of exit_mm into it's own function coredump_exit_mm. Link: https://lkml.kernel.org/r/87a6k2x277.fsf@disp2133 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-06exec: Check for a pending fatal signal instead of core_stateEric W. Biederman
Prevent exec continuing when a fatal signal is pending by replacing mmap_read_lock with mmap_read_lock_killable. This is always the right thing to do as userspace will never observe an exec complete when there is a fatal signal pending. With that change it becomes unnecessary to explicitly test for a core dump in progress. In coredump_wait zap_threads arranges under mmap_write_lock for all tasks that use a mm to also have SIGKILL pending, which means mmap_read_lock_killable will always return -EINTR when old_mm->core_state is present. Link: https://lkml.kernel.org/r/87fstux27w.fsf@disp2133 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-06ksmbd: add the check to vaildate if stream protocol length exceeds maximum valueNamjae Jeon
This patch add MAX_STREAM_PROT_LEN macro and check if stream protocol length exceeds maximum value. opencode pdu size check in ksmbd_pdu_size_has_room(). Cc: Tom Talpey <tom@talpey.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Ralph Böhme <slow@samba.org> Acked-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-10-05Merge tag 'warning-fixes-20211005' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull misc fs warning fixes from David Howells: "The first four patches fix kerneldoc warnings in fscache, afs, 9p and nfs - they're mostly just comment changes, though there's one place in 9p where a comment got detached from the function it was attached to (v9fs_fid_add) and has to switch places with a function that got inserted between (__add_fid). The patch on the end removes an unused symbol in fscache" * tag 'warning-fixes-20211005' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: fscache: Remove an unused static variable fscache: Fix some kerneldoc warnings shown up by W=1 9p: Fix a bunch of kerneldoc warnings shown up by W=1 afs: Fix kerneldoc warning shown up by W=1 nfs: Fix kerneldoc warning shown up by W=1
2021-10-05fs/sysfs/dir.c: replace S_IRWXU|S_IRUGO|S_IXUGO with 0755 sysfs_create_dir_ns()Luis Chamberlain
If one ends up expanding on this line checkpatch will complain that the combination S_IRWXU|S_IRUGO|S_IXUGO should just be replaced with the octal 0755. Do that. This makes no functional changes. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20210927163805.808907-9-mcgrof@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-05fs/ntfs3: Remove unnecessary functionsKonstantin Komarov
We don't need ntfs_xattr_get_acl and ntfs_xattr_set_acl. There are ntfs_get_acl_ex and ntfs_set_acl_ex. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-05fs/kernfs/symlink.c: replace S_IRWXUGO with 0777 on kernfs_create_link()Luis Chamberlain
If one ends up extending this line checkpatch will complain about the use of S_IRWXUGO suggesting it is not preferred and that 0777 should be used instead. Take the tip from checkpatch and do that change before we do our subsequent changes. This makes no functional changes. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20210927163805.808907-8-mcgrof@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-05fs/ntfs3: Forbid FALLOC_FL_PUNCH_HOLE for normal filesKonstantin Komarov
FALLOC_FL_PUNCH_HOLE isn't allowed with normal files. Filesystem must remember info about hole, but for normal file we can only zero it and forget. Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation") Now xfstests generic/016 generic/021 generic/022 pass. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-10-05fs/proc/uptime.c: Fix idle time reporting in /proc/uptimeJosh Don
/proc/uptime reports idle time by reading the CPUTIME_IDLE field from the per-cpu kcpustats. However, on NO_HZ systems, idle time is not continually updated on idle cpus, leading this value to appear incorrectly small. /proc/stat performs an accounting update when reading idle time; we can use the same approach for uptime. With this patch, /proc/stat and /proc/uptime now agree on idle time. Additionally, the following shows idle time tick up consistently on an idle machine: (while true; do cat /proc/uptime; sleep 1; done) | awk '{print $2-prev; prev=$2}' Reported-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Josh Don <joshdon@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lkml.kernel.org/r/20210827165438.3280779-1-joshdon@google.com