summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2023-04-14Merge tag 'btree-complain-bad-records-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: standardize btree record checking code [v24.5] While I was cleaning things up for 6.1, I noticed that the btree _query_range and _query_all functions don't perform the same checking that the _get_rec functions perform. In fact, they don't perform /any/ sanity checking, which means that callers aren't warned about impossible records. Therefore, hoist the record validation and complaint logging code into separate functions, and call them from any place where we convert an ondisk record into an incore record. For online scrub, we can replace checking code with a call to the record checking functions in libxfs, thereby reducing the size of the codebase. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-drain-intents-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: drain deferred work items when scrubbing [v24.5] The design doc for XFS online fsck contains a long discussion of the eventual consistency models in use for XFS metadata. In that chapter, we note that it is possible for scrub to collide with a chain of deferred space metadata updates, and proposes a lightweight solution: The use of a pending-intents counter so that scrub can wait for the system to drain all chains. This patchset implements that scrub drain. The first patch implements the basic mechanism, and the subsequent patches reduce the runtime overhead by converting the implementation to use sloppy counters and introducing jump labels to avoid walking into scrub hooks when it isn't running. This last paradigm repeats elsewhere in this megaseries. v23.1: make intent items take an active ref to the perag structure and document why we bump and drop the intent counts when we do Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-fix-legalese-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs_scrub: fix licensing and copyright notices [v24.5] Fix various attribution problems in the xfs_scrub source code, such as the author's contact information, out of date SPDX tags, and a rough estimate of when the feature was under heavy development. The most egregious parts are the files that are missing license information completely. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'pass-perag-refs-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: pass perag references around when possible [v24.5] Avoid the cost of perag radix tree lookups by passing around active perag references when possible. v24.2: rework some of the naming and whatnot so there's less opencoding Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'intents-perag-refs-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: make intent items take a perag reference [v24.5] Now that we've cleaned up some code warts in the deferred work item processing code, let's make intent items take an active perag reference from their creation until they are finally freed by the defer ops machinery. This change facilitates the scrub drain in the next patchset and will make it easier for the future AG removal code to detect a busy AG in need of quiescing. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-12xfs: fix BUG_ON in xfs_getbmap()Ye Bin
There's issue as follows: XFS: Assertion failed: (bmv->bmv_iflags & BMV_IF_DELALLOC) != 0, file: fs/xfs/xfs_bmap_util.c, line: 329 ------------[ cut here ]------------ kernel BUG at fs/xfs/xfs_message.c:102! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 14612 Comm: xfs_io Not tainted 6.3.0-rc2-next-20230315-00006-g2729d23ddb3b-dirty #422 RIP: 0010:assfail+0x96/0xa0 RSP: 0018:ffffc9000fa178c0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff888179a18000 RDX: 0000000000000000 RSI: ffff888179a18000 RDI: 0000000000000002 RBP: 0000000000000000 R08: ffffffff8321aab6 R09: 0000000000000000 R10: 0000000000000001 R11: ffffed1105f85139 R12: ffffffff8aacc4c0 R13: 0000000000000149 R14: ffff888269f58000 R15: 000000000000000c FS: 00007f42f27a4740(0000) GS:ffff88882fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000b92388 CR3: 000000024f006000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> xfs_getbmap+0x1a5b/0x1e40 xfs_ioc_getbmap+0x1fd/0x5b0 xfs_file_ioctl+0x2cb/0x1d50 __x64_sys_ioctl+0x197/0x210 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd Above issue may happen as follows: ThreadA ThreadB do_shared_fault __do_fault xfs_filemap_fault __xfs_filemap_fault filemap_fault xfs_ioc_getbmap -> Without BMV_IF_DELALLOC flag xfs_getbmap xfs_ilock(ip, XFS_IOLOCK_SHARED); filemap_write_and_wait do_page_mkwrite xfs_filemap_page_mkwrite __xfs_filemap_fault xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); iomap_page_mkwrite ... xfs_buffered_write_iomap_begin xfs_bmapi_reserve_delalloc -> Allocate delay extent xfs_ilock_data_map_shared(ip) xfs_getbmap_report_one ASSERT((bmv->bmv_iflags & BMV_IF_DELALLOC) != 0) -> trigger BUG_ON As xfs_filemap_page_mkwrite() only hold XFS_MMAPLOCK_SHARED lock, there's small window mkwrite can produce delay extent after file write in xfs_getbmap(). To solve above issue, just skip delalloc extents. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-12xfs: verify buffer contents when we skip log replayDarrick J. Wong
syzbot detected a crash during log recovery: XFS (loop0): Mounting V5 Filesystem bfdc47fc-10d8-4eed-a562-11a831b3f791 XFS (loop0): Torn write (CRC failure) detected at log block 0x180. Truncating head block from 0x200. XFS (loop0): Starting recovery (logdev: internal) ================================================================== BUG: KASAN: slab-out-of-bounds in xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813 Read of size 8 at addr ffff88807e89f258 by task syz-executor132/5074 CPU: 0 PID: 5074 Comm: syz-executor132 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106 print_address_description+0x74/0x340 mm/kasan/report.c:306 print_report+0x107/0x1f0 mm/kasan/report.c:417 kasan_report+0xcd/0x100 mm/kasan/report.c:517 xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813 xfs_btree_lookup+0x346/0x12c0 fs/xfs/libxfs/xfs_btree.c:1913 xfs_btree_simple_query_range+0xde/0x6a0 fs/xfs/libxfs/xfs_btree.c:4713 xfs_btree_query_range+0x2db/0x380 fs/xfs/libxfs/xfs_btree.c:4953 xfs_refcount_recover_cow_leftovers+0x2d1/0xa60 fs/xfs/libxfs/xfs_refcount.c:1946 xfs_reflink_recover_cow+0xab/0x1b0 fs/xfs/xfs_reflink.c:930 xlog_recover_finish+0x824/0x920 fs/xfs/xfs_log_recover.c:3493 xfs_log_mount_finish+0x1ec/0x3d0 fs/xfs/xfs_log.c:829 xfs_mountfs+0x146a/0x1ef0 fs/xfs/xfs_mount.c:933 xfs_fs_fill_super+0xf95/0x11f0 fs/xfs/xfs_super.c:1666 get_tree_bdev+0x400/0x620 fs/super.c:1282 vfs_get_tree+0x88/0x270 fs/super.c:1489 do_new_mount+0x289/0xad0 fs/namespace.c:3145 do_mount fs/namespace.c:3488 [inline] __do_sys_mount fs/namespace.c:3697 [inline] __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f89fa3f4aca Code: 83 c4 08 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd5fb5ef8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00646975756f6e2c RCX: 00007f89fa3f4aca RDX: 0000000020000100 RSI: 0000000020009640 RDI: 00007fffd5fb5f10 RBP: 00007fffd5fb5f10 R08: 00007fffd5fb5f50 R09: 000000000000970d R10: 0000000000200800 R11: 0000000000000206 R12: 0000000000000004 R13: 0000555556c6b2c0 R14: 0000000000200800 R15: 00007fffd5fb5f50 </TASK> The fuzzed image contains an AGF with an obviously garbage agf_refcount_level value of 32, and a dirty log with a buffer log item for that AGF. The ondisk AGF has a higher LSN than the recovered log item. xlog_recover_buf_commit_pass2 reads the buffer, compares the LSNs, and decides to skip replay because the ondisk buffer appears to be newer. Unfortunately, the ondisk buffer is corrupt, but recovery just read the buffer with no buffer ops specified: error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno, buf_f->blf_len, buf_flags, &bp, NULL); Skipping the buffer leaves its contents in memory unverified. This sets us up for a kernel crash because xfs_refcount_recover_cow_leftovers reads the buffer (which is still around in XBF_DONE state, so no read verification) and creates a refcountbt cursor of height 32. This is impossible so we run off the end of the cursor object and crash. Fix this by invoking the verifier on all skipped buffers and aborting log recovery if the ondisk buffer is corrupt. It might be smarter to force replay the log item atop the buffer and then see if it'll pass the write verifier (like ext4 does) but for now let's go with the conservative option where we stop immediately. Link: https://syzkaller.appspot.com/bug?extid=7e9494b8b399902e994e Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-12xfs: _{attr,data}_map_shared should take ILOCK_EXCL until iread_extents is ↵Darrick J. Wong
completely done While fuzzing the data fork extent count on a btree-format directory with xfs/375, I observed the following (excerpted) splat: XFS: Assertion failed: xfs_isilocked(ip, XFS_ILOCK_EXCL), file: fs/xfs/libxfs/xfs_bmap.c, line: 1208 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 43192 at fs/xfs/xfs_message.c:104 assfail+0x46/0x4a [xfs] Call Trace: <TASK> xfs_iread_extents+0x1af/0x210 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xchk_dir_walk+0xb8/0x190 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xchk_parent_count_parent_dentries+0x41/0x80 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xchk_parent_validate+0x199/0x2e0 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xchk_parent+0xdf/0x130 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xfs_scrub_metadata+0x2b8/0x730 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xfs_scrubv_metadata+0x38b/0x4d0 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xfs_ioc_scrubv_metadata+0x111/0x160 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] xfs_file_ioctl+0x367/0xf50 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd] __x64_sys_ioctl+0x82/0xa0 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 The cause of this is a race condition in xfs_ilock_data_map_shared, which performs an unlocked access to the data fork to guess which lock mode it needs: Thread 0 Thread 1 xfs_need_iread_extents <observe no iext tree> xfs_ilock(..., ILOCK_EXCL) xfs_iread_extents <observe no iext tree> <check ILOCK_EXCL> <load bmbt extents into iext> <notice iext size doesn't match nextents> xfs_need_iread_extents <observe iext tree> xfs_ilock(..., ILOCK_SHARED) <tear down iext tree> xfs_iunlock(..., ILOCK_EXCL) xfs_iread_extents <observe no iext tree> <check ILOCK_EXCL> *BOOM* Fix this race by adding a flag to the xfs_ifork structure to indicate that we have not yet read in the extent records and changing the predicate to look at the flag state, not if_height. The memory barrier ensures that the flag will not be set until the very end of the function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-12xfs: remove WARN when dquot cache insertion failsDave Chinner
It just creates unnecessary bot noise these days. Reported-by: syzbot+6ae213503fb12e87934f@syzkaller.appspotmail.com Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-12xfs: don't consider future format versions validDave Chinner
In commit fe08cc504448 we reworked the valid superblock version checks. If it is a V5 filesystem, it is always valid, then we checked if the version was less than V4 (reject) and then checked feature fields in the V4 flags to determine if it was valid. What we missed was that if the version is not V4 at this point, we shoudl reject the fs. i.e. the check current treats V6+ filesystems as if it was a v4 filesystem. Fix this. cc: stable@vger.kernel.org Fixes: fe08cc504448 ("xfs: open code sb verifier feature checks") Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-11xfs: complain about bad file mapping records in the ondisk bmbtDarrick J. Wong
Similar to what we've just done for the other btrees, create a function to log corrupt bmbt records and call it whenever we encounter a bad record in the ondisk btree. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: complain about bad records in query_range helpersDarrick J. Wong
For every btree type except for the bmbt, refactor the code that complains about bad records into a helper and make the ->query_range helpers call it so that corruptions found via that avenue are logged. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: standardize ondisk to incore conversion for bmap btreesDarrick J. Wong
Fix all xfs_bmbt_disk_get_all callsites to call xfs_bmap_validate_extent and bubble up corruption reports. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: standardize ondisk to incore conversion for rmap btreesDarrick J. Wong
Create a xfs_rmap_check_irec function to detect corruption in btree records. Fix all xfs_rmap_btrec_to_irec callsites to call the new helper and bubble up corruption reports. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: return a failure address from xfs_rmap_irec_offset_unpackDarrick J. Wong
Currently, xfs_rmap_irec_offset_unpack returns only 0 or -EFSCORRUPTED. Change this function to return the code address of a failed conversion in preparation for the next patch, which standardizes localized record checking and reporting code. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: standardize ondisk to incore conversion for refcount btreesDarrick J. Wong
Create a xfs_refcount_check_irec function to detect corruption in btree records. Fix all xfs_refcount_btrec_to_irec callsites to call the new helper and bubble up corruption reports. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: standardize ondisk to incore conversion for inode btreesDarrick J. Wong
Create a xfs_inobt_check_irec function to detect corruption in btree records. Fix all xfs_inobt_btrec_to_irec callsites to call the new helper and bubble up corruption reports. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: standardize ondisk to incore conversion for free space btreesDarrick J. Wong
Create a xfs_alloc_btrec_to_irec function to convert an ondisk record to an incore record, and a xfs_alloc_check_irec function to detect corruption. Replace all the open-coded logic with calls to the new helpers and bubble up corruption reports. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: scrub should use ECHRNG to signal that the drain is neededDarrick J. Wong
In the previous patch, we added jump labels to the intent drain code so that regular filesystem operations need not pay the price of checking for someone (scrub) waiting on intents to drain from some part of the filesystem when that someone isn't running. However, I observed that xfs/285 now spends a lot more time pushing the AIL from the inode btree scrubber than it used to. This is because the inobt scrubber will try push the AIL to try to get logged inode cores written to the filesystem when it sees a weird discrepancy between the ondisk inode and the inobt records. This AIL push is triggered when the setup function sees TRY_HARDER is set; and the requisite EDEADLOCK return is initiated when the discrepancy is seen. The solution to this performance slow down is to use a different result code (ECHRNG) for scrub code to signal that it needs to wait for deferred intent work items to drain out of some part of the filesystem. When this happens, set a new scrub state flag (XCHK_NEED_DRAIN) so that setup functions will activate the jump label. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: minimize overhead of drain wakeups by using jump labelsDarrick J. Wong
To reduce the runtime overhead even further when online fsck isn't running, use a static branch key to decide if we call wake_up on the drain. For compilers that support jump labels, the call to wake_up is replaced by a nop sled when nobody is waiting for intents to drain. From my initial microbenchmarking, every transition of the static key between the on and off states takes about 22000ns to complete; this is paid entirely by the xfs_scrub process. When the static key is off (which it should be when fsck isn't running), the nop sled adds an overhead of approximately 0.36ns to runtime code. The post-atomic lockless waiter check adds about 0.03ns, which is basically free. For the few compilers that don't support jump labels, runtime code pays the cost of calling wake_up on an empty waitqueue, which was observed to be about 30ns. However, most architectures that have sufficient memory and CPU capacity to run XFS also support jump labels, so this is not much of a worry. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: clean up scrub context if scrub setup returns -EDEADLOCKDarrick J. Wong
It has been a longstanding convention that online scrub and repair functions can return -EDEADLOCK to signal that they weren't able to obtain some necessary resource. When this happens, the scrub framework is supposed to release all resources attached to the scrub context, set the TRY_HARDER flag in the scrub context flags, and try again. In this context, individual scrub functions are supposed to take all the resources they (incorrectly) speculated were not necessary. We're about to make it so that the functions that lock and wait for a filesystem AG can also return EDEADLOCK to signal that we need to try again with the drain waiters enabled. Therefore, refactor xfs_scrub_metadata to support this behavior for ->setup() functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: allow queued AG intents to drain before scrubbingDarrick J. Wong
When a writer thread executes a chain of log intent items, the AG header buffer locks will cycle during a transaction roll to get from one intent item to the next in a chain. Although scrub takes all AG header buffer locks, this isn't sufficient to guard against scrub checking an AG while that writer thread is in the middle of finishing a chain because there's no higher level locking primitive guarding allocation groups. When there's a collision, cross-referencing between data structures (e.g. rmapbt and refcountbt) yields false corruption events; if repair is running, this results in incorrect repairs, which is catastrophic. Fix this by adding to the perag structure the count of active intents and make scrub wait until it has both AG header buffer locks and the intent counter reaches zero. One quirk of the drain code is that deferred bmap updates also bump and drop the intent counter. A fundamental decision made during the design phase of the reverse mapping feature is that updates to the rmapbt records are always made by the same code that updates the primary metadata. In other words, callers of bmapi functions expect that the bmapi functions will queue deferred rmap updates. Some parts of the reflink code queue deferred refcount (CUI) and bmap (BUI) updates in the same head transaction, but the deferred work manager completely finishes the CUI before the BUI work is started. As a result, the CUI drops the intent count long before the deferred rmap (RUI) update even has a chance to bump the intent count. The only way to keep the intent count elevated between the CUI and RUI is for the BUI to bump the counter until the RUI has been created. A second quirk of the intent drain code is that deferred work items must increment the intent counter as soon as the work item is added to the transaction. When a BUI completes and queues an RUI, the RUI must increment the counter before the BUI decrements it. The only way to accomplish this is to require that the counter be bumped as soon as the deferred work item is created in memory. In the next patches we'll improve on this facility, but this patch provides the basic functionality. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: add a tracepoint to report incorrect extent refcountsDarrick J. Wong
Add a new tracepoint so that I can see exactly what and where we failed the refcount check. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: update copyright years for scrub/ filesDarrick J. Wong
Update the copyright years in the scrub/ source code files. This isn't required, but it's helpful to remind myself just how long it's taken to develop this feature. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: fix author and spdx headers on scrub/ filesDarrick J. Wong
Fix the spdx tags to match current practice, and update the author contact information. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: create traced helper to get extra perag referencesDarrick J. Wong
There are a few places in the XFS codebase where a caller has either an active or a passive reference to a perag structure and wants to give a passive reference to some other piece of code. Btree cursor creation and inode walks are good examples of this. Replace the open-coded logic with a helper to do this. The new function adds a few safeguards -- it checks that there's at least one reference to the perag structure passed in, and it records the refcount bump in the ftrace information. This makes it much easier to debug perag refcounting problems. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: give xfs_refcount_intent its own perag referenceDarrick J. Wong
Give the xfs_refcount_intent a passive reference to the perag structure data. This reference will be used to enable scrub intent draining functionality in subsequent patches. Any space being modified by a refcount intent is already allocated, so we need to be able to operate even if the AG is being shrunk or offlined. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: give xfs_rmap_intent its own perag referenceDarrick J. Wong
Give the xfs_rmap_intent a passive reference to the perag structure data. This reference will be used to enable scrub intent draining functionality in subsequent patches. The space we're (reverse) mapping is already allocated, so we need to be able to operate even if the AG is being shrunk or offlined. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: give xfs_extfree_intent its own perag referenceDarrick J. Wong
Give the xfs_extfree_intent an passive reference to the perag structure data. This reference will be used to enable scrub intent draining functionality in subsequent patches. The space being freed must already be allocated, so we need to able to run even if the AG is being offlined or shrunk. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: pass per-ag references to xfs_free_extentDarrick J. Wong
Pass a reference to the per-AG structure to xfs_free_extent. Most callers already have one, so we can eliminate unnecessary lookups. The one exception to this is the EFI code, which the next patch will fix. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-11xfs: give xfs_bmap_intent its own perag referenceDarrick J. Wong
Give the xfs_bmap_intent an active reference to the perag structure data. This reference will be used to enable scrub intent draining functionality in subsequent patches. Later, shrink will use these passive references to know if an AG is quiesced or not. The reason why we take a passive ref for a file mapping operation is simple: we're committing to some sort of action involving space in an AG, so we want to indicate our interest in that AG. The space is already allocated, so we need to be able to operate on AGs that are offline or being shrunk. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-04-08Merge tag '6.3-rc5-smb3-cifs-client-fixes' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull cifs client fixes from Steve French: "Two cifs/smb3 client fixes, one for stable: - double lock fix for a cifs/smb1 reconnect path - DFS prefixpath fix for reconnect when server moved" * tag '6.3-rc5-smb3-cifs-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: double lock in cifs_reconnect_tcon() cifs: sanitize paths in cifs_update_super_prepath.
2023-04-08Merge tag 'mm-hotfixes-stable-2023-04-07-16-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM fixes from Andrew Morton: "28 hotfixes. 23 are cc:stable and the other five address issues which were introduced during this merge cycle. 20 are for MM and the remainder are for other subsystems" * tag 'mm-hotfixes-stable-2023-04-07-16-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (28 commits) maple_tree: fix a potential concurrency bug in RCU mode maple_tree: fix get wrong data_end in mtree_lookup_walk() mm/swap: fix swap_info_struct race between swapoff and get_swap_pages() nilfs2: fix sysfs interface lifetime mm: take a page reference when removing device exclusive entries mm: vmalloc: avoid warn_alloc noise caused by fatal signal nilfs2: initialize "struct nilfs_binfo_dat"->bi_pad field nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread() zsmalloc: document freeable stats zsmalloc: document new fullness grouping fsdax: force clear dirty mark if CoW mm/hugetlb: fix uffd wr-protection for CoW optimization path mm: enable maple tree RCU mode by default maple_tree: add RCU lock checking to rcu callback functions maple_tree: add smp_rmb() to dead node detection maple_tree: fix write memory barrier of nodes once dead for RCU mode maple_tree: remove extra smp_wmb() from mas_dead_leaves() maple_tree: fix freeing of nodes in rcu mode maple_tree: detect dead nodes in mas_start() maple_tree: be more cautious about dead nodes ...
2023-04-07Merge tag '6.3-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull ksmbd server fixes from Steve French: "Four fixes, three for stable: - slab out of bounds fix - lock cancellation fix - minor cleanup to address clang warning - fix for xfstest 551 (wrong parms passed to kvmalloc)" * tag '6.3-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix slab-out-of-bounds in init_smb2_rsp_hdr ksmbd: delete asynchronous work from list ksmbd: remove unused is_char_allowed function ksmbd: do not call kvmalloc() with __GFP_NORETRY | __GFP_NO_WARN
2023-04-06cifs: double lock in cifs_reconnect_tcon()Dan Carpenter
This lock was supposed to be an unlock. Fixes: 6cc041e90c17 ("cifs: avoid races in parallel reconnects in smb1") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-04-05nilfs2: fix sysfs interface lifetimeRyusuke Konishi
The current nilfs2 sysfs support has issues with the timing of creation and deletion of sysfs entries, potentially leading to null pointer dereferences, use-after-free, and lockdep warnings. Some of the sysfs attributes for nilfs2 per-filesystem instance refer to metadata file "cpfile", "sufile", or "dat", but nilfs_sysfs_create_device_group that creates those attributes is executed before the inodes for these metadata files are loaded, and nilfs_sysfs_delete_device_group which deletes these sysfs entries is called after releasing their metadata file inodes. Therefore, access to some of these sysfs attributes may occur outside of the lifetime of these metadata files, resulting in inode NULL pointer dereferences or use-after-free. In addition, the call to nilfs_sysfs_create_device_group() is made during the locking period of the semaphore "ns_sem" of nilfs object, so the shrinker call caused by the memory allocation for the sysfs entries, may derive lock dependencies "ns_sem" -> (shrinker) -> "locks acquired in nilfs_evict_inode()". Since nilfs2 may acquire "ns_sem" deep in the call stack holding other locks via its error handler __nilfs_error(), this causes lockdep to report circular locking. This is a false positive and no circular locking actually occurs as no inodes exist yet when nilfs_sysfs_create_device_group() is called. Fortunately, the lockdep warnings can be resolved by simply moving the call to nilfs_sysfs_create_device_group() out of "ns_sem". This fixes these sysfs issues by revising where the device's sysfs interface is created/deleted and keeping its lifetime within the lifetime of the metadata files above. Link: https://lkml.kernel.org/r/20230330205515.6167-1-konishi.ryusuke@gmail.com Fixes: dd70edbde262 ("nilfs2: integrate sysfs support into driver") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+979fa7f9c0d086fdc282@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/0000000000003414b505f7885f7e@google.com Reported-by: syzbot+5b7d542076d9bddc3c6a@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/0000000000006ac86605f5f44eb9@google.com Cc: Viacheslav Dubeyko <slava@dubeyko.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05nilfs2: initialize "struct nilfs_binfo_dat"->bi_pad fieldTetsuo Handa
nilfs_btree_assign_p() and nilfs_direct_assign_p() are not initializing "struct nilfs_binfo_dat"->bi_pad field, causing uninit-value reports when being passed to CRC function. Link: https://lkml.kernel.org/r/20230326152146.15872-1-konishi.ryusuke@gmail.com Reported-by: syzbot <syzbot+048585f3f4227bb2b49b@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=048585f3f4227bb2b49b Reported-by: Dipanjan Das <mail.dipanjan.das@gmail.com> Link: https://lkml.kernel.org/r/CANX2M5bVbzRi6zH3PTcNE_31TzerstOXUa9Bay4E6y6dX23_pg@mail.gmail.com Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread()Ryusuke Konishi
The finalization of nilfs_segctor_thread() can race with nilfs_segctor_kill_thread() which terminates that thread, potentially causing a use-after-free BUG as KASAN detected. At the end of nilfs_segctor_thread(), it assigns NULL to "sc_task" member of "struct nilfs_sc_info" to indicate the thread has finished, and then notifies nilfs_segctor_kill_thread() of this using waitqueue "sc_wait_task" on the struct nilfs_sc_info. However, here, immediately after the NULL assignment to "sc_task", it is possible that nilfs_segctor_kill_thread() will detect it and return to continue the deallocation, freeing the nilfs_sc_info structure before the thread does the notification. This fixes the issue by protecting the NULL assignment to "sc_task" and its notification, with spinlock "sc_state_lock" of the struct nilfs_sc_info. Since nilfs_segctor_kill_thread() does a final check to see if "sc_task" is NULL with "sc_state_lock" locked, this can eliminate the race. Link: https://lkml.kernel.org/r/20230327175318.8060-1-konishi.ryusuke@gmail.com Reported-by: syzbot+b08ebcc22f8f3e6be43a@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/00000000000000660d05f7dfa877@google.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05fsdax: force clear dirty mark if CoWShiyang Ruan
XFS allows CoW on non-shared extents to combat fragmentation[1]. The old non-shared extent could be mwrited before, its dax entry is marked dirty. This results in a WARNing: [ 28.512349] ------------[ cut here ]------------ [ 28.512622] WARNING: CPU: 2 PID: 5255 at fs/dax.c:390 dax_insert_entry+0x342/0x390 [ 28.513050] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables [ 28.515462] CPU: 2 PID: 5255 Comm: fsstress Kdump: loaded Not tainted 6.3.0-rc1-00001-g85e1481e19c1-dirty #117 [ 28.515902] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.1-1-1 04/01/2014 [ 28.516307] RIP: 0010:dax_insert_entry+0x342/0x390 [ 28.516536] Code: 30 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 8b 45 20 48 83 c0 01 e9 e2 fe ff ff 48 8b 45 20 48 83 c0 01 e9 cd fe ff ff <0f> 0b e9 53 ff ff ff 48 8b 7c 24 08 31 f6 e8 1b 61 a1 00 eb 8c 48 [ 28.517417] RSP: 0000:ffffc9000845fb18 EFLAGS: 00010086 [ 28.517721] RAX: 0000000000000053 RBX: 0000000000000155 RCX: 000000000018824b [ 28.518113] RDX: 0000000000000000 RSI: ffffffff827525a6 RDI: 00000000ffffffff [ 28.518515] RBP: ffffea00062092c0 R08: 0000000000000000 R09: ffffc9000845f9c8 [ 28.518905] R10: 0000000000000003 R11: ffffffff82ddb7e8 R12: 0000000000000155 [ 28.519301] R13: 0000000000000000 R14: 000000000018824b R15: ffff88810cfa76b8 [ 28.519703] FS: 00007f14a0c94740(0000) GS:ffff88817bd00000(0000) knlGS:0000000000000000 [ 28.520148] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 28.520472] CR2: 00007f14a0c8d000 CR3: 000000010321c004 CR4: 0000000000770ee0 [ 28.520863] PKRU: 55555554 [ 28.521043] Call Trace: [ 28.521219] <TASK> [ 28.521368] dax_fault_iter+0x196/0x390 [ 28.521595] dax_iomap_pte_fault+0x19b/0x3d0 [ 28.521852] __xfs_filemap_fault+0x234/0x2b0 [ 28.522116] __do_fault+0x30/0x130 [ 28.522334] do_fault+0x193/0x340 [ 28.522586] __handle_mm_fault+0x2d3/0x690 [ 28.522975] handle_mm_fault+0xe6/0x2c0 [ 28.523259] do_user_addr_fault+0x1bc/0x6f0 [ 28.523521] exc_page_fault+0x60/0x140 [ 28.523763] asm_exc_page_fault+0x22/0x30 [ 28.524001] RIP: 0033:0x7f14a0b589ca [ 28.524225] Code: c5 fe 7f 07 c5 fe 7f 47 20 c5 fe 7f 47 40 c5 fe 7f 47 60 c5 f8 77 c3 66 0f 1f 84 00 00 00 00 00 40 0f b6 c6 48 89 d1 48 89 fa <f3> aa 48 89 d0 c5 f8 77 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 [ 28.525198] RSP: 002b:00007fff1dea1c98 EFLAGS: 00010202 [ 28.525505] RAX: 000000000000001e RBX: 000000000014a000 RCX: 0000000000006046 [ 28.525895] RDX: 00007f14a0c82000 RSI: 000000000000001e RDI: 00007f14a0c8d000 [ 28.526290] RBP: 000000000000006f R08: 0000000000000004 R09: 000000000014a000 [ 28.526681] R10: 0000000000000008 R11: 0000000000000246 R12: 028f5c28f5c28f5c [ 28.527067] R13: 8f5c28f5c28f5c29 R14: 0000000000011046 R15: 00007f14a0c946c0 [ 28.527449] </TASK> [ 28.527600] ---[ end trace 0000000000000000 ]--- To be able to delete this entry, clear its dirty mark before invalidate_inode_pages2_range(). [1] https://lore.kernel.org/linux-xfs/20230321151339.GA11376@frogsfrogsfrogs/ Link: https://lkml.kernel.org/r/1679653680-2-1-git-send-email-ruansy.fnst@fujitsu.com Fixes: f80e1668888f3 ("fsdax: invalidate pages when CoW") Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05cifs: sanitize paths in cifs_update_super_prepath.Thiago Rafael Becker
After a server reboot, clients are failing to move files with ENOENT. This is caused by DFS referrals containing multiple separators, which the server move call doesn't recognize. v1: Initial patch. v2: Move prototype to header. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2182472 Fixes: a31080899d5f ("cifs: sanitize multiple delimiters in prepath") Actually-Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api") Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Thiago Rafael Becker <tbecker@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-04-04Merge tag 'nfsd-6.3-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a crash and a resource leak in NFSv4 COMPOUND processing - Fix issues with AUTH_SYS credential handling - Try again to address an NFS/NFSD/SUNRPC build dependency regression * tag 'nfsd-6.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: callback request does not use correct credential for AUTH_SYS NFS: Remove "select RPCSEC_GSS_KRB5 sunrpc: only free unix grouplist after RCU settles nfsd: call op_release, even when op_func returns an error NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL
2023-04-04NFSD: callback request does not use correct credential for AUTH_SYSDai Ngo
Currently callback request does not use the credential specified in CREATE_SESSION if the security flavor for the back channel is AUTH_SYS. Problem was discovered by pynfs 4.1 DELEG5 and DELEG7 test with error: DELEG5 st_delegation.testCBSecParms : FAILURE expected callback with uid, gid == 17, 19, got 0, 0 Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Fixes: 8276c902bbe9 ("SUNRPC: remove uid and gid from struct auth_cred") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-04-04NFS: Remove "select RPCSEC_GSS_KRB5Chuck Lever
If CONFIG_CRYPTO=n (e.g. arm/shmobile_defconfig): WARNING: unmet direct dependencies detected for RPCSEC_GSS_KRB5 Depends on [n]: NETWORK_FILESYSTEMS [=y] && SUNRPC [=y] && CRYPTO [=n] Selected by [y]: - NFS_V4 [=y] && NETWORK_FILESYSTEMS [=y] && NFS_FS [=y] As NFSv4 can work without crypto enabled, remove the RPCSEC_GSS_KRB5 dependency altogether. Trond says: > It is possible to use the NFSv4.1 client with just AUTH_SYS, and > in fact there are plenty of people out there using only that. The > fact that RFC5661 gets its knickers in a twist about RPCSEC_GSS > support is largely irrelevant to those people. > > The other issue is that ’select’ enforces the strict dependency > that if the NFS client is compiled into the kernel, then the > RPCSEC_GSS and kerberos code needs to be compiled in as well: they > cannot exist as modules. Fixes: e57d06527738 ("NFS & NFSD: Update GSS dependencies") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Niklas Söderlund <niklas.soderlund@ragnatech.se> Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-04-03Merge tag 'vfs.misc.fixes.v6.3-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfs fix from Christian Brauner: "When a mount or mount tree is made shared the vfs allocates new peer group ids for all mounts that have no peer group id set. Only mounts that aren't marked with MNT_SHARED are relevant here as MNT_SHARED indicates that the mount has fully transitioned to a shared mount. The peer group id handling is done with namespace lock held. On failure, the peer group id settings of mounts for which a new peer group id was allocated need to be reverted and the allocated peer group id freed. The cleanup_group_ids() helper can identify the mounts to cleanup by checking whether a given mount has a peer group id set but isn't marked MNT_SHARED. The deallocation always needs to happen with namespace lock held to protect against concurrent modifications of the propagation settings. This fixes the one place where the namespace lock was dropped before calling cleanup_group_ids()" * tag 'vfs.misc.fixes.v6.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: fs: drop peer group ids under namespace lock
2023-04-02ksmbd: fix slab-out-of-bounds in init_smb2_rsp_hdrNamjae Jeon
When smb1 mount fails, KASAN detect slab-out-of-bounds in init_smb2_rsp_hdr like the following one. For smb1 negotiate(56bytes) , init_smb2_rsp_hdr() for smb2 is called. The issue occurs while handling smb1 negotiate as smb2 server operations. Add smb server operations for smb1 (get_cmd_val, init_rsp_hdr, allocate_rsp_buf, check_user_session) to handle smb1 negotiate so that smb2 server operation does not handle it. [ 411.400423] CIFS: VFS: Use of the less secure dialect vers=1.0 is not recommended unless required for access to very old servers [ 411.400452] CIFS: Attempting to mount \\192.168.45.139\homes [ 411.479312] ksmbd: init_smb2_rsp_hdr : 492 [ 411.479323] ================================================================== [ 411.479327] BUG: KASAN: slab-out-of-bounds in init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd] [ 411.479369] Read of size 16 at addr ffff888488ed0734 by task kworker/14:1/199 [ 411.479379] CPU: 14 PID: 199 Comm: kworker/14:1 Tainted: G OE 6.1.21 #3 [ 411.479386] Hardware name: ASUSTeK COMPUTER INC. Z10PA-D8 Series/Z10PA-D8 Series, BIOS 3801 08/23/2019 [ 411.479390] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd] [ 411.479425] Call Trace: [ 411.479428] <TASK> [ 411.479432] dump_stack_lvl+0x49/0x63 [ 411.479444] print_report+0x171/0x4a8 [ 411.479452] ? kasan_complete_mode_report_info+0x3c/0x200 [ 411.479463] ? init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd] [ 411.479497] kasan_report+0xb4/0x130 [ 411.479503] ? init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd] [ 411.479537] kasan_check_range+0x149/0x1e0 [ 411.479543] memcpy+0x24/0x70 [ 411.479550] init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd] [ 411.479585] handle_ksmbd_work+0x109/0x760 [ksmbd] [ 411.479616] ? _raw_spin_unlock_irqrestore+0x50/0x50 [ 411.479624] ? smb3_encrypt_resp+0x340/0x340 [ksmbd] [ 411.479656] process_one_work+0x49c/0x790 [ 411.479667] worker_thread+0x2b1/0x6e0 [ 411.479674] ? process_one_work+0x790/0x790 [ 411.479680] kthread+0x177/0x1b0 [ 411.479686] ? kthread_complete_and_exit+0x30/0x30 [ 411.479692] ret_from_fork+0x22/0x30 [ 411.479702] </TASK> Fixes: 39b291b86b59 ("ksmbd: return unsupported error on smb1 mount") Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-04-02ksmbd: delete asynchronous work from listNamjae Jeon
When smb2_lock request is canceled by smb2_cancel or smb2_close(), ksmbd is missing deleting async_request_entry async_requests list. Because calling init_smb2_rsp_hdr() in smb2_lock() mark ->synchronous as true and then it will not be deleted in ksmbd_conn_try_dequeue_request(). This patch add release_async_work() to release the ones allocated for async work. Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-04-02Merge tag 'for-6.3-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - scan block devices in non-exclusive mode to avoid temporary mkfs failures - fix race between quota disable and quota assign ioctls - fix deadlock when aborting transaction during relocation with scrub - ignore fiemap path cache when there are multiple paths for a node * tag 'for-6.3-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: ignore fiemap path cache when there are multiple paths for a node btrfs: fix deadlock when aborting transaction during relocation with scrub btrfs: scan device in non-exclusive mode btrfs: fix race between quota disable and quota assign ioctls
2023-04-01Merge tag '6.3-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs client fixes from Steve French: "Four cifs/smb3 client (reconnect and DFS related) fixes, including two for stable: - DFS oops fix - DFS reconnect recursion fix - An SMB1 parallel reconnect fix - Trivial dead code removal in smb2_reconnect" * tag '6.3-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: get rid of dead check in smb2_reconnect() cifs: prevent infinite recursion in CIFSGetDFSRefer() cifs: avoid races in parallel reconnects in smb1 cifs: fix DFS traversal oops without CONFIG_CIFS_DFS_UPCALL
2023-03-31nfsd: call op_release, even when op_func returns an errorJeff Layton
For ops with "trivial" replies, nfsd4_encode_operation will shortcut most of the encoding work and skip to just marshalling up the status. One of the things it skips is calling op_release. This could cause a memory leak in the layoutget codepath if there is an error at an inopportune time. Have the compound processing engine always call op_release, even when op_func sets an error in op->status. With this change, we also need nfsd4_block_get_device_info_scsi to set the gd_device pointer to NULL on error to avoid a double free. Reported-by: Zhi Li <yieli@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2181403 Fixes: 34b1744c91cc ("nfsd4: define ->op_release for compound ops") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-03-31NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGALChuck Lever
OPDESC() simply indexes into nfsd4_ops[] by the op's operation number, without range checking that value. It assumes callers are careful to avoid calling it with an out-of-bounds opnum value. nfsd4_decode_compound() is not so careful, and can invoke OPDESC() with opnum set to OP_ILLEGAL, which is 10044 -- well beyond the end of nfsd4_ops[]. Reported-by: Jeff Layton <jlayton@kernel.org> Fixes: f4f9ef4a1b0a ("nfsd4: opdesc will be useful outside nfs4proc.c") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>