summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
8 daysbtrfs: dump detailed info and specific messages on log replay failuresFilipe Manana
Currently debugging log replay failures can be harder than needed, since all we do now is abort a transaction, which gives us a line number, a stack trace and an error code. But that is most of the times not enough to give some clue about what went wrong. So add a new helper to abort log replay and provide contextual information: 1) Dump the current leaf of the log tree being processed and print the slot we are currently at and the key at that slot; 2) Dump the current subvolume tree leaf if we have any; 3) Print the current stage of log replay; 4) Print the id of the subvolume root associated with the log tree we are currently processing (as we can have multiple); 5) Print some error message to mention what we were trying to do when we got an error. Replace all transaction abort calls (btrfs_abort_transaction()) with the new helper btrfs_abort_log_replay(), which besides dumping all that extra information, it also aborts the current transaction. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: abort transaction if we fail to update inode in log replay dir fixupFilipe Manana
If we fail to update the inode at link_to_fixup_dir(), we don't abort the transaction and propagate the error up the call chain, which makes it hard to pinpoint the error to the inode update. So abort the transaction if the inode update call fails, so that if it happens we known immediately. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: abort transaction if we fail to find dir item during log replayFilipe Manana
At __add_inode_ref() if we get an error when trying to lookup a dir item we don't abort the transaction and propagate the error up the call chain, so that somewhere else up in the call chain the transaction is aborted. This however makes it hard to know that the failure comes from looking up a dir item, so add a transaction abort in case we fail there, so that we immediately pinpoint where the problem comes from during log replay. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: remove pointless inode lookup when processing extrefs during log replayFilipe Manana
At unlink_extrefs_not_in_log() we do an inode lookup of the directory but we already have the directory inode accessible as a function argument, so the lookup is redudant. Remove it and use the directory inode passed in as an argument. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: stop passing inode object IDs to __add_inode_ref() in log replayFilipe Manana
There's no point in passing the inode and parent inode object IDs to __add_inode_ref() and its helpers because we can get them by calling btrfs_ino() against the inode and the directory inode, and we pass both inodes to __add_inode_ref() and its helpers. So remove the object IDs parameters to reduce arguments passed and to make things less confusing. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: add path for subvolume tree changes to struct walk_controlFilipe Manana
While replaying log trees we need to do searches and updates to subvolume trees and for that we use a path that we allocate in replay_one_buffer() and then pass it as a parameter to other functions deeper in the log replay call chain. Instead of passing it as parameter, add it to struct walk_control since we pass a pointer to that structure for every log replay function. This reduces the number of arguments passed to the functions and it will be needed and important for an upcoming changes that improves error reporting for log replay. Also name the new filed in struct walk_control to 'subvol_path' - while that is longer to type, the naming makes it clear it's used for subvolume tree operations since many of the log replay functions operate both on subvolume and log trees, and for the log tree searches we have struct walk_control::log_leaf to also make it obvious it's an extent buffer for a log tree extent buffer. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: remove redundant path release when overwriting item during log replayFilipe Manana
At overwrite_item() we have a redundant btrfs_release_path() just before failing with -ENOMEM, as the caller who passed in the path will free it and therefore also release any refcounts and locks on the extent buffers of the path. So remove it. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: remove redundant path release when processing dentry during log replayFilipe Manana
At replay_one_one() we have a redundant btrfs_release_path() just before calling insert_one_name(), as some lines above we have already released the path with another btrfs_release_path() call. So remove it. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: avoid unnecessary path allocation when replaying a dir itemFilipe Manana
There's no need to allocate 'fixup_path' at replay_one_dir_item(), as the path passed as an argument is unused by the time link_to_fixup_dir() is called (replay_one_name() releases the path before it returns). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: avoid path allocations when dropping extents during log replayFilipe Manana
We can avoid a path allocation in the btrfs_drop_extents() calls we have at replay_one_extent() and replay_one_buffer() by passing the path we already have in those contextes as it's unused by the time they call btrfs_drop_extents(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: avoid unnecessary path allocation at fixup_inode_link_count()Filipe Manana
There's no need to allocate a path as our single caller already has a path that we can use. So pass the caller's path and use it. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: add current log leaf, key and slot to struct walk_controlFilipe Manana
A lot of the log replay functions get passed the current log leaf being processed as well as the current slot and the key at that slot. Instead of passing them as parameters, add them to struct walk_control so that we reduce the numbers of parameters. This is also going to be needed to further changes that improve error reporting during log replay. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: use the inode item boolean everywhere in overwrite_item()Filipe Manana
We have this boolean 'inode_item' to tell if we are processing an inode item key and we use it in a couple of places while in another two places we open code by checking if the key type matches the inode item type. Make this consistent and use the boolean everywhere. Also rename it from 'inode_item' to 'is_inode_item', which makes it more clear that it's a boolean and not an instance of struct btrfs_inode_item, and make it const too. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: use level argument in log tree walk callback replay_one_buffer()Filipe Manana
We already have the extent buffer's level in an argument, there's no need to first ensure the extent buffer's data is loaded (by calling btrfs_read_extent_buffer()) and then call btrfs_header_level() to check the level. So use the level argument and do the check before calling btrfs_read_extent_buffer(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: use level argument in log tree walk callback process_one_buffer()Filipe Manana
We already have the extent buffer's level in an argument, there's no need to call btrfs_header_level(). So use the level argument and make the code shorter. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: pass walk_control structure to overwrite_item()Filipe Manana
Instead of passing the transaction and subvolume root as arguments to overwrite_item(), pass the walk_control structure as we can grab them from the structure. This reduces the number of arguments passed and it's going to be needed by an incoming change that improves error reporting for log replay. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: pass walk_control structure to drop_one_dir_item() and helpersFilipe Manana
Instead of passing the transaction as an argument to drop_one_dir_item() and its helpers (link_to_fixup_dir() and unlink_inode_for_log_replay()), pass the walk_control structure as we can access the transaction from it and the subvolume root. This is going to be needed by an incoming change that improves error reporting for log replay and also reduces the number of arguments passed to link_to_fixup_dir(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: pass walk_control structure to replay_one_dir_item() and ↵Filipe Manana
replay_one_name() Instead of passing the transaction and subvolume root and log tree as arguments, pass the walk_control structure as we can grab all of those from the structure. This reduces the number of arguments passed and it's going to be needed by an incoming change that improves error reporting for log replay. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: pass walk_control structure to add_inode_ref() and helpersFilipe Manana
Instead of passing the transaction, subvolume root and log tree as arguments to add_inode_ref() and its helpers (__add_inode_ref(), unlink_refs_not_in_log(), unlink_extrefs_not_in_log() and unlink_old_inode_refs()), pass the walk_control structure as we can access all of those from the structure. This reduces the number of arguments passed and it's going to be needed by an incoming change that improves error reporting for log replay. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: pass walk_control structure to replay_one_extent()Filipe Manana
Instead of passing the transaction and subvolume root as arguments to replay_one_extent(), pass the walk_control structure as we can grab all of those from the structure. This reduces the number of arguments passed and it's going to be needed by an incoming change that improves error reporting for log replay. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: pass walk_control structure to check_item_in_log()Filipe Manana
Instead of passing the transaction and log tree as arguments to check_item_in_log(), pass the walk_control structure as we can grab those from the structure. This reduces the number of arguments passed and it's going to be needed by an incoming change that improves error reporting for log replay. Notice that a NULL log root argument to check_item_in_log() makes it unconditionally delete a directory entry, so since the walk_control always has a non-NULL log root, we add an extra boolean to check_item_in_log() to tell it if it should unconditionally delete a directory entry, preserving the behaviour and also making it a bit more clear. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: pass walk_control structure to replay_dir_deletes()Filipe Manana
Instead of passing the transaction, subvolume root and log tree as arguments to replay_dir_deletes(), pass the walk_control structure as we can grab all of those from the structure. This reduces the number of arguments passed and it's going to be needed by an incoming change that improves error reporting for log replay. This also requires changing fixup_inode_link_counts() and fixup_inode_link_count() to take that structure as an argument since fixup_inode_link_count() makes a call to replay_dir_deletes(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: move up the definition of struct walk_controlFilipe Manana
In upcoming changes we need to pass struct walk_control as an argument to replay_dir_deletes() and link_to_fixup_dir() so we need to move its definition above the prototypes of those functions. So move it up right below the enum that defines log replay stages and before any functions and function prototypes are declared. Also fixup the comments while moving it so that they comply with the preferred code style (capitalize the first word in a sentence, end sentences with punctuation, makes lines wider and closer to the 80 characters limit). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: pass walk_control structure to replay_xattr_deletes()Filipe Manana
Instead of passing the transaction, subvolume root and log tree as arguments to replay_xattr_deletes(), pass the walk_control structure as we can grab all of those from the structure. This reduces the number of arguments passed and it's going to be needed by an incoming change that improves error reporting for log replay. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: always drop log root tree reference in btrfs_replay_log()Filipe Manana
Currently we have this odd behaviour: 1) At btrfs_replay_log() we drop the reference of the log root tree if the call to btrfs_recover_log_trees() failed; 2) But if the call to btrfs_recover_log_trees() did not fail, we don't drop the reference in btrfs_replay_log() - we expect that btrfs_recover_log_trees() does it in case it returns success. Let's simplify this and make btrfs_replay_log() always drop the reference on the log root tree, not only this simplifies code as it's what makes sense since it's btrfs_replay_log() who grabbed the reference in the first place. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: stop setting log_root_tree->log_root to NULL in btrfs_recover_log_trees()Filipe Manana
There's no point in setting log_root_tree->log_root to NULL as this is already NULL, we never assigned anything to it before and it's meaningless as a log root never has a value other than NULL for the ->log_root field, that can be not NULL only for non log roots. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: stop passing transaction parameter to log tree walk functionsFilipe Manana
It's unncessary to pass a transaction parameter since struct walk_control already has a member that points to the transaction, so we can make the functions access the structure. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: deduplicate log root free in error paths from btrfs_recover_log_trees()Filipe Manana
Instead of duplicating the dropping of a log tree in case we jump to the 'error' label, move the dropping under the 'error' label and get rid of the the unnecessary setting of the log root to NULL since we return immediately after. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: add and use a log root field to struct walk_controlFilipe Manana
Instead of passing an extra log root parameter for the log tree walk functions and callbacks, add the log tree to struct walk_control and make those functions and callbacks extract the log root from that structure, reducing the number of parameters. This also simplifies further upcoming changes to report log tree replay failures. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: rename root to log in walk_down_log_tree() and walk_up_log_tree()Filipe Manana
Everywhere we have a log root we name it as 'log' or 'log_root' except in walk_down_log_tree() and walk_up_log_tree() where we name it as 'root', which not only it's inconsistent, it's also confusing since we typically use 'root' when naming variables that refer to a subvolume tree. So for clairty and consistency rename the 'root' argument to 'log'. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: rename replay_dest member of struct walk_control to rootFilipe Manana
Everywhere else we refer to a subvolume root we are replaying to simply as 'root', so rename from 'replay_dest' to 'root' for consistency and having a more meaningful and shorter name. While at it also update the comment to be more detailed and comply to preferred style (first word in a sentence is capitalized and sentence ends with punctuation). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: use booleans in walk control structure for log replayFilipe Manana
The 'free' and 'pin' member of struct walk_control, used during log replay and when freeing a log tree, are defined as integers but in practice are used as booleans. Change their type to bool and while at it update their comments to be more detailed and comply with the preferred comment style (first word in a sentence is capitalized, sentences end with punctuation and the comment opening (/*) is on a line of its own). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: cache max and min order inside btrfs_fs_infoQu Wenruo
Inside btrfs_fs_info we cache several bits shift like sectorsize_bits. Apply this to max and min folio orders so that every time mapping order needs to be applied we can skip the calculation. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: introduce btrfs_bio_for_each_block_all() helperQu Wenruo
Currently if we want to iterate all blocks inside a bio, we do something like this: bio_for_each_segment_all(bvec, bio, iter_all) { for (off = 0; off < bvec->bv_len; off += sectorsize) { /* Iterate blocks using bv + off */ } } That's fine for now, but it will not handle future bs > ps, as bio_for_each_segment_all() is a single-page iterator, it will always return a bvec that's no larger than a page. But for bs > ps cases, we need a full folio (which covers at least one block) so that we can work on the block. To address this problem and handle future bs > ps cases better: - Introduce a helper btrfs_bio_for_each_block_all() This helper will create a local bvec_iter, which has the size of the target bio. Then grab the current physical address of the current location, then advance the iterator by block size. - Use btrfs_bio_for_each_block_all() to replace existing call sites Including: * set_bio_pages_uptodate() in raid56 * verify_bio_data_sectors() in raid56 Both will result much easier to read code. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: introduce btrfs_bio_for_each_block() helperQu Wenruo
Currently if we want to iterate a bio in block unit, we do something like this: while (iter->bi_size) { struct bio_vec bv = bio_iter_iovec(); /* Do something with using the bv */ bio_advance_iter_single(&bbio->bio, iter, sectorsize); } That's fine for now, but it will not handle future bs > ps, as bio_iter_iovec() returns a single-page bvec, meaning the bv_len will not exceed page size. This means the code using that bv can only handle a block if bs <= ps. To address this problem and handle future bs > ps cases better: - Introduce a helper btrfs_bio_for_each_block() Instead of bio_vec, which has single and multiple page version and multiple page version has quite some limits, use my favorite way to represent a block, phys_addr_t. For bs <= ps cases, nothing is changed, except we will do a very small overhead to convert phys_addr_t to a folio, then use the proper folio helpers to handle the possible highmem cases. For bs > ps cases, all blocks will be backed by large folios, meaning every folio will cover at least one block. And still use proper folio helpers to handle highmem cases. With phys_addr_t, we will handle both large folio and highmem properly. So there is no better single variable to present a btrfs block than phys_addr_t. - Extract the data block csum calculation into a helper The new helper, btrfs_calculate_block_csum() will be utilized by btrfs_csum_one_bio(). - Use btrfs_bio_for_each_block() to replace existing call sites Including: * index_one_bio() from raid56.c Very straight-forward. * btrfs_check_read_bio() Also update repair_one_sector() to grab the folio using phys_addr_t, and do extra checks to make sure the folio covers at least one block. We do not need to bother bv_len at all now. * btrfs_csum_one_bio() Now we can move the highmem handling into a dedicated helper, calculate_block_csum(), and use btrfs_bio_for_each_block() helper. There is one exception in btrfs_decompress_buf2page(), which is copying decompressed data into the original bio, which is not iterating using block size thus we don't need to bother. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: concentrate highmem handling for data verificationQu Wenruo
Currently for btrfs checksum verification, we do it in the following pattern: kaddr = kmap_local_*(); ret = btrfs_check_csum_csum(kaddr); kunmap_local(kaddr); It's OK for now, but it's still not following the patterns of helpers inside linux/highmem.h, which never requires a virt memory address. In those highmem helpers, they mostly accept a folio, some offset/length inside the folio, and in the implementation they check if the folio needs partial kmap, and do the handling. Inspired by those formal highmem helpers, enhance the highmem handling of data checksum verification by: - Rename btrfs_check_sector_csum() to btrfs_check_block_csum() To follow the more common term "block" used in all other major filesystems. - Pass a physical address into btrfs_check_block_csum() and btrfs_data_csum_ok() The physical address is always available even for a highmem page. Since it's page frame number << PAGE_SHIFT + offset in page. And with that physical address, we can grab the folio covering the page, and do extra checks to ensure it covers at least one block. This also allows us to do the kmap inside btrfs_check_block_csum(). This means all the extra HIGHMEM handling will be concentrated into btrfs_check_block_csum(), and no callers will need to bother highmem by themselves. - Properly zero out the block if csum mismatch Since btrfs_data_csum_ok() only got a paddr, we can not and should not use memzero_bvec(), which only accepts single page bvec. Instead use paddr to grab the folio and call folio_zero_range() Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: support all block sizes which is no larger than page sizeQu Wenruo
Currently if block size < page size, btrfs only supports one single config, 4K. This is mostly to reduce the test configurations, as 4K is going to be the default block size for all architectures. However all other major filesystems have no artificial limits on the support block size, and some are already supporting block size > page sizes. Since the btrfs subpage block support has been there for a long time, it's time for us to enable all block size <= page size support. So here enable all block sizes support as long as it's no larger than page size for experimental builds. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: scrub: replace max_t()/min_t() with clamp() in scrub_throttle_dev_io()Thorsten Blum
Replace max_t() followed by min_t() with a single clamp(). As was pointed by David Laight in https://lore.kernel.org/linux-btrfs/20250906122458.75dfc8f0@pumpkin/ the calculation may overflow u32 when the input value is too large, so clamp_t() is not used. In practice the expected values are in range of megabytes to gigabytes (throughput limit) so the bug would not happen. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: David Sterba <dsterba@suse.com> [ Use clamp() and add explanation. ] Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: fix typos in comments and stringsDavid Sterba
Annual typo fixing pass. Strangely codespell found only about 30% of what is in this patch, the rest was done manually using text spellchecker with a custom dictionary of acceptable terms. Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: reduce compression workspace buffer space to block sizeQu Wenruo
Currently the compression workspace buffer size is always based on PAGE_SIZE, but btrfs has support subpage sized block size for some time. This means for one-shot compression algorithm like lzo, we're wasting quite some memory if the block size is smaller than page size, as the LZO only works on one block (thus one-shot). On 64K page sized systems with 4K block size, it means we only need at most 8K buffer space for lzo, but in reality we're allocating 64K buffer. So to reduce the memory usage, change all workspace buffer to base its size based on block size other than page size. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: rename btrfs_compress_op to btrfs_compress_levelsQu Wenruo
Since all workspace managers are per-fs, there is no need nor no way to store them inside btrfs_compress_op::wsm anymore. With that said, we can do the following modifications: - Remove zstd_workspace_mananger::ops Zstd always grab the global btrfs_compress_op[]. - Remove btrfs_compress_op::wsm member - Rename btrfs_compress_op to btrfs_compress_levels This should make it more clear that btrfs_compress_levels structures are only to indicate the levels of each compress algorithm. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: cleanup the per-module compression workspace managersQu Wenruo
Since all workspaces are handled by the per-fs workspace managers, we can safely remove the old per-module managers. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: migrate to use per-fs workspace managerQu Wenruo
There are several interfaces involved for each algorithm: - alloc workspace All algorithms allocate a workspace without the need for workspace manager. So no change needs to be done. - get workspace This involves checking the workspace manager to find a free one, and if not, allocate a new one. For none and lzo, they share the same generic btrfs_get_workspace() helper, only needs to update that function to use the per-fs manager. For zlib it uses a wrapper around btrfs_get_workspace(), so no special work needed. For zstd, update zstd_find_workspace() and zstd_get_workspace() to utilize the per-fs manager. - put workspace For none/zlib/lzo they share the same btrfs_put_workspace(), update that function to use the per-fs manager. For zstd, it's zstd_put_workspace(), the same update. - zstd specific timer This is the timer to reclaim workspace, change it to grab the per-fs workspace manager instead. Now all workspace are managed by the per-fs manager. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: add generic workspace manager initializationQu Wenruo
This involves: - Add (alloc|free)_workspace_manager helpers. These are the helper to alloc/free workspace_manager structure. The allocator will allocate a workspace_manager structure, initialize it, and pre-allocate one workspace for it. The freer will do the cleanup and set the manager pointer to NULL. - Call alloc_workspace_manager() inside btrfs_alloc_compress_wsm() - Call alloc_workspace_manager() inside btrfs_free_compress_wsm() For none, zlib and lzo compression algorithms. For now the generic per-fs workspace managers won't really have any effect, and all compression is still going through the global workspace manager. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: add workspace manager initialization for zstdQu Wenruo
This involves: - Add zstd_alloc_workspace_manager() and zstd_free_workspace_manager() Those two functions will accept an fs_info pointer, and alloc/free fs_info->compr_wsm[BTRFS_COMPRESS_ZSTD] pointer. - Add btrfs_alloc_compress_wsm() and btrfs_free_compress_wsm() Those are helpers allocating the workspace managers for all algorithms. For now only zstd is supported, and the timing is a little unusual, the btrfs_alloc_compress_wsm() should only be called after the sectorsize being initialized. Meanwhile btrfs_free_fs_info_compress() is called in btrfs_free_fs_info(). - Move the definition of btrfs_compression_type to "fs.h" The reason is that "compression.h" has already included "fs.h", thus we can not just include "compression.h" to get the definition of BTRFS_NR_COMPRESS_TYPES to define fs_info::compr_wsm[]. For now the per-fs zstd workspace manager won't really have any effect, and all compression is still going through the global workspace manager. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: add an fs_info parameter for compression workspace managerQu Wenruo
[BACKGROUND] Currently btrfs shares workspaces and their managers for all filesystems, this is mostly fine as all those workspaces are using page size based buffers, and btrfs only support block size (bs) <= page size (ps). This means even if bs < ps, we at most waste some buffer space in the workspace, but everything will still work fine. The problem here is that is limiting our support for bs > ps cases. As now a workspace now may need larger buffer to handle bs > ps cases, but since the pool has no way to distinguish different workspaces, a regular workspace (which is still using buffer size based on ps) can be passed to a btrfs whose bs > ps. In that case the buffer is not large enough, and will cause various problems. [ENHANCEMENT] To prepare for the per-fs workspace migration, add an fs_info parameter to all workspace related functions. For now this new fs_info parameter is not yet utilized. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: keep folios locked inside run_delalloc_nocow()Qu Wenruo
[BUG] There is a very low chance that DEBUG_WARN() inside btrfs_writepage_cow_fixup() can be triggered when CONFIG_BTRFS_EXPERIMENTAL is enabled. This only happens after run_delalloc_nocow() failed. Unfortunately I haven't hit it for a while thus no real world dmesg for now. [CAUSE] There is a race window where after run_delalloc_nocow() failed, error handling can race with writeback thread. Before we hit run_delalloc_nocow(), there is an inode with the following dirty pages: (4K page size, 4K block size, no large folio) 0 4K 8K 12K 16K |/////////|///////////|///////////|////////////| The inode also have NODATACOW flag, and the above dirty range will go through different extents during run_delalloc_range(): 0 4K 8K 12K 16K | NOCOW | COW | COW | NOCOW | The race happen like this: writeback thread A | writeback thread B ----------------------------------+-------------------------------------- Writeback for folio 0 | run_delalloc_nocow() | |- nocow_one_range() | | For range [0, 4K), ret = 0 | | | |- fallback_to_cow() | | For range [4K, 8K), ret = 0 | | Folio 4K *UNLOCKED* | | | Writeback for folio 4K |- fallback_to_cow() | extent_writepage() | For range [8K, 12K), failure | |- writepage_delalloc() | | | |- btrfs_cleanup_ordered_extents()| | |- btrfs_folio_clear_ordered() | | | Folio 0 still locked, safe | | | | | Ordered extent already allocated. | | | Nothing to do. | | |- extent_writepage_io() | | |- btrfs_writepage_cow_fixup() |- btrfs_folio_clear_ordered() | | Folio 4K hold by thread B, | | UNSAFE! | |- btrfs_test_ordered() | | Cleared by thread A, | | | |- DEBUG_WARN(); This is only possible after run_delalloc_nocow() failure, as cow_file_range() will keep all folios and io tree range locked, until everything is finished or after error handling. The root cause is we allow fallback_to_cow() and nocow_one_range() to unlock the folios after a successful run, so that during error handling we're no longer safe to use btrfs_cleanup_ordered_extents() as the folios are already unlocked. [FIX] - Make fallback_to_cow() and nocow_one_range() to keep folios locked after a successful run For fallback_to_cow() we can pass COW_FILE_RANGE_KEEP_LOCKED flag into cow_file_range(). For nocow_one_range() we have to remove the PAGE_UNLOCK flag from extent_clear_unlock_delalloc(). - Unlock folios if everything is fine in run_delalloc_nocow() - Use extent_clear_unlock_delalloc() to handle range [@start, @cur_offset) inside run_delalloc_nocow() Since folios are still locked, we do not need cleanup_dirty_folios() to do the cleanup. extent_clear_unlock_delalloc() with "PAGE_START_WRITEBACK | PAGE_END_WRITEBACK" will clear the dirty flags. - Remove cleanup_dirty_folios() Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: make nocow_one_range() to do cleanup on errorQu Wenruo
Currently if we hit an error inside nocow_one_range(), we do not clear the page dirty, and let the caller to handle it. This is very different compared to fallback_to_cow(), when that function failed, everything will be cleaned up by cow_file_range(). Enhance the situation by: - Use a common error handling for nocow_one_range() If we failed anything, use the same btrfs_cleanup_ordered_extents() and extent_clear_unlock_delalloc(). btrfs_cleanup_ordered_extents() is safe even if we haven't created new ordered extent, in that case there should be no OE and that function will do nothing. The same applies to extent_clear_unlock_delalloc(), and since we're passing PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK, it will also clear folio dirty flag during error handling. - Avoid touching the failed range of nocow_one_range() As the failed range will be cleaned up and unlocked by that function. Here we introduce a new variable @nocow_end to record the failed range, so that we can skip it during the error handling of run_delalloc_nocow(). Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
8 daysbtrfs: enhance error messages for delalloc range failureQu Wenruo
When running emulated write error tests like generic/475, we can hit error messages like this: BTRFS error (device dm-12 state EA): run_delalloc_nocow failed, root=596 inode=264 start=1605632 len=73728: -5 BTRFS error (device dm-12 state EA): failed to run delalloc range, root=596 ino=264 folio=1605632 submit_bitmap=0-7 start=1605632 len=73728: -5 Which is normally buried by direct IO error messages. However above error messages are not enough to determine which is the real range that caused the error. Considering we can have multiple different extents in one delalloc range (e.g. some COW extents along with some NOCOW extents), just outputting the error at the end of run_delalloc_nocow() is not enough. To enhance the error messages: - Remove the rate limit on the existing error messages In the generic/475 example, most error messages are from direct IO, not really from the delalloc range. Considering how useful the delalloc range error messages are, we don't want they to be rate limited. - Add extra @cur_offset output for cow_file_range() - Add extra variable output for run_delalloc_nocow() This is especially important for run_delalloc_nocow(), as there are extra error paths where we can hit error without into nocow_one_range() nor fallback_to_cow(). - Add an error message for nocow_one_range() That's the missing part. For fallback_to_cow(), we have error message from cow_file_range() already. - Constify the @len and @end local variables for nocow_one_range() This makes it much easier to make sure @len and @end are not modified at runtime. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
9 dayshfsplus: fix slab-out-of-bounds read in hfsplus_strcasecmp()Viacheslav Dubeyko
The hfsplus_strcasecmp() logic can trigger the issue: [ 117.317703][ T9855] ================================================================== [ 117.318353][ T9855] BUG: KASAN: slab-out-of-bounds in hfsplus_strcasecmp+0x1bc/0x490 [ 117.318991][ T9855] Read of size 2 at addr ffff88802160f40c by task repro/9855 [ 117.319577][ T9855] [ 117.319773][ T9855] CPU: 0 UID: 0 PID: 9855 Comm: repro Not tainted 6.17.0-rc6 #33 PREEMPT(full) [ 117.319780][ T9855] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 117.319783][ T9855] Call Trace: [ 117.319785][ T9855] <TASK> [ 117.319788][ T9855] dump_stack_lvl+0x1c1/0x2a0 [ 117.319795][ T9855] ? __virt_addr_valid+0x1c8/0x5c0 [ 117.319803][ T9855] ? __pfx_dump_stack_lvl+0x10/0x10 [ 117.319808][ T9855] ? rcu_is_watching+0x15/0xb0 [ 117.319816][ T9855] ? lock_release+0x4b/0x3e0 [ 117.319821][ T9855] ? __kasan_check_byte+0x12/0x40 [ 117.319828][ T9855] ? __virt_addr_valid+0x1c8/0x5c0 [ 117.319835][ T9855] ? __virt_addr_valid+0x4a5/0x5c0 [ 117.319842][ T9855] print_report+0x17e/0x7e0 [ 117.319848][ T9855] ? __virt_addr_valid+0x1c8/0x5c0 [ 117.319855][ T9855] ? __virt_addr_valid+0x4a5/0x5c0 [ 117.319862][ T9855] ? __phys_addr+0xd3/0x180 [ 117.319869][ T9855] ? hfsplus_strcasecmp+0x1bc/0x490 [ 117.319876][ T9855] kasan_report+0x147/0x180 [ 117.319882][ T9855] ? hfsplus_strcasecmp+0x1bc/0x490 [ 117.319891][ T9855] hfsplus_strcasecmp+0x1bc/0x490 [ 117.319900][ T9855] ? __pfx_hfsplus_cat_case_cmp_key+0x10/0x10 [ 117.319906][ T9855] hfs_find_rec_by_key+0xa9/0x1e0 [ 117.319913][ T9855] __hfsplus_brec_find+0x18e/0x470 [ 117.319920][ T9855] ? __pfx_hfsplus_bnode_find+0x10/0x10 [ 117.319926][ T9855] ? __pfx_hfs_find_rec_by_key+0x10/0x10 [ 117.319933][ T9855] ? __pfx___hfsplus_brec_find+0x10/0x10 [ 117.319942][ T9855] hfsplus_brec_find+0x28f/0x510 [ 117.319949][ T9855] ? __pfx_hfs_find_rec_by_key+0x10/0x10 [ 117.319956][ T9855] ? __pfx_hfsplus_brec_find+0x10/0x10 [ 117.319963][ T9855] ? __kmalloc_noprof+0x2a9/0x510 [ 117.319969][ T9855] ? hfsplus_find_init+0x8c/0x1d0 [ 117.319976][ T9855] hfsplus_brec_read+0x2b/0x120 [ 117.319983][ T9855] hfsplus_lookup+0x2aa/0x890 [ 117.319990][ T9855] ? __pfx_hfsplus_lookup+0x10/0x10 [ 117.320003][ T9855] ? d_alloc_parallel+0x2f0/0x15e0 [ 117.320008][ T9855] ? __lock_acquire+0xaec/0xd80 [ 117.320013][ T9855] ? __pfx_d_alloc_parallel+0x10/0x10 [ 117.320019][ T9855] ? __raw_spin_lock_init+0x45/0x100 [ 117.320026][ T9855] ? __init_waitqueue_head+0xa9/0x150 [ 117.320034][ T9855] __lookup_slow+0x297/0x3d0 [ 117.320039][ T9855] ? __pfx___lookup_slow+0x10/0x10 [ 117.320045][ T9855] ? down_read+0x1ad/0x2e0 [ 117.320055][ T9855] lookup_slow+0x53/0x70 [ 117.320065][ T9855] walk_component+0x2f0/0x430 [ 117.320073][ T9855] path_lookupat+0x169/0x440 [ 117.320081][ T9855] filename_lookup+0x212/0x590 [ 117.320089][ T9855] ? __pfx_filename_lookup+0x10/0x10 [ 117.320098][ T9855] ? strncpy_from_user+0x150/0x290 [ 117.320105][ T9855] ? getname_flags+0x1e5/0x540 [ 117.320112][ T9855] user_path_at+0x3a/0x60 [ 117.320117][ T9855] __x64_sys_umount+0xee/0x160 [ 117.320123][ T9855] ? __pfx___x64_sys_umount+0x10/0x10 [ 117.320129][ T9855] ? do_syscall_64+0xb7/0x3a0 [ 117.320135][ T9855] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 117.320141][ T9855] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 117.320145][ T9855] do_syscall_64+0xf3/0x3a0 [ 117.320150][ T9855] ? exc_page_fault+0x9f/0xf0 [ 117.320154][ T9855] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 117.320158][ T9855] RIP: 0033:0x7f7dd7908b07 [ 117.320163][ T9855] Code: 23 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 08 [ 117.320167][ T9855] RSP: 002b:00007ffd5ebd9698 EFLAGS: 00000202 ORIG_RAX: 00000000000000a6 [ 117.320172][ T9855] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7dd7908b07 [ 117.320176][ T9855] RDX: 0000000000000009 RSI: 0000000000000009 RDI: 00007ffd5ebd9740 [ 117.320179][ T9855] RBP: 00007ffd5ebda780 R08: 0000000000000005 R09: 00007ffd5ebd9530 [ 117.320181][ T9855] R10: 00007f7dd799bfc0 R11: 0000000000000202 R12: 000055e2008b32d0 [ 117.320184][ T9855] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 117.320189][ T9855] </TASK> [ 117.320190][ T9855] [ 117.351311][ T9855] Allocated by task 9855: [ 117.351683][ T9855] kasan_save_track+0x3e/0x80 [ 117.352093][ T9855] __kasan_kmalloc+0x8d/0xa0 [ 117.352490][ T9855] __kmalloc_noprof+0x288/0x510 [ 117.352914][ T9855] hfsplus_find_init+0x8c/0x1d0 [ 117.353342][ T9855] hfsplus_lookup+0x19c/0x890 [ 117.353747][ T9855] __lookup_slow+0x297/0x3d0 [ 117.354148][ T9855] lookup_slow+0x53/0x70 [ 117.354514][ T9855] walk_component+0x2f0/0x430 [ 117.354921][ T9855] path_lookupat+0x169/0x440 [ 117.355325][ T9855] filename_lookup+0x212/0x590 [ 117.355740][ T9855] user_path_at+0x3a/0x60 [ 117.356115][ T9855] __x64_sys_umount+0xee/0x160 [ 117.356529][ T9855] do_syscall_64+0xf3/0x3a0 [ 117.356920][ T9855] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 117.357429][ T9855] [ 117.357636][ T9855] The buggy address belongs to the object at ffff88802160f000 [ 117.357636][ T9855] which belongs to the cache kmalloc-2k of size 2048 [ 117.358827][ T9855] The buggy address is located 0 bytes to the right of [ 117.358827][ T9855] allocated 1036-byte region [ffff88802160f000, ffff88802160f40c) [ 117.360061][ T9855] [ 117.360266][ T9855] The buggy address belongs to the physical page: [ 117.360813][ T9855] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x21608 [ 117.361562][ T9855] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 117.362285][ T9855] flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff) [ 117.362929][ T9855] page_type: f5(slab) [ 117.363282][ T9855] raw: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002 [ 117.364015][ T9855] raw: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000 [ 117.364750][ T9855] head: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002 [ 117.365491][ T9855] head: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000 [ 117.366232][ T9855] head: 00fff00000000003 ffffea0000858201 00000000ffffffff 00000000ffffffff [ 117.366968][ T9855] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008 [ 117.367711][ T9855] page dumped because: kasan: bad access detected [ 117.368259][ T9855] page_owner tracks the page as allocated [ 117.368745][ T9855] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN1 [ 117.370541][ T9855] post_alloc_hook+0x240/0x2a0 [ 117.370954][ T9855] get_page_from_freelist+0x2101/0x21e0 [ 117.371435][ T9855] __alloc_frozen_pages_noprof+0x274/0x380 [ 117.371935][ T9855] alloc_pages_mpol+0x241/0x4b0 [ 117.372360][ T9855] allocate_slab+0x8d/0x380 [ 117.372752][ T9855] ___slab_alloc+0xbe3/0x1400 [ 117.373159][ T9855] __kmalloc_cache_noprof+0x296/0x3d0 [ 117.373621][ T9855] nexthop_net_init+0x75/0x100 [ 117.374038][ T9855] ops_init+0x35c/0x5c0 [ 117.374400][ T9855] setup_net+0x10c/0x320 [ 117.374768][ T9855] copy_net_ns+0x31b/0x4d0 [ 117.375156][ T9855] create_new_namespaces+0x3f3/0x720 [ 117.375613][ T9855] unshare_nsproxy_namespaces+0x11c/0x170 [ 117.376094][ T9855] ksys_unshare+0x4ca/0x8d0 [ 117.376477][ T9855] __x64_sys_unshare+0x38/0x50 [ 117.376879][ T9855] do_syscall_64+0xf3/0x3a0 [ 117.377265][ T9855] page last free pid 9110 tgid 9110 stack trace: [ 117.377795][ T9855] __free_frozen_pages+0xbeb/0xd50 [ 117.378229][ T9855] __put_partials+0x152/0x1a0 [ 117.378625][ T9855] put_cpu_partial+0x17c/0x250 [ 117.379026][ T9855] __slab_free+0x2d4/0x3c0 [ 117.379404][ T9855] qlist_free_all+0x97/0x140 [ 117.379790][ T9855] kasan_quarantine_reduce+0x148/0x160 [ 117.380250][ T9855] __kasan_slab_alloc+0x22/0x80 [ 117.380662][ T9855] __kmalloc_noprof+0x232/0x510 [ 117.381074][ T9855] tomoyo_supervisor+0xc0a/0x1360 [ 117.381498][ T9855] tomoyo_env_perm+0x149/0x1e0 [ 117.381903][ T9855] tomoyo_find_next_domain+0x15ad/0x1b90 [ 117.382378][ T9855] tomoyo_bprm_check_security+0x11c/0x180 [ 117.382859][ T9855] security_bprm_check+0x89/0x280 [ 117.383289][ T9855] bprm_execve+0x8f1/0x14a0 [ 117.383673][ T9855] do_execveat_common+0x528/0x6b0 [ 117.384103][ T9855] __x64_sys_execve+0x94/0xb0 [ 117.384500][ T9855] [ 117.384706][ T9855] Memory state around the buggy address: [ 117.385179][ T9855] ffff88802160f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 117.385854][ T9855] ffff88802160f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 117.386534][ T9855] >ffff88802160f400: 00 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 117.387204][ T9855] ^ [ 117.387566][ T9855] ffff88802160f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 117.388243][ T9855] ffff88802160f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 117.388918][ T9855] ================================================================== The issue takes place if the length field of struct hfsplus_unistr is bigger than HFSPLUS_MAX_STRLEN. The patch simply checks the length of comparing strings. And if the strings' length is bigger than HFSPLUS_MAX_STRLEN, then it is corrected to this value. v2 The string length correction has been added for hfsplus_strcmp(). Reported-by: Jiaming Zhang <r772577952@gmail.com> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> cc: Yangtao Li <frank.li@vivo.com> cc: linux-fsdevel@vger.kernel.org cc: syzkaller@googlegroups.com Link: https://lore.kernel.org/r/20250919191243.1370388-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>