summaryrefslogtreecommitdiff
path: root/fs/btrfs/verity.c
AgeCommit message (Collapse)Author
3 daysMerge tag 'for-6.18-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "There are no new features, the changes are in the core code, notably tree-log error handling and reporting improvements, and initial support for block size > page size. Performance improvements: - search data checksums in the commit root (previous transaction) to avoid locking contention, this improves parallelism of read heavy/low write workloads, and also reduces transaction commit time; on real and reproducer workload the sync time went from minutes to tens of seconds (workload and numbers are in the changelog) Core: - tree-log updates: - error handling improvements, transaction aborts - add new error state 'O' (printed in status messages) when log replay fails and is aborted - reduced number of btrfs_path allocations when traversing the tree - 'block size > page size' support - basic implementation with limitations, under experimental build - limitations: no direct io, raid56, encoded read (standalone and in send ioctl), encoded write - preparatory work for compression, removing implicit assumptions of page and block sizes - compression workspaces are now per-filesystem, we cannot assume common block size for work memory among different filesystems - tree-checker now verifies INODE_EXTREF item (which is implementing hardlinks) - tree leaf pretty printer updates, there were missing data from items, keys/items - move config option CONFIG_BTRFS_REF_VERIFY to CONFIG_BTRFS_DEBUG, it's a debugging feature and not needed to be enabled separately - more struct btrfs_path auto free updates - use ref_tracker API for tracking delayed inodes, enabled by mount option 'ref_verify', allowing to better pinpoint leaking references - in zoned mode, avoid selecting data relocation zoned for ordinary data block groups - updated and enhanced error messages - lots of cleanups and refactoring" * tag 'for-6.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (113 commits) btrfs: use smp_mb__after_atomic() when forcing COW in create_pending_snapshot() btrfs: add unlikely annotations to branches leading to transaction abort btrfs: add unlikely annotations to branches leading to EIO btrfs: add unlikely annotations to branches leading to EUCLEAN btrfs: more trivial BTRFS_PATH_AUTO_FREE conversions btrfs: zoned: don't fail mount needlessly due to too many active zones btrfs: use kmalloc_array() for open-coded arithmetic in kmalloc() btrfs: enable experimental bs > ps support btrfs: add extra ASSERT()s to catch unaligned bios btrfs: fix symbolic link reading when bs > ps btrfs: prepare scrub to support bs > ps cases btrfs: prepare zlib to support bs > ps cases btrfs: prepare lzo to support bs > ps cases btrfs: prepare zstd to support bs > ps cases btrfs: prepare compression folio alloc/free for bs > ps cases btrfs: fix the incorrect max_bytes value for find_lock_delalloc_range() btrfs: remove pointless key offset setup in create_pending_snapshot() btrfs: annotate btrfs_is_testing() as unlikely and make it return bool btrfs: make the rule checking more readable for should_cow_block() btrfs: simplify inline extent end calculation at replay_one_extent() ...
11 daysbtrfs: add unlikely annotations to branches leading to transaction abortDavid Sterba
The unlikely() annotation is a static prediction hint that compiler may use to reorder code out of hot path. We use it elsewhere (namely tree-checker.c) for error branches that almost never happen. Transaction abort is one such error, the btrfs_abort_transaction() inlines code to check the state and print a warning, this ought to be out of the hot path. The most common pattern is when transaction abort is called after checking a return value and the control flow leads to a quick return. In other cases it may not be necessary to add unlikely() e.g. when the function returns anyway or the control flow is not changed noticeably. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
11 daysbtrfs: add unlikely annotations to branches leading to EUCLEANDavid Sterba
The unlikely() annotation is a static prediction hint that compiler may use to reorder code out of hot path. We use it elsewhere (namely tree-checker.c) for error branches that almost never happen, where EUCLEAN (a corruption) is one of them. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-21btrfs: move verity info pointer to fs-specific part of inodeEric Biggers
Move the fsverity_info pointer into the filesystem-specific part of the inode by adding the field btrfs_inode::i_verity_info and configuring fsverity_operations::inode_info_offs accordingly. This is a prerequisite for a later commit that removes inode::i_verity_info, saving memory and improving cache efficiency on filesystems that don't support fsverity. Co-developed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/20250810075706.172910-12-ebiggers@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-18btrfs: pass struct btrfs_inode to btrfs_sync_inode_flags_to_i_flags()David Sterba
Pass a struct btrfs_inode to btrfs_sync_inode_flags_to_i_flags() as it's an internal interface. Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: add and use helper to verify the calling task has locked the inodeFilipe Manana
We have a few places that check if we have the inode locked by doing: ASSERT(inode_is_locked(vfs_inode)); This actually proved to be useful several times as if assertions are enabled (and by default they are in many distros) it immediately triggers a crash which is impossible for users to miss. However that doesn't check if the lock is held by the calling task, so the check passes if some other task locked the inode. Using one of the lockdep functions to check the lock is held, like lockdep_assert_held() for example, does check that the calling task holds the lock, and if that's not the case it produces a warning and stack trace in dmesg. However, despite the misleading "assert" in the name of the lockdep helpers, it does not trigger a crash/BUG_ON(), just a warning and splat in dmesg, which is easy to get unnoticed by users who may have lockdep enabled. So add a helper that does the ASSERT() and calls lockdep_assert_held() immediately after and use it every where we check the inode is locked. Like this if the lock is held by some other task we get the warning in dmesg which is caught by fstests, very helpful during development, and may also be occassionaly noticed by users with lockdep enabled. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert read_key_bytes() to take a folioLi Zetao
The old page API is being gradually replaced and converted to use folio to improve code readability and avoid repeated conversion between page and folio. Moreover, use kmap_local_folio() instead of kmap_local_page(), which is more consistent with folio usage. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove unused included headersDavid Sterba
With help of neovim, LSP and clangd we can identify header files that are not actually needed to be included in the .c files. This is focused only on removal (with minor fixups), further cleanups are possible but will require doing the header files properly with forward declarations, minimized includes and include-what-you-use care. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12btrfs: remove redundant root argument from btrfs_update_inode()Filipe Manana
The root argument for btrfs_update_inode() always matches the root of the given inode, so remove the root argument and get it from the inode argument. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-09-13btrfs: convert btrfs_read_merkle_tree_page() to use a folioMatthew Wilcox (Oracle)
Remove a number of hidden calls to compound_head() by using a folio throughout. Also follow core kernel coding style by adding the folio to the page cache immediately after allocation instead of doing the read first, then adding it to the page cache. This ordering makes subsequent readers block waiting for the first reader instead of duplicating the work only to throw it away when they find out they lost the race. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-01fsverity: pass pos and size to ->write_merkle_tree_blockEric Biggers
fsverity_operations::write_merkle_tree_block is passed the index of the block to write and the log base 2 of the block size. However, all implementations of it use these parameters only to calculate the position and the size of the block, in bytes. Therefore, make ->write_merkle_tree_block take 'pos' and 'size' parameters instead of 'index' and 'log_blocksize'. Suggested-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Dave Chinner <dchinner@redhat.com> Link: https://lore.kernel.org/r/20221214224304.145712-5-ebiggers@kernel.org
2022-12-05btrfs: move orphan prototypes into orphan.hJosef Bacik
Move these out of ctree.h into orphan.h to cut down on code in ctree.h. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move verity prototypes into verity.hJosef Bacik
Move these out of ctree.h into verity.h to cut down on code in ctree.h. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move ioctl prototypes into ioctl.hJosef Bacik
Move these out of ctree.h into ioctl.h to cut down on code in ctree.h. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move accessor helpers into accessors.hJosef Bacik
This is a large patch, but because they're all macros it's impossible to split up. Simply copy all of the item accessors in ctree.h and paste them in accessors.h, and then update any files to include the header so everything compiles. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ reformat comments, style fixups ] Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move the printk helpers out of ctree.hJosef Bacik
We have a bunch of printk helpers that are in ctree.h. These have nothing to do with ctree.c, so move them into their own header. Subsequent patches will cleanup the printk helpers. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move fs wide helpers out of ctree.hJosef Bacik
We have several fs wide related helpers in ctree.h. The bulk of these are the incompat flag test helpers, but there are things such as btrfs_fs_closing() and the read only helpers that also aren't directly related to the ctree code. Move these into a fs.h header, which will serve as the location for file system wide related helpers. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-26btrfs: send: add support for fs-verityBoris Burkov
Preserve the fs-verity status of a btrfs file across send/recv. There is no facility for installing the Merkle tree contents directly on the receiving filesystem, so we package up the parameters used to enable verity found in the verity descriptor. This gives the receive side enough information to properly enable verity again. Note that this means that receive will have to re-compute the whole Merkle tree, similar to how compression worked before encoded_write. Since the file becomes read-only after verity is enabled, it is important that verity is added to the send stream after any file writes. Therefore, when we process a verity item, merely note that it happened, then actually create the command in the send stream during 'finish_inode_if_needed'. This also creates V3 of the send stream format, without any format changes besides adding the new commands and attributes. Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2022-01-03btrfs: drop the _nr from the item helpersJosef Bacik
Now that all call sites are using the slot number to modify item values, rename the SETGET helpers to raw_item_*(), and then rework the _nr() helpers to be the btrfs_item_*() btrfs_set_item_*() helpers, and then rename all of the callers to the new helpers. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-09-17btrfs: fix transaction handle leak after verity rollback failureFilipe Manana
During a verity rollback, if we fail to update the inode or delete the orphan, we abort the transaction and return without releasing our transaction handle. Fix that by releasing the handle. Fixes: 146054090b0859 ("btrfs: initial fsverity support") Fixes: 705242538ff348 ("btrfs: verity metadata orphan items") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23btrfs: verity metadata orphan itemsBoris Burkov
Writing out the verity data is too large of an operation to do in a single transaction. If we are interrupted before we finish creating fsverity metadata for a file, or fail to clean up already created metadata after a failure, we could leak the verity items that we already committed. To address this issue, we use the orphan mechanism. When we start enabling verity on a file, we also add an orphan item for that inode. When we are finished, we delete the orphan. However, if we are interrupted midway, the orphan will be present at mount and we can cleanup the half-formed verity state. There is a possible race with a normal unlink operation: if unlink and verity run on the same file in parallel, it is possible for verity to succeed and delete the still legitimate orphan added by unlink. Then, if we are interrupted and mount in that state, we will never clean up the inode properly. This is also possible for a file created with O_TMPFILE. Check nlink==0 before deleting to avoid this race. A final thing to note is that this is a resurrection of using orphans to signal an operation besides "delete this inode". The old case was to signal the need to do a truncate. That case still technically applies for mounting very old file systems, so we need to take some care to not clobber it. To that end, we just have to be careful that verity orphan cleanup is a no-op for non-verity files. Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23btrfs: initial fsverity supportBoris Burkov
Add support for fsverity in btrfs. To support the generic interface in fs/verity, we add two new item types in the fs tree for inodes with verity enabled. One stores the per-file verity descriptor and btrfs verity item and the other stores the Merkle tree data itself. Verity checking is done in end_page_read just before a page is marked uptodate. This naturally handles a variety of edge cases like holes, preallocated extents, and inline extents. Some care needs to be taken to not try to verity pages past the end of the file, which are accessed by the generic buffered file reading code under some circumstances like reading to the end of the last page and trying to read again. Direct IO on a verity file falls back to buffered reads. Verity relies on PageChecked for the Merkle tree data itself to avoid re-walking up shared paths in the tree. For this reason, we need to cache the Merkle tree data. Since the file is immutable after verity is turned on, we can cache it at an index past EOF. Use the new inode ro_flags to store verity on the inode item, so that we can enable verity on a file, then rollback to an older kernel and still mount the file system and read the file. Since we can't safely write the file anymore without ruining the invariants of the Merkle tree, we mark a ro_compat flag on the file system when a file has verity enabled. Acked-by: Eric Biggers <ebiggers@google.com> Co-developed-by: Chris Mason <clm@fb.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>