summaryrefslogtreecommitdiff
path: root/fs/btrfs/free-space-cache.c
AgeCommit message (Collapse)Author
2025-01-13btrfs: open code set_page_extent_mapped()David Sterba
The function set_page_extent_mapped() is now a simple wrapper so use the folio helper. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-13btrfs: free-space-cache: remove unnecessary calls to btrfs_mark_buffer_dirty()Filipe Manana
We have several places explicitly calling btrfs_mark_buffer_dirty() but that is not necessarily since the target leaf came from a path that was obtained for a btree search function that modifies the btree, something like btrfs_insert_empty_item() or anything else that ends up calling btrfs_search_slot() with a value of 1 for its 'cow' argument. These just make the code more verbose, confusing and add a little extra overhead and well as increase the module's text size, so remove them. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-13btrfs: move extent-tree function declarations out of ctree.hFilipe Manana
We have 3 functions that have their prototypes declared in ctree.h but they are defined at extent-tree.c and they are unrelated to the btree data structure. Move the prototypes out of ctree.h and into extent-tree.h. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: convert btrfs_buffered_write() to use foliosQu Wenruo
The buffered write path is still heavily utilizing the page interface. Since we have converted it to do a page-by-page copying, it's much easier to convert all involved functions to folio interface, this involves: - btrfs_copy_from_user() - btrfs_drop_folio() - prepare_uptodate_page() - prepare_one_page() - lock_and_cleanup_extent_if_need() - btrfs_dirty_page() All function are changed to accept a folio parameter, and if the word "page" is in the function name, change that to "folio" too. The function btrfs_dirty_page() is exported for v1 space cache, convert v1 cache call site to convert its page to folio for the new interface. And there is a small enhancement for prepare_one_folio(), instead of manually waiting for the page writeback, let __filemap_get_folio() to handle that by using FGP_WRITEBEGIN, which implies (FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE). Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: make buffered write to copy one page a timeQu Wenruo
Currently the btrfs_buffered_write() is preparing multiple page a time, allowing a better performance. But the current trend is to support larger folio as an optimization, instead of implementing own multi-page optimization. This is inspired by generic_perform_write(), which is copying one folio a time. Such change will prepare us to migrate to implement the write_begin() and write_end() callbacks, and make every involved function a little easier. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: use str_yes_no() helper function in btrfs_dump_free_space()Thorsten Blum
Remove hard-coded strings by using the str_yes_no() and str_no_yes() helper functions. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: remove the dirty_page local variableQu Wenruo
Inside btrfs_buffered_write(), we have a local variable @dirty_pages, recording the number of pages we dirtied in the current iteration. However we do not really need that variable, since it can be calculated from @pos and @copied. In fact there is already a problem inside the short copy path, where we use @dirty_pages to calculate the range we need to release. But that usage assumes sectorsize == PAGE_SIZE, which is no longer true. Instead of keeping @dirty_pages and cause incorrect usage, just calculate the number of dirtied pages inside btrfs_dirty_pages(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-07btrfs: add cancellation points to trim loopsLuca Stefani
There are reports that system cannot suspend due to running trim because the task responsible for trimming the device isn't able to finish in time, especially since we have a free extent discarding phase, which can trim a lot of unallocated space. There are no limits on the trim size (unlike the block group part). Since trime isn't a critical call it can be interrupted at any time, in such cases we stop the trim, report the amount of discarded bytes and return an error. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180 Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737 CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-15btrfs: zoned: properly take lock to read/update block group's zoned variablesNaohiro Aota
__btrfs_add_free_space_zoned() references and modifies bg's alloc_offset, ro, and zone_unusable, but without taking the lock. It is mostly safe because they monotonically increase (at least for now) and this function is mostly called by a transaction commit, which is serialized by itself. Still, taking the lock is a safer and correct option and I'm going to add a change to reset zone_unusable while a block group is still alive. So, add locking around the operations. Fixes: 169e0da91a21 ("btrfs: zoned: track unusable bytes for zones") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-29btrfs: zoned: fix zone_unusable accounting on making block group read-write ↵Naohiro Aota
again When btrfs makes a block group read-only, it adds all free regions in the block group to space_info->bytes_readonly. That free space excludes reserved and pinned regions. OTOH, when btrfs makes the block group read-write again, it moves all the unused regions into the block group's zone_unusable. That unused region includes reserved and pinned regions. As a result, it counts too much zone_unusable bytes. Fortunately (or unfortunately), having erroneous zone_unusable does not affect the calculation of space_info->bytes_readonly, because free space (num_bytes in btrfs_dec_block_group_ro) calculation is done based on the erroneous zone_unusable and it reduces the num_bytes just to cancel the error. This behavior can be easily discovered by adding a WARN_ON to check e.g, "bg->pinned > 0" in btrfs_dec_block_group_ro(), and running fstests test case like btrfs/282. Fix it by properly considering pinned and reserved in btrfs_dec_block_group_ro(). Also, add a WARN_ON and introduce btrfs_space_info_update_bytes_zone_unusable() to catch a similar mistake. Fixes: 169e0da91a21 ("btrfs: zoned: track unusable bytes for zones") CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11btrfs: fix bitmap leak when loading free space cache on duplicate entryFilipe Manana
If we failed to link a free space entry because there's already a conflicting entry for the same offset, we free the free space entry but we don't free the associated bitmap that we had just allocated before. Fix that by freeing the bitmap before freeing the entry. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11btrfs: switch btrfs_block_group::inode to struct btrfs_inodeDavid Sterba
The structure is internal so we should use struct btrfs_inode for that. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11btrfs: remove super block argument from btrfs_iget_path()Filipe Manana
It's pointless to pass a super block argument to btrfs_iget_path() because we always pass a root and from it we can get the super block through: root->fs_info->sb So remove the super block argument. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11btrfs: pass a btrfs_inode to btrfs_wait_ordered_range()Filipe Manana
Instead of passing a (VFS) inode pointer argument, pass a btrfs_inode instead, as this is generally what we do for internal APIs, making it more consistent with most of the code base. This will later allow to help to remove a lot of BTRFS_I() calls in btrfs_sync_file(). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11btrfs: pass a btrfs_inode to btrfs_fdatawrite_range()Filipe Manana
Instead of passing a (VFS) inode pointer argument, pass a btrfs_inode instead, as this is generally what we do for internal APIs, making it more consistent with most of the code base. This will later allow to help to remove a lot of BTRFS_I() calls in btrfs_sync_file(). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-27Merge tag 'for-6.10-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix quota root leak after quota disable failure - fix condition when checking if a zone can be added as free - allocate inode in NOFS context during logging or tree-log replay - handle raid-stripe-tree lookup correctly during scrub * tag 'for-6.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: qgroup: fix quota root leak after quota disable failure btrfs: scrub: handle RST lookup error correctly btrfs: zoned: fix initial free space detection btrfs: use NOFS context when getting inodes during logging and log replay
2024-06-25btrfs: zoned: fix initial free space detectionNaohiro Aota
When creating a new block group, it calls btrfs_add_new_free_space() to add the entire block group range into the free space accounting. __btrfs_add_free_space_zoned() checks if size == block_group->length to detect the initial free space adding, and proceed that case properly. However, if the zone_capacity == zone_size and the over-write speed is fast enough, the entire zone can be over-written within one transaction. That confuses __btrfs_add_free_space_zoned() to handle it as an initial free space accounting. As a result, that block group becomes a strange state: 0 used bytes, 0 zone_unusable bytes, but alloc_offset == zone_capacity (no allocation anymore). The initial free space accounting can properly be checked by checking alloc_offset too. Fixes: 98173255bddd ("btrfs: zoned: calculate free space from zone capacity") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-04-01btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()Alexander Lobakin
bitmap_set_bits() does not start with the FS' prefix and may collide with a new generic helper one day. It operates with the FS-specific types, so there's no change those two could do the same thing. Just add the prefix to exclude such possible conflict. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Acked-by: David Sterba <dsterba@suse.com> Reviewed-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-05btrfs: remove SLAB_MEM_SPREAD flag useChengming Zhou
The SLAB_MEM_SPREAD flag used to be implemented in SLAB, which was removed as of v6.8-rc1, so it became a dead flag since the commit 16a1d968358a ("mm/slab: remove mm/slab.c and slab_def.h"). And the series[1] went on to mark it obsolete to avoid confusion for users. Here we can just remove all its users, which has no functional change. [1] https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: use KMEM_CACHE() to create btrfs_free_space cacheKunwu Chan
Use the KMEM_CACHE() macro instead of kmem_cache_create() to simplify the creation of SLAB caches when the default values are used. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: add helper to get fs_info from struct inode pointerDavid Sterba
Add a convenience helper to get a fs_info from a VFS inode pointer instead of open coding the chain or using btrfs_sb() that in some cases does one more pointer hop. This is implemented as a macro (still with type checking) so we don't need full definitions of struct btrfs_inode, btrfs_root or btrfs_fs_info. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: mark __btrfs_add_free_space staticLijuan Li
__btrfs_add_free_space is only used in free-space-cache.c, so mark it static. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Lijuan Li <lilijuan@iscas.ac.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove unused included headersDavid Sterba
With help of neovim, LSP and clangd we can identify header files that are not actually needed to be included in the .c files. This is focused only on removal (with minor fixups), further cleanups are possible but will require doing the header files properly with forward declarations, minimized includes and include-what-you-use care. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-12-15btrfs: migrate subpage code to folio interfacesQu Wenruo
Although subpage itself is conflicting with higher folio, since subpage (sectorsize < PAGE_SIZE and nodesize < PAGE_SIZE) means we will never need higher order folio, there is a hidden pitfall: - btrfs_page_*() helpers Those helpers are an abstraction to handle both subpage and non-subpage cases, which means we're going to pass pages pointers to those helpers. And since those helpers are shared between data and metadata paths, it's unavoidable to let them to handle folios, including higher order folios). Meanwhile for true subpage case, we should only have a single page backed folios anyway, thus add a new ASSERT() for btrfs_subpage_assert() to ensure that. Also since those helpers are shared between both data and metadata, add some extra ASSERT()s for data path to make sure we only get single page backed folio for now. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12btrfs: remove redundant root argument from btrfs_update_inode()Filipe Manana
The root argument for btrfs_update_inode() always matches the root of the given inode, so remove the root argument and get it from the inode argument. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12btrfs: abort transaction on generation mismatch when marking eb as dirtyFilipe Manana
When marking an extent buffer as dirty, at btrfs_mark_buffer_dirty(), we check if its generation matches the running transaction and if not we just print a warning. Such mismatch is an indicator that something really went wrong and only printing a warning message (and stack trace) is not enough to prevent a corruption. Allowing a transaction to commit with such an extent buffer will trigger an error if we ever try to read it from disk due to a generation mismatch with its parent generation. So abort the current transaction with -EUCLEAN if we notice a generation mismatch. For this we need to pass a transaction handle to btrfs_mark_buffer_dirty() which is always available except in test code, in which case we can pass NULL since it operates on dummy extent buffers and all test roots have a single node/leaf (root node at level 0). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12btrfs: remove btrfs_crc32c wrapperJosef Bacik
This simply sends the same arguments into crc32c(), and is just used in a few places. Remove this wrapper and directly call crc32c() in these instances. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12btrfs: move btrfs_crc32c_final into free-space-cache.cJosef Bacik
This is the only place this helper is used, take it out of ctree.h and move it into free-space-cache.c. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-21btrfs: zoned: no longer count fresh BG region as zone unusableNaohiro Aota
Now that we switched to write time activation, we no longer need to (and must not) count the fresh region as zone unusable. This commit is similar to revert of commit fa2068d7e922b434eb ("btrfs: zoned: count fresh BG region as zone unusable"). Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-21btrfs: print target number of bytes when dumping free spaceFilipe Manana
When dumping free space, with btrfs_dump_free_space(), we pass a bytes argument in order to count how many free space entries in the block group have a size greater than or equal to that number of bytes. We then print how many suitable entries we found, but we don't print the target number of bytes, we just say "bytes". Change the message to actually print the number of bytes, which makes debugging -ENOSPC issues a bit easier. Also sligthly change the odd grammar and terminology: the sentence is ending with 'is', which doesn't make sense, and the term 'blocks' is confusing as we are referring to free space entries within the block group's free space cache. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-21btrfs: make find_first_extent_bit() return a booleanFilipe Manana
Currently find_first_extent_bit() returns a 0 if it found a range in the given io tree and 1 if it didn't find any. There's no need to return any errors, so make the return value a boolean and invert the logic to make more sense: return true if it found a range and false if it didn't find any range. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: move btrfs_check_trunc_cache_free_space into block-rsv.cJosef Bacik
This is completely related to block rsv's, move it out of the free space cache code and into block-rsv.c. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: assert tree lock is held when removing free space entriesFilipe Manana
Removing a free space entry from an in memory space cache requires having the corresponding btrfs_free_space_ctl's 'tree_lock' held. We have several code paths that remove an entry, so add assertions where appropriate to verify we are holding the lock, as the lock is acquired by some other function up in the call chain, which makes it easy to miss in the future. Note: for this to work we need to lock the local btrfs_free_space_ctl at load_free_space_cache(), which was not being done because it's local, declared on the stack, so no other task has access to it. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: assert tree lock is held when linking free spaceFilipe Manana
When linking a free space entry, at link_free_space(), the caller should be holding the spinlock 'tree_lock' of the given btrfs_free_space_ctl argument, which is necessary for manipulating the red black tree of free space entries (done by tree_insert_offset(), which already asserts the lock is held) and for manipulating the 'free_space', 'free_extents', 'discardable_extents' and 'discardable_bytes' counters of the given struct btrfs_free_space_ctl. So assert that the spinlock 'tree_lock' of the given btrfs_free_space_ctl is held by the current task. We have multiple code paths that end up calling link_free_space(), and all currently take the lock before calling it. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: assert tree lock is held when searching for free space entriesFilipe Manana
When searching for a free space entry by offset, at tree_search_offset(), we are supposed to have the btrfs_free_space_ctl's 'tree_lock' held, so assert that. We have multiple callers of tree_search_offset(), and all currently hold the necessary lock before calling it. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: assert proper locks are held at tree_insert_offset()Filipe Manana
There are multiple code paths leading to tree_insert_offset(), and each path takes the necessary locks before tree_insert_offset() is called, since they do other things that require those locks to be held. This makes it easy to miss the locking somewhere, so make tree_insert_offset() assert that the required locks are being held by the calling task. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: simplify arguments to tree_insert_offset()Filipe Manana
For the in-memory component of space caching (free space cache and free space tree), three of the arguments passed to tree_insert_offset() can always be taken from the new free space entry that we are about to add. So simplify tree_insert_offset() to take the new entry instead of the 'offset', 'node' and 'bitmap' arguments. This will also allow to make further changes simpler. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: use precomputed end offsets at do_trimming()Filipe Manana
The are two computations of end offsets at do_trimming() that are not necessary, as they were previously computed and stored in local const variables. So just use the variables instead, to make the source code shorter and easier to read. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: avoid searching twice for previous node when merging free space entriesFilipe Manana
At try_merge_free_space(), avoid calling twice rb_prev() to find the previous node, as that requires looping through the red black tree, so store the result of the rb_prev() call and then use it. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: avoid extra memory allocation when copying free space cacheFilipe Manana
At copy_free_space_cache(), we add a new entry to the block group's ctl before we free the entry from the temporary ctl. Adding a new entry requires the allocation of a new struct btrfs_free_space, so we can avoid a temporary extra allocation by freeing the entry from the temporary ctl before we add a new entry to the main ctl, which possibly also reduces the chances for a memory allocation failure in case of very high memory pressure. So just do that. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-09btrfs: fix space cache inconsistency after error loading it from diskFilipe Manana
When loading a free space cache from disk, at __load_free_space_cache(), if we fail to insert a bitmap entry, we still increment the number of total bitmaps in the btrfs_free_space_ctl structure, which is incorrect since we failed to add the bitmap entry. On error we then empty the cache by calling __btrfs_remove_free_space_cache(), which will result in getting the total bitmaps counter set to 1. A failure to load a free space cache is not critical, so if a failure happens we just rebuild the cache by scanning the extent tree, which happens at block-group.c:caching_thread(). Yet the failure will result in having the total bitmaps of the btrfs_free_space_ctl always bigger by 1 then the number of bitmap entries we have. So fix this by having the total bitmaps counter be incremented only if we successfully added the bitmap entry. Fixes: a67509c30079 ("Btrfs: add a io_ctl struct and helpers for dealing with the space cache") Reviewed-by: Anand Jain <anand.jain@oracle.com> CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-03-15btrfs: zoned: count fresh BG region as zone unusableNaohiro Aota
The naming of space_info->active_total_bytes is misleading. It counts not only active block groups but also full ones which are previously active but now inactive. That confusion results in a bug not counting the full BGs into active_total_bytes on mount time. For a background, there are three kinds of block groups in terms of activation. 1. Block groups never activated 2. Block groups currently active 3. Block groups previously active and currently inactive (due to fully written or zone finish) What we really wanted to exclude from "total_bytes" is the total size of BGs #1. They seem empty and allocatable but since they are not activated, we cannot rely on them to do the space reservation. And, since BGs #1 never get activated, they should have no "used", "reserved" and "pinned" bytes. OTOH, BGs #3 can be counted in the "total", since they are already full we cannot allocate from them anyway. For them, "total_bytes == used + reserved + pinned + zone_unusable" should hold. Tracking #2 and #3 as "active_total_bytes" (current implementation) is confusing. And, tracking #1 and subtract that properly from "total_bytes" every time you need space reservation is cumbersome. Instead, we can count the whole region of a newly allocated block group as zone_unusable. Then, once that block group is activated, release [0 .. zone_capacity] from the zone_unusable counters. With this, we can eliminate the confusing ->active_total_bytes and the code will be common among regular and the zoned mode. Also, no additional counter is needed with this approach. Fixes: 6a921de58992 ("btrfs: zoned: introduce space_info->active_total_bytes") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: pass btrfs_inode to btrfs_add_delayed_iputDavid Sterba
The function is for internal interfaces so we should use the btrfs_inode. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: simplify percent calculation helpers, rename div_factorDavid Sterba
The div_factor* helpers calculate fraction or percentage fraction. The name is a bit confusing, we use it only for percentage calculations and there are two helpers. There's a helper mult_frac that's for general fractions, that tries to be accurate but we multiply and divide by small numbers so we can use the div_u64 helper. Rename the div_factor* helpers and use 1..100 percentage range, also drop the case checking for percentage == 100, it's never hit. The conversions: * div_factor calculates tenths and the numbers need to be adjusted * div_factor_fine is direct replacement Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move super_block specific helpers into super.hJosef Bacik
This will make syncing fs.h to user space a little easier if we can pull the super block specific helpers out of fs.h and put them in super.h. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move file prototypes to file.hJosef Bacik
Move these out of ctree.h into file.h to cut down on code in ctree.h. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move file-item prototypes into their own headerJosef Bacik
Move these prototypes out of ctree.h and into file-item.h. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: update function commentsDavid Sterba
Update, reformat or reword function comments. This also removes the kdoc marker so we don't get reports when the function name is missing. Changes made: - remove kdoc markers - reformat the brief description to be a proper sentence - reword to imperative voice - align parameter list - fix typos Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move accessor helpers into accessors.hJosef Bacik
This is a large patch, but because they're all macros it's impossible to split up. Simply copy all of the item accessors in ctree.h and paste them in accessors.h, and then update any files to include the header so everything compiles. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ reformat comments, style fixups ] Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05btrfs: move BTRFS_FS_STATE* definitions and helpers to fs.hJosef Bacik
We're going to use fs.h to hold fs wide related helpers and definitions, move the FS_STATE enum and related helpers to fs.h, and then update all files that need these definitions to include fs.h. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>