summaryrefslogtreecommitdiff
path: root/fs/bcachefs/buckets.c
AgeCommit message (Collapse)Author
2025-01-09bcachefs: bch2_kvmalloc()Kent Overstreet
Add a version of kvmalloc() that doesn't have the INT_MAX limit; large filesystems do hit this. We'll want to get rid of the in-memory bucket gens array, but we're not there quite yet. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-29bcachefs: kill __bch2_extent_ptr_to_bp()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-29bcachefs: bch2_extent_ptr_to_bp() no longer depends on deviceKent Overstreet
bch_backpointer no longer contains the bucket_offset field, it's just a direct LBA mapping (with low bits to account for compressed extent splitting), so we don't need to refer to the device to construct it anymore. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: alloc_data_type_set() happens in alloc triggerKent Overstreet
Originally, we ran insert triggers before overwrite so that if an extent was being moved (by fallocate insert/collapse range), the bucket sector count wouldn't hit 0 partway through, and so we don't trigger state changes caused by that too soon. But this is better solved by just moving the data type change to the alloc trigger itself, where it's already called. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Kill unnecessary mark_lock usageKent Overstreet
We can't hold mark_lock while calling fsck_err() - that's a deadlock, mark_lock is meant to be a leaf node lock. It's also unnecessary for gc_bucket() and bucket_gen(); rcu suffices since the bucket_gens array describes its size, and we can't race with device removal or resize during gc/fsck since that takes state lock. Reported-by: syzbot+38641fcbda1aaffefdd4@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Kill bch2_bucket_alloc_new_fs()Kent Overstreet
The early-early allocation path, bch2_bucket_alloc_new_fs(), is no longer needed - and inconsistencies around new_fs_bucket_idx have been a frequent source of bugs. Reported-by: syzbot+592425844580a6598410@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch_backpointer -> bkey_i_backpointerKent Overstreet
Since we no longer store backpointers in alloc keys, there's no reason not to pass around bkey_i_backpointers; this means we don't have to pass the bucket pos separately. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Kill FSCK_NEED_FSCKKent Overstreet
If we find an error that indicates that we need to run fsck, we can specify that directly with run_explicit_recovery_pass(). These are now log_fsck_err() calls: we're just logging in the superblock that an error occurred - and possibly doing an emergency shutdown, depending on policy. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Move bch_extent_rebalance code to rebalance.cKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Annotate struct bucket_gens with __counted_by()Thorsten Blum
Add the __counted_by compiler attribute to the flexible array member b to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Use struct_size() to calculate the number of bytes to be allocated. Update bucket_gens->nbuckets and bucket_gens->nbuckets_minus_first when resizing. Compile-tested only. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-18bcachefs: bch2_folio_reservation_get_partial() is now better behavedKent Overstreet
bch2_folio_reservation_get_partial(), on partial success, will now return a reservation that's aligned to the filesystem blocksize. This is a partial fix for fstests generic/299 - fio verify is badly behaved in the presence of short writes that aren't aligned to its blocksize. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21bcachefs: bch2_trigger_ptr() calculates sectors even when no deviceKent Overstreet
This is necessary for erasure coded pointers to devices that have been removed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21bcachefs: EIO errcode cleanupKent Overstreet
We want to be using private errcodes whenever possible, for better error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21bcachefs: bch2_dev_rcu_noerror()Kent Overstreet
bch2_dev_rcu() now properly errors if the device is invalid Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21bcachefs: Move tabstop setup to bch2_dev_usage_to_text()Kent Overstreet
No reason for it not to be where it's needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09bcachefs: Annotate bch_replicas_entry_{v0,v1} with __counted_by()Thorsten Blum
Add the __counted_by compiler attribute to the flexible array members devs to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Increment nr_devs before adding a new device to the devs array and adjust the array indexes accordingly. Add a helper macro for adding a new device. In bch2_journal_read(), explicitly set nr_devs to 0. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09bcachefs: More BCH_SB_MEMBER_INVALID supportKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-01bcachefs: fix rebalance accountingKent Overstreet
Fixes: 49aa7830396b ("bcachefs: Fix rebalance_work accounting") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-24bcachefs: Fix rebalance_work accountingKent Overstreet
rebalance_work was keying off of the presence of rebelance_opts in the extent - but that was incorrect, we keep those around after rebalance for indirect extents since the inode's options are not directly available Fixes: 20ac515a9cc7 ("bcachefs: bch_acct_rebalance_work") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-16bcachefs: Fix locking in __bch2_trans_mark_dev_sb()Kent Overstreet
We run this in full RW mode now, so we have to guard against the superblock buffer being reallocated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-16bcachefs: Fix forgetting to pass trans to fsck_err()Kent Overstreet
Reported-by: syzbot+e3938cd6d761b78750e6@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-13bcachefs: bcachefs_metadata_version_disk_accounting_inumKent Overstreet
This adds another disk accounting counter to track usage per inode number (any snapshot ID). This will be used for a couple things: - It'll give us a way to tell the user how much space a given file ista consuming in all snapshots; i.e. how much extra space it's consuming due to snapshot versioning. - It counts number of extents and total size of extents (both in btree keyspace sectors and actual disk usage), meaning it gives us average extent size: that is, it'll let us cheaply find fragmented files that should be defragmented. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-09bcachefs: improve bch2_dev_usage_to_text()Kent Overstreet
Add a line for capacity Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Fix missing BTREE_TRIGGER_bucket_invalidate flagKent Overstreet
This fixes an accounting mismatch for cached data. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Reduce the scope of gc_lockKent Overstreet
gc_lock is now only for synchronization between check_alloc_info and interior btree updates - nothing else Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: fsck_err() may now take a btree_transKent Overstreet
fsck_err() now optionally takes a btree_trans; if the current thread has one, it is required that it be passed. The next patch will use this to unlock when waiting for user input. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: bch_acct_rebalance_workKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: bch_acct_btreeKent Overstreet
Add counters for how much disk space we're using per btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: bch_acct_snapshotKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: bch_acct_compressionKent Overstreet
This adds per-compression-type accounting of compressed and uncompressed size as well as number of extents - meaning we can now see compression ratio (without walking the whole filesystem). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Convert gc to new accountingKent Overstreet
Rewrite fsck/gc for the new accounting scheme. This adds a second set of in-memory accounting counters for gc to use; like with other parts of gc we run all trigger in TRIGGER_GC mode, then compare what we calculated to existing in-memory accounting at the end. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Kill fs_usage_onlineKent Overstreet
More dead code deletion. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Kill bch2_fs_usage_to_text()Kent Overstreet
Dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Delete journal-buf-sharded old style accountingKent Overstreet
More deletion of dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: kill bch2_fs_usage_read()Kent Overstreet
With bch2_ioctl_fs_usage(), this is now dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Kill bch2_fs_usage_initialize()Kent Overstreet
Deleting code for the old disk accounting scheme. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: dev_usage updated by new accountingKent Overstreet
Reading disk accounting now requires an eytzinger lookup (see: bch2_accounting_mem_read()), but the per-device counters are used frequently enough that we'd like to still be able to read them with just a percpu sum, as in the old code. This patch special cases the device counters; when we update in-memory accounting we also update the old style percpu counters if it's a deice counter update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Disk space accounting rewriteKent Overstreet
Main part of the disk accounting rewrite. This is a wholesale rewrite of the existing disk space accounting, which relies on percepu counters that are sharded by journal buffer, and rolled up and added to each journal write. With the new scheme, every set of counters is a distinct key in the accounting btree; this fixes scaling limitations of the old scheme, where counters took up space in each journal entry and required multiple percpu counters. Now, in memory accounting requires a single set of percpu counters - not multiple for each in flight journal buffer - and in the future we'll probably also have counters that don't use in memory percpu counters, they're not strictly required. An accounting update is now a normal btree update, using the btree write buffer path. At transaction commit time, we apply accounting updates to the in memory counters, which are percpu counters indexed in an eytzinger tree by the accounting key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: BCH_DATA_unstripedKent Overstreet
Add a new pseudo data type, to track buckets that are members of a stripe, but have unstriped data in them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: bch_alloc->stripe_sectorsKent Overstreet
Add a separate counter to bch_alloc_v4 for amount of striped data; this lets us separately track striped and unstriped data in a bucket, which lets us see when erasure coding has failed to update extents with stripe pointers, and also find buckets to continue updating if we crash mid way through creating a new stripe. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Use try_cmpxchg() family of functions instead of cmpxchg()Uros Bizjak
Use try_cmpxchg() family of functions instead of cmpxchg (*ptr, old, new) == old. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg() implicitly assigns old *ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-10bcachefs: Fix RCU splatKent Overstreet
Reported-by: syzbot+e74fea078710bbca6f4b@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-10bcachefs: Check for invalid bucket from bucket_gen(), gc_bucket()Kent Overstreet
Turn more asserts into proper recoverable error paths. Reported-by: syzbot+246b47da27f8e7e7d6fb@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-10bcachefs: Replace bucket_valid() asserts in bucket lookup with proper checksKent Overstreet
The bucket_gens array and gc_buckets array known their own size; we should be using those members, and returning an error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-10bcachefs: Fix refcount leak in check_fix_ptrs()Kent Overstreet
fsck_err() does a goto fsck_err on error; factor out check_fix_ptr() so that our error label can drop our device ref. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Fix uninitialized var warningKent Overstreet
Can't actually be used uninitialized, but gcc was being silly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-20bcachefs: Fix ref in trans_mark_dev_sbs() error pathKent Overstreet
Reported-by: syzbot+5c7f715a7107a608a544@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-20bcachefs: Fix rcu splat in check_fix_ptrs()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: kill bch2_dev_bkey_exists() in bch2_check_fix_ptrs()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Kill bch2_dev_bkey_exists() in backpointer codeKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>