summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-22bcachefs: EINTR -> BCH_ERR_transaction_restartKent Overstreet
Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: btree_trans_too_many_iters() is now a transaction restartKent Overstreet
All transaction restarts need a tracepoint - this is essential for debugging Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Prevent a btree iter overflow in alloc pathKent Overstreet
In bch2_bucket_alloc_trans(), we're iterating over buckets - but not directly with an iterator, since we're iterating over the freespace btree. This means that we need to clear iter->path->preserve, otherwise we'll end up retaining a btree_path for every alloc key we touched - which is not what we want here. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Use bch2_err_str() in error messagesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improved errcodesKent Overstreet
Instead of overloading standard error codes (EINTR/EAGAIN), and defining short lists of error codes in multiple places that potentially end up overlapping & conflicting, we're now going to have one master list of error codes. Error codes are defined with an x-macro: thus we also have bch2_err_str() now. Also, error codes have a class field. Now, instead of checking for errors with ==, code should use bch2_err_matches(), which returns true if the error is equal to or a sub-error of the error class. This means we can define unique errors for every source location where an error is generated, which will help improve our error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: We can handle missing btree roots for all alloc btreesKent Overstreet
We can rebuild alloc info if these btree roots are missing - no need to bail out and say the filesystem is unrecoverable Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix should_invalidate_buckets()Kent Overstreet
Like bch2_copygc_wait_amount, should_invalidate_buckets() needs to try to ensure that there are always more buckets free than the largest reserve. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: ec_stripe_bkey_insert() -> for_each_btree_key_norestart()Kent Overstreet
With the upcoming patches to add assertions for incorrect nested transaction restart handling, this code is now bogus. Switch it to for_each_btree_key_norestart() so that transaction restarts are only handled in one place. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert erasure coding to for_each_btree_key_commit()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add a counter for btree_trans restartsKent Overstreet
This will help us improve nested transactions - we need to add assertions that whenever an inner transaction handles a restart, it still returns -EINTR to the outer transaction. This also adds nested_lockrestart_do() and nested_commit_do() which use the new counters to correctly return -EINTR when the transaction was restarted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert alloc code to for_each_btree_key_commit()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert subvol code to for_each_btree_key_commit()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert bch2_dev_usrdata_drop() to for_each_btree_key2()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Convert bch2_do_invalidates_work() to for_each_btree_key2()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: bch2_trans_run()Kent Overstreet
This adds a new helper, bch2_trans_run(), that runs a function with a btree_transaction context but without handling transaction restarts. We're adding checks for nested transaction restart handling: when an inner transaction handles a transaction restart it will still have to return it to the outer transaction, or else assertions will be popped in the outer transaction. But some places don't need restart handling at the outer scope, so this helper does what they need. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert bch2_gc_done() for_each_btree_key2()Kent Overstreet
This converts bch2_gc_stripes_done() and bch2_gc_reflink_done() to the new for_each_btree_key_commit() macro. The new for_each_btree_key2() and for_each_btree_key_commit() macros handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert more fsck code to for_each_btree_key2()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert more quota code to for_each_btree_key2()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert bch2_check_lrus() to for_each_btree_key_commit()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert bch2_dev_freespace_init() to for_each_btree_key_commit()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert bch2_do_discards_work() to for_each_btree_key2()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Improve bucket_alloc_fail tracepointKent Overstreet
We should be printing the number of free buckets, not just the number of available buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch2_mark_alloc(): Do wakeups after updating usageKent Overstreet
We have an obvious wake up race if we do the wakeup _before_ updating the counters the thing doing the waiting is reading. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: added lock held time statsDaniel Hill
We now record the length of time btree locks are held and expose this in debugfs. Enabled via CONFIG_BCACHEFS_LOCK_TIME_STATS. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch2_time_stats_to_text now indents properlyDaniel Hill
Printbufs indentation feature doesn't yet work with '\n' and '\t'. So we've replaced all instances of '\n' with prt_newline. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: lock time stats prep work.Daniel Hill
We need the caller name and a place to store our results, btree_trans provides this. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Unlock in bch2_trans_begin() if we've held locks more than 10usKent Overstreet
We try to ensure we never hold btree locks for too long - bcachefs tries to be soft realtime. This adds a check when restarting a transaction, where a transaction restart is cheap - if we've been holding locks for too long, drop and retake them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: for_each_btree_key2()Kent Overstreet
This introduces two new macros for iterating through the btree, with transaction restart handling - for_each_btree_key2() - for_each_btree_key_commit() Every iteration is now in an implicit transaction, and - as with lockrestart_do() and commit_do() - returning -EINTR will cause the transaction to be restarted, at the same key. This patch converts a bunch of code that was open coding this to these new macros, saving a substantial amount of code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix repair for extent past end of inodeKent Overstreet
When we find an extent past an inode's i_size, we need to do the deletion in the inode's snapshot (which will emit a whiteout if necessary); and we also need to note that we now have an a key at that position and snapshot, so that we don't go into an infinite loop. Also, switch to walking inodes in reverse older, oldest snapshot to newest, so that we emit the fewest whiteouts possible. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: When fsck finds redundant snapshot keys, trigger snapshots cleanupKent Overstreet
Fsck now checks for keys in different snapshot IDs that are now redundant due to other snapshots being deleted - it needs to for its own algorithms to not get confused. When it detects this it should re-run the post snapshot deletion cleanup - this patch does that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve fsck for subvols/snapshotsKent Overstreet
- Bunch of refactoring, and move some code out of bch2_snapshots_start() and into bch2_snapshots_check(), for constency with the rest of fsck - Interior snapshot nodes no longer point to a subvolume; this is so we don't end up with dangling subvol references when deleting or require scanning the full snapshots btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve snapshots_seenKent Overstreet
This makes the snapshots_seen data structure fsck private and improves it; we now also track the equivalence class for each snapshot id we've seen, which means we can detect when snapshot deletion hasn't finished or run correctly (which will otherwise confuse fsck). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix subvol/snapshot deleting in recoveryKent Overstreet
fsck doesn't want to run while we're cleaning up deleted snapshots - if that work needs to be done, we want it to have finished before fsck runs, otherwise fsck will get confused when it finds multiple keys in the same snapshot ID equivalence class (i.e. the mechanism that snapshot deletion uses for cleaning up redundant keys). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: fsck_inode_rm() shouldn't delete subvolsKent Overstreet
We should never see an inode marked as unlinked that's a subvolume root (or a directory) in fsck, but even if we do it's not correct for fsck to delete the subvolume: subvolumes are owned by dirents, and if we find a dangling subvolume (not marked as unlinked) we want fsck to reattach it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Switch data_update path to snapshot_id_listKent Overstreet
snapshots_seen is becoming private to fsck, and snapshot_id_list is actually what the data update path needs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix snapshot deletionKent Overstreet
Snapshots being deleted won't in general have a corresponding subvolume: this fixes a spurious fsck error where we'd complain about a snapshot pointing to a missing subvolume - but the subvolume had been deleted, and the snapshot was pending deletion as well. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Rename __bch2_trans_do() -> commit_do()Kent Overstreet
Better/more descriptive naming, and prep for adding nested_lockrestart_do() and nested_commit_do(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Silence some fsck errors when reconstructing alloc infoKent Overstreet
There's no need to print fsck errors for errors that are expected, and the user has already opted to repair. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Put some repair messages behind opts->verboseKent Overstreet
These messages log the updates we're doing in bch2_check_fix_ptrs(), which is useful when debugging but not usually needed. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Silence unimportant tracepointsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix move path when move_stats == NULLKent Overstreet
This isn't done very often, but it is legitimate Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Get ref on c->writes in move.cKent Overstreet
There's no point reading an extent in order to move it if the write is going to fail because we're shutting down. This patch changes the move path so that moving_io now owns a ref on c->writes - as a bonus, rebalance and copygc will now notice that we're shutting down and exit quicker. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: move.c refactoringKent Overstreet
- add bch2_moving_ctxt_(init|exit) - split out __bch2_evacutae_bucket() which takes an existing moving_ctxt, this will be used for improving copygc performance by pipelining across multiple buckets Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: data jobs, including rebalance wait for copygc.Daniel Hill
move_ratelimit() now has a bool that specifies whether we want to wait for copygc to finish. When copygc is running, we're probably low on free buckets instead of consuming the remaining buckets, we want to wait for copygc to finish. This should help with performance, and run away bucket fragmentation. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Redo data_update interfaceKent Overstreet
This patch significantly cleans up and simplifies the data_update interface. Instead of only being able to specify a single pointer by device to rewrite, we're now able to specify any or all of the pointers in the original extent to be rewrited, as a bitmask. data_cmd is no more: the various pred functions now just return true if the extent should be moved/updated. All the data_update path does is rewrite existing replicas, or add new ones. This fixes a bug where with background compression on replicated filesystems, where rebalance -> data_update would incorrectly drop the wrong old replica, and keep trying to recompress an extent pointer and each time failing to drop the right replica. Oops. Now, the data update path doesn't look at the io options to decide which pointers to keep and which to drop - it only goes off of the data_update_options passed to it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix bch2_check_alloc_key()Kent Overstreet
bch2_check_alloc_key() was failing to check buckets that didn't have alloc keys yet (because they'd never been used) - they still need to be added to the freespace btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve bch2_check_alloc_infoKent Overstreet
- In check_alloc_key(), previously we were re-initializing iterators for the need_discard and freespace btrees for every alloc key we checked. But this was causing us to redo lookups into the journal keys every time, since those lookups are cached in struct btree_iter. This initializes the iterators in bch2_check_alloc_info and passes them into check_alloc_key(). - Make the looping more consistent/efficient in bch2_check_alloc_info() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Use BTREE_INSERT_LAZY_RW in bch2_check_alloc_info()Kent Overstreet
This runs before we go rw for journal replay, but after we're allowed to go rw. It might be time to consider killing BTREE_INSERT_LAZY_RW, though. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Bucket invalidate path improvementsKent Overstreet
- invalidate_one_bucket() now returns 1 when we don't have any buckets on this device to invalidate, ensuring we don't spin - the tracepoint invocation is moved to after the transaction commit, and we now include the number of cached sectors in the tracepoint Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't BUG_ON() inode link count underflowKent Overstreet
This switches that assertion to a bch2_trans_inconsistent() call, as it should be. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>