summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-22bcachefs: run_one_trigger() now checks journal keysKent Overstreet
Previously, when doing updates and running triggers before journal replay completes, triggers would see the incorrect key for the old key being overwritten - this patch updates the trigger code to check the journal keys when necessary, needed for the upcoming allocator rewrite. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Stash a copy of key being overwritten in btree_insert_entryKent Overstreet
We currently need to call bch2_btree_path_peek_slot() multiple times in the transaction commit path - and some of those need to be updated to also check the keys from journal replay, too. Let's consolidate this and stash the key being overwritten in btree_insert_entry. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: bch2_btree_path_set_pos()Kent Overstreet
bch2_btree_path_set_pos() is now available outside of btree_iter.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: btree_id_cached()Kent Overstreet
Add a new helper that returns true if the given btree ID uses the btree key cache. This enables some new cleanups, since the helper can check the options for whether caching is enabled on a given btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve btree_key_cache_flush_pos()Kent Overstreet
btree_key_cache_flush_pos() uses BTREE_ITER_CACHED_NOFILL - but it wasn't checking for !ck->valid. It does check for the entry being dirty, so it shouldn't matter, but this refactor it a bit and adds and assertion. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix freeing in bch2_dev_buckets_resize()Kent Overstreet
We were double-freeing old_buckets and not freeing old_buckets_gens: also, the code was supposed to free buckets, not old_buckets; old_buckets is only needed because we have to use rcu_assign_pointer() instead of swap(), and won't be set if we hit the error path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't keep nodes in btree_reserve lockedKent Overstreet
These nodes aren't reachable by other threads, so there's no need to keep it locked - and this fixes a bug with the assertion in bch2_trans_unlock() firing on transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Log message improvementsKent Overstreet
Change the error messages in bch2_inconsistent_error() and bch2_fatal_error() so we can distinguish them. Also, prefer bch2_fs_fatal_error() (which also logs an error message) to bch2_fatal_error(), and change a call to bch2_inconsistent_error() to bch2_fatal_error() when we can't continue. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Delete some dead codeKent Overstreet
__bch2_mark_replicas() is now only used in one place, so inline it into the caller. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Ignore cached data when calculating fragmentationKent Overstreet
Previously, bucket fragmentation was considered to be bucket size - total amount of live data, both dirty and cached. This meant that if a bucket was full but only a small amount of data in it was dirty - the rest cached, we'd get stuck: copygc wouldn't move the dirty data out of the bucket and the allocator wouldn't be able to invalidate and drop the cached data. This changes fragmentation to exclude cached data, so that copygc will evacuate these buckets and copygc/the allocator will always be able to make forward progress. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't use in-memory bucket array for alloc updatesKent Overstreet
More prep work for getting rid of the in-memory bucket array: now that we have BTREE_ITER_WITH_JOURNAL, the allocator code can do ntree lookups before journal replay is finished, and there's no longer any need for it to get allocation information from the in-memory bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Kill allocator short-circuit invalidateKent Overstreet
The allocator thread invalidates buckets (increments their generation number) prior to discarding them and putting them on freelists. We've had a short circuit path for some time to only update the in-memory bucket mark when doing the invalidate if we're not invalidating cached data, but that short-circuit path hasn't really been needed for quite some time (likely since the btree key cache code was added). We're deleting it now as part of deleting/converting code that uses the in memory bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: BTREE_INSERT_LAZY_RW is only for recovery pathKent Overstreet
BTREE_INSERT_LAZY_RW shouldn't do anything after the filesystem has finished starting up - otherwise, it might interfere with going read-only as part of shutting down. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Handle transaction restarts in __bch2_move_data()Kent Overstreet
We weren't checking for -EINTR in the main loop in __bch2_move_data - this code predates modern transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Simplify bch2_inode_delete_keys()Kent Overstreet
Had a bug report that implies bch2_inode_delete_keys() returned -EINTR before it completed, so this patch simplifies it and makes the flow control a little more conventional. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: iter->update_pathKent Overstreet
With BTREE_ITER_FILTER_SNAPSHOTS, we have to distinguish between the path where the key was found, and the path for inserting into the current snapshot. This adds a new field to struct btree_iter for saving a path for the current snapshot, and plumbs it through bch2_trans_update(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Refactor bch2_btree_iter()Kent Overstreet
This splits bch2_btree_iter() up into two functions: an inner function that handles BTREE_ITER_WITH_JOURNAL, BTREE_ITER_WITH_UPDATES, and iterating acrcoss leaf nodes, and an outer one that implements BTREE_ITER_FILTER_SNAPHSOTS. This is prep work for remember a btree_path at our update position in BTREE_ITER_FILTER_SNAPSHOTS mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Tracepoint improvementsKent Overstreet
This improves the transaction restart tracepoints - adding distinct tracepoints for all the locations and reasons a transaction might have been restarted, and ensures that there's a tracepoint for every transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: New snapshot unit testKent Overstreet
This still needs to be expanded more, but this adds a basic test for BTREE_ITER_FILTER_SNAPSHOTS. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix an error path in bch2_snapshot_node_create()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Use BTREE_INSERT_USE_RESERVE in btree_update_key()Kent Overstreet
bch2_btree_update_key() is used in the btree node write path - before delivering the completion we have to update the parent pointer with the number of sectors written. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Refactor trigger codeKent Overstreet
This breaks bch2_trans_commit_run_triggers() up into multiple functions, and deletes a bit of duplication - prep work for triggers on alloc keys, which will need to run last. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Rename data_op_data_progress -> data_jobsKent Overstreet
Mild refactoring. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix check_pos_snapshot_overwritten for !snapshotsKent Overstreet
It shouldn't run if the btree being checked doesn't have snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: New data structure for buckets waiting on journal commitKent Overstreet
Implement a hash table, using cuckoo hashing, for empty buckets that are waiting on a journal commit before they can be reused. This replaces the journal_seq field of bucket_mark, and is part of eventually getting rid of the in memory bucket array. We may need to make bch2_bucket_needs_journal_commit() lockless, pending profiling and testing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Also print out in-memory gen on stale dirty pointerKent Overstreet
We're trying to track down a bug that shows itself as newly-created extents having stale dirty pointers - possibly due to the in memory gen and the btree gen being inconsistent. This patch changes the error message to also print out the in memory bucket gen when this happens. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve path for when btree_gc needs another passKent Overstreet
btree_gc sometimes needs another pass when it corrects bucket generation numbers or data types - when it finds multiple pointers of different data types to the same bucket, it may want to keep the second one it found. When this happens, we now clear out bucket sector counts _without_ resetting the bucket generation/data types that we already found, instead of resetting them to what we have in the alloc btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix bch2_check_fix_ptrs()Kent Overstreet
The repair for for btree_ptrs was saying one thing and doing another - fortunately, that code can just be deleted. Also, when we update a btree node pointer, we also have to update node in memery, if it exists in the btree node cache - this fixes bch2_check_fix_ptrs() to do that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix an uninitialized variableKent Overstreet
Only userspace builds were complaining about it, oddly enough. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22Revert "bcachefs: Delete some obsolete journal_seq_blacklist code"Kent Overstreet
This reverts commit f95b61228efd04c9c158123da5827c96e9773b29. It turns out, we're seeing filesystems in the wild end up with blacklisted btree node bsets - this should not be happening, and until we understand why and fix it we need to keep this code around. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Log & error message improvementsKent Overstreet
- Add a shim uuid_unparse_lower() in the kernel, since %pU doesn't work in userspace - We don't need to print the bcachefs: or the filesystem name prefix in userspace - Improve a few error messages Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: BTREE_ITER_FILTER_SNAPSHOTS is selected automaticallyKent Overstreet
It doesn't have to be specified - this patch deletes the two instances where it was. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Switch to __func__for recording where btree_trans was initializedKent Overstreet
Symbol decoding, via %ps, isn't supported in userspace - this will also be faster when we're using trans->fn in the fast path, as with the new BCH_JSET_ENTRY_log journal messages. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix bch2_journal_seq_blacklist_add()Kent Overstreet
The old code correctly handled the case where we were blacklisting a range that exactly matched an existing entry, but not the case where the new range partially overlaps an existing entry. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add verbose log messages for journal readKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improved superblock-related error messagesKent Overstreet
This patch converts bch2_sb_validate() and the .validate methods for the various superblock sections to take printbuf, to which they can print detailed error messages, including printing the entire section that was invalid. This is a great improvement over the previous situation, where we could only return static strings that didn't have precise information about what was wrong. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Use kvmalloc() for array of sorted keys in journal replayKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Make eytzinger size parameter more conventionalKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Kill bch2_bset_fix_invalidated_key()Kent Overstreet
Was dead code, so delete it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix an assertionKent Overstreet
bch2_trans_commit() can legitimately return -ENOSPC with BTREE_INSERT_NOFAIL set if BTREE_INSERT_NOWAIT was also set. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: bch_dev->devKent Overstreet
Add a field to bch_dev for the dev_t of the underlying block device - this fixes a null ptr deref in tracepoints. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Simplify journal replayKent Overstreet
With BTREE_ITER_WITH_JOURNAL, there's no longer any restrictions on the order we have to replay keys from the journal in, and we can also start up journal reclaim right away - and delete a bunch of code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22fixup! bcachefs: Factor out __bch2_btree_iter_set_pos()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: BTREE_ITER_WITH_JOURNALKent Overstreet
This adds a new btree iterator flag, BTREE_ITER_WITH_JOURNAL, that is automatically enabled when initializing a btree iterator before journal replay has completed - it overlays the contents of the journal with the btree. This lets us delete bch2_btree_and_journal_walk() and just use the normal btree iterator interface instead - which also lets us delete a significant amount of duplicated code. Note that BTREE_ITER_WITH_JOURNAL is still unoptimized in this patch - we're redoing the binary search over keys in the journal every time we call bch2_btree_iter_peek(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Tweak journal reclaim orderKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Make sure BCH_FS_FSCK_DONE gets setKent Overstreet
If we're not running fsck we still want to set BCH_FS_FSCK_DONE, so that bch2_fsck_err() calls are interpreted as bch2_inconsistent_error() calls(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve error messages in superblock write pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Log what we're doing when repairingKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix race between btree updates & journal replayKent Overstreet
Add a flag to indicate whether a journal replay key has been overwritten, and set/test it with appropriate btree locks held. This fixes a race between the allocator - invalidating buckets, and doing btree updates - and journal replay, which before this patch could clobber the allocator thread's update with an older version of the key from the journal. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: bch2_journal_entry_to_text()Kent Overstreet
This adds a _to_text() pretty printer for journal entries - including every subtype - which will shortly be used by the 'bcachefs list_journal' subcommand. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>