linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-10-22	bcachefs: data jobs, including rebalance wait for copygc.	Daniel Hill
	move_ratelimit() now has a bool that specifies whether we want to wait for copygc to finish. When copygc is running, we're probably low on free buckets instead of consuming the remaining buckets, we want to wait for copygc to finish. This should help with performance, and run away bucket fragmentation. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Redo data_update interface	Kent Overstreet
	This patch significantly cleans up and simplifies the data_update interface. Instead of only being able to specify a single pointer by device to rewrite, we're now able to specify any or all of the pointers in the original extent to be rewrited, as a bitmask. data_cmd is no more: the various pred functions now just return true if the extent should be moved/updated. All the data_update path does is rewrite existing replicas, or add new ones. This fixes a bug where with background compression on replicated filesystems, where rebalance -> data_update would incorrectly drop the wrong old replica, and keep trying to recompress an extent pointer and each time failing to drop the right replica. Oops. Now, the data update path doesn't look at the io options to decide which pointers to keep and which to drop - it only goes off of the data_update_options passed to it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix bch2_check_alloc_key()	Kent Overstreet
	bch2_check_alloc_key() was failing to check buckets that didn't have alloc keys yet (because they'd never been used) - they still need to be added to the freespace btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Improve bch2_check_alloc_info	Kent Overstreet
	- In check_alloc_key(), previously we were re-initializing iterators for the need_discard and freespace btrees for every alloc key we checked. But this was causing us to redo lookups into the journal keys every time, since those lookups are cached in struct btree_iter. This initializes the iterators in bch2_check_alloc_info and passes them into check_alloc_key(). - Make the looping more consistent/efficient in bch2_check_alloc_info() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Use BTREE_INSERT_LAZY_RW in bch2_check_alloc_info()	Kent Overstreet
	This runs before we go rw for journal replay, but after we're allowed to go rw. It might be time to consider killing BTREE_INSERT_LAZY_RW, though. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Bucket invalidate path improvements	Kent Overstreet
	- invalidate_one_bucket() now returns 1 when we don't have any buckets on this device to invalidate, ensuring we don't spin - the tracepoint invocation is moved to after the transaction commit, and we now include the number of cached sectors in the tracepoint Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Don't BUG_ON() inode link count underflow	Kent Overstreet
	This switches that assertion to a bch2_trans_inconsistent() call, as it should be. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Always descend to leaf nodes it btree_gc	Kent Overstreet
	If a btree node is unreadable, it's the topology repair that fixes that and it's kicked off by btree_gc, so btree_gc needs to touch every node and very that they can be read. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: fix __dev_available().	Daniel Hill
	__dev_available() now calculates available buckets correctly. Previously it would almost always return 0 when we have cached data. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Fix assertion in topology repair	Kent Overstreet
	If we were at the end of the node, when breaking out of the loop we'd pop the assertion on line 446 when cur wasn't NULL. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Make verbose option settable at runtime	Kent Overstreet
	-o verbose is very useful, and we're starting to use it more for runtime debug statements - making it possible to enable at runtime is a no brainer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Improve "copygc requested to run" error message	Kent Overstreet
	This improves the "copygc requested to run but no buckets found" to show the device that requires copygc to be run on - we'll definitely need to improve this more. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Pull out data_update.c	Kent Overstreet
	This is the start of reorganizing the data IO paths. The plan is to also break apart io.c into data_read.c and data_write.c, and migrate_write will be renamed to the data_update path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Split out dev_buckets_free()	Kent Overstreet
	Previously, dev_buckets_available() only counted buckets that are eligible to be allocated right now - i.e. buckets that don't have cached data, or need discard, or need gc gens, etc. But most users of this function want to know how many buckets are eligible to be allocated from without moving data around - copygc, allocator striping, which means we should be including cached data buckets etc. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: btree key cache pcpu freedlist	Kent Overstreet
	Originally, the btree key cache code would always allocate new entries by reusing from the recently-freed list, if that list wasn't empty. But that behaviour was dropped, for lock contention reasons. But it seems that entries stranded on the freed list have been contributing to some of our oom issues, because long running btree transactions will prevent them from being freed. This patch re-adds allocating from the freed list, but it also adds percpu buffers to solve the lock contention issues - and the new percpu freed lists will improve the evict paths, too. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Make IO in flight by copygc/rebalance configurable	Kent Overstreet
	This adds a new option, move_bytes_in_flight, for configuring the amount of IO in flight by copygc/rebalance - users with many devices in their filesystem will want to increase this. In the future we should be smarter about this, but this is an easy improvement. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Check for extents with too many ptrs	Kent Overstreet
	We have a hardcoded maximum on number of pointers in an extent that's used by some other data structures - notably bch_devs_list - but we weren't actually checking for it. Oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix refcount leak in bch2_do_invalidates()	Kent Overstreet
	If we fail to queue the work item because it's already in process, we need to drop the ref we just took. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Always use percpu_ref_tryget_live() on c->writes	Kent Overstreet
	If we're trying to get a ref and the refcount has been killed, it means we're doing an emergency shutdown - we always want tryget_live(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Improve checksum error messages	Kent Overstreet
	We're seeing checksum errors in the bch2_rechecksum_bio() path - give it a better error message to help track this down. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Improve an error message	Kent Overstreet
	When inserting a key type that's not valid for a given btree, we should print out which btree we were inserting into. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Fix assertion in bch2_dev_list_add_dev()	Kent Overstreet
	We were only allowing 4 devices in a dev_list, not 16. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Increase max size for btree_trans bump allocator	Kent Overstreet
	With backpointers, alloc keys have gotten bigger, so we're needing more memory here. We're probably going to need to go with something more sophisticated than a bump allocator, but - let's see if we can avoid doing that just yet. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Add a persistent counter for bucket discards	Kent Overstreet
	Like the previous patch for bucket invalidates, add another counter for a core allocator path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix btree node read retries	Kent Overstreet
	b->written wasn't being reset to 0 in the btree node read retry path, causing decrypting & validation of previously read bsets to not be re-run - ouch. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Add a persistent counter for bucket invalidation	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Call bch2_do_invalidates() when going read write	Kent Overstreet
	Like bch2_do_discards(), we should check if this needs to be done when going rw. Also, add some sysfs code for debugging bucket invalidation. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Improved human readable integer parsing	Kent Overstreet
	Printbufs recently switched to using string_get_size() for printing integers in human readable units. This updates __bch2_strtoh() to parse numbers printed by string_get_size() - we now have to handle floating point numbers, and new unit suffixes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix freespace initialization	Kent Overstreet
	bch2_dev_freespace_init() was using __bch2_trans_do() incorrectly, and calling bch2_bucket_do_index() with a stale alloc key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Printbuf rework	Kent Overstreet
	This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Fix btree node read error path	Kent Overstreet
	We were forgetting to clear the read_in_flight flag - oops. This also fixes it to not call bch2_fatal_error() before topology repair has had a chance to do its thing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix btree_and_journal_iter	Kent Overstreet
	We had a bug where btree_and_journal_iter would return the same key twice - after deleting it (perhaps because it was present in both the btree and the journal?) This reworks btree_and_journal_iter to track the current position, much like btree_paths, which makes the logic considerably simpler and more robust. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix for cmd_list_journal	Kent Overstreet
	cmd_list_journal wasn't correctly listing the most recent journal entries as blacklisted - because in the recovery path when just reading the journal, we were failing to add those to the blacklist table. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Also log overwrites in journal	Kent Overstreet
	Lately we've been doing a lot of debugging by looking at the journal to see what was changed, and by what code path. This patch adds a new journal entry type for recording overwrites, so that we don't have to search backwards through the journal to see what was being overwritten in order to work out what the triggers were supposed to be doing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Refactor journal entry adding	Kent Overstreet
	This takes copying the payload out of bch2_journal_add_entry(), which means we can use it for journal_transaction_name() - also prep work for journalling overwrites. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Add some missing error messages	Kent Overstreet
	bch2_opt_parse() was failing to generate error messages in error path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix memory corruption in encryption path	Kent Overstreet
	When do_encrypt() was passed a vmalloc address and the buffer spanned more than a single page, we were encrypting/decrypting completely different pages than the ones intended. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: bch2_trans_reset_updates()	Kent Overstreet
	Factor out a new helper. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix error checking in bch2_fs_alloc()	Kent Overstreet
	One of the init calls had a ; instead of a ?:, and errors after that got dropped - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Print message on btree node read retry success	Kent Overstreet
	Right now, we print an error message on btree node read error, and we print that we're retrying, but we don't explicitly say if the retry succeeded - this makes things a little clearer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix journal_keys_search() overhead	Kent Overstreet
	Previously, on every btree_iter_peek() operation we were searching the journal keys, doing a full binary search - which was slow. This patch fixes that by saving our position in the journal keys, so that we only do a full binary search when moving our position backwards or a large jump forwards. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Always print when doing journal replay in fsck	Kent Overstreet
	This logging improvement helps see when the previous fsck pass has completed. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Rename group to label for remaining strings.	Daniel Hill
	Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Fix encryption path on arm	Kent Overstreet
	flush_dcache_page() is not a noop on arm, but we were using virt_to_page() instead of vmalloc_to_page() for an address on the kernel stack - vmalloc memory, leading to an oops in flush_dcache_page(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Switch to key_type_user, not logon	Kent Overstreet
	The only difference key_type_logon and key_type_user is that key_type_logon keys can't be read by userspace. However, userspace has actually been adding keys to both the logon and user keychains, because userspace fsck requires the keychain interface - so we might as well just use user and drop the logon keychain. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: LRU repair tweaks	Kent Overstreet
	- Drop old unneeded parameter for whether we're in initial GC - which was from when btree updates had to be done differently before we went RW. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Delete bch_writepage	Kent Overstreet
	Per Dave Chinner and the xfs folks, .writepage is no longer needed, and it's better not to define it if .writepages is the intended path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Make bch_option compatible with Rust ffi	Brett Holman
	Rust FFI lacks support for unnamed structs and unions. The space saved in bch_option is not enough to be significant. Signed-off-by: Brett Holman <bholman.devel@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22	bcachefs: Put btree_trans_verify_sorted() behind debug_check_iterators	Kent Overstreet
	This is pretty expensive, and we've tested sufficiently with it now that it doesn't need to be on by default. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22	bcachefs: Fix extent merging	Kent Overstreet
	When merging extents, we have to check that we won't overflow size fields in any CRC entries - but the check for this was wrong, because in the loop it was in we weren't keeping a pointer to the (packed, encoded) CRC field. Fix this by moving it to its own loop. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>