summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-22bcachefs: Make sure bch2_trans_mark_update uses correct iter flagsKent Overstreet
Now that bch2_btree_iter_peek_with_updates() has been removed in favor of BTREE_ITER_WITH_UPDATES, we need to make sure it's not used where we don't want it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a memory leak in dio write pathKent Overstreet
Commit c42bca92be928ce7dece5fc04cf68d0e37ee6718 "bio: don't copy bvec for direct IO" changed bio_iov_iter_get_pages() to point bio->bi_iovec at the incoming biovec, meaning if we already allocated one, it'll be leaked. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: fix a possible bcachefs checksum mapping error opt-checksum enum ↵Janpieter Sollie
to type-checksum enum This fixes some rare cases where the metadata checksum option specified may map to the wrong actual checksum type. Signed-off-by: Janpieter Sollie <janpieter.sollie@edpnet.be> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Clear iter->should_be_locked in bch2_trans_resetKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't underflow c->sectors_availableKent Overstreet
This rarely used error path should've been checking for underflow - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Kill bch2_btree_iter_peek_cached()Kent Overstreet
It's now been rolled into bch2_btree_iter_peek_slot() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Allow shorter JSET_ENTRY_dev_usage entriesKent Overstreet
If the last entry(ies) would be all zeros, there's no need to write them out - the read path already handles that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: mount: fix null deref with null devnameDan Robertson
- Fix null deref on mount when given a null device name. - Move the dev_name checks to return EINVAL when it is invalid. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix null ptr deref when splitting compressed extentsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix overflow in journal_replay_entry_earlyKent Overstreet
If filesystem on disk was used by a version with a larger BCH_DATA_NR thas the currently running version, we don't want this to cause a buffer overrun. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Always zero memory from bch2_trans_kmalloc()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Merging for indirect extentsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improved extent mergingKent Overstreet
Previously, checksummed extents could only be merged when the checksum covered only the currently live data. xfstest generic/064 creates a test file, then uses finsert calls to split the extent, then collapse calls to see if they get merged. But without any reads to trigger the narrow_crcs path, each of the split extents will still have a checksum for the entire original extent. This patch improves the extent merge path so that if either of the extents we're attempting to merge has a checksum that covers the entire merged extent, we just use that checksum. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Re-implement extent merging in transaction commit pathKent Overstreet
We haven't had extent merging in quite some time. It used to be done by the btree code when sorting btree nodes, but that was eliminated as part of the work to separate extent handling from core btree code. This patch re-implements extent merging in the transaction commit path. We don't currently have the ability to merge reflink pointers, we need to do some work on the triggers code to be able to do that without ending up with incorrect refcounts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Refactor extent_handle_overwrites()Kent Overstreet
Prep work for extent merging Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Clean up key mergingKent Overstreet
This patch simplifies the key merging code by getting rid of partial merges - it's simpler and saner if we just don't merge extents when they'd overflow k->size. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Kill trans->updates2Kent Overstreet
Now that extent handling has been lifted to bch2_trans_update(), we don't need to keep two different lists of updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Simplify reflink triggerKent Overstreet
Now that we only mark entire extents, we can ditch the "reflink_p_frag_references" code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Move extent_handle_overwrites() to bch2_trans_update()Kent Overstreet
This lifts handling of overlapping extents out of __bch2_trans_commit() and moves it to where we first do the update - which means that BTREE_ITER_WITH_UPDATES can now work correctly in extents mode. Also, this patch reworks how extent triggers work: previously, on partial extent overwrite we would pass this information to the trigger, telling it what part of the extent was being overwritten. But, this approach has had too many subtle corner cases - now, we only mark whole extents, meaning on partial extent overwrite we unmark the old extent and mark the new extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: bch2_btree_iter_peek_slot() now saves initial position when searchingKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill __bch2_btree_iter_peek_slot_extents()Kent Overstreet
This codepath won't just be for extents in the future, it'll also be for BTREE_ITER_FILTER_SNAPSHOTS mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch2_btree_iter_peek_slot() now supports BTREE_ITER_WITH_UPDATESKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: BTREE_ITER_WITH_UPDATESKent Overstreet
This drops bch2_btree_iter_peek_with_updates() and replaces it with a new flag, BTREE_ITER_WITH_UPDATES, and also reworks bch2_btree_iter_peek_slot() to respect it too. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Child btree iteratorsKent Overstreet
This adds the ability for btree iterators to own child iterators - to be used by an upcoming rework of bch2_btree_iter_peek_slot(), so we can scan forwards while maintaining our current position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Drop all btree locks when submitting btree node readsKent Overstreet
As a rule we don't want to be holding btree locks while submitting IO - this will improve overall filesystem latency. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: More topology repair codeKent Overstreet
This improves the handling of overlapping btree nodes; now, we handle the case where one btree node completely overwrites another. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a buffer overrunKent Overstreet
In make_extent_indirect(), we were allocating too small of a buffer for the new indirect extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't mark superblocks past end of usable spaceKent Overstreet
bcachefs-tools recently started putting a backup superblock at the end of the device. This causes a problem if the bucket size doesn't divide the device size - but we can fix it by just skipping marking that part. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a spurious debug mode assertionKent Overstreet
When we switched to using bch2_btree_bset_insert_key() for extents it turned out it started leaving invalid keys around - of type deleted but nonzero size - but this is fine (if ugly) because they're never written out. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix unitialized use of a valueBrett Holman
Signed-off-by: Brett Holman <bpholman5@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: do not compile acl mod on minimal configDan Robertson
Do not compile the acl.o target if BCACHEFS_POSIX_ACL is not enabled. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: btree_iter->should_be_lockedKent Overstreet
Add a field to struct btree_iter for tracking whether it should be locked - this fixes spurious transaction restarts in bch2_trans_relock(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve btree iterator tracepointsKent Overstreet
This patch adds some new tracepoints to the btree iterator code, and adds new fields to the existing tracepoints - primarily for the iterator position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Preallocate transaction memKent Overstreet
This helps avoid transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Check for errors from bch2_trans_update()Kent Overstreet
Upcoming refactoring is going to change bch2_trans_update() to start returning transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs; Check for allocator thread shutdownKent Overstreet
We were missing a kthread_should_stop() check in the loop in bch2_invalidate_buckets(), very occasionally leading to us getting stuck while shutting down. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Journal space calculation fixKent Overstreet
When devices have different bucket sizes, we may accumulate a journal write that doesn't fit on some of our devices - previously, we'd underflow when calculating space on that device and then everything would get weird. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't fragment extents when making them indirectKent Overstreet
This fixes a "disk usage increased without a reservation" bug, when reflinking compressed extents. Also, there's no good reason for reflink to be fragmenting extents anyways. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fsck for reflink refcountsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Assorted endianness fixesKent Overstreet
Found by sparse Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a deadlockKent Overstreet
Waiting on a btree node write with btree locks held can deadlock, if the write errors: the write error path has to do do a btree update to drop the pointer to the replica that errored. The interior update path has to wait on in flight btree writes before freeing nodes on disk. Previously, this was done in bch2_btree_interior_update_will_free_node(), and could deadlock; now, we just stash a pointer to the node and do it in btree_update_nodes_written(), just prior to the transactional part of the update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Split out btree_error_wqKent Overstreet
We can't use btree_update_wq becuase btree updates may be waiting on btree writes to complete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix pathalogical behaviour with inode sharding by cpu IDKent Overstreet
If the transactior restarts on a different CPU, it could end up needing to read in a different btree node, which makes another transaction restart more likely... Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix journal write error pathKent Overstreet
Journal write errors were racing with the submission path - potentially causing writes to other replicas to not get submitted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Reflink refcount fixKent Overstreet
__bch2_trans_mark_reflink_p wasn't always correctly returning the number of sectors processed - the new logic is a bit more straightforward overall too. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add an option to control sharding new inode numbersKent Overstreet
We're seeing a bug where inode creates end up spinning in bch2_inode_create - disabling sharding will simplify what we're testing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't use bch_write_op->cl for delivering completionsKent Overstreet
We already had op->end_io as an alternative mechanism to op->cl.parent for delivering write completions; this switches all code paths to using op->end_io. Two reasons: - op->end_io is more efficient, due to fewer atomic ops, this completes the conversion that was originally only done for the direct IO path. - We'll be restructing the write path to use a different mechanism for punting to process context, refactoring to not use op->cl will make that easier. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill bch_write_op.index_update_fnKent Overstreet
This deletes bch_write_op.index_update_fn: indirect function calls have gotten considerably more expensive post spectre/meltdown, and we only have two different index_update_fns - this patch adds a flag to specify which one to use (normal vs. data move path). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Inline fastpath of bch2_disk_reservation_add()Kent Overstreet
The fastpath now doesn't even disable preemption - instead we use a (non locked) cmpxchg. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't use uuid in tracepointsKent Overstreet
%pU for printing out pointers to uuids doesn't work in perf trace Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>