summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-22bcachefs: mount: fix null deref with null devnameDan Robertson
- Fix null deref on mount when given a null device name. - Move the dev_name checks to return EINVAL when it is invalid. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix null ptr deref when splitting compressed extentsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix overflow in journal_replay_entry_earlyKent Overstreet
If filesystem on disk was used by a version with a larger BCH_DATA_NR thas the currently running version, we don't want this to cause a buffer overrun. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Always zero memory from bch2_trans_kmalloc()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Merging for indirect extentsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improved extent mergingKent Overstreet
Previously, checksummed extents could only be merged when the checksum covered only the currently live data. xfstest generic/064 creates a test file, then uses finsert calls to split the extent, then collapse calls to see if they get merged. But without any reads to trigger the narrow_crcs path, each of the split extents will still have a checksum for the entire original extent. This patch improves the extent merge path so that if either of the extents we're attempting to merge has a checksum that covers the entire merged extent, we just use that checksum. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Re-implement extent merging in transaction commit pathKent Overstreet
We haven't had extent merging in quite some time. It used to be done by the btree code when sorting btree nodes, but that was eliminated as part of the work to separate extent handling from core btree code. This patch re-implements extent merging in the transaction commit path. We don't currently have the ability to merge reflink pointers, we need to do some work on the triggers code to be able to do that without ending up with incorrect refcounts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Refactor extent_handle_overwrites()Kent Overstreet
Prep work for extent merging Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Clean up key mergingKent Overstreet
This patch simplifies the key merging code by getting rid of partial merges - it's simpler and saner if we just don't merge extents when they'd overflow k->size. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Kill trans->updates2Kent Overstreet
Now that extent handling has been lifted to bch2_trans_update(), we don't need to keep two different lists of updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Simplify reflink triggerKent Overstreet
Now that we only mark entire extents, we can ditch the "reflink_p_frag_references" code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Move extent_handle_overwrites() to bch2_trans_update()Kent Overstreet
This lifts handling of overlapping extents out of __bch2_trans_commit() and moves it to where we first do the update - which means that BTREE_ITER_WITH_UPDATES can now work correctly in extents mode. Also, this patch reworks how extent triggers work: previously, on partial extent overwrite we would pass this information to the trigger, telling it what part of the extent was being overwritten. But, this approach has had too many subtle corner cases - now, we only mark whole extents, meaning on partial extent overwrite we unmark the old extent and mark the new extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: bch2_btree_iter_peek_slot() now saves initial position when searchingKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill __bch2_btree_iter_peek_slot_extents()Kent Overstreet
This codepath won't just be for extents in the future, it'll also be for BTREE_ITER_FILTER_SNAPSHOTS mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch2_btree_iter_peek_slot() now supports BTREE_ITER_WITH_UPDATESKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: BTREE_ITER_WITH_UPDATESKent Overstreet
This drops bch2_btree_iter_peek_with_updates() and replaces it with a new flag, BTREE_ITER_WITH_UPDATES, and also reworks bch2_btree_iter_peek_slot() to respect it too. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Child btree iteratorsKent Overstreet
This adds the ability for btree iterators to own child iterators - to be used by an upcoming rework of bch2_btree_iter_peek_slot(), so we can scan forwards while maintaining our current position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Drop all btree locks when submitting btree node readsKent Overstreet
As a rule we don't want to be holding btree locks while submitting IO - this will improve overall filesystem latency. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: More topology repair codeKent Overstreet
This improves the handling of overlapping btree nodes; now, we handle the case where one btree node completely overwrites another. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a buffer overrunKent Overstreet
In make_extent_indirect(), we were allocating too small of a buffer for the new indirect extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't mark superblocks past end of usable spaceKent Overstreet
bcachefs-tools recently started putting a backup superblock at the end of the device. This causes a problem if the bucket size doesn't divide the device size - but we can fix it by just skipping marking that part. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a spurious debug mode assertionKent Overstreet
When we switched to using bch2_btree_bset_insert_key() for extents it turned out it started leaving invalid keys around - of type deleted but nonzero size - but this is fine (if ugly) because they're never written out. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix unitialized use of a valueBrett Holman
Signed-off-by: Brett Holman <bpholman5@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: do not compile acl mod on minimal configDan Robertson
Do not compile the acl.o target if BCACHEFS_POSIX_ACL is not enabled. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: btree_iter->should_be_lockedKent Overstreet
Add a field to struct btree_iter for tracking whether it should be locked - this fixes spurious transaction restarts in bch2_trans_relock(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve btree iterator tracepointsKent Overstreet
This patch adds some new tracepoints to the btree iterator code, and adds new fields to the existing tracepoints - primarily for the iterator position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Preallocate transaction memKent Overstreet
This helps avoid transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Check for errors from bch2_trans_update()Kent Overstreet
Upcoming refactoring is going to change bch2_trans_update() to start returning transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs; Check for allocator thread shutdownKent Overstreet
We were missing a kthread_should_stop() check in the loop in bch2_invalidate_buckets(), very occasionally leading to us getting stuck while shutting down. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Journal space calculation fixKent Overstreet
When devices have different bucket sizes, we may accumulate a journal write that doesn't fit on some of our devices - previously, we'd underflow when calculating space on that device and then everything would get weird. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't fragment extents when making them indirectKent Overstreet
This fixes a "disk usage increased without a reservation" bug, when reflinking compressed extents. Also, there's no good reason for reflink to be fragmenting extents anyways. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fsck for reflink refcountsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Assorted endianness fixesKent Overstreet
Found by sparse Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a deadlockKent Overstreet
Waiting on a btree node write with btree locks held can deadlock, if the write errors: the write error path has to do do a btree update to drop the pointer to the replica that errored. The interior update path has to wait on in flight btree writes before freeing nodes on disk. Previously, this was done in bch2_btree_interior_update_will_free_node(), and could deadlock; now, we just stash a pointer to the node and do it in btree_update_nodes_written(), just prior to the transactional part of the update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Split out btree_error_wqKent Overstreet
We can't use btree_update_wq becuase btree updates may be waiting on btree writes to complete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix pathalogical behaviour with inode sharding by cpu IDKent Overstreet
If the transactior restarts on a different CPU, it could end up needing to read in a different btree node, which makes another transaction restart more likely... Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix journal write error pathKent Overstreet
Journal write errors were racing with the submission path - potentially causing writes to other replicas to not get submitted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Reflink refcount fixKent Overstreet
__bch2_trans_mark_reflink_p wasn't always correctly returning the number of sectors processed - the new logic is a bit more straightforward overall too. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add an option to control sharding new inode numbersKent Overstreet
We're seeing a bug where inode creates end up spinning in bch2_inode_create - disabling sharding will simplify what we're testing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't use bch_write_op->cl for delivering completionsKent Overstreet
We already had op->end_io as an alternative mechanism to op->cl.parent for delivering write completions; this switches all code paths to using op->end_io. Two reasons: - op->end_io is more efficient, due to fewer atomic ops, this completes the conversion that was originally only done for the direct IO path. - We'll be restructing the write path to use a different mechanism for punting to process context, refactoring to not use op->cl will make that easier. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill bch_write_op.index_update_fnKent Overstreet
This deletes bch_write_op.index_update_fn: indirect function calls have gotten considerably more expensive post spectre/meltdown, and we only have two different index_update_fns - this patch adds a flag to specify which one to use (normal vs. data move path). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Inline fastpath of bch2_disk_reservation_add()Kent Overstreet
The fastpath now doesn't even disable preemption - instead we use a (non locked) cmpxchg. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't use uuid in tracepointsKent Overstreet
%pU for printing out pointers to uuids doesn't work in perf trace Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add a tracepoint for copygc waitingKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add a cond_resched call to the copygc main loopKent Overstreet
We seem to have a bug where the copygc thread ends up spinning and making the system unusable - this will at least prevent it from locking up the machine, and it's a good thing to have anyways. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a null ptr derefKent Overstreet
bch2_btree_iter_peek() won't always return a key - whoops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix an issue with inconsistent btree writes after unclean shutdownKent Overstreet
After unclean shutdown, btree writes may have completed on one device and not others - and this inconsistency could lead us to writing new bsets with a gap in our btree node in one of our replicas. Fortunately, this is only an issue with bsets that are newer than the most recent journal flush, and we already have a mechanism for detecting and blacklisting those. We just need to make sure to start new btree writes after the most recent _non_ blacklisted bset. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve FS_IOC_GOINGDOWN ioctlKent Overstreet
We weren't interpreting the flags argument at all. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add a workqueue for btree io completionsKent Overstreet
Also, clean up workqueue usage - we shouldn't be using system workqueues, pretty much everything we do needs to be on our own WQ_MEM_RECLAIM workqueues. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: rewrote prefetch asm in gas syntax for clang compatibilityBrett Holman
Signed-off-by: Brett Holman <bpholman5@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>