summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-22bcachefs: Add a tracepoint for copygc waitingKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add a cond_resched call to the copygc main loopKent Overstreet
We seem to have a bug where the copygc thread ends up spinning and making the system unusable - this will at least prevent it from locking up the machine, and it's a good thing to have anyways. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a null ptr derefKent Overstreet
bch2_btree_iter_peek() won't always return a key - whoops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix an issue with inconsistent btree writes after unclean shutdownKent Overstreet
After unclean shutdown, btree writes may have completed on one device and not others - and this inconsistency could lead us to writing new bsets with a gap in our btree node in one of our replicas. Fortunately, this is only an issue with bsets that are newer than the most recent journal flush, and we already have a mechanism for detecting and blacklisting those. We just need to make sure to start new btree writes after the most recent _non_ blacklisted bset. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Improve FS_IOC_GOINGDOWN ioctlKent Overstreet
We weren't interpreting the flags argument at all. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add a workqueue for btree io completionsKent Overstreet
Also, clean up workqueue usage - we shouldn't be using system workqueues, pretty much everything we do needs to be on our own WQ_MEM_RECLAIM workqueues. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: rewrote prefetch asm in gas syntax for clang compatibilityBrett Holman
Signed-off-by: Brett Holman <bpholman5@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add a debug mode that always reads from every btree replicaKent Overstreet
There's a new module parameter, verify_all_btree_replicas, that enables reading from every btree replica when reading in btree nodes and comparing them against each other. We've been seeing some strange btree corruption - this will hopefully aid in tracking it down and catching it more often. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't repair btree nodes until after interior journal replay is doneKent Overstreet
We need the btree to be in a consistent state before we can rewrite btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix an uninitialized varKent Overstreet
this fixes a valgrind complaint Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix for buffered writes getting -ENOSPCKent Overstreet
Buffered writes may have to increase their disk reservation at btree update time, due to compression and erasure coding being unpredictable: O_DIRECT writes should be checking for -ENOSPC, but buffered writes have already been accepted and should not. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix inode backpointers in RENAME_OVERWRITEKent Overstreet
When we delete the dirent an inode points to, we need to zero out the backpointer fields - this was missed in the RENAME_OVERWRITE case. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Make bch2_remap_range respect O_SYNCKent Overstreet
Caught by xfstest generic/628 Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Split extents if necessary in bch2_trans_update()Kent Overstreet
Currently, we handle multiple overlapping extents in the same transaction commit by doing fixups in bch2_trans_update() - this patch extents that to split updates when necessary. The next patch that changes the reflink code to not fragment extents when making them indirect will require this. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Ratelimiting for writeback IOsKent Overstreet
Writeback throttling is a kernel config option and not always enabled. When it's not enabled we need a fallback, to avoid unbounded memory pinning and work item backlogs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: statfs resports incorrect avail blocksDan Robertson
The current implementation of bch_statfs does not scale the number of available blocks provided in f_bavail by the reserve factor. This causes an allocation of a file of this size to fail. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix for bch2_bkey_pack_pos() not initializing len/version fieldsKent Overstreet
This bug led to push_whiteout() generating whiteouts that failed bch2_bkey_invalid() due to nonzero length fields - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix a memcpy callKent Overstreet
Not supposed to pass a null ptr to memcpy (even if the size is 0). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Fix bch2_extent_can_insert() callKent Overstreet
It was being skipped when hole punching, leading to problems when splitting compressed extents. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Make sure to pass a disk reservation to bch2_extent_update()Kent Overstreet
It's needed when we split an existing compressed extent - we get a null ptr deref without it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: made changes to support clang, fixed a couple bugsBrett Holman
fs/bcachefs/bset.c edited prefetch macro to add clang support fs/bcachefs/btree_iter.c bugfix: initialize iter->real_pos in bch2_btree_iter_init for later use fs/bcachefs/io.c bugfix: eliminated undefined behavior (negative bitshift) fs/bcachefs/buckets.c bugfix: invert sign to handle 64bit abs() Signed-off-by: Brett Holman <bpholman5@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix locking in __bch2_set_nr_journal_buckets()Kent Overstreet
We weren't holding mark_lock correctly - it's needed for the new_fs path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: properly initialize used valuesDan Robertson
- Ensure the second key value in bch_hash_info is initialized to zero if the info type is of type BCH_STR_HASH_SIPHASH. - Initialize the possibly returned value in bch2_inode_create. Assuming bch2_btree_iter_peek returns bkey_s_c_null, the uninitialized value of ret could be returned to the user as an error pointer. - Fix compiler warning in initialization of bkey_s_c_stripe fs/bcachefs/buckets.c:1646:35: warning: suggest braces around initialization of subobject [-Wmissing-braces] struct bkey_s_c_stripe new_s = { NULL }; ^~~~ Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Repair code for multiple types of data in same bucketKent Overstreet
bch2_check_fix_ptrs() is awkward, we need to find a way to improve it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix out of bounds read in fs usage ioctlDan Robertson
Fix a possible read out of bounds if bch2_ioctl_fs_usage is called when replica_entries_bytes is set to a value that is smaller than the size of bch_replicas_usage. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix null deref in bch2_ioctl_read_superDan Robertson
Do not attempt to cleanup the returned value of bch2_device_lookup if the returned value was an error pointer. We currently check to see if the returned value is null and run the cleanup otherwise. As a result, we attempt to run the cleanup on a error pointer. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix possible null deref on mountDan Robertson
Ensure that the block device pointer in a superblock handle is not null before dereferencing it in bch2_dev_to_fs. The block device pointer may be null when mounting a new bcachefs filesystem given another mounted bcachefs filesystem exists that has at least one device that is offline. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix error in parsing of mount optionsDan Robertson
When parsing the mount options duplicate the given options. This is required as the options are parsed twice and strsep is used in parsing. The options will be modified into a possibly invalid options set for the second round of parsing if the options are not duplicated before parsing. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: avoid out-of-bounds in split_devsStijn Tintel
Calling mount with an empty source string causes an out-of-bounds error in split_devs. Check the length of the source string to avoid this. Signed-off-by: Stijn Tintel <stijn@linux-ipv6.be> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Make sure to use BTREE_ITER_PREFETCH in fsckKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix bch2_btree_iter_peek_with_updates()Kent Overstreet
By not re-fetching the next update we were going into an infinite loop. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix reflink triggerKent Overstreet
The trigger for reflink pointers wasn't always incrementing/decrementing the refcounts correctly - this patch fixes that logic. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix some refcounting bugsKent Overstreet
We really need debug mode assertions that ca->ref and ca->io_ref are used correctly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix oob write in __bch2_btree_node_writeDan Robertson
Fix a possible out of bounds write in __bch2_btree_node_write when the data buffer padding is cleared up to the block size. The out of bounds write is possible if the data buffers size is not a multiple of the block size. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix usage of last_seq + encryptionKent Overstreet
jset->last_seq is in the region that's encrypted - on journal write completion, we were using it and getting garbage. This patch shadows it to fix. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Clean up bch2_btree_and_journal_walk()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Mark newly allocated btree nodes as accessedKent Overstreet
This was a major oversight - this means under memory pressure we can end up reading in a btree node, then having it evicted before we get to use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix time handlingKent Overstreet
There were some overflows in the time conversion functions - fix this by converting tv_sec and tv_nsec separately. Also, set sb->time_min and sb->time_max. Fixes xfstest generic/258. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add a tracepoint for when we block on journal reclaimKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Make sure to initialize j->last_flushedKent Overstreet
If the journal reclaim thread makes it to the timeout without ever initializing j->last_flushed, we could end up sleeping for a very long time. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Ensure that fpunch updates inode timestampsKent Overstreet
Fixes xfstests generic/059 Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Change copygc wait amount to be min of per device waitsKent Overstreet
We're seeing a filesystem get stuck when all devices but one have no more reclaimable buckets - because the copygc wait amount is curretly filesystem wide. This patch should fix that, possibly at the expensive of running too much when only one or a few devices is full and the rebalance thread needs to move data around. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Change bch2_btree_key_cache_count() to exclude dirty keysKent Overstreet
We're seeing livelocks that appear to be due to bch2_btree_key_cache_scan repeatedly scanning and blocking other tasks from using the key cache lock - we probably shouldn't be reporting objects that can't actually be freed yet. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Call bch2_inconsistent_error() on missing stripe/indirect extentKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: New tracepoint for bch2_trans_get_iter()Kent Overstreet
Trying to debug an issue where after traverse_all() we shouldn't have to traverse any iterators... yet we are Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix __bch2_trans_get_iter()Kent Overstreet
We need to also set iter->uptodate to indicate it needs to be traversed. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Evict btree nodes we're deletingKent Overstreet
There was a bug that led to duplicate btree node pointers being inserted at the wrong level. The new topology repair code can fix that, except that the btree cache code gets confused when we read in a btree node from the pointer that was at the wrong level. This patch evicts nodes that we're deleting to, which nicely solves the problem. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: New check_nlinks algorithm for snapshotsKent Overstreet
With snapshots, using a radix tree for the table of link counts won't work anymore because we also need to distinguish between inodes with different snapshot IDs. Instead, this patch builds up a sorted array of inodes that have hardlinks that we can binary search on - taking advantage of the fact that with inode backpointers, the check_nlinks() pass _only_ needs to concern itself with inodes that have hardlinks now. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix a null ptr derefKent Overstreet
Fix a few memory safety issues, found by asan in userspace. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: New and improved topology repair codeKent Overstreet
This splits out btree topology repair into a separate pass, and makes some improvements: - When we have to pick which of two overlapping nodes to drop keys from, we use the btree node header sequence number to preserve the newer node - the gc code has been changed so that it doesn't bail out if we're continuing/ignoring on fsck error - this way the dump tool can skip running the repair pass but still walk all reachable metadata - add a new superblock flag indicating when a filesystem is known to have btree topology issues, and the topology repair pass should be run - changing the start/end of a node might mean keys in that node have to be deleted: this patch handles that better by splitting it out into a separate function and running it explicitly in the topology repair code, previously those keys were only being dropped when the btree node was read in. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>