summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2024-01-01bcachefs: Unwritten journal buffers are always dirtyKent Overstreet
Ensure that journal bufs that haven't been written can't be reclaimed from the journal pin fifo, and can thus have new pins taken. Prep work for changing the btree write buffer to pull keys from the journal directly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch2_trans_node_add no longer uses trans_for_each_path()Kent Overstreet
In the future we'll be making trans->paths resizable and potentially having _many_ more paths (for fsck); we need to start fixing algorithms that walk each path in a transaction where possible. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Improve trans->extra_journal_entriesKent Overstreet
Instead of using a darray, we now allocate journal entries for the transaction commit path with our normal bump allocator - with an inlined fastpath, and using btree_transaction_stats to remember how much to initially allocate so as to avoid transaction restarts. This is prep work for converting write buffer updates to use this mechanism. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs; kill bch2_btree_key_cache_flush()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: kill btree_path->(alloc_seq|downgrade_seq)Kent Overstreet
These were for extra info in tracepoints for debugging a specialized issue - we do not want to bloat btree_path for this, at least in release builds. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix snapshot.c assertion for online fsckKent Overstreet
c->curr_recovery_pass can go backwards; this adds a non rewinding version, c->recovery_pass_done. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: six lock: fix typosRandy Dunlap
Fix a few typos in the six.h header file. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Brian Foster <bfoster@redhat.com> Cc: linux-bcachefs@vger.kernel.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: reserve path idx 0 for sentinalKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename for_each_btree_key2() -> for_each_btree_key()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Kill for_each_btree_key()Kent Overstreet
for_each_btree_key() handles transaction restarts, like for_each_btree_key2(), but only calls bch2_trans_begin() after a transaction restart - for_each_btree_key2() wraps every loop iteration in a transaction. The for_each_btree_key() behaviour is problematic when it leads to holding the SRCU lock that prevents key cache reclaim for an unbounded amount of time - there's no real need to keep it around. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: continue now works in for_each_btree_key2()Kent Overstreet
continue now works as in any other loop Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix bch2_read_btree()Kent Overstreet
In the debugfs code, we had an incorrect use of drop_locks_do(); on transaction restart we don't want to restart the current loop iteration, since we've already emitted the current key to the buffer for userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix open coded set_btree_iter_dontneed()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: BCH_IOCTL_FSCK_ONLINEKent Overstreet
This adds a new ioctl for running fsck on a mounted, in use filesystem. This reuses the fsck_thread code from the previous patch for running fsck on an offline, unmounted filesystem, so that log messages for the fsck thread are redirected to userspace. Only one running fsck instance is allowed at a time; a new semaphore (since the lock will be taken by one thread and released by another) is added for this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: BCH_IOCTL_FSCK_OFFLINEKent Overstreet
This adds a new ioctl for running fsck on a list of devices. Normally, if we wish to use the kernel's implementation of fsck we'd run it at mount time with -o fsck. This ioctl lets us run fsck without mounting, so that userspace bcachefs-tools can transparently switch to the kernel's implementation of fsck when appropriate - primarily if the kernel version of bcachefs better matches the filesystem on disk. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch2_run_online_recovery_passes()Kent Overstreet
Add a new helper for running online recovery passes - i.e. online fsck. This is a subset of our normal recovery passes, and does not - for now - use or follow c->curr_recovery_pass. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Mark recovery passses that are safe to run onlineKent Overstreet
Online fsck is coming, and many of our recovery/fsck passes are already safe to run while the filesystem is in use - mark which ones. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Add ability to redirect log outputKent Overstreet
Upcoming patches are going to add two new ioctls for running fsck in the kernel, but pretending that we're running our normal userspace fsck. This patch adds some plumbing for redirecting our normal log messages away from the dmesg log to a thread_with_file file descriptor - via a struct log_output, which will be consumed by the fsck f_op's read method. The new ioctls will allow for running fsck in the kernel against an offline filesystem (without mounting it), and an online filesystem. For an offline filesystem we need a way to pass in a pointer to the log_output, which is done via a new hidden opts.h option. For online fsck, we can set c->output directly, but only want to redirect log messages from the thread running fsck - hence the new c->output_filter method. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: thread_with_fileKent Overstreet
Abstract out a new helper from the data job code, for connecting a kthread to a file descriptor. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: c->ro_refKent Overstreet
Add a new refcount for async ops that don't necessarily need the fs to be RW, with similar lifetime/rules otherwise as c->writes. To be used by online fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Improve error message when finding wrong btree nodeKent Overstreet
single_device.merge_torture_flakey is, very rarely, finding a btree node that doesn't match the key that points to it: this patch improves the error message to print out more fields from the btree node header, so that we can see what else does or does not match the key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: return from fsync on writeback error to avoid early shutdownBrian Foster
When investigating transient failures of generic/441 on bcachefs, it was determined that the cause of the failure was a combination of unconditional emergency shutdown and racing between background journal activity and the test switchover from a working device mapper table to an error injecting table. Part of the reason for this sequence of events is that bcachefs aggressively flushes as much as possible during fsync(), regardless of errors. While this is reasonable behavior, it is technically unnecessary because once an error is returned from fsync(), the caller cannot make any assumptions about the resilience of data. Tweak the bch2_fsync() logic to return an error on failure of any of the steps involved in the flush. Note that this change alone does not prevent generic/441 failure, but in combination with a test tweak to avoid racing during the dm-error table switchover it avoids the unnecessary shutdowns and allows the test to pass reliably on bcachefs. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: BCH_ERR_opt_parse_errorKent Overstreet
Continuing the project of replacing generic error codes with more specific ones. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Refactor trans->paths_allocated to be standard bitmapKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Move reflink_p triggers into reflink.cKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Remove obsolete comment about zstdRichard Davies
Remove obsolete comment about zstd, since approach changed during development of commit bbc3a46065d08f9ab3412b1f26bbfa778c444833 Signed-off-by: Richard Davies <richard@arachsys.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Include btree_trans in more tracepointsKent Overstreet
This gives us more context information - e.g. which codepath is invoking btree node reads. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: remove sb lock and flags update on explicit shutdownBrian Foster
bcachefs grabs s_umount and sets SB_RDONLY when the fs is shutdown via the ioctl() interface. This has a couple issues related to interactions between shutdown and freeze: 1. The flags == FSOP_GOING_FLAGS_DEFAULT case is a deadlock vector because freeze_bdev() calls into freeze_super(), which also acquires s_umount. 2. If an explicit shutdown occurs while the sb is frozen, SB_RDONLY alters the thaw path as if the sb was read-only at freeze time. This effectively leaks the frozen state and leaves the sb frozen indefinitely. The usage of SB_RDONLY here goes back to the initial bcachefs commit and AFAICT is simply historical behavior. This behavior is unique to bcachefs relative to the handful of other filesystems that support the shutdown ioctl(). Typically, SB_RDONLY is reserved for the proper remount path, which itself is restricted from modifying frozen superblocks in reconfigure_super(). Drop the unnecessary sb lock and flags update bch2_ioc_goingdown() to address both of these issues. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Make backpointer fsck wb flush check more rigorousKent Overstreet
backpointers fsck now always runs in rw mode - the btree is being modified while it runs, by e.g. copygc, rebalance, the discard worker, the invalidate worker. We could find a missing backpointer, flush the btree write buffer, and then on the next iteration find a new key at the exact same position - which will most likely need another write buffer flush. Hence, we have to check for an exact match on last_flushed, not just the pos. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: On missing backpointer to interior node, flush interior updatesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: remove redundant condition from data_update_index_updateDaniel Hill
Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: copygc shouldn't try moving buckets on errorDaniel Hill
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Explicity go RW for fsckKent Overstreet
This eliminates a lot of BCH_TRANS_COMMIT_lazy_rw flags, and is less error prone. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: copygc should wakeup on shutdown if disabledDaniel Hill
Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: rebalance should wakeup on shutdown if disabledDaniel Hill
Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: remove dead bch2_evacuate_bucket()Daniel Hill
Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Replace zero-length arrays with flexible-array membersGustavo A. R. Silva
Fake flexible arrays (zero-length and one-element arrays) are deprecated, and should be replaced by flexible-array members. So, replace zero-length arrays with flexible-array members in multiple structures. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: more write buffer refactoringKent Overstreet
prep work for big rewrite - no functional changes in this patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: wb_flush_one_slowpath()Kent Overstreet
A bit of refactoring for better inlining in the main btree write buffer flush path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: ONLY_SPECIFIED_DEVS doesn't mean ignore durability anymoreKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Don't open code bch2_dev_exists2()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Improve trace_trans_restart_would_deadlockKent Overstreet
In the CI, we're seeing tests failing due to excessive would_deadlock transaction restarts - the tracepoint now includes the lock cycle that occured. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Improve trace_trans_restart_too_many_iters()Kent Overstreet
We now include the list of paths in use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: count_event()Kent Overstreet
Small helper for event counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch2_btree_write_buffer_flush() -> bch2_btree_write_buffer_tryflush()Kent Overstreet
More accurate naming. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch2_btree_write_buffer_flush_locked()Kent Overstreet
Minor refactoring - improved naming, and move the responsibility for flush_lock to the caller instead of having it be shared. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Clean up btree write buffer write ref handlingKent Overstreet
__bch2_btree_write_buffer_flush() now assumes a write ref is already held (as called by the transaction commit path); and the wrappers bch2_write_buffer_flush() and flush_sync() take an explicit write ref. This means internally the write buffer code can always use BTREE_INSERT_NOCHECK_RW, instead of in the previous code passing flags around and hoping the NOCHECK_RW flag was always carried around correctly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: delete useless commit_do()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: kill journal->preres_waitKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Improve btree write buffer tracepointsKent Overstreet
- add a tracepoint for write_buffer_flush_sync; this is expensive - fix the write_buffer_flush_slowpath tracepoint Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>