Age | Commit message (Collapse) | Author |
|
check_directory_structure runs after check_dirents, so it expects that
it won't see any inodes with missing backpointers - normally.
But online fsck can't run check_dirents yet, or the user might only be
running a specific pass, so we need to be careful that this isn't an
error. If an inode is unreachable, that's handled by a separate pass.
Also, add a new 'bch2_inode_has_backpointer()' helper, since we were
doing this inconsistently.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
'inode_has_wrong_backpointer'; we have more specific errors for every
case afterwards.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a flag for tracking whether a directory has case-insensitive
descendents - so that overlayfs can disallow mounting, even though the
filesystem supports case insensitivity.
This is a new on disk format version, with a (cheap) upgrade to ensure
the flag is correctly set on existing inodes.
Create, rename and fssetxattr are all plumbed to ensure the new flag is
set, and we've got new fsck code that hooks into check_inode(0.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Move a fsck.c helper into inode.c, eliminate some duplicate and organize
the inode lookup helpers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Factor out a small common helper.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Changing the casefold option requires extra checks/work - factor out a
helper from bch2_fileattr_set() for the xattr code to use.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Prep work for calling bch2_check_dirent_target() from bch2_lookup().
- Add an inline wrapper, if the target and backpointer match we can skip
the function call.
- We don't (yet?) want to remove the dirent we did the lookup from (when
we find a directory or subvol with multiple valid dirents pointing to
it), we can defer on that until later. For now, add an "are we in
fsck?" parameter.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch_extent_rebalance
Previously, bch2_bkey_sectors_need_rebalance() called
bch2_target_accepts_data(), checking whether the target is writable.
However, this means that adding or removing devices from a target would
change the value of bch2_bkey_sectors_need_rebalance() for an existing
extent; this needs to be invariant so that the extent trigger can
correctly maintain rebalance_work accounting.
Instead, check target_accepts_data() in io_opts_to_rebalance_opts(),
before creating the bch_extent_rebalance entry.
This fixes (one?) cause of rebalance_work accounting being off.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Persistent cursors for inode allocation.
A free inodes btree would add substantial overhead to inode allocation
and freeing - a "next num to allocate" cursor is always going to be
faster.
We just need it to be persistent, to avoid scanning the inodes btree
from the start on startup.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This adds a new inode field, bi_depth, for directory inodes: this allows
us to make the check_directory_structure pass much more efficient.
Currently, to ensure the filesystem is fully connect and has no loops,
for every directory we follow backpointers until we find the root. But
by adding a depth counter, it sufficies to only check the parent of each
directory, and check that the parent's bi_depth is smaller.
(fsck doesn't require that bi_depth = parent->bi_depth + 1; if a rename
causes bi_depth off, but the chain to the root is still strictly
decreasing, then the algorithm still works and there's no need for fsck
to fixup the bi_depth fields).
We've already checked backpointers, so we know that every directory
(excluding the root)has a valid parent: if bi_depth is always
decreasing, every chain must terminate, and terminate at the root
directory.
bi_depth will not necessarily be correct when fsck runs, due to
directory renames - we can't change bi_depth on every child directory
when renaming a directory. That's ok; fsck will silently fix the
bi_depth field as needed, and future fsck runs will be much faster.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a new parameter to bkey validate functions, and use it to improve
invalid bkey error messages: we can now print the btree and depth it
came from, or if it came from the journal, or is a btree root.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Previously, BCHFS_IOC_REINHERIT_ATTRS didn't trigger rebalance scans
when changing rebalance options - it had been missed, only the xattr
interface triggered them.
Ideally they'd be done by the transactional trigger, but unpacking the
inode to get the options is too heavy to be done in the low level
trigger - the inode trigger is run on every extent update, since the
bch_inode.bi_journal_seq has to be updated for fsync.
bch2_write_inode() is a good compromise, it already unpacks and repacks
and is not run in any super-fast paths.
Additionally, creating the new rebalance entry to trigger the scan is
now done in the same transaction as the inode update that changed the
options.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Trivial cleanup - add a normal BITMASK() helper for bch_inode_unpacked.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
There's an inherent race in taking a snapshot while an unlinked file is
open, and then reattaching it in the child snapshot.
In the interior snapshot node the file will appear unlinked, as though
it should be deleted - it's not referenced by anything in that snapshot
- but we can't delete it, because the file data is referenced by the
child snapshot.
This was being handled incorrectly with
propagate_key_to_snapshot_leaves() - but that doesn't resolve the
fundamental inconsistency of "this file looks like it should be deleted
according to normal rules, but - ".
To fix this, we need to fix the rule for when an inode is deleted. The
previous rule, ignoring snapshots (there was no well-defined rule
for with snapshots) was:
Unlinked, non open files are deleted, either at recovery time or
during online fsck
The new rule is:
Unlinked, non open files, that do not exist in child snapshots, are
deleted.
To make this work transactionally, we add a new inode flag,
BCH_INODE_has_child_snapshot; it overrides BCH_INODE_unlinked when
considering whether to delete an inode, or put it on the deleted list.
For transactional consistency, clearing it handled by the inode trigger:
when deleting an inode we check if there are parent inodes which can now
have the BCH_INODE_has_child_snapshot flag cleared.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
These shouldn't always be fatal errors - logged op resume, in
particular, and we want it as a parameter there.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It was initially believed that it would be better to be explicit about
the snapshot we're updating when writing inodes in fsck; however, it
turns out that passing around the snapshot separately is more error
prone and we're usually updating the inode in the same snapshow we read
it from.
This is different from normal filesystem paths, where we do the update
in the snapshot of the subvolume we're in.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
this allows for various cleanups in fsck
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bkey_fsck_err() was added as an interface that looks like fsck_err(),
but previously all it did was ensure that the appropriate error counter
was incremented in the superblock.
This is a cleanup and bugfix patch that converts it to a wrapper around
fsck_err(). This is needed to fix an issue with the upgrade path to
disk_accounting_v3, where the "silent fix" error list now includes
bkey_fsck errors; fsck_err() handles this in a unified way, and since we
need to change printing of bkey fsck errors from the caller to the inner
bkey_fsck_err() calls, this ends up being a pretty big change.
Als,, rename .invalid() methods to .validate(), for clarity, while we're
changing the function signature anyways (to drop the printbuf argument).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Unnecessary here, and this broke the rust bindings:
error[E0588]: packed type cannot transitively contain a `#[repr(align)]` type
--> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:29025:1
|
29025 | pub struct bkey_i_inode_v3 {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: `bch_inode_v3` has a `#[repr(align)]` attribute
--> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:8949:1
|
8949 | pub struct bch_inode_v3 {
| ^^^^^^^^^^^^^^^^^^^^^^^
error[E0588]: packed type cannot transitively contain a `#[repr(align)]` type
--> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:32826:1
|
32826 | pub struct bkey_inode_buf {
| ^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: `bch_inode_v3` has a `#[repr(align)]` attribute
--> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:8949:1
|
8949 | pub struct bch_inode_v3 {
| ^^^^^^^^^^^^^^^^^^^^^^^
note: `bkey_inode_buf` contains a field of type `bkey_i_inode_v3`
--> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:32827:9
|
32827 | pub inode: bkey_i_inode_v3,
| ^^^^^
note: ...which contains a field of type `bch_inode_v3`
--> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:29027:9
|
29027 | pub v: bch_inode_v3,
| ^
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We're about to start using bch_validate_flags for superblock section
validation - it's no longer bkey specific.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When building with clang's -Wincompatible-function-pointer-types-strict
(a warning designed to catch potential kCFI failures at build time),
there are several warnings along the lines of:
fs/bcachefs/bkey_methods.c:118:2: error: incompatible function pointer types initializing 'int (*)(struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, enum btree_iter_update_trigger_flags)' with an expression of type 'int (struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, unsigned int)' [-Werror,-Wincompatible-function-pointer-types-strict]
118 | BCH_BKEY_TYPES()
| ^~~~~~~~~~~~~~~~
fs/bcachefs/bcachefs_format.h:394:2: note: expanded from macro 'BCH_BKEY_TYPES'
394 | x(inode, 8) \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
fs/bcachefs/bkey_methods.c:117:41: note: expanded from macro 'x'
117 | #define x(name, nr) [KEY_TYPE_##name] = bch2_bkey_ops_##name,
| ^~~~~~~~~~~~~~~~~~~~
<scratch space>:277:1: note: expanded from here
277 | bch2_bkey_ops_inode
| ^~~~~~~~~~~~~~~~~~~
fs/bcachefs/inode.h:26:13: note: expanded from macro 'bch2_bkey_ops_inode'
26 | .trigger = bch2_trigger_inode, \
| ^~~~~~~~~~~~~~~~~~
There are several functions that did not have their flags parameter
converted to 'enum btree_iter_update_trigger_flags' in the recent
unification, which will cause kCFI failures at runtime because the
types, while ABI compatible (hence no warning from the non-strict
version of this warning), do not match exactly.
Fix up these functions (as well as a few other obvious functions that
should have it, even if there are no warnings currently) to resolve the
warnings and potential kCFI runtime failures.
Fixes: 31e4ef3280c8 ("bcachefs: iter/update/trigger/str_hash flag cleanup")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Combine iter/update/trigger/str_hash flags into a single enum, and
x-macroize them for a to_text() function later.
These flags are all for a specific iter/key/update context, so it makes
sense to group them together - iter/update/trigger flags were already
given distinct bits, this cleans up and unifies that handling.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
prep work for improving logging/error messages
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Subvolumes and subvolume root inodes point to each other: this verifies
the subvolume -> inode -> subvolme path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Prep work for disk space accounting rewrite: we're going to want to use
a single callback for both of our current triggers, so we need to change
them to have the same type signature first.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Prep work for disk space accounting rewrite: we're going to want to use
a single callback for both of our current triggers, so we need to change
them to have the same type signature first.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This lets us use bch2_prt_bitflags to print them out.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
- the fsck_err() check for the filesystem being clean was incorrect,
causing us to always fail to delete unlinked inodes
- if a snapshot had been taken, the unlinked inode needs to be
propagated to snapshot leaves so the unlink can happen there - fixed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This patch adds a superblock error counter for every distinct fsck
error; this means that when analyzing filesystems out in the wild we'll
be able to see what sorts of inconsistencies are being found and repair,
and hence what bugs to look for.
Errors validating bkeys are not yet considered distinct fsck errors, but
this patch adds a new helper, bkey_fsck_err(), in order to add distinct
error types for them as well.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
New helper for new rebalance code
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_quota_read(), when scanning for inodes, may attempt to look up
inodes that have been deleted in the main subvolume - this is not an
error.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a new bitset btree for inodes pending deletion; this means we no
longer have to scan the full inodes btree after an unclean shutdown.
Specifically, this adds:
- a trigger to update the deleted_inodes btree based on changes to the
inodes btree
- a new recovery pass
- and check_inodes is now only a fsck pass.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Prep work for the new deleted inodes btree
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bit of reorg
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
As part of the forward compatibility patch series, we need to allow for
new key types without complaining loudly when running an old version.
This patch changes the flags parameter of bkey_invalid to an enum, and
adds a new flag to indicate we're being called from the transaction
commit path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This adds a new field to bkey_ops for the minimum size of the value,
which standardizes that check and also enforces the new rule (previously
done somewhat ad-hoc) that we can extend value types by adding new
fields on to the end.
To make that work we do _not_ initialize min_val_size with sizeof,
instead we initialize it to the size of the first version of those
values.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Move bi_size and bi_sectors into the non-varint portion of the inode, so
that the write path can update them without going through the relatively
expensive unpack/pack operations.
Other changes:
- Add a field for the offset of the varint section, so we can add new
non-varint fields without needing a new inode type, like alloc_v3
- Move bi_mode into the flags field, so that the varint section can be
u64 aligned
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This improves io_opts() and makes it a non-inline function - it's big
enough that it probably shouldn't be.
Also, bch_io_opts no longer needs fields for whether options are
defined, so we can slim it down a bit.
We'd like to stop passing around the full bch_io_opts, but that'll be
tricky because of bch2_rebalance_add_key().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It turns out the *_defined entries of bch_io_opts are only used in one
place - in the xattr get path - and there we immediately convert to a
bch_opts struct, which also has the *_defined entries.
This patch changes bch2_inode_opts_to_opts() to go directly from
bch_inode_unpacked to bch_opts, which is a minor simplification and will
also let us slim down struct bch_io_opts in another patch.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fixes for various checkpatch errors.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Long ago, bkey_unpack_key() was added to bset.h instead of bkey.h
because bkey.h didn't include btree_types.h, which it needs for the
compiled unpack function.
This patch finally moves it to the proper location.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|