summaryrefslogtreecommitdiff
path: root/fs/bcachefs/io_read.h
AgeCommit message (Collapse)Author
2025-06-11bcachefs: Read error message now prints if self healingKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: bch_err_throw()Kent Overstreet
Add a tracepoint for any time we return an error and unwind. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Make various async objs visible in debugfsKent Overstreet
Add async objs list for - promote_op - bch_read_bio - btree_read_bio - btree_write_bio This gets us introspection on in-flight async ops, and because under the hood it uses fast_lists (percpu slot buffer on top of a radix tree), it'll be fast enough to enable in production. This will be very helpful for debugging "something got stuck" issues, which have been cropping up from time to time (in the CI, especially with folio writeback). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: bch2_read_bio_to_textKent Overstreet
Pretty printer for struct bch_read_bio. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Be precise about bch_io_failuresKent Overstreet
If the extent we're reading from changes, due to be being overwritten or moved (possibly partially) - we need to reset bch_io_failures so that we don't accidentally mark a new extent as poisoned prematurely. This means we have to separately track (in the retry path) the extent we previously read from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-25bcachefs: Fix silent short reads in data read retry pathKent Overstreet
__bch2_read, before calling __bch2_read_extent(), sets bvec_iter.bi_size to "the size we can read from the current extent" with a swap, and restores it to "the size for the total read" after the read_extent call with another swap. But we neglected to do the restore before the "if (ret) goto err;" - which is a problem if we're retrying those errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24bcachefs: __bch2_read() now takes a btree_transKent Overstreet
Next patch will be checking if the extent we're reading from matches the IO failure we saw before marking the failure. For this to work, __bch2_read() needs to take the same transaction context that bch2_rbio_retry() uses to do that check. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24bcachefs: BCH_READ_data_update -> bch_read_bio.data_updateKent Overstreet
Read flags are codepath dependent and change as they're passed around, while the fields in rbio._state are mostly fixed properties of that particular object. Losing track of BCH_READ_data_update would be bad, and previously it was not obvious if it was always correctly set in the rbio, so this is a safety cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-16bcachefs: Return errors to top level bch2_rbio_retry()Kent Overstreet
Next patch will be adding an additional retry loop for checksum errors, so that we can rule out transient errors before marking an extent as poisoned. Prerequisite to this is returning errors to bch2_rbio_retry(); this will also let us add a "successful retry" message. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-16bcachefs: BCH_ERR_data_read_buffer_too_smallKent Overstreet
Now that the read path uses proper error codes, we can get rid of the weird rbio->hole signalling to the move path that the read didn't happen. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-16bcachefs: Convert read path to standard error codesKent Overstreet
Kill the READ_ERR/READ_RETRY/READ_RETRY_AVOID enums, and add standard error codes that describe precisely which error occured. This is going to be used for the data move path, to move but poison extents with checksum errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-16bcachefs: Don't create bch_io_failures unless it's neededKent Overstreet
Only needed in retry path, no point in wasting stack space. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: ScrubKent Overstreet
Add a new data op to walk all data and metadata in a filesystem, checking if it can be read successfully, and on error repairing from another copy if possible. - New helper: bch2_dev_idx_is_online(), so that we can bail out and report to userspace when we're unable to scrub because the device is offline - data_update_opts, which controls the data move path, now understands scrub: data is only read, not written. The read path is responsible for rewriting on read error, as with other reads. - scrub_pred skips data extents that don't have checksums - bch_ioctl_data has a new scrub member, which has a data_types field for data types to check - i.e. all data types, or only metadata. - Add new entries to bch_move_stats so that we can report numbers for corrected and uncorrected errors - Add a new enum to bch_ioctl_data_event for explicitly reporting completion and return code (i.e. device offline) Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: bch2_bkey_pick_read_device() can now specify a deviceKent Overstreet
To be used for scrub, where we want the read to come from a specific device. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: promote_op uses embedded bch_read_bioKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: rbio_init() cleanupKent Overstreet
Move more initialization to rbio_init(), to assist in further cleanups. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: rbio_init_fragment()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: x-macroize BCH_READ flagsKent Overstreet
Will be adding a bch2_read_bio_to_text(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: kill bch_read_bio.devs_haveKent Overstreet
Dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Don't delete reflink pointers to missing indirect extentsKent Overstreet
To avoid tragic loss in the event of transient errors (i.e., a btree node topology error that was later corrected by btree node scan), we can't delete reflink pointers to correct errors. This adds a new error bit to bch_reflink_p, indicating that it is known to point to a missing indirect extent, and the error has already been reported. Indirect extent lookups now use bch2_lookup_indirect_extent(), which on error reports it as a fsck_err() and sets the error bit, and clears it if necessary on succesful lookup. This also gets rid of the bch2_inconsistent_error() call in __bch2_read_indirect_extent, and in the reflink_p trigger: part of the online self healing project. An on disk format change isn't necessary here: setting the error bit will be interpreted by older versions as pointing to a different index, which will also be missing - which is fine. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Break up io.cKent Overstreet
More reorganization, this splits up io.c into - io_read.c - io_misc.c - fallocate, fpunch, truncate - io_write.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>