linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2025-03-13	bcachefs: fix tiny leak in bch2_dev_add()	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06	bcachefs: fix deadlock in journal_entry_open()	Jeongjun Park
	In the previous commit b3d82c2f2761, code was added to prevent journal sequence overflow. Among them, the code added to journal_entry_open() uses the bch2_fs_fatal_err_on() function to handle errors. However, __journal_res_get() , which calls journal_entry_open() , calls journal_entry_open() while holding journal->lock , but bch2_fs_fatal_err_on() internally tries to acquire journal->lock , which results in a deadlock. So we need to add a locked helper to handle fatal errors even when the journal->lock is held. Fixes: b3d82c2f2761 ("bcachefs: Guard against journal seq overflow") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09	bcachefs: bch2_fs_btree_gc_init()	Kent Overstreet
	Now returns errors, prep work for check_allocations_done_lock Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09	bcachefs: bcachefs_metadata_version_persistent_inode_cursors	Kent Overstreet
	Persistent cursors for inode allocation. A free inodes btree would add substantial overhead to inode allocation and freeing - a "next num to allocate" cursor is always going to be faster. We just need it to be persistent, to avoid scanning the inodes btree from the start on startup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: Kill unnecessary mark_lock usage	Kent Overstreet
	We can't hold mark_lock while calling fsck_err() - that's a deadlock, mark_lock is meant to be a leaf node lock. It's also unnecessary for gc_bucket() and bucket_gen(); rcu suffices since the bucket_gens array describes its size, and we can't race with device removal or resize during gc/fsck since that takes state lock. Reported-by: syzbot+38641fcbda1aaffefdd4@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: Kill bch2_bucket_alloc_new_fs()	Kent Overstreet
	The early-early allocation path, bch2_bucket_alloc_new_fs(), is no longer needed - and inconsistencies around new_fs_bucket_idx have been a frequent source of bugs. Reported-by: syzbot+592425844580a6598410@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: fix O(n^2) issue with whiteouts in journal keys	Kent Overstreet
	The journal_keys array can't be substantially modified after we go RW, because lookups need to be able to check it locklessly - thus we're limited on what we can do when a key in the journal has been overwritten. This is a problem when there's many overwrites to skip over for peek() operations. To fix this, add tracking of ranges of overwrites: we create a range entry when there's more than one contiguous whiteout. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: Fix shutdown message	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: Don't use page allocator for sb_read_scratch	Kent Overstreet
	Kill another unnecessary dependency on PAGE_SIZE Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: Simplify code in bch2_dev_alloc()	Youling Tang
	- Remove unnecessary variable 'ret'. - Remove unnecessary bch2_dev_free() operations. Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: Add assert for use of journal replay keys for updates	Kent Overstreet
	The journal replay keys mechanism can only be used for updates in early recovery, when still single threaded. Add some asserts to make sure we never accidentally use it elsewhere. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: copygc_enabled, rebalance_enabled now opts.h options	Kent Overstreet
	They can now be set at mount time Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: Add locking for bch_fs.curr_recovery_pass	Kent Overstreet
	Recovery can rewind in certain situations - when we discover we need to run a pass that doesn't normally run. This can happen from another thread for btree node read errors, so we need a bit of locking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21	bcachefs: remove superfluous ; after statements	Colin Ian King
	There are a several statements with two following semicolons, replace these with just one semicolon. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-11-07	bcachefs: bch2_btree_write_buffer_flush_going_ro()	Kent Overstreet
	The write buffer needs to be specifically flushed when going RO: keys in the journal that haven't yet been moved to the write buffer don't have a journal pin yet. This fixes numerous syzbot bugs, all with symptoms of still doing writes after we've got RO. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-18	bcachefs: Don't use commit_do() unnecessarily	Kent Overstreet
	Using commit_do() to call alloc_sectors_start_trans() breaks when we're randomly injecting transaction restarts - the restart in the commit causes us to leak the lock that alloc_sectorS_start_trans() takes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-14	bcachefs: Fix sysfs warning in fstests generic/730,731	Kent Overstreet
	sysfs warns if we're removing a symlink from a directory that's no longer in sysfs; this is triggered by fstests generic/730, which simulates hot removal of a block device. This patch is however not a correct fix, since checking kobj->state_in_sysfs on a kobj owned by another subsystem is racy. A better fix would be to add the appropriate check to sysfs_remove_link() - and sysfs_create_link() as well. But kobject_add_internal()/kobject_del() do not as of today have locking that would support that. Note that the block/holder.c code appears to be subject to this race as well. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21	bcachefs: btree cache counters should be size_t	Kent Overstreet
	32 bits won't overflow any time soon, but size_t is the correct type for counting objects in memory. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21	bcachefs: bch2_sb_member_alloc()	Kent Overstreet
	refactoring Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21	bcachefs: bch2_dev_remove_alloc() -> alloc_background.c	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09	bcachefs: promote_whole_extents is now a normal option	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09	bcachefs: switch to rhashtable for vfs inodes hash	Kent Overstreet
	the standard vfs inode hash table suffers from painful lock contention - this is long overdue Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-30	bcachefs: Fix double free of ca->buckets_nouse	Kent Overstreet
	Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: ffcbec6076 ("bcachefs: Kill opts.buckets_nouse") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14	bcachefs: Improve startup message	Kent Overstreet
	We're not always mounting when we start the filesystem Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14	bcachefs: Plumb more logging through stdio redirect	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14	bcachefs: bch2_verify_accounting_clean()	Kent Overstreet
	Verify that the in-memory accounting verifies the on-disk accounting after a clean shutdown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14	bcachefs: Convert gc to new accounting	Kent Overstreet
	Rewrite fsck/gc for the new accounting scheme. This adds a second set of in-memory accounting counters for gc to use; like with other parts of gc we run all trigger in TRIGGER_GC mode, then compare what we calculated to existing in-memory accounting at the end. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14	bcachefs: Kill replicas_journal_res	Kent Overstreet
	More dead code deletion Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14	bcachefs: Delete journal-buf-sharded old style accounting	Kent Overstreet
	More deletion of dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14	bcachefs: kill bch2_fs_usage_read()	Kent Overstreet
	With bch2_ioctl_fs_usage(), this is now dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14	bcachefs: Disk space accounting rewrite	Kent Overstreet
	Main part of the disk accounting rewrite. This is a wholesale rewrite of the existing disk space accounting, which relies on percepu counters that are sharded by journal buffer, and rolled up and added to each journal write. With the new scheme, every set of counters is a distinct key in the accounting btree; this fixes scaling limitations of the old scheme, where counters took up space in each journal entry and required multiple percpu counters. Now, in memory accounting requires a single set of percpu counters - not multiple for each in flight journal buffer - and in the future we'll probably also have counters that don't use in memory percpu counters, they're not strictly required. An accounting update is now a normal btree update, using the btree write buffer path. At transaction commit time, we apply accounting updates to the in memory counters, which are percpu counters indexed in an eytzinger tree by the accounting key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14	bcachefs: allow passing full device path for target options	Thomas Bertschinger
	The output of mount options such as "metadata_target" in `/proc/mounts` uses the full path to the device. mount(8) from util-linux uses the output from `/proc/mounts` to pass existing mount options when performing a remount, so bcachefs should accept as input the same form that it prints as output. Without this change: $ mount -t bcachefs -o metadata_target=vdb /dev/vdb /mnt $ strace mount -o remount /mnt ... fsconfig(4, FSCONFIG_SET_STRING, "metadata_target", "/dev/vdb", 0) = -1 EINVAL (Invalid argument) ... Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-28	bcachefs: Don't use the new_fs() bucket alloc path on an initialized fs	Kent Overstreet
	On a new filesystem or device we have to allocate the journal with a bump allocator, because allocation info isn't ready yet - but when hot-adding a device that doesn't have a journal, we don't want to use that path. Reported-by: syzbot+24a867cb90d8315cccff@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-28	bcachefs: Switch online_reserved shutdown assert to WARN()	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-25	bcachefs: Discard, invalidate workers are now per device	Kent Overstreet
	There's no reason for discards to be single threaded across all devices; this will improve performance on multi device setups. Additionally, making them per-device simplifies the refcounting on bch_dev->io_ref; we now hold it for the duration that the discard path is running, which fixes a race between the discard path and device removal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-23	bcachefs: Add missing recalc_capacity() call	Kent Overstreet
	This fixes filesystem size not changing on device removal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21	bcachefs: Replace bare EEXIST with private error codes	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-19	bcachefs: Fix initialization order for srcu barrier	Kent Overstreet
	btree_iter_init() needs to happen before key_cache_init(), to initialize btree_trans_barrier Reported-by: syzbot+3cca837c2183f8f6fcaf@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-10	bcachefs: Split out btree_write_submit_wq	Kent Overstreet
	Split the workqueues for btree read completions and btree write submissions; we don't want concurrency control on btree read completions, but we do want concurrency control on write submissions, else blocking in submit_bio() will cause a ton of kworkers to be allocated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-26	bcachefs: Fix debug assert	Kent Overstreet
	Reported-by: syzbot+a8074a75b8d73328751e@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-22	bcachefs: Fix shutdown ordering	Kent Overstreet
	the btree key cache uses the srcu struct created/destroyed by btree_iter.c; btree_iter needs to be exited last. Reported-by: syzbot+3af9daea347788b15213@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: move replica_set from bch_dev to bch_fs	Kent Overstreet
	This is needed for the next patch - the write submit path has to be able to allocate a replica bio even when we weren't able to get a ref on the device. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: Debug asserts for ca->ref	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: New helpers for device refcounts	Kent Overstreet
	This will be used in the next patch for adding some new debug mode asserts. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: x-macroize journal flags enums	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: On device add, prefer unused slots	Kent Overstreet
	We can't strictly guarantee that no pointers refer to nonexistent devices - we attempt to, but we need to be safe when the filesystem is corrupt. Therefore, change device_add to try to pick a slot that's never been used, or the slot that's been unused the longest. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: Kill opts.buckets_nouse	Kent Overstreet
	Now explicitly allocate and free the buckets_nouse bitmap - this is going to be used for online fsck. To go RW when we haven't check allocations, we'll do a much slimmed down version that just initializes the buckets_nouse bitmaps. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: journal seq blacklist gc no longer has to walk btree	Kent Overstreet
	Since btree_ptr_v2, we no longer require the journal seq blacklist table for skipping blacklisted bsets (btree node entries); the pointer to a given node indicates how much data is present. Therefore there's no longer any need for journal seq blacklist gc to walk the btree - we can prune entries older than journal last_seq. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: Move gc of bucket.oldest_gen to workqueue	Kent Overstreet
	This is a nice cleanup - and we've also been having problems with kthread creation in the mount path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08	bcachefs: assert that online_reserved == 0 on shutdown	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>