summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-04-17btrfs: add a wbc pointer to struct btrfs_bio_ctrlChristoph Hellwig
Instead of passing down the wbc pointer the deep call chain, just add it to the btrfs_bio_ctrl structure. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: remove the sync_io flag in struct btrfs_bio_ctrlChristoph Hellwig
The sync_io flag is equivalent to wbc->sync_mode == WB_SYNC_ALL, so just check for that and remove the separate flag. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: store the bio opf in struct btrfs_bio_ctrlChristoph Hellwig
The bio op and flags never change over the life time of a bio_ctrl, so move it in there instead of passing it down the deep call chain all the way down to alloc_new_bio. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: remove the force_bio_submit to submit_extent_pageChristoph Hellwig
If force_bio_submit, submit_extent_page simply calls submit_one_bio as the first thing. This can just be moved to the only caller that sets force_bio_submit to true. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: don't set force_bio_submit in read_extent_buffer_subpageChristoph Hellwig
When read_extent_buffer_subpage calls submit_extent_page, it does so on a freshly initialized btrfs_bio_ctrl structure that can't have a valid bio to submit. Clear the force_bio_submit parameter to false as there is nothing to submit. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: open code btrfs_bin_search()Anand Jain
btrfs_bin_search() is a simple wrapper that searches for the whole slots by calling btrfs_generic_bin_search() with the starting slot/first_slot preset to 0. This simple wrapper can be open coded as btrfs_bin_search(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: dev-replace: properly follow its read modeQu Wenruo
[BUG] Although dev replace ioctl has a way to specify the mode on whether we should read from the source device, it's not properly followed. # mkfs.btrfs -f -d raid1 -m raid1 $dev1 $dev2 # mount $dev1 $mnt # xfs_io -f -c "pwrite 0 32M" $mnt/file # sync # btrfs replace start -r -f 1 $dev3 $mnt And one extra trace is added to scrub_submit(), showing the detail about the bio: btrfs-11569 [005] ... 37.0270: scrub_submit.part.0: devid=1 logical=22036480 phy=22036480 len=16384 btrfs-11569 [005] ... 37.0273: scrub_submit.part.0: devid=1 logical=30457856 phy=30457856 len=32768 btrfs-11569 [005] ... 37.0274: scrub_submit.part.0: devid=1 logical=30507008 phy=30507008 len=49152 btrfs-11569 [005] ... 37.0274: scrub_submit.part.0: devid=1 logical=30605312 phy=30605312 len=32768 btrfs-11569 [005] ... 37.0275: scrub_submit.part.0: devid=1 logical=30703616 phy=30703616 len=65536 btrfs-11569 [005] ... 37.0281: scrub_submit.part.0: devid=1 logical=298844160 phy=298844160 len=131072 ... btrfs-11569 [005] ... 37.0762: scrub_submit.part.0: devid=1 logical=322961408 phy=322961408 len=131072 btrfs-11569 [005] ... 37.0762: scrub_submit.part.0: devid=1 logical=323092480 phy=323092480 len=131072 One can see that all the reads are submitted to devid 1, even if we have specified "-r" option to avoid reading from the source device. [CAUSE] The dev-replace read mode is only set but not followed by scrub code at all. In fact, only common read path is properly following the read mode, but scrub itself has its own read path, thus not following the mode. [FIX] Here we enhance scrub_find_good_copy() to also follow the read mode. The idea is pretty simple, in the first loop, we avoid the following devices: - Missing devices This is the existing condition - The source device if the replace wants to avoid it. And if above loop found no candidate (e.g. replace a single device), then we discard the 2nd condition, and try again. Since we're here, also enhance the function scrub_find_good_copy() by: - Remove the forward declaration - Makes it return int To indicates errors, e.g. no good mirror found. - Add extra error messages Now with the same trace, "btrfs replace start -r" works as expected: btrfs-1213 [000] ... 991.9059: scrub_submit.part.0: devid=2 logical=22036480 phy=1064960 len=16384 btrfs-1213 [000] ... 991.9062: scrub_submit.part.0: devid=2 logical=30457856 phy=9486336 len=32768 btrfs-1213 [000] ... 991.9063: scrub_submit.part.0: devid=2 logical=30507008 phy=9535488 len=49152 btrfs-1213 [000] ... 991.9064: scrub_submit.part.0: devid=2 logical=30605312 phy=9633792 len=32768 btrfs-1213 [000] ... 991.9065: scrub_submit.part.0: devid=2 logical=30703616 phy=9732096 len=65536 btrfs-1213 [000] ... 991.9073: scrub_submit.part.0: devid=2 logical=298844160 phy=277872640 len=131072 btrfs-1213 [000] ... 991.9075: scrub_submit.part.0: devid=2 logical=298975232 phy=278003712 len=131072 btrfs-1213 [000] ... 991.9078: scrub_submit.part.0: devid=2 logical=299106304 phy=278134784 len=131072 ... btrfs-1213 [000] ... 991.9474: scrub_submit.part.0: devid=2 logical=318504960 phy=297533440 len=131072 btrfs-1213 [000] ... 991.9476: scrub_submit.part.0: devid=2 logical=318636032 phy=297664512 len=131072 btrfs-1213 [000] ... 991.9479: scrub_submit.part.0: devid=2 logical=318767104 phy=297795584 len=131072 Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: fold finish_compressed_bio_write into btrfs_finish_compressed_write_workChristoph Hellwig
Fold finish_compressed_bio_write into its only caller as there is no reason to keep them separate. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: don't clear page->mapping in btrfs_free_compressed_pagesChristoph Hellwig
No one ever set ->mapping on these pages, so don't bother clearing it. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: factor out a btrfs_free_compressed_pages helperChristoph Hellwig
Share the code to free the compressed pages and the array to hold them into a common helper. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: factor out a btrfs_add_compressed_bio_pages helperChristoph Hellwig
Factor out a common helper to add the compressed_bio pages to the bio that is shared by the compressed read and write path. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: use the bbio file offset in add_ra_bio_pagesChristoph Hellwig
struct btrfs_bio now has a file_offset field set up by all submitters. Use that value combined with the bio size in add_ra_bio_pages to calculate the last offset in the bio. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: use the bbio file offset in btrfs_submit_compressed_readChristoph Hellwig
struct btrfs_bio now has a file_offset field set up by all submitters. Use that in btrfs_submit_compressed_read instead of recalculating the value. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: remove redundant free_extent_map in btrfs_submit_compressed_readChristoph Hellwig
em can't be non-NULL after the free_extent_map label. Also remove the now pointless clearing of em to NULL after freeing it. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: embed a btrfs_bio into struct compressed_bioChristoph Hellwig
Embed a btrfs_bio into struct compressed_bio. This avoids potential (so far theoretical) deadlocks due to nesting of btrfs_bioset allocations for the original read bio and the compressed bio, and avoids an extra memory allocation in the I/O path. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: replace btrfs_io_context::raid_map with a fixed u64 valueQu Wenruo
In btrfs_io_context structure, we have a pointer raid_map, which indicates the logical bytenr for each stripe. But considering we always call sort_parity_stripes(), the result raid_map[] is always sorted, thus raid_map[0] is always the logical bytenr of the full stripe. So why we waste the space and time (for sorting) for raid_map? This patch will replace btrfs_io_context::raid_map with a single u64 number, full_stripe_start, by: - Replace btrfs_io_context::raid_map with full_stripe_start - Replace call sites using raid_map[0] to use full_stripe_start - Replace call sites using raid_map[i] to compare with nr_data_stripes. The benefits are: - Less memory wasted on raid_map It's sizeof(u64) * num_stripes vs sizeof(u64). It'll always save at least one u64, and the benefit grows larger with num_stripes. - No more weird alloc_btrfs_io_context() behavior As there is only one fixed size + one variable length array. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: use an efficient way to represent source of duplicated stripesQu Wenruo
For btrfs dev-replace, we have to duplicate writes to the source device into the target device. For non-RAID56, all writes into the same mapped ranges are sharing the same content, thus they don't really need to bother anything. (E.g. in btrfs_submit_bio() for non-RAID56 range we just submit the same write to all involved devices). But for RAID56, all stripes contain different content, thus we must have a clear mapping of which stripe is duplicated from which original stripe. Currently we use a complex way using tgtdev_map[] array, e.g: num_tgtdevs = 1 tgtdev_map[0] = 0 <- Means stripes[0] is not involved in replace. tgtdev_map[1] = 3 <- Means stripes[1] is involved in replace, and it's duplicated to stripes[3]. tgtdev_map[2] = 0 <- Means stripes[2] is not involved in replace. But this is wasting some space, and ignores one important thing for dev-replace, there is at most one running replace. Thus we can change it to a fixed array to represent the mapping: replace_nr_stripes = 1 replace_stripe_src = 1 <- Means stripes[1] is involved in replace. thus the extra stripe is a copy of stripes[1] By this we can save some space for bioc on RAID56 chunks with many devices. And we get rid of one variable sized array from bioc. Thus the patch involves the following changes: - Replace @num_tgtdevs and @tgtdev_map[] with @replace_nr_stripes and @replace_stripe_src. @num_tgtdevs is just renamed to @replace_nr_stripes. While the mapping is completely changed. - Add extra ASSERT()s for RAID56 code - Only add two more extra stripes for dev-replace cases. As we have an upper limit on how many dev-replace stripes we can have. - Unify the behavior of handle_ops_on_dev_replace() Previously handle_ops_on_dev_replace() go two different paths for WRITE and GET_READ_MIRRORS. Now unify them by always going the WRITE path first (with at most 2 replace stripes), then if we're doing GET_READ_MIRRORS and we have 2 extra stripes, just drop one stripe. - Remove the @real_stripes argument from alloc_btrfs_io_context() As we don't need the old variable length array any more. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: reduce type width of btrfs_io_contextsQu Wenruo
That structure is our ultimate object for all __btrfs_map_block() related functions. We have some hard to understand members, like tgtdev_map, but without any comments. This patch will improve the situation: - Add extra comments for num_stripes, mirror_num, num_tgtdevs and tgtdev_map[] Especially for the last two members, add a dedicated (thus very long) comments for them, with example to explain it. - Shrink those int members to u16. In fact our on-disk format is only using u16 for num_stripes, thus no need to use int at all. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: simplify the bioc argument for handle_ops_on_dev_replace()Qu Wenruo
There is no memory re-allocation for handle_ops_on_dev_replace(), thus we don't need to pass a btrfs_io_context pointer. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: reduce div64 calls by limiting the number of stripes of a chunk to u32Qu Wenruo
There are quite some div64 calls inside btrfs_map_block() and its variants. Such calls are for @stripe_nr, where @stripe_nr is the number of stripes before our logical bytenr inside a chunk. However we can eliminate such div64 calls by just reducing the width of @stripe_nr from 64 to 32. This can be done because our chunk size limit is already 10G, with fixed stripe length 64K. Thus a U32 is definitely enough to contain the number of stripes. With such width reduction, we can get rid of slower div64, and extra warning for certain 32bit arch. This patch would do: - Add a new tree-checker chunk validation on chunk length Make sure no chunk can reach 256G, which can also act as a bitflip checker. - Reduce the width from u64 to u32 for @stripe_nr variables - Replace unnecessary div64 calls with regular modulo and division 32bit division and modulo are much faster than 64bit operations, and we are finally free of the div64 fear at least in those involved functions. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: replace map_lookup->stripe_len by BTRFS_STRIPE_LENQu Wenruo
Currently btrfs doesn't support stripe lengths other than 64KiB. This is already set in the tree-checker. There is really no meaning to record that fixed value in map_lookup for now, and can all be replaced with BTRFS_STRIPE_LEN. Furthermore we can use the fix stripe length to do the following optimization: - Use BTRFS_STRIPE_LEN_SHIFT to replace some 64bit division Now we only need to do a right shift. And the value of BTRFS_STRIPE_LEN itself is already too large for bit shift, thus if we accidentally use BTRFS_STRIPE_LEN to do bit shift, a compiler warning would be triggered. Thus this bit shift optimization would be safe. - Use BTRFS_STRIPE_LEN_MASK to calculate the offset inside a stripe Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: move all btree inode initialization into btrfs_init_btree_inodeChristoph Hellwig
Move the remaining code that deals with initializing the btree inode into btrfs_init_btree_inode instead of splitting it between that helpers and its only caller. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: switch search_file_offset_in_bio to return boolAnand Jain
Function search_file_offset_in_bio() finds the file offset in the file_offset_ret, and we use the return value to indicate if it is successful, so use bool. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: avoid reusing return variable in nested block in btrfs_lookup_bio_sumsAnand Jain
The function btrfs_lookup_bio_sums() and a nested if statement declare ret respectively as blk_status_t and int. There is no need to store the return value of search_file_offset_in_bio() to ret as this is a one-time call. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: open code btrfs_csum_ptrJohannes Thumshirn
Remove btrfs_csum_ptr() and fold it into it's only caller. Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: raid56: no need for irqsafe lockingChristoph Hellwig
These days all the operations that take locks in the raid56.c code are run from user context (mostly workqueues). Drop all the irqsafe locking that is not required any more. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: abort the transaction if we get an error during snapshot dropJosef Bacik
We were seeing weird errors when we were testing our btrfs backports before we had the incorrect level check fix. These errors appeared to be improper error handling, but error injection testing uncovered that the errors were a result of corruption that occurred from improper error handling during snapshot delete. With snapshot delete if we encounter any errors during walk_down or walk_up we'll simply return an error, we won't abort the transaction. This is problematic because we will be dropping references for nodes and leaves along the way, and if we fail in the middle we will leave the file system corrupt because we don't know where we left off in the drop. Fix this by making sure we abort if we hit any errors during the walk down or walk up operations, as we have no idea what operations could have been left half done at this point. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: handle errors in walk_down_tree properlyJosef Bacik
We can get errors in walk_down_proc as we try and lookup extent info for the snapshot dropping to act on. However if we get an error we simply return 1 which indicates we're done with walking down, which will lead us to improperly continue with the snapshot drop with the incorrect information. Instead break if we get any error from walk_down_proc or do_walk_down, and handle the case of ret == 1 by returning 0, otherwise return the ret value that we have. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: drop root refs properly when orphan cleanup failsJosef Bacik
When we mount the file system we do something like this: while (1) { lookup fs roots; for (i = 0; i < num_roots; i++) { ret = btrfs_orphan_cleanup(roots[i]); if (ret) break; btrfs_put_root(roots[i]); } } for (; i < num_roots; i++) btrfs_put_root(roots[i]); As you can see if we break in that inner loop we just go back to the outer loop and lose the fact that we have to drop references on the remaining roots we looked up. Fix this by making an out label and jumping to that on error so we don't leak a reference to the roots we looked up. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: add missing iputs on orphan cleanup failureJosef Bacik
We missed a couple of iput()s in the orphan cleanup failure paths, add them so we don't get refcount errors. The iput needs to be done in the check and not under a common label due to the way the code is structured. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: handle errors from btrfs_read_node_slot in splitJosef Bacik
While investigating a problem with error injection I tripped over curious behavior in the node/leaf splitting code. If we get an EIO when trying to read either the left or right leaf/node for splitting we'll simply treat the node as if it were full and continue on. The end result of this isn't too bad, we simply end up allocating a block when we may have pushed items into the adjacent blocks. However this does essentially allow us to continue to modify a file system that we've gotten errors on, either from a bad disk or csum mismatch or other corruption. This isn't particularly safe, so instead handle these btrfs_read_node_slot() usages differently. We allow you to pass in any slot, the idea being that we save some code if the slot number is outside of the range of the parent. This means we treat all errors the same, when in reality we only want to ignore -ENOENT. Fix this by changing how we call btrfs_read_node_slot(), which is to only call it for slots we know are valid. This way if we get an error back from reading the block we can properly pass the error up the chain. This was validated with the error injection testing I was doing. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: replace BUG_ON with ASSERT in btrfs_read_node_slotJosef Bacik
In btrfs_read_node_slot() we have a BUG_ON() that can be converted to an ASSERT(), it's from an extent buffer and the level is validated at the time it's read from disk. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17btrfs: use btrfs_handle_fs_error in btrfs_fill_superJosef Bacik
While trying to track down a lost EIO problem I hit the following assertion while doing my error injection testing BTRFS warning (device nvme1n1): transaction 1609 (with 180224 dirty metadata bytes) is not committed assertion failed: !found, in fs/btrfs/disk-io.c:4456 ------------[ cut here ]------------ kernel BUG at fs/btrfs/messages.h:169! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 1445 Comm: mount Tainted: G W 6.2.0-rc5+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 RIP: 0010:btrfs_assertfail.constprop.0+0x18/0x1a RSP: 0018:ffffb95fc3b0bc68 EFLAGS: 00010286 RAX: 0000000000000034 RBX: ffff9941c2ac2000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffffb6741f7d RDI: 00000000ffffffff RBP: ffff9941c2ac2428 R08: 0000000000000000 R09: ffffb95fc3b0bb38 R10: 0000000000000003 R11: ffffffffb71438a8 R12: ffff9941c2ac2428 R13: ffff9941c2ac2450 R14: ffff9941c2ac2450 R15: 000000000002c000 FS: 00007fcea2d07800(0000) GS:ffff9941fbc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f00cc7c83a8 CR3: 000000010c686000 CR4: 0000000000350ef0 Call Trace: <TASK> close_ctree+0x426/0x48f btrfs_mount_root.cold+0x7e/0xee ? legacy_parse_param+0x2b/0x220 legacy_get_tree+0x2b/0x50 vfs_get_tree+0x29/0xc0 vfs_kern_mount.part.0+0x73/0xb0 btrfs_mount+0x11d/0x3d0 ? legacy_parse_param+0x2b/0x220 legacy_get_tree+0x2b/0x50 vfs_get_tree+0x29/0xc0 path_mount+0x438/0xa40 __x64_sys_mount+0xe9/0x130 do_syscall_64+0x3e/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc This is because the error injection did an EIO for the root inode lookup and we simply jumped to closing the ctree. However because we didn't mark the file system as having an error we skipped all of the broken transaction cleanup stuff, and thus triggered this ASSERT(). Fix this by calling btrfs_handle_fs_error() in this case so we have the error set on the file system. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17wifi: mt76: connac: add nss calculation into mt76_connac2_mac_tx_rate_val()Ryder Lee
Take nss calculation into account since this function always wrongly returns 0. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: connac: fix txd multicast rate settingRyder Lee
The vif->bss_conf.mcast_rate should be applied to multicast data frame only. Fixes: 182071cdd594 ("mt76: connac: move connac2_mac_write_txwi in mt76_connac module") Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: mt7921e: stop chip reset worker in unregister hookQuan Zhou
If the chip reset worker is triggered during the remove process, the chip DMA may not be properly pushed back to the idle state. This can lead to corruption of the DMA flow due to the chip reset. Therefore, it is necessary to stop the chip reset before the DMA is finalized. To avoid resetting the chip after the reset worker is cancelled, use __mt7921_mcu_drv_pmctrl() instead of mt7921_mcu_drv_pmctrl(). It is safe to ignore the pm mutex because the pm worker and wake worker have already been cancelled. Fixes: 033ae79b3830 ("mt76: mt7921: refactor init.c to be bus independent") Co-developed-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Co-developed-by: Deren Wu <deren.wu@mediatek.com> Signed-off-by: Deren Wu <deren.wu@mediatek.com> Co-developed-by: Wang Zhao <wang.zhao@mediatek.com> Signed-off-by: Wang Zhao <wang.zhao@mediatek.com> Signed-off-by: Quan Zhou <quan.zhou@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: mt7921e: improve reliability of dma resetQuan Zhou
The hardware team has advised the driver that it is necessary to first put WFDMA into an idle state before resetting the WFDMA. Otherwise, the WFDMA may enter an unknown state where it cannot be polled with the right state successfully. To ensure that the DMA can work properly while a stressful cold reboot test was being made, we have reordered the programming sequence in the driver based on the hardware team's guidance. The patch would modify the WFDMA disabling flow from "DMA reset -> disabling DMASHDL -> disabling WFDMA -> polling and waiting until DMA idle" to "disabling WFDMA -> polling and waiting for DMA idle -> disabling DMASHDL -> DMA reset. Where he polling and waiting until WFDMA is idle is coordinated with the operation of disabling WFDMA. Even while WFDMA is being disabled, it can still handle Tx/Rx requests. The additional polling allows sufficient time for WFDMA to process the last T/Rx request. When the idle state of WFDMA is reached, it is a reliable indication that DMASHDL is also idle to ensure it is safe to disable it and perform the DMA reset. Fixes: 0a1059d0f060 ("mt76: mt7921: move mt7921_dma_reset in dma.c") Co-developed-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> Co-developed-by: Deren Wu <deren.wu@mediatek.com> Signed-off-by: Deren Wu <deren.wu@mediatek.com> Co-developed-by: Wang Zhao <wang.zhao@mediatek.com> Signed-off-by: Wang Zhao <wang.zhao@mediatek.com> Signed-off-by: Quan Zhou <quan.zhou@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: mt7921: fix missing unwind goto in `mt7921u_probe`Jiefeng Li
`mt7921u_dma_init` can only return zero or negative number according to its definition. When it returns non-zero number, there exists an error and this function should handle this error rather than return directly. Fixes: 0d2afe09fad5 ("mt76: mt7921: add mt7921u driver") Signed-off-by: Jiefeng Li <jiefeng_li@hust.edu.cn> Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17mt76: mt7921: fix kernel panic by accessing unallocated eeprom.dataSean Wang
The MT7921 driver no longer uses eeprom.data, but the relevant code has not been removed completely since commit 16d98b548365 ("mt76: mt7921: rely on mcu_get_nic_capability"). This could result in potential invalid memory access. To fix the kernel panic issue in mt7921, it is necessary to avoid accessing unallocated eeprom.data which can lead to invalid memory access. Furthermore, it is possible to entirely eliminate the mt7921_mcu_parse_eeprom function and solely depend on mt7921_mcu_parse_response to divide the RxD header. [2.702735] BUG: kernel NULL pointer dereference, address: 0000000000000550 [2.702740] #PF: supervisor write access in kernel mode [2.702741] #PF: error_code(0x0002) - not-present page [2.702743] PGD 0 P4D 0 [2.702747] Oops: 0002 [#1] PREEMPT SMP NOPTI [2.702755] RIP: 0010:mt7921_mcu_parse_response+0x147/0x170 [mt7921_common] [2.702758] RSP: 0018:ffffae7c00fef828 EFLAGS: 00010286 [2.702760] RAX: ffffa367f57be024 RBX: ffffa367cc7bf500 RCX: 0000000000000000 [2.702762] RDX: 0000000000000550 RSI: 0000000000000000 RDI: ffffa367cc7bf500 [2.702763] RBP: ffffae7c00fef840 R08: ffffa367cb167000 R09: 0000000000000005 [2.702764] R10: 0000000000000000 R11: ffffffffc04702e4 R12: ffffa367e8329f40 [2.702766] R13: 0000000000000000 R14: 0000000000000001 R15: ffffa367e8329f40 [2.702768] FS: 000079ee6cf20c40(0000) GS:ffffa36b2f940000(0000) knlGS:0000000000000000 [2.702769] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2.702775] CR2: 0000000000000550 CR3: 00000001233c6004 CR4: 0000000000770ee0 [2.702776] PKRU: 55555554 [2.702777] Call Trace: [2.702782] mt76_mcu_skb_send_and_get_msg+0xc3/0x11e [mt76 <HASH:1bc4 5>] [2.702785] mt7921_run_firmware+0x241/0x853 [mt7921_common <HASH:6a2f 6>] [2.702789] mt7921e_mcu_init+0x2b/0x56 [mt7921e <HASH:d290 7>] [2.702792] mt7921_register_device+0x2eb/0x5a5 [mt7921_common <HASH:6a2f 6>] [2.702795] ? mt7921_irq_tasklet+0x1d4/0x1d4 [mt7921e <HASH:d290 7>] [2.702797] mt7921_pci_probe+0x2d6/0x319 [mt7921e <HASH:d290 7>] [2.702799] pci_device_probe+0x9f/0x12a Fixes: 16d98b548365 ("mt76: mt7921: rely on mcu_get_nic_capability") Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: move mcu_uni_event and mcu_reg_event in common codeLorenzo Bianconi
mcu_uni_event and mcu_reg_event structs are shared between mt7921 and mt7615 drivers, so move them in connac lib. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: mt7996: enable coredump supportRyder Lee
Host triggered and catastrophic event triggered firmware core dumping for basic firmware issues triage, including state reporting, function calltrace and MCU memory dump. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: mt7996: add full system reset knobs into debugfsRyder Lee
Add testing points into debugfs to trigger firmware assert and enable full system recovery. Also rename knob "fw_ser" to a clear-cut name "sys_recovery". Co-developed-by: Bo Jiao <Bo.Jiao@mediatek.com> Signed-off-by: Bo Jiao <Bo.Jiao@mediatek.com> Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: mt7996: enable full system reset supportBo Jiao
Add mt7996_reset() and refactor mt7996_mac_reset_work() to support full system recovery. Signed-off-by: Bo Jiao <Bo.Jiao@mediatek.com> Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: mt7921: enable p2p supportSean Wang
Introduce p2p-go/p2p-client support to mt7921 driver CONNECTION_P2P_GC/GO is not supported with the current firmware so we added mt76_dev to mt76_connac_mcu_sta_basic_tlv signature to use CONNECTION_INFRA_STA/AP instead for p2p-client and p2p-go respectively to make it work. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17Merge tag 'scmi-updates-6.4' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers Arm SCMI updates for v6.4 The main and only new addition this time around is the support for unidirectional mailbox channels. SCMI communicates between the agent and the platform using one bidirectional 'a2p' channel used by the agent to send SCMI commands and synchronously receive the related replies, and an optional 'p2a' unidirectional channel used to asynchronously receive delayed responses and notifications emitted from the platform. In order to support platforms that support only unidirectional mailbox hardware channels, the existing bindings are extended to support the same. Both bidirectional and unidirectional channels support for the SCMI mailbox can coexist. The correct and effective combination of defined 'mboxes' and 'shmem' descriptors determines the type of the mailbox channel. This also contains a fix for the transfers allocation on Rx channel especially when the base protocol doesn't use Rx channel while some of the protocols can have dedicated Rx channels. * tag 'scmi-updates-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_scmi: Add support for unidirectional mailbox channels dt-bindings: firmware: arm,scmi: Support mailboxes unidirectional channels firmware: arm_scmi: Fix xfers allocation on Rx channel firmware: arm_scmi: Use the bitmap API to allocate bitmaps firmware: arm_scmi: Fix device node validation for mailbox transport firmware: arm_scmi: Fix raw coexistence mode behaviour on failure path firmware: arm_scmi: Remove duplicate include header inclusion firmware: arm_scmi: Return a literal instead of a variable firmware: arm_scmi: Clean up a return statement in scmi_probe Link: https://lore.kernel.org/r/20230417145743.1904318-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-17Merge tag 'vexpress-update-6.4' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers Armv7/v8 Vexpress update for v6.4 Addition of explicit of_platform.h header inclusion and removal of soon to be removed of_device.h * tag 'vexpress-update-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: bus: vexpress-config: Add explicit of_platform.h include Link: https://lore.kernel.org/r/20230417145724.1904259-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-17Merge tag 'memory-controller-drv-6.4-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl into soc/drivers Memory controller drivers for v6.4, part two 1. Tegra210 EMC: correct reading of MR18 register. 2. MediaTek SMI: add support for MT8365. * tag 'memory-controller-drv-6.4-2' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl: memory: mtk-smi: mt8365: Add SMI Support dt-bindings: memory-controllers: mediatek,smi-larb: add mt8365 dt-bindings: memory-controllers: mediatek,smi-common: add mt8365 memory: tegra: read values from correct device Link: https://lore.kernel.org/r/20230416143248.308942-1-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-04-17wifi: mt76: mt7921: Replace fake flex-arrays with flexible-array membersGustavo A. R. Silva
Zero-length arrays as fake flexible arrays are deprecated and we are moving towards adopting C99 flexible-array members instead. Address the following warnings found with GCC-13 and -fstrict-flex-arrays=3 enabled: drivers/net/wireless/mediatek/mt76/mt7921/acpi_sar.c:266:25: warning: array subscript 0 is outside array bounds of ‘struct mt7921_asar_dyn_limit_v2[0]’ [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7921/acpi_sar.c:263:25: warning: array subscript 0 is outside array bounds of ‘struct mt7921_asar_dyn_limit[0]’ [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7921/acpi_sar.c:223:28: warning: array subscript <unknown> is outside array bounds of ‘struct mt7921_asar_geo_limit_v2[0]’ [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7921/acpi_sar.c:220:28: warning: array subscript <unknown> is outside array bounds of ‘struct mt7921_asar_geo_limit[0]’ [-Warray-bounds=] drivers/net/wireless/mediatek/mt76/mt7921/acpi_sar.c:334:37: warning: array subscript i is outside array bounds of ‘u8[0]’ {aka ‘unsigned char[]’} [-Warray-bounds=] Notice that the DECLARE_FLEX_ARRAY() helper allows for flexible-array members in unions. This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines on memcpy() and help us make progress towards globally enabling -fstrict-flex-arrays=3 [1]. Link: https://github.com/KSPP/linux/issues/21 Link: https://github.com/KSPP/linux/issues/272 Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: Replace zero-length array with flexible-array memberGustavo A. R. Silva
Zero-length arrays are deprecated [1] and have to be replaced by C99 flexible-array members. This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines on memcpy() and help to make progress towards globally enabling -fstrict-flex-arrays=3 [2] Link: https://github.com/KSPP/linux/issues/78 [1] Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [2] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>
2023-04-17wifi: mt76: mt7921: add Netgear AXE3000 (A8000) supportReese Russell
Issue: Though the Netgear AXE3000 (A8000) is based on the mt7921 chipset because of the unique USB VID:PID combination this device does not initialize/register. Thus making it not plug and play. Fix: Adds support for the Netgear AXE3000 (A8000) based on the Mediatek mt7921au chipset. The method of action is adding the USD VID/PID pair to the mt7921u_device_table[] array. Notes: A retail sample of the Netgear AXE3000 (A8000) yeilds the following from lsusb D 0846:9060 NetGear, Inc. Wireless_Device. This pair 0846:9060 VID:PID has been reported by other users on Github. Signed-off-by: Reese Russell <git@qrsnap.io> Signed-off-by: Felix Fietkau <nbd@nbd.name>