summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-19btrfs: remove level argument from btrfs_set_block_flagsJosef Bacik
We just pass in btrfs_header_level(eb) for the level, and we're passing in the eb already, so simply get the level from the eb inside of btrfs_set_block_flags. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: move btrfs_check_trunc_cache_free_space into block-rsv.cJosef Bacik
This is completely related to block rsv's, move it out of the free space cache code and into block-rsv.c. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: scrub: use recovered data stripes as cache to avoid unnecessary readQu Wenruo
For P/Q stripe scrub, we have quite some duplicated read IO: - Data stripes read for verification This is triggered by the scrub_submit_initial_read() inside scrub_raid56_parity_stripe(). - Data stripes read (again) for P/Q stripe verification This is triggered by scrub_assemble_read_bios() from scrub_rbio(). Although we can have hit rbio cache and avoid unnecessary read, the chance is very low, as scrub would easily flush the whole rbio cache. This means, even we're just scrubbing a single P/Q stripe, we would read the data stripes twice for the best case scenario. If we need to recover some data stripes, it would cause more reads on the same data stripes, again and again. However before we call raid56_parity_submit_scrub_rbio() we already have all data stripes repaired and their contents ready to use. But RAID56 cache is unaware about the scrub cache, thus RAID56 layer itself still needs to re-read the data stripes. To avoid such cache miss, this patch would: - Introduce a new helper, raid56_parity_cache_data_pages() This function would grab the pages from an array, and copy the content to the rbio, marking all the involved sectors uptodate. The page copy is unavoidable because of the cache pages of rbio are all self managed, thus can not utilize outside pages without screwing up the lifespan. - Use the repaired data stripes as cache inside scrub_raid56_parity_stripe() By this, we ensure all the data sectors of the scrub rbio are already uptodate, and no need to read them again from disk. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: assert tree lock is held when removing free space entriesFilipe Manana
Removing a free space entry from an in memory space cache requires having the corresponding btrfs_free_space_ctl's 'tree_lock' held. We have several code paths that remove an entry, so add assertions where appropriate to verify we are holding the lock, as the lock is acquired by some other function up in the call chain, which makes it easy to miss in the future. Note: for this to work we need to lock the local btrfs_free_space_ctl at load_free_space_cache(), which was not being done because it's local, declared on the stack, so no other task has access to it. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: assert tree lock is held when linking free spaceFilipe Manana
When linking a free space entry, at link_free_space(), the caller should be holding the spinlock 'tree_lock' of the given btrfs_free_space_ctl argument, which is necessary for manipulating the red black tree of free space entries (done by tree_insert_offset(), which already asserts the lock is held) and for manipulating the 'free_space', 'free_extents', 'discardable_extents' and 'discardable_bytes' counters of the given struct btrfs_free_space_ctl. So assert that the spinlock 'tree_lock' of the given btrfs_free_space_ctl is held by the current task. We have multiple code paths that end up calling link_free_space(), and all currently take the lock before calling it. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: assert tree lock is held when searching for free space entriesFilipe Manana
When searching for a free space entry by offset, at tree_search_offset(), we are supposed to have the btrfs_free_space_ctl's 'tree_lock' held, so assert that. We have multiple callers of tree_search_offset(), and all currently hold the necessary lock before calling it. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: assert proper locks are held at tree_insert_offset()Filipe Manana
There are multiple code paths leading to tree_insert_offset(), and each path takes the necessary locks before tree_insert_offset() is called, since they do other things that require those locks to be held. This makes it easy to miss the locking somewhere, so make tree_insert_offset() assert that the required locks are being held by the calling task. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: simplify arguments to tree_insert_offset()Filipe Manana
For the in-memory component of space caching (free space cache and free space tree), three of the arguments passed to tree_insert_offset() can always be taken from the new free space entry that we are about to add. So simplify tree_insert_offset() to take the new entry instead of the 'offset', 'node' and 'bitmap' arguments. This will also allow to make further changes simpler. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: use precomputed end offsets at do_trimming()Filipe Manana
The are two computations of end offsets at do_trimming() that are not necessary, as they were previously computed and stored in local const variables. So just use the variables instead, to make the source code shorter and easier to read. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: avoid searching twice for previous node when merging free space entriesFilipe Manana
At try_merge_free_space(), avoid calling twice rb_prev() to find the previous node, as that requires looping through the red black tree, so store the result of the rb_prev() call and then use it. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: avoid extra memory allocation when copying free space cacheFilipe Manana
At copy_free_space_cache(), we add a new entry to the block group's ctl before we free the entry from the temporary ctl. Adding a new entry requires the allocation of a new struct btrfs_free_space, so we can avoid a temporary extra allocation by freeing the entry from the temporary ctl before we add a new entry to the main ctl, which possibly also reduces the chances for a memory allocation failure in case of very high memory pressure. So just do that. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: simplify transid initialization in btrfs_ioctl_wait_syncTom Rix
A small code simplification, move the default value of transid to its initialization and remove the else-statement. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: output affected files when relocation failsQu Wenruo
[PROBLEM] When relocation fails (mostly due to checksum mismatch), we only got very cryptic error messages like: BTRFS info (device dm-4): relocating block group 13631488 flags data BTRFS warning (device dm-4): csum failed root -9 ino 257 off 0 csum 0x373e1ae3 expected csum 0x98757625 mirror 1 BTRFS error (device dm-4): bdev /dev/mapper/test-scratch1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 BTRFS info (device dm-4): balance: ended with status: -5 The end user has to decipher the above messages and use various tools to locate the affected files and find a way to fix the problem (mostly deleting the file). This is not an easy work even for experienced developer, not to mention the end users. [SCRUB IS DOING BETTER] By contrast, scrub is providing much better error messages: BTRFS error (device dm-4): unable to fixup (regular) error at logical 13631488 on dev /dev/mapper/test-scratch1 physical 13631488 BTRFS warning (device dm-4): checksum error at logical 13631488 on dev /dev/mapper/test-scratch1, physical 13631488, root 5, inode 257, offset 0, length 4096, links 1 (path: file) BTRFS info (device dm-4): scrub: finished on devid 1 with status: 0 Which provides the affected files directly to the end user. [IMPROVEMENT] Instead of the generic data checksum error messages, which is not doing a good job for data reloc inodes, this patch introduce a scrub like backref walking based solution. When a sector fails its checksum for data reloc inode, we go the following workflow: - Get the real logical bytenr For data reloc inode, the file offset is the offset inside the block group. Thus the real logical bytenr is @file_off + @block_group->start. - Do an extent type check If it's tree blocks it's much easier to handle, just go through all the tree block backref. - Do a backref walk and inode path resolution for data extents This is mostly the same as scrub. But unfortunately we can not reuse the same function as the output format is different. Now the new output would be more user friendly: BTRFS info (device dm-4): relocating block group 13631488 flags data BTRFS warning (device dm-4): csum failed root -9 ino 257 off 0 logical 13631488 csum 0x373e1ae3 expected csum 0x98757625 mirror 1 BTRFS warning (device dm-4): checksum error at logical 13631488 mirror 1 root 5 inode 257 offset 0 length 4096 links 1 (path: file) BTRFS error (device dm-4): bdev /dev/mapper/test-scratch1 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 BTRFS info (device dm-4): balance: ended with status: -5 Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: remove hipri_workers workqueueChristoph Hellwig
Now that btrfs_wq_submit_bio is never called for synchronous I/O, the hipri_workers workqueue is not used anymore and can be removed. Reviewed-by: Chris Mason <clm@fb.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: determine synchronous writers from bio or writeback controlChristoph Hellwig
The writeback_control structure already passes down the information about a writeback being synchronous from the core VM code, and thus information is propagated into the bio REQ_SYNC flag through the wbc_to_write_flags helper. Use that information to decide if checksums calculation is offloaded to a workqueue instead of btrfs_inode::sync_writers field that not only bloats the inode but also has too wide scope, being inode wide instead of limited to the actual writeback request. The sync writes were set in: - btrfs_do_write_iter - regular IO, sync status is set - start_ordered_ops - ordered write start, writeback with WB_SYNC_ALL mode - btrfs_write_marked_extents - write marked extents, writeback with WB_SYNC_ALL mode Reviewed-by: Chris Mason <clm@fb.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: submit IO synchronously for fast checksum implementationsChristoph Hellwig
Most modern hardware supports very fast accelerated crc32c calculation. If that is supported the CPU overhead of the checksum calculation is very limited, and offloading the calculation to special worker threads has a lot of overhead for no gain. E.g. on an Intel Optane device is actually very much slows down even 1M buffered writes with fio: Unpatched: write: IOPS=3316, BW=3316MiB/s (3477MB/s)(200GiB/61757msec); 0 zone resets With synchronous CRCs: write: IOPS=4882, BW=4882MiB/s (5119MB/s)(200GiB/41948msec); 0 zone resets With a lot of variation during the unpatched run going down as low as 1100MB/s, while the synchronous CRC version has about the same peak write speed but much lower dips, and fewer kworkers churning around. Both tests had fio saturated at 100% CPU. (thanks to Jens Axboe via Chris Mason for the benchmarking) Reviewed-by: Chris Mason <clm@fb.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: use SECTOR_SHIFT to convert LBA to physical offsetAnand Jain
Using SECTOR_SHIFT to convert LBA to physical address makes it more readable. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: use SECTOR_SHIFT to convert physical offset to LBAAnand Jain
Use SECTOR_SHIFT while converting a physical address to an LBA, makes it more readable. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: improve leaf dump and error handlingQu Wenruo
Improve the leaf dump behavior by: - Always dump the leaf first, then the error message - Output the slot number if possible Especially in __btrfs_free_extent() the leaf dump of extent tree can be pretty large. With an extra slot number it's much easier to locate the problem. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: print-tree: pass const extent buffer pointerQu Wenruo
Since print-tree infrastructure only prints the content of a tree block, we can make them to accept const extent buffer pointer. This removes a forced type convert in extent-tree, where we convert a const extent buffer pointer to regular one, just to avoid compiler warning. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: export bitmap_test_range_all_{set,zero}Naohiro Aota
bitmap_test_range_all_{set,zero} defined in subpage.c are useful for other components. Move them to misc.h and use them in zoned.c. Also, as find_next{,_zero}_bit take/return "unsigned long" instead of "unsigned int", convert the type to "unsigned long". While at it, also rewrite the "if (...) return true; else return false;" pattern and add const to the input bitmap. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: tag as unlikely the key comparison when checking sibling keysFilipe Manana
When checking siblings keys, before moving keys from one node/leaf to a sibling node/leaf, it's very unexpected to have the last key of the left sibling greater than or equals to the first key of the right sibling, as that means we have a (serious) corruption that breaks the key ordering properties of a b+tree. Since this is unexpected, surround the comparison with the unlikely macro, which helps the compiler generate better code for the most expected case (no existing b+tree corruption). This is also what we do for other unexpected cases of invalid key ordering (like at btrfs_set_item_key_safe()). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: make btrfs_free_device() staticFilipe Manana
The function btrfs_free_device() is never used outside of volumes.c, so make it static and remove its prototype declaration at volumes.h. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: don't commit transaction for every subvol createSweet Tea Dorminy
Recently a Meta-internal workload encountered subvolume creation taking up to 2s each, significantly slower than directory creation. As they were hoping to be able to use subvolumes instead of directories, and were looking to create hundreds, this was a significant issue. After Josef investigated, it turned out to be due to the transaction commit currently performed at the end of subvolume creation. This change improves the workload by not doing transaction commit for every subvolume creation, and merely requiring a transaction commit on fsync. In the worst case, of doing a subvolume create and fsync in a loop, this should require an equal amount of time to the current scheme; and in the best case, the internal workload creating hundreds of subvolumes before fsyncing is greatly improved. While it would be nice to be able to use the log tree and use the normal fsync path, log tree replay can't deal with new subvolume inodes presently. It's possible that there's some reason that the transaction commit is necessary for correctness during subvolume creation; however, git logs indicate that the commit dates back to the beginning of subvolume creation, and there are no notes on why it would be necessary. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19btrfs: unexport btrfs_prev_leaf()Filipe Manana
btrfs_prev_leaf() is not used outside ctree.c, so there's no need to export it at ctree.h - just make it static at ctree.c and move its definition above btrfs_search_slot_for_read(), since that function calls it. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19ASoC: intel: sof_sdw: Fixup typo in device link checkingCharles Keepax
The loop checking for multiple different devices on a single sdw link contains a typo accidentally using i twice instead of j. Correct to the correct index variable. Fixes: dc5a3e60a4b5 ("ASoC: Intel: sof_sdw: append codec type to dai link name") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20230614142116.1059677-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-19mmc: usdhi60rol0: fix deferred probingSergey Shtylyov
The driver overrides the error codes returned by platform_get_irq_byname() to -ENODEV, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-13-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: sunxi: fix deferred probingSergey Shtylyov
The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: 2408a08583d2 ("mmc: sunxi-mmc: Handle return value of platform_get_irq") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://lore.kernel.org/r/20230617203622.6812-12-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: sh_mmcif: fix deferred probingSergey Shtylyov
The driver overrides the error codes returned by platform_get_irq() to -ENXIO, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-11-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: sdhci-spear: fix deferred probingSergey Shtylyov
The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: 682798a596a6 ("mmc: sdhci-spear: Handle return value of platform_get_irq") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20230617203622.6812-10-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: sdhci-acpi: fix deferred probingSergey Shtylyov
The driver overrides the error codes returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 1b7ba57ecc86 ("mmc: sdhci-acpi: Handle return value of platform_get_irq") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20230617203622.6812-9-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: owl: fix deferred probingSergey Shtylyov
The driver overrides the error codes returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: ff65ffe46d28 ("mmc: Add Actions Semi Owl SoCs SD/MMC driver") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-8-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: omap_hsmmc: fix deferred probingSergey Shtylyov
The driver overrides the error codes returned by platform_get_irq() to -ENXIO, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-7-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: omap: fix deferred probingSergey Shtylyov
The driver overrides the error codes returned by platform_get_irq() to -ENXIO, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-6-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: mvsdio: fix deferred probingSergey Shtylyov
The driver overrides the error codes returned by platform_get_irq() to -ENXIO, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-5-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: mtk-sd: fix deferred probingSergey Shtylyov
The driver overrides the error codes returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 208489032bdd ("mmc: mediatek: Add Mediatek MMC driver") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-4-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: meson-gx: fix deferred probingSergey Shtylyov
The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: cbcaac6d7dd2 ("mmc: meson-gx-mmc: Fix platform_get_irq's error checking") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20230617203622.6812-3-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: bcm2835: fix deferred probingSergey Shtylyov
The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: 660fc733bd74 ("mmc: bcm2835: Add new driver for the sdhost controller.") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-2-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19mmc: litex_mmc: set PROBE_PREFER_ASYNCHRONOUSJisheng Zhang
mmc host drivers should have enabled the asynchronous probe option, but it seems like we didn't set it for litex_mmc when introducing litex mmc support, so let's set it now. Tested with linux-on-litex-vexriscv on sipeed tang nano 20K fpga. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Acked-by: Gabriel Somlo <gsomlo@gmail.com> Fixes: 92e099104729 ("mmc: Add driver for LiteX's LiteSDCard interface") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230617085319.2139-1-jszhang@kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19Merge tag 'mlx5-fixes-2023-06-16' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-fixes-2023-06-16 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-19net: qca_spi: Avoid high load if QCA7000 is not availableStefan Wahren
In case the QCA7000 is not available via SPI (e.g. in reset), the driver will cause a high load. The reason for this is that the synchronization is never finished and schedule() is never called. Since the synchronization is not timing critical, it's safe to drop this from the scheduling condition. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-18Linux 6.4-rc7v6.4-rc7Linus Torvalds
2023-06-18Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four fixes, all in drivers: three fairly obvious small ones and a large one in aacraid to add block queue completion mapping and fix a CPU offline hang" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: lpfc: Fix incorrect big endian type assignment in bsg loopback path scsi: target: core: Fix error path in target_setup_session() scsi: storvsc: Always set no_report_opcodes scsi: aacraid: Reply queue mapping to CPUs based on IRQ affinity
2023-06-18Merge tag 'ata-6.4-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fix from Damien Le Moal: - Avoid deadlocks on resume from sleep by delaying scsi rescan until the scsi device is also fully resumed. * tag 'ata-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-scsi: Avoid deadlock on rescan after device resume
2023-06-18Merge tag 'parisc-for-6.4-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fix from Helge Deller: - Drop redundant register definitions to fix build with latest binutils * tag 'parisc-for-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Delete redundant register definitions in <asm/assembly.h>
2023-06-18net: phy: Manual remove LEDs to ensure correct orderingAndrew Lunn
If the core is left to remove the LEDs via devm_, it is performed too late, after the PHY driver is removed from the PHY. This results in dereferencing a NULL pointer when the LED core tries to turn the LED off before destroying the LED. Manually unregister the LEDs at a safe point in phy_remove. Cc: stable@vger.kernel.org Reported-by: Florian Fainelli <f.fainelli@gmail.com> Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-18mm/mmap: Fix error path in do_vmi_align_munmap()Liam R. Howlett
The error unrolling was leaving the VMAs detached in many cases and leaving the locked_vm statistic altered, and skipping the unrolling entirely in the case of the vma tree write failing. Fix the error path by re-attaching the detached VMAs and adding the necessary goto for the failed vma tree write, and fix the locked_vm statistic by only updating after the vma tree write succeeds. Fixes: 763ecb035029 ("mm: remove the vma linked list") Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-18svcrdma: Fix stale commentChuck Lever
Commit 7d81ee8722d6 ("svcrdma: Single-stage RDMA Read") changed the behavior of svc_rdma_recvfrom() but neglected to update the documenting comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-18NFSD: Distinguish per-net namespace initializationChuck Lever
I find the naming of nfsd_init_net() and nfsd_startup_net() to be confusingly similar. Rename the namespace initialization and tear- down ops and add comments to distinguish their separate purposes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-18nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_netJeff Layton
Commit f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") moved the initialization of the reply cache into nfsd startup, but didn't account for the stats counters, which can be accessed before nfsd is ever started. The result can be a NULL pointer dereference when someone accesses /proc/fs/nfsd/reply_cache_stats while nfsd is still shut down. This is a regression and a user-triggerable oops in the right situation: - non-x86_64 arch - /proc/fs/nfsd is mounted in the namespace - nfsd is not started in the namespace - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats" Although this is easy to trigger on some arches (like aarch64), on x86_64, calling this_cpu_ptr(NULL) evidently returns a pointer to the fixed_percpu_data. That struct looks just enough like a newly initialized percpu var to allow nfsd_reply_cache_stats_show to access it without Oopsing. Move the initialization of the per-net+per-cpu reply-cache counters back into nfsd_init_net, while leaving the rest of the reply cache allocations to be done at nfsd startup time. Kudos to Eirik who did most of the legwork to track this down. Cc: stable@vger.kernel.org # v6.3+ Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup") Reported-and-tested-by: Eirik Fuller <efuller@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2215429 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>