summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-13btrfs: remove duplicate include header in extent-tree.cye xingchen
extent-tree.h is included more than once, added in a0231804affe ("btrfs: move extent-tree helpers into their own header file"). Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: scrub: improve tree block error reportingQu Wenruo
[BUG] When debugging a scrub related metadata error, it turns out that our metadata error reporting is not ideal. The only 3 error messages are: - BTRFS error (device dm-2): bdev /dev/mapper/test-scratch1 errs: wr 0, rd 0, flush 0, corrupt 0, gen 1 Showing we have metadata generation mismatch errors. - BTRFS error (device dm-2): unable to fixup (regular) error at logical 7110656 on dev /dev/mapper/test-scratch1 Showing which tree blocks are corrupted. - BTRFS warning (device dm-2): checksum/header error at logical 24772608 on dev /dev/mapper/test-scratch2, physical 3801088: metadata node (level 1) in tree 5 Showing which physical range the corrupted metadata is at. We have to combine the above 3 to know we have a corrupted metadata with generation mismatch. And this is already the better case, if we have other problems, like fsid mismatch, we can not even know the cause. [CAUSE] The problem is caused by the fact that, scrub_checksum_tree_block() never outputs any error message. It just return two bits for scrub: sblock->header_error, and sblock->generation_error. And later we report error in scrub_print_warning(), but unfortunately we only have two bits, there is not really much thing we can done to print any detailed errors. [FIX] This patch will do the following to enhance the error reporting of metadata scrub: - Add extra warning (ratelimited) for every error we hit This can help us to distinguish the different types of errors. Some errors can help us to know what's going wrong immediately, like bytenr mismatch. - Re-order the checks Currently we check bytenr first, then immediately generation. This can lead to false generation mismatch reports, while the fsid mismatches. Here is the new output for the bug I'm debugging (we forgot to writeback tree blocks for commit roots): BTRFS warning (device dm-2): tree block 24117248 mirror 1 has bad fsid, has b77cd862-f150-4c71-90ec-7baf0544d83f want 17df6abf-23cd-445f-b350-5b3e40bfd2fc BTRFS warning (device dm-2): tree block 24117248 mirror 0 has bad fsid, has b77cd862-f150-4c71-90ec-7baf0544d83f want 17df6abf-23cd-445f-b350-5b3e40bfd2fc Now we can immediately know it's some tree blocks didn't even get written back, other than the original confusing generation mismatch. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: don't use size classes for zoned file systemsBoris Burkov
When a file system has ZNS devices which are constrained by a maximum number of active block groups, then not being able to use all the block groups for every allocation is not ideal, and could cause us to loop a ton with mixed size allocations. In general, since zoned doesn't write into gaps behind where block groups are writing, it is not susceptible to the same sort of fragmentation that size classes are designed to solve, so we can skip size classes for zoned file systems in general, even though there would probably be no harm for SMR devices. Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: load block group size class when cachingBoris Burkov
Since the size class is an artifact of an arbitrary anti fragmentation strategy, it doesn't really make sense to persist it. Furthermore, most of the size class logic assumes fresh block groups. That is of course not a reasonable assumption -- we will be upgrading kernels with existing filesystems whose block groups are not classified. To work around those issues, implement logic to compute the size class of the block groups as we cache them in. To perfectly assess the state of a block group, we would have to read the entire extent tree (since the free space cache mashes together contiguous extent items) which would be prohibitively expensive for larger file systems with more extents. We can do it relatively cheaply by implementing a simple heuristic of sampling a handful of extents and picking the smallest one we see. In the happy case where the block group was classified, we will only see extents of the correct size. In the unhappy case, we will hopefully find one of the smaller extents, but there is no perfect answer anyway. Autorelocation will eventually churn up the block group if there is significant freeing anyway. There was no regression in mount performance at end state of the fsperf test suite, and the delay until the block group is marked cached is minimized by the constant number of extent samples. Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: introduce size class to block group allocatorBoris Burkov
The aim of this patch is to reduce the fragmentation of block groups under certain unhappy workloads. It is particularly effective when the size of extents correlates with their lifetime, which is something we have observed causing fragmentation in the fleet at Meta. This patch categorizes extents into size classes: - x < 128KiB: "small" - 128KiB < x < 8MiB: "medium" - x > 8MiB: "large" and as much as possible reduces allocations of extents into block groups that don't match the size class. This takes advantage of any (possible) correlation between size and lifetime and also leaves behind predictable re-usable gaps when extents are freed; small writes don't gum up bigger holes. Size classes are implemented in the following way: - Mark each new block group with a size class of the first allocation that goes into it. - Add two new passes to ffe: "unset size class" and "wrong size class". First, try only matching block groups, then try unset ones, then allow allocation of new ones, and finally allow mismatched block groups. - Filtering is done just by skipping inappropriate ones, there is no special size class indexing. Other solutions I considered were: - A best fit allocator with an rb-tree. This worked well, as small writes didn't leak big holes from large freed extents, but led to regressions in ffe and write performance due to lock contention on the rb-tree with every allocation possibly updating it in parallel. Perhaps something clever could be done to do the updates in the background while being "right enough". - A fixed size "working set". This prevents freeing an extent drastically changing where writes currently land, and seems like a good option too. Doesn't take advantage of size in any way. - The same size class idea, but implemented with xarray marks. This turned out to be slower than looping the linked list and skipping wrong block groups, and is also less flexible since we must have only 3 size classes (max #marks). With the current approach we can have as many as we like. Performance testing was done via: https://github.com/josefbacik/fsperf Of particular relevance are the new fragmentation specific tests. A brief summary of the testing results: - Neutral results on existing tests. There are some minor regressions and improvements here and there, but nothing that truly stands out as notable. - Improvement on new tests where size class and extent lifetime are correlated. Fragmentation in these cases is completely eliminated and write performance is generally a little better. There is also significant improvement where extent sizes are just a bit larger than the size class boundaries. - Regression on one new tests: where the allocations are sized intentionally a hair under the borders of the size classes. Results are neutral on the test that intentionally attacks this new scheme by mixing extent size and lifetime. The full dump of the performance results can be found here: https://bur.io/fsperf/size-class-2022-11-15.txt (there are ANSI escape codes, so best to curl and view in terminal) Here is a snippet from the full results for a new test which mixes buffered writes appending to a long lived set of files and large short lived fallocates: bufferedappendvsfallocate results metric baseline current stdev diff ====================================================================================== avg_commit_ms 31.13 29.20 2.67 -6.22% bg_count 14 15.60 0 11.43% commits 11.10 12.20 0.32 9.91% elapsed 27.30 26.40 2.98 -3.30% end_state_mount_ns 11122551.90 10635118.90 851143.04 -4.38% end_state_umount_ns 1.36e+09 1.35e+09 12248056.65 -1.07% find_free_extent_calls 116244.30 114354.30 964.56 -1.63% find_free_extent_ns_max 599507.20 1047168.20 103337.08 74.67% find_free_extent_ns_mean 3607.19 3672.11 101.20 1.80% find_free_extent_ns_min 500 512 6.67 2.40% find_free_extent_ns_p50 2848 2876 37.65 0.98% find_free_extent_ns_p95 4916 5000 75.45 1.71% find_free_extent_ns_p99 20734.49 20920.48 1670.93 0.90% frag_pct_max 61.67 0 8.05 -100.00% frag_pct_mean 43.59 0 6.10 -100.00% frag_pct_min 25.91 0 16.60 -100.00% frag_pct_p50 42.53 0 7.25 -100.00% frag_pct_p95 61.67 0 8.05 -100.00% frag_pct_p99 61.67 0 8.05 -100.00% fragmented_bg_count 6.10 0 1.45 -100.00% max_commit_ms 49.80 46 5.37 -7.63% sys_cpu 2.59 2.62 0.29 1.39% write_bw_bytes 1.62e+08 1.68e+08 17975843.50 3.23% write_clat_ns_mean 57426.39 54475.95 2292.72 -5.14% write_clat_ns_p50 46950.40 42905.60 2101.35 -8.62% write_clat_ns_p99 148070.40 143769.60 2115.17 -2.90% write_io_kbytes 4194304 4194304 0 0.00% write_iops 2476.15 2556.10 274.29 3.23% write_lat_ns_max 2101667.60 2251129.50 370556.59 7.11% write_lat_ns_mean 59374.91 55682.00 2523.09 -6.22% write_lat_ns_min 17353.10 16250 1646.08 -6.36% There are some mixed improvements/regressions in most metrics along with an elimination of fragmentation in this workload. On the balance, the drastic 1->0 improvement in the happy cases seems worth the mix of regressions and improvements we do observe. Some considerations for future work: - Experimenting with more size classes - More hinting/search ordering work to approximate a best-fit allocator Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: add more find_free_extent tracepointsBoris Burkov
find_free_extent is a complicated function. It consists (at least) of: - a hint that jumps into the middle of a for loop macro - a middle loop trying every raid level - an outer loop ascending through ffe loop levels - complicated logic for skipping some of those ffe loop levels - multiple underlying in-bg allocators (zoned, cluster, no cluster) Which is all to say that more tracing is helpful for debugging its behavior. Add two new tracepoints: at the entrance to the block_groups loop (hit for every raid level and every ffe_ctl loop) and at the point we seriously consider a block_group for allocation. This way we can see the whole path through the algorithm, including hints, multiple loops, etc. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: pass find_free_extent_ctl to allocator tracepointsBoris Burkov
The allocator tracepoints currently have a pile of values from ffe_ctl. In modifying the allocator and adding more tracepoints, I found myself adding to the already long argument list of the tracepoints. It makes it a lot simpler to just send in the ffe_ctl itself. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: remove the wait argument to btrfs_start_ordered_extentChristoph Hellwig
Given that wait is always set to 1, so remove the argument. Last use of wait with 0 was in 0c304304feab ("Btrfs: remove csum_bytes_left"). Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: use a single variable to track return value for log_dir_items()Filipe Manana
We currently use 'ret' and 'err' to track the return value for log_dir_items(), which is confusing and likely the cause for previous bugs where log_dir_items() did not return an error when it should, fixed in previous patches. So change this and use only a single variable, 'ret', to track the return value. This is simpler and makes it similar to most of the existing code. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: use a negative value for BTRFS_LOG_FORCE_COMMITFilipe Manana
Currently we use the value 1 for BTRFS_LOG_FORCE_COMMIT, but that value has a few inconveniences: 1) If it's ever used by btrfs_log_inode(), or any function down the call chain, we have to remember to btrfs_set_log_full_commit(), which is repetitive and has a chance to be forgotten in future use cases. btrfs_log_inode_parent() only calls btrfs_set_log_full_commit() when it gets a negative value from btrfs_log_inode(); 2) Down the call chain of btrfs_log_inode(), we may have functions that need to force a log commit, but can return either an error (negative value), false (0) or true (1). So they are forced to return some random negative to force a log commit - using BTRFS_LOG_FORCE_COMMIT would make the intention more clear. Currently the only example is flush_dir_items_batch(). So turn BTRFS_LOG_FORCE_COMMIT into a negative value. The chosen value is -(MAX_ERRNO + 1), so that it does not overlap any errno value and makes it easier to debug. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: use PAGE_{ALIGN, ALIGNED, ALIGN_DOWN} macroYushan Zhou
The header file linux/mm.h provides PAGE_ALIGN, PAGE_ALIGNED, PAGE_ALIGN_DOWN macros. Use these macros to make code more concise. Signed-off-by: Yushan Zhou <katrinzhou@tencent.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: go to matching label when cleaning em in btrfs_submit_directPeng Hao
When btrfs_get_chunk_map fails to allocate a new em the cleanup does not need to be done so the goto target is out_err, which is consistent with current coding style. Signed-off-by: Peng Hao <flyingpeng@tencent.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: turn on -Wmaybe-uninitializedJosef Bacik
We had a recent bug that would have been caught by a newer compiler with -Wmaybe-uninitialized and would have saved us a month of failing tests that I didn't have time to investigate. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: fix uninitialized variable warning in run_one_async_startJosef Bacik
With -Wmaybe-uninitialized compiler complains about ret being possibly uninitialized, which isn't possible as the WQ_ constants are set only from our code, however we can handle the default case and get rid of the warning. The value is set to BLK_STS_IOERR so it does not issue any IO and could be potentially detected, but this is basically a "cannot happen" error. To catch any problems during development use the assert. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ set the error in default: ] Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: zoned: fix uninitialized variable warning in btrfs_get_dev_zonesNaohiro Aota
Fix an uninitialized warning we get with -Wmaybe-uninitialized where it thought zno may have been uninitialized, in both cases it depends on zinfo->zone_cache but we know the value won't change between checks. Reported-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/linux-btrfs/af6c527cbd8bdc782e50bd33996ee83acc3a16fb.1671221596.git.josef@toxicpanda.com/ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: fix uninitialized variable warning in btrfs_sb_log_locationJosef Bacik
We only have 3 possible mirrors, and we have ASSERT()'s to make sure we're not passing in an invalid super mirror into this function, so technically this value isn't uninitialized. However -Wmaybe-uninitialized will complain, so set it to U64_MAX so if we don't have ASSERT()'s turned on it'll error out later on when it see's the zone is beyond our maximum zones. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: fix uninitialized variable warnings in __set_extent_bit and ↵Josef Bacik
convert_extent_bit We will pass in the parent and p pointer into our tree_search function to avoid doing a second search when inserting a new extent state into the tree. However because this is conditional upon passing in these pointers the compiler seems to think these values can be uninitialized if we're using -Wmaybe-uninitialized. Fix this by initializing these values. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: fix uninitialized variable warning in btrfs_update_block_groupJosef Bacik
reclaim isn't set in the alloc case, however we only care about reclaim in the !alloc case. This isn't an actual problem, however -Wmaybe-uninitialized will complain, so initialize reclaim to quiet the compiler. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: fix uninitialized variable warning in get_inode_genJosef Bacik
Anybody that calls get_inode_gen() can have an uninitialized gen if there's an error. This isn't a big deal because all the users just exit if they get an error, however it makes -Wmaybe-uninitialized complain, so fix this up to always initialize the passed in gen, this quiets all of the uninitialized warnings in send.c. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: fix uninitialized variable warning in btrfs_cleanup_ordered_extentsJosef Bacik
We can conditionally pass in a locked page, and then we'll use that page range to skip marking errors as that will happen in another layer. However this causes the compiler to complain because it doesn't understand we only use these values when we have the page. Make the compiler stop complaining by setting these values to 0. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: move btrfs_abort_transaction to transaction.cJosef Bacik
While trying to sync messages.[ch] I ended up with this dependency on messages.h in the rest of btrfs-progs code base because it's where btrfs_abort_transaction() was now held. We want to keep messages.[ch] limited to the kernel code, and the btrfs_abort_transaction() code better fits in the transaction code and not in messages. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ move the __cold attributes ] Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: directly pass in fs_info to btrfs_merge_delayed_refsJohannes Thumshirn
Now that none of the functions called by btrfs_merge_delayed_refs() needs a btrfs_trans_handle, directly pass in a btrfs_fs_info to btrfs_merge_delayed_refs(). Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: drop trans parameter of insert_delayed_refJohannes Thumshirn
Now that drop_delayed_ref() doesn't need a btrfs_trans_handle, drop it from insert_delayed_ref() as well. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: remove trans parameter of merge_refJohannes Thumshirn
Now that drop_delayed_ref() doesn't get the btrfs_trans_handle passed in anymore, we can get rid of it in merge_ref() as well. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13btrfs: drop unused trans parameter of drop_delayed_refJohannes Thumshirn
drop_delayed_ref() doesn't use the btrfs_trans_handle it gets passed in, so remove it. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13Merge tag 'platform-drivers-x86-v6.2-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fix from Hans de Goede: "Intel vsec driver Meteor Lake PCI ids addition" * tag 'platform-drivers-x86-v6.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel/vsec: Add support for Meteor Lake
2023-02-13drm: Disable dynamic debug as brokenVille Syrjälä
CONFIG_DRM_USE_DYNAMIC_DEBUG breaks debug prints for (at least modular) drm drivers. The debug prints can be reinstated by manually frobbing /sys/module/drm/parameters/debug after the fact, but at that point the damage is done and all debugs from driver probe are lost. This makes drivers totally undebuggable. There's a more complete fix in progress [1], with further details, but we need this fixed in stable kernels. Mark the feature as broken and disable it by default, with hopes distros follow suit and disable it as well. [1] https://lore.kernel.org/r/20230125203743.564009-1-jim.cromie@gmail.com Fixes: 84ec67288c10 ("drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro") Cc: Jim Cromie <jim.cromie@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.1+ Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230207143337.2126678-1-jani.nikula@intel.com
2023-02-13cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RTKrzysztof Kozlowski
The runtime Power Management of CPU topology is not compatible with PREEMPT_RT: 1. Core cpuidle path disables IRQs. 2. Core cpuidle calls cpuidle-psci. 3. cpuidle-psci in __psci_enter_domain_idle_state() calls pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use spinlocks (which are sleeping on PREEMPT_RT). Deep sleep modes are not a priority of Realtime kernels because the latencies might become unpredictable. On the other hand the PSCI CPU idle power domain is a parent of other devices and power domain controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250). Disable the idle callbacks in cpuidle-psci and mark the domain as always on. This is a trade-off between making PREEMPT_RT working and still having a proper power domain hierarchy in the system. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Adrien Thierry <athierry@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-13MIPS: loongson32: Drop obsolete cpufreq platform deviceKeguang Zhang
The obsolete cpufreq driver was removed, drop the platform device and data accordingly. Link: https://lore.kernel.org/all/20230112135342.3927338-1-keguang.zhang@gmail.com Signed-off-by: Keguang Zhang <keguang.zhang@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-13powercap: intel_rapl: Fix handling for large time windowZhang Rui
When setting the power limit time window, software updates the 'y' bits and 'f' bits in the power limit register, and the value hardware takes follows the formula below Time window = 2 ^ y * (1 + f / 4) * Time_Unit When handling large time window input from userspace, using left shifting breaks in two cases: 1. when ilog2(value) is bigger than 31, in expression "1 << y", left shifting by more than 31 bits has undefined behavior. This breaks 'y'. For example, on an Alderlake platform, "1 << 32" returns 1. 2. when ilog2(value) equals 31, "1 << 31" returns negative value because '1' is recognized as signed int. And this breaks 'f'. Given that 'y' has 5 bits and hardware can never take a value larger than 31, fix the first problem by clamp the time window to the maximum possible value that the hardware can take. Fix the second problem by using unsigned bit left shift. Note that hardware has its own maximum time window limitation, which may be lower than the time window value retrieved from the power limit register. When this happens, hardware clamps the input to its maximum time window limitation. That is why a software clamp is preferred to handle the problem on hand. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [ rjw: Adjusted the comment added by this change ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-13block: ublk: check IO buffer based on flag need_get_dataLiu Xiaodong
Currently, uring_cmd with UBLK_IO_FETCH_REQ or UBLK_IO_COMMIT_AND_FETCH_REQ is always checked whether userspace server has provided IO buffer even flag UBLK_F_NEED_GET_DATA is configured. This is a excessive check. If UBLK_F_NEED_GET_DATA is configured, FETCH_RQ doesn't need to provide IO buffer; COMMIT_AND_FETCH_REQ also doesn't need to do that if the IO type is not READ. Check ub_cmd->addr together with ublk_need_get_data() and IO type in ublk_ch_uring_cmd(). With this fix, userspace server doesn't need to preserve buffers for every ublk_io when flag UBLK_F_NEED_GET_DATA is configured, in order to save memory. Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com> Fixes: c86019ff75c1 ("ublk_drv: add support for UBLK_IO_NEED_GET_DATA") Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230210141356.112321-1-xiaodong.liu@intel.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-13sched/core: Fix a missed update of user_cpus_ptrWaiman Long
Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask"), a successful call to sched_setaffinity() should always save the user requested cpu affinity mask in a task's user_cpus_ptr. However, when the given cpu mask is the same as the current one, user_cpus_ptr is not updated. Fix this by saving the user mask in this case too. Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230203181849.221943-1-longman@redhat.com
2023-02-13freezer,umh: Fix call_usermode_helper_exec() vs SIGKILLPeter Zijlstra
Tetsuo-San noted that commit f5d39b020809 ("freezer,sched: Rewrite core freezer logic") broke call_usermodehelper_exec() for the KILLABLE case. Specifically it was missed that the second, unconditional, wait_for_completion() was not optional and ensures the on-stack completion is unused before going out-of-scope. Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic") Reported-by: syzbot+6cd18e123583550cf469@syzkaller.appspotmail.com Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Debugged-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/Y90ar35uKQoUrLEK@hirez.programming.kicks-ass.net
2023-02-13Documentation: powerclamp: Fix numbered lists formattingBagas Sanjaya
Texts in numbered lists are rendered as continous paragraph when there should have been breaks between first line text in the beginning of list item and the description. Fix this by adding appropriate line breaks and indent the rest of lines to match the first line of numbered list item. Fixes: d6d71ee4a14ae6 ("PM: Introduce Intel PowerClamp Driver") Fixes: 6bbe6f5732faea ("docs: thermal: convert to ReST") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-13Documentation: powerclamp: Escape wildcard in cpumask descriptionBagas Sanjaya
kernel test robot reported htmldocs warning: Documentation/admin-guide/thermal/intel_powerclamp.rst:328: WARNING: Inline emphasis start-string without end-string. The mistaken asterisk in /proc/irq/*/smp_affinity is rendered as hyperlink as the result. Escape the asterisk to fix above warning. Link: https://lore.kernel.org/linux-doc/202302122247.N4S791c4-lkp@intel.com/ Fixes: ebf51971021881 ("thermal: intel: powerclamp: Add two module parameters") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-13Documentation: admin-guide: Add toctree entry for thermal docsBagas Sanjaya
kernel test robot reported htmldocs warnings: Documentation/admin-guide/index.rst:62: WARNING: toctree contains reference to nonexisting document 'admin-guide/thermal' Documentation/admin-guide/thermal/intel_powerclamp.rst: WARNING: document isn't included in any toctree Add toctree entry for thermal/ docs to fix these warnings. Link: https://lore.kernel.org/linux-doc/202302121759.MmJgDTxc-lkp@intel.com/ Fixes: 707bf8e1dfd51d ("Documentation: admin-guide: Move intel_powerclamp documentation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-13cpuidle: driver: Update microsecond values of state parameters as neededRafael J. Wysocki
If the cpuidle driver provides the target residency and exit latency in nanoseconds, the corresponding values in microseconds need to be set to reflect the provided numbers in order for the sysfs interface to show them correctly, so make __cpuidle_driver_init() do that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2023-02-13Merge tag 'qcom-arm64-defconfig-for-6.3-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/defconfig More ARM64 defconfig updates for v6.3 Here are two more defconfig updates for 6.3, enabling the SM8450 Display clock controller driver, as well as the SDAM driver, a driver exposing SRAM on newer Qualcomm PMICs to other devices. * tag 'qcom-arm64-defconfig-for-6.3-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: defconfig: enable Qualcomm SDAM nvmem driver arm64: defconfig: enable SM8450 DISPCC clock driver arm64: defconfig: enable the clock driver for Qualcomm SA8775P platforms arm64: defconfig: enable Visionox VTDR6130 DSI Panel driver arm64: defconfig: enable SM8550 DISPCC clock driver arm64: defconfig: enable Qualcomm PCIe modem drivers arm64: defconfig: Enable SC8280XP Display Clock Controller arm64: defconfig: Enable GCC, TCSRCC, pinctrl and interconnect for SM8550 arm64: defconfig: enable crypto userspace API arm64: defconfig: build SDM_LPASSCC_845 as a module arm64: defconfig: enable camera on Thundercomm RB5 platform arm64: defconfig: build PINCTRL_SM8250_LPASS_LPI as module arm64: defconfig: Enable Qualcomm EUD Link: https://lore.kernel.org/r/20230210181516.2021902-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-02-13Merge tag 'qcom-arm64-for-6.3-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/dt More Qualcomm ARM64 DT updates for 6.3 The new Qualcomm QDU1000 and QRU1000 platforms, and the IDP device on these are introduced. New support for a couple of USB modem sticks from THWC are introduced, so is support for Xiaomi Mi Pad 5 Pro and the Pro SKU of the Herobrine device. The Core Bus Fabric (CBF) is introduced on MSM8996. Interconnect paths for UFS are also described. A few fixes related to the power-grid of herobrine, on SC7280, are introduced. QFPROM is introduced on IPQ8074 and Interconnect providers are added for SDM670. On SDM845 the duplicated wcd9340 audio coded description is moved from devices to a common file, audio devices are added to the OnePlus 6 and 6T. On SM6115 debug UART, SMP2P, watchdog nodes are introduced, and the platform is switched to use #address/size-cells of 2, in line with most other platforms. Camera control interface and clock controllers are added for SM6350, and the CCI interface is enabled on the Fairphone FP4. On SM8350 the interconnect reference of SDHCI controller is corrected, DSI1 PHY clocks are properly described as sources for the Display clock controller and DSI1 is wired up to the display controller. The firmware paths are corrected for the Sony Xperia Nagara platform. The GPR bus, audio servic3es and LPASS pinctrl nodes are added for the SM8550 platform. Additionally a few small typos/errors are corrected. gpio-ranges are corrected across MSM8953, SM6115 and SC8280XP and a range of DT validation issues are corrected. * tag 'qcom-arm64-for-6.3-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (81 commits) arm64: dts: qcom: sc7280: Power herobrine's 3.3 eDP/TS rail more properly arm64: dts: qcom: pmk8550: fix PON compatible arm64: dts: qcom: sm8550: fix DSI controller compatible arm64: dts: qcom: sc7280: Hook up the touchscreen IO rail on evoker arm64: dts: qcom: sc7280: Hook up the touchscreen IO rail on villager arm64: dts: qcom: sc7280: Add 3ms ramp to herobrine's pp3300_left_in_mlb arm64: dts: qcom: sc7280: On QCard, regulator L3C should be 1.8V arm64: dts: qcom: sc8280xp: correct LPASS GPIO gpio-ranges arm64: dts: qcom: msm8992-lg-bullhead: Enable regulators arm64: dts: qcom: sm6115: correct TLMM gpio-ranges arm64: dts: qcom: msm8953: correct TLMM gpio-ranges arm64: dts: qcom: msm8992-lg-bullhead: Correct memory overlaps with the SMEM and MPSS memory regions arm64: dts: qcom: sm8350-hdk: correct LT9611 pin function arm64: dts: qcom: sm8350-hdk: align pin config node names with bindings arm64: dts: qcom: sm6350: Use specific qmpphy compatible arm64: dts: qcom: sm6115: Add smp2p nodes arm64: dts: qcom: sm7225-fairphone-fp4: Enable CCI busses arm64: dts: qcom: sm6350: Add CCI nodes arm64: dts: qcom: sm6350: Add camera clock controller dt-bindings: clock: add QCOM SM6350 camera clock bindings ... Link: https://lore.kernel.org/r/20230210192908.2039976-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-02-13Merge tag 'qcom-dts-for-6.3-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/dt More Qualcomm ARM32 DTS updates for 6.3 This adds backlight, notification LED, vibrator, volume keys and hall sensor to the OnePlus One, and provides a range of Devicetree validation fixes across various platforms. * tag 'qcom-dts-for-6.3-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (22 commits) ARM: dts: qcom: align OPP table names with DT schema ARM: dts: qcom: msm8974-oneplus-bacon: Add notification LED ARM: dts: qcom: msm8974-oneplus-bacon: Add backlight ARM: dts: qcom: msm8974-oneplus-bacon: Add volume keys and hall sensor ARM: dts: qcom: msm8974-oneplus-bacon: Add vibrator ARM: dts: qcom: pm8941: Add vibrator node ARM: dts: qcom: sdx55: correct TLMM gpio-ranges dt-bindings: arm: qcom: add the sa8775p-ride board ARM: dts: qcom: apq8064: add second DSI host and PHY ARM: dts: qcom: apq8060-dragonboard: align MPP pin node names with DT schema dt-bindings: arm: qcom: Add Xiaomi Mi Pad 5 Pro (xiaomi-elish) ARM: dts: qcom-sdx65: align RPMh regulator nodes with bindings ARM: dts: qcom-sdx55: align RPMh regulator nodes with bindings ARM: dts: qcom: use "okay" for status ARM: dts: qcom: sdx65: Add Qcom SMMU-500 as the fallback for IOMMU node ARM: dts: qcom: sdx55: Add Qcom SMMU-500 as the fallback for IOMMU node ARM: dts: qcom: apq8064: use hdmi_phy for the MMCC's hdmipll clock ARM: dts: qcom: apq8064: add #clock-cells to the HDMI PHY node ARM: dts: qcom: ipq8064: move reg-less nodes outside soc node dt-bindings: qcom: Document msm8916-thwc-uf896 and ufi001c ... Link: https://lore.kernel.org/r/20230210185846.2032601-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-02-13Merge tag 'samsung-dt-6.3-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/dt Samsung DTS ARM changes for v6.3, part two Several cleanups pointed out by `make dtbs_check`: 1. Align LED status node name with bindings. 2. Drop redundant properties. 3. Move i2c-gpio node out of soc to top-level, as soc node is expected to have only MMIO nodes. 4. Correct SPI NOR flash compatible in SMDK5250 and SMDKv310. 5. Align GPIO property names in WM1811-family codec nodes with bindings. 6. Correct MAX98090 codec DAI cells in Snow. * tag 'samsung-dt-6.3-2' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ARM: dts: exynos: correct max98090 DAI argument in Snow ARM: dts: s5pv210: add "gpios" suffix to wlf,ldo1ena on Aries ARM: dts: exynos: add "gpios" suffix to wlf,ldo1ena on Arndale ARM: dts: exynos: add "gpios" suffix to wlf,ldo1ena on Midas ARM: dts: exynos: correct SPI nor compatible in SMDK5250 ARM: dts: exynos: correct SPI nor compatible in SMDKv310 ARM: dts: exynos: move I2C10 out of soc node on Arndale ARM: dts: exynos: drop redundant address/size cells from I2C10 on Arndale ARM: dts: exynos: drop default status from I2C10 on Arndale ARM: dts: exynos: align status led name with bindings on Origen4210 Link: https://lore.kernel.org/r/20230211113103.58894-1-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-02-13platform/x86/amd/pmf: Add depends on CONFIG_POWER_SUPPLYShyam Sundar S K
It is reported that amd_pmf driver is missing "depends on" for CONFIG_POWER_SUPPLY causing the following build error. ld: drivers/platform/x86/amd/pmf/core.o: in function `amd_pmf_remove': core.c:(.text+0x10): undefined reference to `power_supply_unreg_notifier' ld: drivers/platform/x86/amd/pmf/core.o: in function `amd_pmf_probe': core.c:(.text+0x38f): undefined reference to `power_supply_reg_notifier' make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make: *** [Makefile:1248: vmlinux] Error 2 Add this to the Kconfig file. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217028 Fixes: c5258d39fc4c ("platform/x86/amd/pmf: Add helper routine to update SPS thermals") Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20230213121457.1764463-1-Shyam-sundar.S-k@amd.com Cc: stable@vger.kernel.org Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13mm: Remove get_kernel_pages()Ira Weiny
The only caller to get_kernel_pages() [shm_get_kernel_pages()] has been updated to not need it. Remove get_kernel_pages(). Cc: Mel Gorman <mgorman@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Andrew Morton <akpm@linux-foudation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2023-02-13tee: Remove call to get_kernel_pages()Ira Weiny
The kernel pages used by shm_get_kernel_pages() are allocated using GFP_KERNEL through the following call stack: trusted_instantiate() trusted_payload_alloc() -> GFP_KERNEL <trusted key op> tee_shm_register_kernel_buf() register_shm_helper() shm_get_kernel_pages() Where <trusted key op> is one of: trusted_key_unseal() trusted_key_get_random() trusted_key_seal() Because the pages can't be from highmem get_kernel_pages() boils down to a get_page() call. Remove the get_kernel_pages() call and open code the get_page(). In case a highmem page does slip through warn on once for a kmap'ed address. Cc: Jens Wiklander <jens.wiklander@linaro.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2023-02-13tee: Remove vmalloc page supportIra Weiny
The kernel pages used by shm_get_kernel_pages() are allocated using GFP_KERNEL through the following call stack: trusted_instantiate() trusted_payload_alloc() -> GFP_KERNEL <trusted key op> tee_shm_register_kernel_buf() register_shm_helper() shm_get_kernel_pages() Where <trusted key op> is one of: trusted_key_unseal() trusted_key_get_random() trusted_key_seal() Remove the vmalloc page support from shm_get_kernel_pages(). Replace with a warn on once. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2023-02-13highmem: Enhance is_kmap_addr() to check kmap_local_page() mappingsIra Weiny
is_kmap_addr() is only looking at the kmap() address range which may cause check_heap_object() to miss checking an overflow on a kmap_local_page() page. Add a check for the kmap_local_page() address range to is_kmap_addr(). Cc: Matthew Wilcox <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Andrew Morton <akpm@linux-foudation.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2023-02-13clocksource/drivers/timer-sun4i: Add CLOCK_EVT_FEAT_DYNIRQYangtao Li
Add CLOCK_EVT_FEAT_DYNIRQ to allow the IRQ could be runtime set affinity to the cores that needs wake up, otherwise saying core0 has to send IPI to wakeup core1. With CLOCK_EVT_FEAT_DYNIRQ set, when broadcast timer could wake up the cores, IPI is not needed. After enabling this feature, especially the scene where cpuidle is enabled can benefit. Signed-off-by: Yangtao Li <frank.li@vivo.com> Link: https://lore.kernel.org/r/20230209040239.24710-1-frank.li@vivo.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-02-13clocksource/drivers/em_sti: Mark driver as non-removableUwe Kleine-König
The comment in the remove callback suggests that the driver is not supposed to be unbound. However returning an error code in the remove callback doesn't accomplish that. Instead set the suppress_bind_attrs property (which makes it impossible to unbind the driver via sysfs). The only remaining way to unbind a em_sti device would be module unloading, but that doesn't apply here, as the driver cannot be built as a module. Also drop the useless remove callback. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230207193010.469495-1-u.kleine-koenig@pengutronix.de Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-02-13clocksource/drivers/sh_tmu: Mark driver as non-removableUwe Kleine-König
The comment in the remove callback suggests that the driver is not supposed to be unbound. However returning an error code in the remove callback doesn't accomplish that. Instead set the suppress_bind_attrs property (which makes it impossible to unbind the driver via sysfs). The only remaining way to unbind a sh_tmu device would be module unloading, but that doesn't apply here, as the driver cannot be built as a module. Also drop the useless remove callback. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230207193614.472060-1-u.kleine-koenig@pengutronix.de Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-02-13clocksource/drivers/riscv: Patch riscv_clock_next_event() jump before first useMatt Evans
A static key is used to select between SBI and Sstc timer usage in riscv_clock_next_event(), but currently the direction is resolved after cpuhp_setup_state() is called (which sets the next event). The first event will therefore fall through the sbi_set_timer() path; this breaks Sstc-only systems. So, apply the jump patching before first use. Fixes: 9f7a8ff6391f ("RISC-V: Prefer sstc extension if available") Signed-off-by: Matt Evans <mev@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/CDDAB2D0-264E-42F3-8E31-BA210BEB8EC1@rivosinc.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>