summaryrefslogtreecommitdiff
path: root/drivers/cpuidle
AgeCommit message (Collapse)Author
2023-02-27Merge tag 'soc-drivers-6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "As usual, there are lots of minor driver changes across SoC platforms from NXP, Amlogic, AMD Zynq, Mediatek, Qualcomm, Apple and Samsung. These usually add support for additional chip variations in existing drivers, but also add features or bugfixes. The SCMI firmware subsystem gains a unified raw userspace interface through debugfs, which can be used for validation purposes. Newly added drivers include: - New power management drivers for StarFive JH7110, Allwinner D1 and Renesas RZ/V2M - A driver for Qualcomm battery and power supply status - A SoC device driver for identifying Nuvoton WPCM450 chips - A regulator coupler driver for Mediatek MT81xxv" * tag 'soc-drivers-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (165 commits) power: supply: Introduce Qualcomm PMIC GLINK power supply soc: apple: rtkit: Do not copy the reg state structure to the stack soc: sunxi: SUN20I_PPU should depend on PM memory: renesas-rpc-if: Remove redundant division of dummy soc: qcom: socinfo: Add IDs for IPQ5332 and its variant dt-bindings: arm: qcom,ids: Add IDs for IPQ5332 and its variant dt-bindings: power: qcom,rpmpd: add RPMH_REGULATOR_LEVEL_LOW_SVS_L1 firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/ MAINTAINERS: Update qcom CPR maintainer entry dt-bindings: firmware: document Qualcomm SM8550 SCM dt-bindings: firmware: qcom,scm: add qcom,scm-sa8775p compatible soc: qcom: socinfo: Add Soc IDs for IPQ8064 and variants dt-bindings: arm: qcom,ids: Add Soc IDs for IPQ8064 and variants soc: qcom: socinfo: Add support for new field in revision 17 soc: qcom: smd-rpm: Add IPQ9574 compatible soc: qcom: pmic_glink: remove redundant calculation of svid soc: qcom: stats: Populate all subsystem debugfs files dt-bindings: soc: qcom,rpmh-rsc: Update to allow for generic nodes soc: qcom: pmic_glink: add CONFIG_NET/CONFIG_OF dependencies soc: qcom: pmic_glink: Introduce altmode support ...
2023-02-21Merge tag 'pm-6.3-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add EPP support to the AMD P-state cpufreq driver, add support for new platforms to the Intel RAPL power capping driver, intel_idle and the Qualcomm cpufreq driver, enable thermal cooling for Tegra194, drop the custom cpufreq driver for loongson1 that is not necessary any more (and the corresponding cpufreq platform device), fix assorted issues and clean up code. Specifics: - Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes Karny, Arnd Bergmann, Bagas Sanjaya) - Drop the custom cpufreq driver for loongson1 that is not necessary any more and the corresponding cpufreq platform device (Keguang Zhang) - Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig entries (Paul E. McKenney) - Enable thermal cooling for Tegra194 (Yi-Wei Wang) - Register module device table and add missing compatibles for cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss) - Various dt binding updates for qcom-cpufreq-nvmem and opp-v2-kryo-cpu (Christian Marangi) - Make kobj_type structure in the cpufreq core constant (Thomas Weißschuh) - Make cpufreq_unregister_driver() return void (Uwe Kleine-König) - Make the TEO cpuidle governor check CPU utilization in order to refine idle state selection (Kajetan Puchalski) - Make Kconfig select the haltpoll cpuidle governor when the haltpoll cpuidle driver is selected and replace a default_idle() call in that driver with arch_cpu_idle() to allow MWAIT to be used (Li RongQing) - Add Emerald Rapids Xeon support to the intel_idle driver (Artem Bityutskiy) - Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to avoid randconfig build failures (Arnd Bergmann) - Make kobj_type structures used in the cpuidle sysfs interface constant (Thomas Weißschuh) - Make the cpuidle driver registration code update microsecond values of idle state parameters in accordance with their nanosecond values if they are provided (Rafael Wysocki) - Make the PSCI cpuidle driver prevent topology CPUs from being suspended on PREEMPT_RT (Krzysztof Kozlowski) - Document that pm_runtime_force_suspend() cannot be used with DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald) - Add EXPORT macros for exporting PM functions from drivers (Richard Fitzgerald) - Remove /** from non-kernel-doc comments in hibernation code (Randy Dunlap) - Fix possible name leak in powercap_register_zone() (Yang Yingliang) - Add Meteor Lake and Emerald Rapids support to the intel_rapl power capping driver (Zhang Rui) - Modify the idle_inject power capping facility to support 100% idle injection (Srinivas Pandruvada) - Fix large time windows handling in the intel_rapl power capping driver (Zhang Rui) - Fix memory leaks with using debugfs_lookup() in the generic PM domains and Energy Model code (Greg Kroah-Hartman) - Add missing 'cache-unified' property in the example for kryo OPP bindings (Rob Herring) - Fix error checking in opp_migrate_dentry() (Qi Zheng) - Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad Dybcio) - Modify some power management utilities to use the canonical ftrace path (Ross Zwisler) - Correct spelling problems for Documentation/power/ as reported by codespell (Randy Dunlap)" * tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits) Documentation: amd-pstate: disambiguate user space sections cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ dt-bindings: opp: opp-v2-kryo-cpu: enlarge opp-supported-hw maximum dt-bindings: cpufreq: qcom-cpufreq-nvmem: make cpr bindings optional dt-bindings: cpufreq: qcom-cpufreq-nvmem: specify supported opp tables PM: Add EXPORT macros for exporting PM functions cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT MIPS: loongson32: Drop obsolete cpufreq platform device powercap: intel_rapl: Fix handling for large time window cpuidle: driver: Update microsecond values of state parameters as needed cpuidle: sysfs: make kobj_type structures constant cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies PM: EM: fix memory leak with using debugfs_lookup() PM: domains: fix memory leak with using debugfs_lookup() cpufreq: Make kobj_type structure constant cpufreq: davinci: Fix clk use after free cpufreq: amd-pstate: avoid uninitialized variable use cpufreq: Make cpufreq_unregister_driver() return void OPP: fix error checking in opp_migrate_dentry() dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM8550 compatible ...
2023-02-13cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RTKrzysztof Kozlowski
The runtime Power Management of CPU topology is not compatible with PREEMPT_RT: 1. Core cpuidle path disables IRQs. 2. Core cpuidle calls cpuidle-psci. 3. cpuidle-psci in __psci_enter_domain_idle_state() calls pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use spinlocks (which are sleeping on PREEMPT_RT). Deep sleep modes are not a priority of Realtime kernels because the latencies might become unpredictable. On the other hand the PSCI CPU idle power domain is a parent of other devices and power domain controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250). Disable the idle callbacks in cpuidle-psci and mark the domain as always on. This is a trade-off between making PREEMPT_RT working and still having a proper power domain hierarchy in the system. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Adrien Thierry <athierry@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-13cpuidle: driver: Update microsecond values of state parameters as neededRafael J. Wysocki
If the cpuidle driver provides the target residency and exit latency in nanoseconds, the corresponding values in microseconds need to be set to reflect the provided numbers in order for the sysfs interface to show them correctly, so make __cpuidle_driver_init() do that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2023-02-09cpuidle: sysfs: make kobj_type structures constantThomas Weißschuh
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definitions to prevent modification at runtime. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09cpuidle: add ARCH_SUSPEND_POSSIBLE dependenciesArnd Bergmann
Some ARMv4 processors don't support suspend, which leads to a build failure with the tegra and qualcomm cpuidle driver: WARNING: unmet direct dependencies detected for ARM_CPU_SUSPEND Depends on [n]: ARCH_SUSPEND_POSSIBLE [=n] Selected by [y]: - ARM_TEGRA_CPUIDLE [=y] && CPU_IDLE [=y] && (ARM [=y] || ARM64) && (ARCH_TEGRA [=n] || COMPILE_TEST [=y]) && !ARM64 && MMU [=y] arch/arm/kernel/sleep.o: in function `__cpu_suspend': (.text+0x68): undefined reference to `cpu_sa110_suspend_size' (.text+0x68): undefined reference to `cpu_fa526_suspend_size' Add an explicit dependency to make randconfig builds avoid this combination. Fixes: faae6c9f2e68 ("cpuidle: tegra: Enable compile testing") Fixes: a871be6b8eee ("cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver") Link: https://lore.kernel.org/all/20211013160125.772873-1-arnd@kernel.org/ Cc: All applicable <stable@vger.kernel.org> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-08firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/Elliot Berman
Move include/linux/qcom_scm.h to include/linux/firmware/qcom/qcom_scm.h. This removes 1 of a few remaining Qualcomm-specific headers into a more approciate subdirectory under include/. Suggested-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Reviewed-by: Guru Das Srinagesh <quic_gurus@quicinc.com> Acked-by: Mukesh Ojha <quic_mojha@quicinc.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230203210956.3580811-1-quic_eberman@quicinc.com
2023-01-31cpuidle: Fix poll_idle() noinstr annotationPeter Zijlstra
The instrumentation_begin()/end() annotations in poll_idle() were complete nonsense. Specifically they caused tracing to happen in the middle of noinstr code, resulting in RCU splats. Now that local_clock() is noinstr, mark up the rest and let it rip. Fixes: 00717eb8c955 ("cpuidle: Annotate poll_idle()") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/oe-lkp/202301192148.58ece903-oliver.sang@intel.com Link: https://lore.kernel.org/r/20230126151323.819534689@infradead.org
2023-01-20cpuidle-haltpoll: Replace default_idle() with arch_cpu_idle()Li RongQing
When a KVM guest has MWAIT, mwait_idle() is used as the default idle function. However, the cpuidle-haltpoll driver calls default_idle() from default_enter_idle() directly and that one uses HLT instead of MWAIT, which may affect performance adversely, because MWAIT is preferred to HLT as explained by the changelog of commit aebef63cf7ff ("x86: Remove vendor checks from prefer_mwait_c1_over_halt"). Make default_enter_idle() call arch_cpu_idle(), which can use MWAIT, instead of default_idle() to address this issue. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> [ rjw: Changelog rewrite ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-18cpuidle, arm64: Fix the ARM64 cpuidle logicPeter Zijlstra
The recent cpuidle changes started triggering RCU splats on Juno development boards: | ============================= | WARNING: suspicious RCU usage | ----------------------------- | include/trace/events/ipi.h:19 suspicious rcu_dereference_check() usage! Fix cpuidle on ARM64: - ... by introducing a new 'is_rcu' flag to the cpuidle helpers & make ARM64 use it, as ARM64 wants to keep RCU active longer and wants to do the ct_cpuidle_enter()/exit() dance itself. - Also update the PSCI driver accordingly. - This also removes the last known RCU_NONIDLE() user as a bonus. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/Y8Z31UbzG3LJgAXE@hirez.programming.kicks-ass.net --
2023-01-18cpuidle: mvebu: Fix duplicate flags assignmentArnd Bergmann
The added '.flags' value is sometimes ignored here because it gets overwritten by another initialization: drivers/cpuidle/cpuidle-mvebu-v7.c:24:33: error: initialized field overwritten [-Werror=override-init] 24 | #define MVEBU_V7_FLAG_DEEP_IDLE 0x10000 | ^~~~~~~ drivers/cpuidle/cpuidle-mvebu-v7.c:69:43: note: in expansion of macro 'MVEBU_V7_FLAG_DEEP_IDLE' ... Merge the two fields into one. Fixes: 4ce40e9dbe83 ("cpuidle, armada: Push RCU-idle into driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230117164642.1672784-1-arnd@kernel.org
2023-01-13cpuidle-haltpoll: select haltpoll governorLi RongQing
The haltpoll cpuidle driver should select the haltpoll governor, so as to ensure that they work together. Signed-off-by: Li RongQing <lirongqing@baidu.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-13cpuidle: Add comments about noinstr/__cpuidle usagePeter Zijlstra
Add a few words on noinstr / __cpuidle usage. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230112195542.397238052@infradead.org
2023-01-13cpuidle, arch: Mark all ct_cpuidle_enter() callers __cpuidlePeter Zijlstra
For all cpuidle drivers that use CPUIDLE_FLAG_RCU_IDLE, ensure that all functions that call ct_cpuidle_enter() are marked __cpuidle. ( due to lack of noinstr validation on these platforms it is entirely possible this isn't complete ) Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230112195542.274096325@infradead.org
2023-01-13cpuidle: Ensure ct_cpuidle_enter() is always called from noinstr/__cpuidlePeter Zijlstra
Tracing (kprobes included) and other compiler instrumentation relies on a normal kernel runtime. Therefore all functions that disable RCU should be noinstr, as should all functions that are called while RCU is disabled. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230112195542.212914195@infradead.org
2023-01-13cpuidle: Annotate poll_idle()Peter Zijlstra
The __cpuidle functions will become a noinstr class, as such they need explicit annotations. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.312601331@infradead.org
2023-01-13cpuidle: Fix ct_idle_*() usagePeter Zijlstra
The whole disable-RCU, enable-IRQS dance is very intricate since changing IRQ state is traced, which depends on RCU. Add two helpers for the cpuidle case that mirror the entry code: ct_cpuidle_enter() ct_cpuidle_exit() And fix all the cases where the enter/exit dance was buggy. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.130014793@infradead.org
2023-01-13cpuidle, dt: Push RCU-idle into driverPeter Zijlstra
Doing RCU-idle outside the driver, only to then temporarily enable it again before going idle is suboptimal. Notably: this converts all dt_init_idle_driver() and __CPU_PM_CPU_IDLE_ENTER() users for they are inextrably intertwined. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.068981667@infradead.org
2023-01-13cpuidle, armada: Push RCU-idle into driverPeter Zijlstra
Doing RCU-idle outside the driver, only to then temporarily enable it again before going idle is suboptimal. Notably the cpu_pm_*() calls implicitly re-enable RCU for a bit. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20230112195539.946630819@infradead.org
2023-01-13cpuidle, psci: Push RCU-idle into driverPeter Zijlstra
Doing RCU-idle outside the driver, only to then temporarily enable it again, at least twice, before going idle is suboptimal. Notably once implicitly through the cpu_pm_*() calls and once explicitly doing ct_irq_*_irqon(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Kajetan Puchalski <kajetan.puchalski@arm.com> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Guo Ren <guoren@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20230112195539.760296658@infradead.org
2023-01-13cpuidle, tegra: Push RCU-idle into driverPeter Zijlstra
Doing RCU-idle outside the driver, only to then temporarily enable it again, at least twice, before going idle is suboptimal. Notably once implicitly through the cpu_pm_*() calls and once explicitly doing RCU_NONIDLE(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20230112195539.699546331@infradead.org
2023-01-13cpuidle, riscv: Push RCU-idle into driverPeter Zijlstra
Doing RCU-idle outside the driver, only to then temporarily enable it again, at least twice, before going idle is suboptimal. That is, once implicitly through the cpu_pm_*() calls and once explicitly doing ct_irq_*_irqon(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20230112195539.637185846@infradead.org
2023-01-13cpuidle: Move IRQ state validationPeter Zijlstra
Make cpuidle_enter_state() consistent with the s2idle variant and verify ->enter() always returns with interrupts disabled. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195539.576412812@infradead.org
2023-01-13cpuidle/poll: Ensure IRQs stay disabled after cpuidle_state::enter() callsPeter Zijlstra
Make cpuidle_state::enter() methods IRQ state invariant on exit. Additionally make sure to use raw_local_irq_*() methods since this cpuidle callback will be called with RCU already disabled. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195539.515253662@infradead.org
2023-01-10cpuidle: teo: Introduce util-awarenessKajetan Puchalski
Modern interactive systems, such as recent Android phones, tend to have power efficient shallow idle states. Selecting deeper idle states on a device while a latency-sensitive workload is running can adversely impact performance due to increased latency. Additionally, if the CPU wakes up from a deeper sleep before its target residency as is often the case, it results in a waste of energy on top of that. At the moment, none of the available idle governors take any scheduling information into account. They also tend to overestimate the idle duration quite often, which causes them to select excessively deep idle states, thus leading to increased wakeup latency and lower performance with no power saving. For 'menu' while web browsing on Android for instance, those types of wakeups ('too deep') account for over 24% of all wakeups. At the same time, on some platforms idle state 0 can be power efficient enough to warrant wanting to prefer it over idle state 1. This is because the power usage of the two states can be so close that sufficient amounts of too deep state 1 sleeps can completely offset the state 1 power saving to the point where it would've been more power efficient to just use state 0 instead. This is, of course, for systems where state 0 is not a polling state, such as arm-based devices. Sleeps that happened in state 0 while they could have used state 1 ('too shallow') only save less power than they otherwise could have. Too deep sleeps, on the other hand, harm performance and nullify the potential power saving from using state 1 in the first place. While taking this into account, it is clear that on balance it is preferable for an idle governor to have more too shallow sleeps instead of more too deep sleeps on those kinds of platforms. This patch specifically tunes TEO to prefer shallower idle states in order to reduce wakeup latency and achieve better performance. To this end, before selecting the next idle state it uses the avg_util signal of a CPU's runqueue in order to determine to what extent the CPU is being utilized. This util value is then compared to a threshold defined as a percentage of the CPU's capacity (capacity >> 6 ie. ~1.5% in the current implementation). If the util is above the threshold, the index of the idle state selected by TEO metrics will be reduced by 1, thus selecting a shallower state. If the util is below the threshold, the governor defaults to the TEO metrics mechanism to try to select the deepest available idle state based on the closest timer event and its own correctness. The main goal of this is to reduce latency and increase performance for some workloads. Under some workloads it will result in an increase in power usage (Geekbench 5) while for other workloads it will also result in a decrease in power usage compared to TEO (PCMark Web, Jankbench, Speedometer). It can provide drastically decreased latency and performance benefits in certain types of workloads that are sensitive to latency. Example test results: 1. GB5 (better score, latency & more power usage) | metric | menu | teo | teo-util-aware | | ------------------------------------- | -------------- | ----------------- | ----------------- | | gmean score | 2826.5 (0.0%) | 2764.8 (-2.18%) | 2865 (1.36%) | | gmean power usage [mW] | 2551.4 (0.0%) | 2606.8 (2.17%) | 2722.3 (6.7%) | | gmean too deep % | 14.99% | 9.65% | 4.02% | | gmean too shallow % | 2.5% | 5.96% | 14.59% | | gmean task wakeup latency (asynctask) | 78.16μs (0.0%) | 61.60μs (-21.19%) | 54.45μs (-30.34%) | 2. Jankbench (better score, latency & less power usage) | metric | menu | teo | teo-util-aware | | ------------------------------------- | -------------- | ----------------- | ----------------- | | gmean frame duration | 13.9 (0.0%) | 14.7 (6.0%) | 12.6 (-9.0%) | | gmean jank percentage | 1.5 (0.0%) | 2.1 (36.99%) | 1.3 (-17.37%) | | gmean power usage [mW] | 144.6 (0.0%) | 136.9 (-5.27%) | 121.3 (-16.08%) | | gmean too deep % | 26.00% | 11.00% | 2.54% | | gmean too shallow % | 4.74% | 11.89% | 21.93% | | gmean wakeup latency (RenderThread) | 139.5μs (0.0%) | 116.5μs (-16.49%) | 91.11μs (-34.7%) | | gmean wakeup latency (surfaceflinger) | 124.0μs (0.0%) | 151.9μs (22.47%) | 87.65μs (-29.33%) | Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com> [ rjw: Comment edits and white space adjustments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-10cpuidle: teo: Optionally skip polling states in teo_find_shallower_state()Kajetan Puchalski
Add a no_poll flag to teo_find_shallower_state() that will let the function optionally not consider polling states. This allows the caller to guard against the function inadvertently resulting in TEO putting the CPU in a polling state when that behaviour is undesirable. Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-19Merge tag 'powerpc-6.2-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add powerpc qspinlock implementation optimised for large system scalability and paravirt. See the merge message for more details - Enable objtool to be built on powerpc to generate mcount locations - Use a temporary mm for code patching with the Radix MMU, so the writable mapping is restricted to the patching CPU - Add an option to build the 64-bit big-endian kernel with the ELFv2 ABI - Sanitise user registers on interrupt entry on 64-bit Book3S - Many other small features and fixes Thanks to Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn Helgaas, Bo Liu, Chen Lifu, Christoph Hellwig, Christophe JAILLET, Christophe Leroy, Christopher M. Riedl, Colin Ian King, Deming Wang, Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven, Gustavo A. R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol Jain, Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin, Pali Rohár, Randy Dunlap, Rohan McLure, Russell Currey, Sathvika Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng, XueBing Chen, Yang Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu, and Wolfram Sang. * tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (181 commits) powerpc/code-patching: Fix oops with DEBUG_VM enabled powerpc/qspinlock: Fix 32-bit build powerpc/prom: Fix 32-bit build powerpc/rtas: mandate RTAS syscall filtering powerpc/rtas: define pr_fmt and convert printk call sites powerpc/rtas: clean up includes powerpc/rtas: clean up rtas_error_log_max initialization powerpc/pseries/eeh: use correct API for error log size powerpc/rtas: avoid scheduling in rtas_os_term() powerpc/rtas: avoid device tree lookups in rtas_os_term() powerpc/rtasd: use correct OF API for event scan rate powerpc/rtas: document rtas_call() powerpc/pseries: unregister VPA when hot unplugging a CPU powerpc/pseries: reset the RCU watchdogs after a LPM powerpc: Take in account addition CPU node when building kexec FDT powerpc: export the CPU node count powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state powerpc/dts/fsl: Fix pca954x i2c-mux node names cxl: Remove unnecessary cxl_pci_window_alignment() selftests/powerpc: Fix resource leaks ...
2022-12-06powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze stateAboorva Devarajan
During the comparative study of cpuidle governors, it is noticed that the menu governor does not select CEDE state in some scenarios even though when the sleep duration of the CPU exceeds the target residency of the CEDE idle state this is because the CPU exits the snooze "polling" state when snooze time limit is reached in the snooze_loop(), which is not a real wake up and it just means that the polling state selection was not adequate. cpuidle governors rely on CPUIDLE_FLAG_POLLING flag to be set for the polling states to handle the condition mentioned above. Hence, set the CPUIDLE_FLAG_POLLING flag for snooze state (polling state) in powerpc arch to make the cpuidle governor work as expected. Reference Commits: - Timeout enabled for snooze state: commit 78eaa10f027c ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state") - commit dc2251bf98c6 ("cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol") - Fix wakeup stats in governor for polling states commit 5f26bdceb9c0 ("cpuidle: menu: Fix wakeup statistics updates for polling state") Signed-off-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com> Tested-by: Vishal Chourasia <vishalc@linux.vnet.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Reviewed-by: Vishal Chourasia <vishalc@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221114145611.37669-1-aboorvad@linux.vnet.ibm.com
2022-10-28cpuidle: dt: Clarify a comment and simplify code in dt_init_idle_driver()Ulf Hansson
The drv->state_count is assigned the total number of available states, so let's make that clear. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-28cpuidle: dt: Return the correct numbers of parsed idle statesUlf Hansson
While we correctly skips to initialize an idle state from a disabled idle state node in DT, the returned value from dt_init_idle_driver() don't get adjusted accordingly. Instead the number of found idle state nodes are returned, while the callers are expecting the number of successfully initialized idle states from DT. This leads to cpuidle drivers unnecessarily continues to initialize their idle state specific data. Moreover, in the case when all idle states have been disabled in DT, we would end up registering a cpuidle driver, rather than relying on the default arch specific idle call. Fixes: 9f14da345599 ("drivers: cpuidle: implement DT based idle states infrastructure") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-28cpuidle: psci: Extend information in log about OSI/PC modeUlf Hansson
It's useful to understand whether we are using OS-initiated (OSI) mode or Platform Coordinated (PC) mode, when initializing the CPU PM domains. Therefore, let's extend the print in the log after a successful probe with this information. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-09Merge tag 'riscv-for-linus-6.1-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Improvements to the CPU topology subsystem, which fix some issues where RISC-V would report bad topology information. - The default NR_CPUS has increased to XLEN, and the maximum configurable value is 512. - The CD-ROM filesystems have been enabled in the defconfig. - Support for THP_SWAP has been added for rv64 systems. There are also a handful of cleanups and fixes throughout the tree. * tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: enable THP_SWAP for RV64 RISC-V: Print SSTC in canonical order riscv: compat: s/failed/unsupported if compat mode isn't supported RISC-V: Increase range and default value of NR_CPUS cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usage perf: RISC-V: throttle perf events perf: RISC-V: exclude invalid pmu counters from SBI calls riscv: enable CD-ROM file systems in defconfig riscv: topology: fix default topology reporting arm64: topology: move store_cpu_topology() to shared code
2022-10-06Merge tag 'arm-drivers-6.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM driver updates from Arnd Bergmann: "The drivers branch for 6.1 is a bit larger than for most releases. Most of the changes come from SoC maintainers for the drivers/soc subsystem: - A new driver for error handling on the NVIDIA Tegra 'control backbone' bus. - A new driver for Qualcomm LLCC/DDR bandwidth measurement - New Rockchip rv1126 and rk3588 power domain drivers - DT binding updates for memory controllers, older Rockchip SoCs, various Mediatek devices, Qualcomm SCM firmware - Minor updates to Hisilicon LPC bus, the Allwinner SRAM driver, the Apple rtkit firmware driver, Tegra firmware - Minor updates for SoC drivers (Samsung, Mediatek, Renesas, Tegra, Qualcomm, Broadcom, NXP, ...) There are also some separate subsystem with downstream maintainers that merge updates this way: - Various updates and new drivers in the memory controller subsystem for Mediatek and Broadcom SoCs - Small set of changes in preparation to add support for FF-A v1.1 specification later, in the Arm FF-A firmware subsystem - debugfs support in the PSCI firmware subsystem" * tag 'arm-drivers-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (149 commits) ARM: remove check for CONFIG_DEBUG_LL_SER3 firmware/psci: Add debugfs support to ease debugging firmware/psci: Print a warning if PSCI doesn't accept PC mode dt-bindings: memory: snps,dw-umctl2-ddrc: Extend schema with IRQs/resets/clocks props dt-bindings: memory: snps,dw-umctl2-ddrc: Replace opencoded numbers with macros dt-bindings: memory: snps,dw-umctl2-ddrc: Use more descriptive device name dt-bindings: memory: synopsys,ddrc-ecc: Detach Zynq DDRC controller support soc: sunxi: sram: Add support for the D1 system control soc: sunxi: sram: Export the LDO control register soc: sunxi: sram: Save a pointer to the OF match data soc: sunxi: sram: Return void from the release function soc: apple: rtkit: Add apple_rtkit_poll soc: imx: add i.MX93 media blk ctrl driver soc: imx: add i.MX93 SRC power domain driver soc: imx: imx8m-blk-ctrl: Use genpd_xlate_onecell soc: imx: imx8mp-blk-ctrl: handle PCIe PHY resets soc: imx: imx8m-blk-ctrl: add i.MX8MP VPU blk ctrl soc: imx: add i.MX8MP HDMI blk ctrl HDCP/HRV_MWR soc: imx: add icc paths for i.MX8MP hsio/hdmi blk ctrl soc: imx: add icc paths for i.MX8MP media blk ctrl ...
2022-09-28firmware/psci: Print a warning if PSCI doesn't accept PC modeDmitry Baryshkov
The function psci_pd_try_set_osi_mode() will print an error if enabling OSI mode fails. To ease debugging PSCI issues print corresponding message if switching to PC mode fails too. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220926110249.666813-1-dmitry.baryshkov@linaro.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-09-23cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usageAnup Patel
Currently, we are using CPU_PM_CPU_IDLE_ENTER_PARAM() for all SBI HSM suspend types so retentive suspend types are also treated non-retentive and kernel will do redundant additional work for these states. The BIT[31] of SBI HSM suspend types allows us to differentiate between retentive and non-retentive suspend types so we should use this BIT to call appropriate CPU_PM_CPU_IDLE_ENTER_xyz() macro. Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Link: https://lore.kernel.org/r/20220718084553.2056169-1-apatel@ventanamicro.com/ Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-03cpuidle: Remove redundant check in cpuidle_switch_governor()Yu Liao
gov has already been NULL checked at the beginning of cpuidle_switch_governor, so remove redundant check. While at it, use pr_info() instead printk() to address the following checkpatch warning: WARNING: Prefer [subsystem eg: netdev]_info([subsystem]dev, ... then dev_info(dev, ... then pr_info(... to printk(KERN_INFO ... Signed-off-by: Yu Liao <liaoyu15@huawei.com> [ rjw: Subject and changelog edits, added empty line after if () ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-31cpuidle: powernv: move from strlcpy() with unused retval to strscpy()Wolfram Sang
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-31cpuidle: coupled: Drop duplicate word from a commentJason Wang
The double `are' is duplicated in the comment, remove one. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> [ rjw: New subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-08Merge tag 'pm-5.20-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These are ARM cpufreq updates and operating performance points (OPP) updates plus one cpuidle update adding a new trace point. Specifics: - Fix return error code in mtk_cpu_dvfs_info_init (Yang Yingliang). - Minor cleanups and support for new boards for Qcom cpufreq drivers (Bryan O'Donoghue, Konrad Dybcio, Pierre Gondois, and Yicong Yang). - Fix sparse warnings for Tegra cpufreq driver (Viresh Kumar). - Make dev_pm_opp_set_regulators() accept NULL terminated list (Viresh Kumar). - Add dev_pm_opp_set_config() and friends and migrate other users and helpers to using them (Viresh Kumar). - Add support for multiple clocks for a device (Viresh Kumar and Krzysztof Kozlowski). - Configure resources before adding OPP table for Venus (Stanimir Varbanov). - Keep reference count up for opp->np and opp_table->np while they are still in use (Liang He). - Minor OPP cleanups (Viresh Kumar and Yang Li). - Add a trace event for cpuidle to track missed (too deep or too shallow) wakeups (Kajetan Puchalski)" * tag 'pm-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (55 commits) cpuidle: Add cpu_idle_miss trace event venus: pm_helpers: Fix warning in OPP during probe OPP: Don't drop opp->np reference while it is still in use OPP: Don't drop opp_table->np reference while it is still in use cpufreq: tegra194: Staticize struct tegra_cpufreq_soc instances dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM6375 compatible dt-bindings: opp: Add msm8939 to the compatible list dt-bindings: opp: Add missing compat devices dt-bindings: opp: opp-v2-kryo-cpu: Fix example binding checks cpufreq: Change order of online() CB and policy->cpus modification cpufreq: qcom-hw: Remove deprecated irq_set_affinity_hint() call cpufreq: qcom-hw: Disable LMH irq when disabling policy cpufreq: qcom-hw: Reset cancel_throttle when policy is re-enabled cpufreq: qcom-cpufreq-hw: use HZ_PER_KHZ macro in units.h cpufreq: mediatek: fix error return code in mtk_cpu_dvfs_info_init() OPP: Remove dev{m}_pm_opp_of_add_table_noclk() PM / devfreq: tegra30: Register config_clks helper OPP: Allow config_clks helper for single clk case OPP: Provide a simple implementation to configure multiple clocks OPP: Assert clk_count == 1 for single clk helpers ...
2022-08-04Merge tag 'spdx-6.0-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX updates from Greg KH: "Here is the set of SPDX comment updates for 6.0-rc1. Nothing huge here, just a number of updated SPDX license tags and cleanups based on the review of a number of common patterns in GPLv2 boilerplate text. Also included in here are a few other minor updates, two USB files, and one Documentation file update to get the SPDX lines correct. All of these have been in the linux-next tree for a very long time" * tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (28 commits) Documentation: samsung-s3c24xx: Add blank line after SPDX directive x86/crypto: Remove stray comment terminator treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_406.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_391.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_390.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_385.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_320.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_319.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_318.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_298.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_292.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 2) treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 1) treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_160.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_152.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_147.RULE treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_133.RULE ...
2022-08-03cpuidle: Add cpu_idle_miss trace eventKajetan Puchalski
Add a trace event for cpuidle to track missed (too deep or too shallow) wakeups. After each wakeup, CPUIdle already computes whether the entered state was optimal, above or below the desired one and updates the relevant counters. This patch makes it possible to trace those events in addition to just reading the counters. The patterns of types and percentages of misses across different workloads appear to be very consistent. This makes the trace event very useful for comparing the relative correctness of different CPUIdle governors for different types of workloads, or for finding the optimal governor for a given device. Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-02Merge tag 'rcu.2022.07.26a' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Documentation updates - Miscellaneous fixes - Callback-offload updates, perhaps most notably a new RCU_NOCB_CPU_DEFAULT_ALL Kconfig option that causes all CPUs to be offloaded at boot time, regardless of kernel boot parameters. This is useful to battery-powered systems such as ChromeOS and Android. In addition, a new RCU_NOCB_CPU_CB_BOOST kernel boot parameter prevents offloaded callbacks from interfering with real-time workloads and with energy-efficiency mechanisms - Polled grace-period updates, perhaps most notably making these APIs account for both normal and expedited grace periods - Tasks RCU updates, perhaps most notably reducing the CPU overhead of RCU tasks trace grace periods by more than a factor of two on a system with 15,000 tasks. The reduction is expected to increase with the number of tasks, so it seems reasonable to hypothesize that a system with 150,000 tasks might see a 20-fold reduction in CPU overhead - Torture-test updates - Updates that merge RCU's dyntick-idle tracking into context tracking, thus reducing the overhead of transitioning to kernel mode from either idle or nohz_full userspace execution for kernels that track context independently of RCU. This is expected to be helpful primarily for kernels built with CONFIG_NO_HZ_FULL=y * tag 'rcu.2022.07.26a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (98 commits) rcu: Add irqs-disabled indicator to expedited RCU CPU stall warnings rcu: Diagnose extended sync_rcu_do_polled_gp() loops rcu: Put panic_on_rcu_stall() after expedited RCU CPU stall warnings rcutorture: Test polled expedited grace-period primitives rcu: Add polled expedited grace-period primitives rcutorture: Verify that polled GP API sees synchronous grace periods rcu: Make Tiny RCU grace periods visible to polled APIs rcu: Make polled grace-period API account for expedited grace periods rcu: Switch polled grace-period APIs to ->gp_seq_polled rcu/nocb: Avoid polling when my_rdp->nocb_head_rdp list is empty rcu/nocb: Add option to opt rcuo kthreads out of RT priority rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread() rcu/nocb: Add an option to offload all CPUs on boot rcu/nocb: Fix NOCB kthreads spawn failure with rcu_nocb_rdp_deoffload() direct call rcu/nocb: Invert rcu_state.barrier_mutex VS hotplug lock locking order rcu/nocb: Add/del rdp to iterate from rcuog itself rcu/tree: Add comment to describe GP-done condition in fqs loop rcu: Initialize first_gp_fqs at declaration in rcu_gp_fqs() rcu/kvfree: Remove useless monitor_todo flag rcu: Cleanup RCU urgency state for offline CPU ...
2022-08-02Merge tag 'pm-5.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These are mostly minor improvements all over including new CPU IDs for the Intel RAPL driver, an Energy Model rework to use micro-Watt as the power unit, cpufreq fixes and cleanus, cpuidle updates, devfreq updates, documentation cleanups and a new version of the pm-graph suite of utilities. Specifics: - Make cpufreq_show_cpus() more straightforward (Viresh Kumar). - Drop unnecessary CPU hotplug locking from store() used by cpufreq sysfs attributes (Viresh Kumar). - Make the ACPI cpufreq driver support the boost control interface on Zhaoxin/Centaur processors (Tony W Wang-oc). - Print a warning message on attempts to free an active cpufreq policy which should never happen (Viresh Kumar). - Fix grammar in the Kconfig help text for the loongson2 cpufreq driver (Randy Dunlap). - Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq governor (Zhao Liu). - Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll cpuidle driver (Eiichi Tsukata). - Modify intel_idle to treat C1 and C1E as independent idle states on Sapphire Rapids (Artem Bityutskiy). - Extend support for wakeirq to callback wrappers used during system suspend and resume (Ulf Hansson). - Defer waiting for device probe before loading a hibernation image till the first actual device access to avoid possible deadlocks reported by syzbot (Tetsuo Handa). - Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn Helgaas). - Add Raptor Lake-P to the list of processors supported by the Intel RAPL driver (George D Sworo). - Add Alder Lake-N and Raptor Lake-P to the list of processors for which Power Limit4 is supported in the Intel RAPL driver (Sumeet Pawnikar). - Make pm_genpd_remove() check genpd_debugfs_dir against NULL before attempting to remove it (Hsin-Yi Wang). - Change the Energy Model code to represent power in micro-Watts and adjust its users accordingly (Lukasz Luba). - Add new devfreq driver for Mediatek CCI (Cache Coherent Interconnect) (Johnson Wang). - Convert the Samsung Exynos SoC Bus bindings to DT schema of exynos-bus.c (Krzysztof Kozlowski). - Address kernel-doc warnings by adding the description for unused function parameters in devfreq core (Mauro Carvalho Chehab). - Use NULL to pass a null pointer rather than zero according to the function propotype in imx-bus.c (Colin Ian King). - Print error message instead of error interger value in tegra30-devfreq.c (Dmitry Osipenko). - Add checks to prevent setting negative frequency QoS limits for CPUs (Shivnandan Kumar). - Update the pm-graph suite of utilities to the latest revision 5.9 including multiple improvements (Todd Brandt). - Drop pme_interrupt reference from the PCI power management documentation (Mario Limonciello)" * tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits) powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P PM: QoS: Add check to make sure CPU freq is non-negative PM: hibernate: defer device probing when resuming from hibernation intel_idle: make SPR C1 and C1E be independent cpufreq: ondemand: Use cpumask_var_t for on-stack cpu mask cpufreq: loongson2: fix Kconfig "its" grammar pm-graph v5.9 cpufreq: Warn users while freeing active policy cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1 firmware: arm_scmi: Get detailed power scale from perf Documentation: EM: Switch to micro-Watts scale PM: EM: convert power field to micro-Watts precision and align drivers PM / devfreq: tegra30: Add error message for devm_devfreq_add_device() PM / devfreq: imx-bus: use NULL to pass a null pointer rather than zero PM / devfreq: shut up kernel-doc warnings dt-bindings: interconnect: samsung,exynos-bus: convert to dtschema PM / devfreq: mediatek: Introduce MediaTek CCI devfreq driver dt-bindings: interconnect: Add MediaTek CCI dt-bindings PM: domains: Ensure genpd_debugfs_dir exists before remove PM: runtime: Extend support for wakeirq for force_suspend|resume ...
2022-07-05context_tracking: Take IRQ eqs entrypoints over RCUFrederic Weisbecker
The RCU dynticks counter is going to be merged into the context tracking subsystem. Prepare with moving the IRQ extended quiescent states entrypoints to context tracking. For now those are dumb redirection to existing RCU calls. [ paulmck: Apply Stephen Rothwell feedback from -next. ] [ paulmck: Apply Nathan Chancellor feedback. ] Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Yu Liao <liaoyu15@huawei.com> Cc: Phil Auld <pauld@redhat.com> Cc: Paul Gortmaker<paul.gortmaker@windriver.com> Cc: Alex Belits <abelits@marvell.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
2022-07-05context_tracking: Take idle eqs entrypoints over RCUFrederic Weisbecker
The RCU dynticks counter is going to be merged into the context tracking subsystem. Start with moving the idle extended quiescent states entrypoints to context tracking. For now those are dumb redirections to existing RCU calls. [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Yu Liao <liaoyu15@huawei.com> Cc: Phil Auld <pauld@redhat.com> Cc: Paul Gortmaker<paul.gortmaker@windriver.com> Cc: Alex Belits <abelits@marvell.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
2022-06-23cpuidle: cpuidle-arm: remove arm64 supportMichael Walle
Since commit 788961462f34 ("ARM: psci: cpuidle: Enable PSCI CPUidle driver") the generic ARM cpuidle driver doesn't probe anymore because arm_cpuidle_init() will always return -EOPNOTSUPP. That is, because the mentioned commit removes the only .cpu_suspend and .cpu_init_idle provider. Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20220529181329.2345722-2-michael@walle.cc Signed-off-by: Will Deacon <will@kernel.org>
2022-06-14cpuidle: haltpoll: Add trace points for guest_halt_poll_ns grow/shrinkEiichi Tsukata
Add trace points as are implemented in KVM host halt polling. This helps tune guest halt polling params. Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-10treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_56.RULE ↵Thomas Gleixner
(part 2) Based on the normalized pattern: this file is licensed under the terms of the gnu general public license version 2 this program is licensed as is without any warranty of any kind whether express or implied extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-23Merge branches 'pm-em' and 'pm-cpuidle'Rafael J. Wysocki
Marge Energy Model support updates and cpuidle updates for 5.19-rc1: - Update the Energy Model support code to allow the Energy Model to be artificial, which means that the power values may not be on a uniform scale with other devices providing power information, and update the cpufreq_cooling and devfreq_cooling thermal drivers to support artificial Energy Models (Lukasz Luba). - Make DTPM check the Energy Model type (Lukasz Luba). - Fix policy counter decrementation in cpufreq if Energy Model is in use (Pierre Gondois). - Add AlderLake processor support to the intel_idle driver (Zhang Rui). - Fix regression leading to no genpd governor in the PSCI cpuidle driver and fix the riscv-sbi cpuidle driver to allow a genpd governor to be used (Ulf Hansson). * pm-em: PM: EM: Decrement policy counter powercap: DTPM: Check for Energy Model type thermal: cooling: Check Energy Model type in cpufreq_cooling and devfreq_cooling Documentation: EM: Add artificial EM registration description PM: EM: Remove old debugfs files and print all 'flags' PM: EM: Change the order of arguments in the .active_power() callback PM: EM: Use the new .get_cost() callback while registering EM PM: EM: Add artificial EM flag PM: EM: Add .get_cost() callback * pm-cpuidle: cpuidle: riscv-sbi: Fix code to allow a genpd governor to be used cpuidle: psci: Fix regression leading to no genpd governor intel_idle: Add AlderLake support
2022-05-23Merge branches 'pm-core', 'pm-sleep' and 'powercap'Rafael J. Wysocki
Merge PM core changes, updates related to system sleep and power capping updates for 5.19-rc1: - Export dev_pm_ops instead of suspend() and resume() in the IIO chemical scd30 driver (Jonathan Cameron). - Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and PM-runtime counterparts (Jonathan Cameron). - Move symbol exports in the IIO chemical scd30 driver into the IIO_SCD30 namespace (Jonathan Cameron). - Avoid device PM-runtime usage count underflows (Rafael Wysocki). - Allow dynamic debug to control printing of PM messages (David Cohen). - Fix some kernel-doc comments in hibernation code (Yang Li, Haowen Bai). - Preserve ACPI-table override during hibernation (Amadeusz Sławiński). - Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson). - Make Intel RAPL power capping driver support the RaptorLake and AlderLake N processors (Zhang Rui, Sumeet Pawnikar). - Remove redundant store to value after multiply in the RAPL power capping driver (Colin Ian King). * pm-core: PM: runtime: Avoid device usage count underflows iio: chemical: scd30: Move symbol exports into IIO_SCD30 namespace PM: core: Add NS varients of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and runtime pm equiv iio: chemical: scd30: Export dev_pm_ops instead of suspend() and resume() * pm-sleep: cpuidle: PSCI: Improve support for suspend-to-RAM for PSCI OSI mode PM: runtime: Allow to call __pm_runtime_set_status() from atomic context PM: hibernate: Don't mark comment as kernel-doc x86/ACPI: Preserve ACPI-table override during hibernation PM: hibernate: Fix some kernel-doc comments PM: sleep: enable dynamic debug support within pm_pr_dbg() PM: sleep: Narrow down -DDEBUG on kernel/power/ files * powercap: powercap: intel_rapl: remove redundant store to value after multiply powercap: intel_rapl: add support for ALDERLAKE_N powercap: RAPL: Add Power Limit4 support for RaptorLake powercap: intel_rapl: add support for RaptorLake