summaryrefslogtreecommitdiff
path: root/drivers/idle/intel_idle.c
AgeCommit message (Collapse)Author
2017-11-13Merge branch 'pm-cpuidle'Rafael J. Wysocki
* pm-cpuidle: intel_idle: Graceful probe failure when MWAIT is disabled cpuidle: Avoid assignment in if () argument cpuidle: Clean up cpuidle_enable_device() error handling a bit cpuidle: ladder: Add per CPU PM QoS resume latency support ARM: cpuidle: Refactor rollback operations if init fails ARM: cpuidle: Correct driver unregistration if init fails intel_idle: replace conditionals with static_cpu_has(X86_FEATURE_ARAT) cpuidle: fix broadcast control when broadcast can not be entered Conflicts: drivers/idle/intel_idle.c
2017-11-09intel_idle: Graceful probe failure when MWAIT is disabledLen Brown
When MWAIT is disabled, intel_idle refuses to probe. But it may mis-lead the user by blaming this on the model number: intel_idle: does not run on family 6 modesl 79 So defer the check for MWAIT until after the model# white-list check succeeds, and if the MWAIT check fails, tell the user how to fix it: intel_idle: Please enable MWAIT in BIOS SETUP Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-04Revert "x86/mm: Stop calling leave_mm() in idle code"Andy Lutomirski
This reverts commit 43858b4f25cf0adc5c2ca9cf5ce5fdf2532941e5. The reason I removed the leave_mm() calls in question is because the heuristic wasn't needed after that patch. With the original version of my PCID series, we never flushed a "lazy cpu" (i.e. a CPU running kernel thread) due a flush on the loaded mm. Unfortunately, that caused architectural issues, so now I've reinstated these flushes on non-PCID systems in: commit b956575bed91 ("x86/mm: Flush more aggressively in lazy TLB mode"). That, in turn, gives us a power management and occasionally performance regression as compared to old kernels: a process that goes into a deep idle state on a given CPU and gets its mm flushed due to activity on a different CPU will wake the idle CPU. Reinstate the old ugly heuristic: if a CPU goes into ACPI C3 or an intel_idle state that is likely to cause a TLB flush gets its mm switched to init_mm before going idle. FWIW, this heuristic is lousy. Whether we should change CR3 before idle isn't a good hint except insofar as the performance hit is a bit lower if the TLB is getting flushed by the idle code anyway. What we really want to know is whether we anticipate being idle long enough that the mm is likely to be flushed before we wake up. This is more a matter of the expected latency than the idle state that gets chosen. This heuristic also completely fails on systems that don't know whether the TLB will be flushed (e.g. AMD systems?). OTOH it may be a bit obsolete anyway -- PCID systems don't presently benefit from this heuristic at all. We also shouldn't do this callback from innermost bit of the idle code due to the RCU nastiness it causes. All the information need is available before rcu_idle_enter() needs to happen. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 43858b4f25cf "x86/mm: Stop calling leave_mm() in idle code" Link: http://lkml.kernel.org/r/c513bbd4e653747213e05bc7062de000bf0202a5.1509793738.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-11intel_idle: replace conditionals with static_cpu_has(X86_FEATURE_ARAT)Jason Baron
If the 'arat' cpu flag is set, then the conditionals in intel_idle() that guard calling tick_broadcast_enter()/exit() will never be true. Use static_cpu_has(X86_FEATURE_ARAT) to create a fast path to replace the conditional. Signed-off-by: Jason Baron <jbaron@akamai.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-05Merge tag 'pm-4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "This time (again) cpufreq gets the majority of changes which mostly are driver updates (including a major consolidation of intel_pstate), some schedutil governor modifications and core cleanups. There also are some changes in the system suspend area, mostly related to diagnostics and debug messages plus some renames of things related to suspend-to-idle. One major change here is that suspend-to-idle is now going to be preferred over S3 on systems where the ACPI tables indicate to do so and provide requsite support (the Low Power Idle S0 _DSM in particular). The system sleep documentation and the tools related to it are updated too. The rest is a few cpuidle changes (nothing major), devfreq updates, generic power domains (genpd) framework updates and a few assorted modifications elsewhere. Specifics: - Drop the P-state selection algorithm based on a PID controller from intel_pstate and make it use the same P-state selection method (based on the CPU load) for all types of systems in the active mode (Rafael Wysocki, Srinivas Pandruvada). - Rework the cpufreq core and governors to make it possible to take cross-CPU utilization updates into account and modify the schedutil governor to actually do so (Viresh Kumar). - Clean up the handling of transition latency information in the cpufreq core and untangle it from the information on which drivers cannot do dynamic frequency switching (Viresh Kumar). - Add support for new SoCs (MT2701/MT7623 and MT7622) to the mediatek cpufreq driver and update its DT bindings (Sean Wang). - Modify the cpufreq dt-platdev driver to autimatically create cpufreq devices for the new (v2) Operating Performance Points (OPP) DT bindings and update its whitelist of supported systems (Viresh Kumar, Shubhrajyoti Datta, Marc Gonzalez, Khiem Nguyen, Finley Xiao). - Add support for Ux500 to the cpufreq-dt driver and drop the obsolete dbx500 cpufreq driver (Linus Walleij, Arnd Bergmann). - Add new SoC (R8A7795) support to the cpufreq rcar driver (Khiem Nguyen). - Fix and clean up assorted issues in the cpufreq drivers and core (Arvind Yadav, Christophe Jaillet, Colin Ian King, Gustavo Silva, Julia Lawall, Leonard Crestez, Rob Herring, Sudeep Holla). - Update the IO-wait boost handling in the schedutil governor to make it less aggressive (Joel Fernandes). - Rework system suspend diagnostics to make it print fewer messages to the kernel log by default, add a sysfs knob to allow more suspend-related messages to be printed and add Low Power S0 Idle constraints checks to the ACPI suspend-to-idle code (Rafael Wysocki, Srinivas Pandruvada). - Prefer suspend-to-idle over S3 on ACPI-based systems with the ACPI_FADT_LOW_POWER_S0 flag set and the Low Power Idle S0 _DSM interface present in the ACPI tables (Rafael Wysocki). - Update documentation related to system sleep and rename a number of items in the code to make it cleare that they are related to suspend-to-idle (Rafael Wysocki). - Export a variable allowing device drivers to check the target system sleep state from the core system suspend code (Florian Fainelli). - Clean up the cpuidle subsystem to handle the polling state on x86 in a more straightforward way and to use %pOF instead of full_name (Rafael Wysocki, Rob Herring). - Update the devfreq framework to fix and clean up a few minor issues (Chanwoo Choi, Rob Herring). - Extend diagnostics in the generic power domains (genpd) framework and clean it up slightly (Thara Gopinath, Rob Herring). - Fix and clean up a couple of issues in the operating performance points (OPP) framework (Viresh Kumar, Waldemar Rymarkiewicz). - Add support for RV1108 to the rockchip-io Adaptive Voltage Scaling (AVS) driver (David Wu). - Fix the usage of notifiers in CPU power management on some platforms (Alex Shi). - Update the pm-graph system suspend/hibernation and boot profiling utility (Todd Brandt). - Make it possible to run the cpupower utility without CPU0 (Prarit Bhargava)" * tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (87 commits) cpuidle: Make drivers initialize polling state cpuidle: Move polling state initialization code to separate file cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol cpufreq: imx6q: Fix imx6sx low frequency support cpufreq: speedstep-lib: make several arrays static, makes code smaller PM: docs: Delete the obsolete states.txt document PM: docs: Describe high-level PM strategies and sleep states PM / devfreq: Fix memory leak when fail to register device PM / devfreq: Add dependency on PM_OPP PM / devfreq: Move private devfreq_update_stats() into devfreq PM / devfreq: Convert to using %pOF instead of full_name PM / AVS: rockchip-io: add io selectors and supplies for RV1108 cpufreq: ti: Fix 'of_node_put' being called twice in error handling path cpufreq: dt-platdev: Drop few entries from whitelist cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2 ARM: ux500: don't select CPUFREQ_DT cpuidle: Convert to using %pOF instead of full_name cpufreq: Convert to using %pOF instead of full_name PM / Domains: Convert to using %pOF instead of full_name cpufreq: Cap the default transition delay value to 10 ms ...
2017-09-04Merge branch 'pm-sleep'Rafael J. Wysocki
* pm-sleep: ACPI / PM: Check low power idle constraints for debug only PM / s2idle: Rename platform operations structure PM / s2idle: Rename ->enter_freeze to ->enter_s2idle PM / s2idle: Rename freeze_state enum and related items PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE ACPI / PM: Prefer suspend-to-idle over S3 on some systems platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle PM / suspend: Define pr_fmt() in suspend.c PM / suspend: Use mem_sleep_labels[] strings in messages PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq() PM / core: Add error argument to dpm_show_time() PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq() PM / s2idle: Rearrange the main suspend-to-idle loop PM / timekeeping: Print debug messages when requested PM / sleep: Mark suspend/hibernation start and finish PM / sleep: Do not print debug messages by default PM / suspend: Export pm_suspend_target_state
2017-08-30cpuidle: Make drivers initialize polling stateRafael J. Wysocki
Make the drivers that want to include the polling state into their states table initialize it explicitly and drop the initialization of it (which in fact is conditional, but that is not obvious from the code) from the core. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-11PM / s2idle: Rename ->enter_freeze to ->enter_s2idleRafael J. Wysocki
Rename the ->enter_freeze cpuidle driver callback to ->enter_s2idle to make it clear that it is used for entering suspend-to-idle and rename the related functions, variables and so on accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-18Merge branch 'x86/boot' into x86/mm, to pick up interacting changesIngo Molnar
The SME patches we are about to apply add some E820 logic, so merge in pending E820 code changes first, to have a single code base. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05x86/mm: Stop calling leave_mm() in idle codeAndy Lutomirski
Now that lazy TLB suppresses all flush IPIs (as opposed to all but the first), there's no need to leave_mm() when going idle. This means we can get rid of the rcuidle hack in switch_mm_irqs_off() and we can unexport leave_mm(). This also removes acpi_unlazy_tlb() from the x86 and ia64 headers, since it has no callers any more. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Nadav Amit <nadav.amit@gmail.com> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/03c699cfd6021e467be650d6b73deaccfe4b4bd7.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-29intel_idle: Use more common logging styleJoe Perches
Remove #define PREFIX and add #define pr_fmt to use more common logging. Miscellanea: o Add missing newline to format o Convert a single printk without KERN_<LEVEL> to pr_info Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-05-01x86/intel_idle: add Gemini Lake supportDavid E. Box
Gemini Lake uses the same C-states as Broxton and also uses the IRTL MSR's to determine maximum C-state latency. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Acked-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-02Merge tag 'pm-turbostat-4.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull turbostat utility updates from Rafael Wysocki: "Power management turbostat utility updates. These update turbostat significantly and in particular: - default output is now verbose, --debug is no longer required to get all counters. As a result, some options have been added to specify exactly what output is wanted. - added --quiet to skip system configuration output - added --list, --show and --hide parameters - added --cpu parameter - enhanced Baytrail SoC support - added Gemini Lake SoC support - added sysfs C-state columns Also the symbol definitions in arch/x86/include/asm/intel-family.h and arch/x86/include/asm/msr-index.h are updated and the intel_idle and intel_pstate drivers are modified to use the updated symbols. Credits to Len Brown for all of these changes" * tag 'pm-turbostat-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (44 commits) tools/power turbostat: version 17.02.24 tools/power turbostat: bugfix: --add u32 was printed as u64 tools/power turbostat: show error on exec tools/power turbostat: dump p-state software config tools/power turbostat: show package number, even without --debug tools/power turbostat: support "--hide C1" etc. tools/power turbostat: move --Package and --processor into the --cpu option tools/power turbostat: turbostat.8 update tools/power turbostat: update --list feature tools/power turbostat: use wide columns to display large numbers tools/power turbostat: Add --list option to show available header names tools/power turbostat: fix zero IRQ count shown in one-shot command mode tools/power turbostat: add --cpu parameter tools/power turbostat: print sysfs C-state stats tools/power turbostat: extend --add option to accept /sys path tools/power turbostat: skip unused counters on BDX tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits tools/power turbostat: skip unused counters on SKX tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7 tools/power turbostat: initial Gemini Lake SOC support ...
2017-03-01intel_idle: use new name for MSR_PKG_CST_CONFIG_CONTROLLen Brown
previously known as MSR_NHM_SNB_PKG_CST_CFG_CTL Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01intel_idle: stop exposing platform acronyms in sysfsLen Brown
Cosmetic only -- no functional change in this patch. sysfs before: state4/desc:MWAIT 0x20 state4/name:C6-HSW sysfs after: state4/desc:MWAIT 0x20 state4/name:C6 We remove the platform acronyms from the end of the state name (-HSW in this case) for three reasonse. 1. more consistency with acpi_idle, which prints C1, C2, C3 etc. 2. users know what platform they are on already an acronym for the processor code name here seems to cause more confusion than clarity. 3. less clutter in "cpupower monitor" output, which truncates the names to 4 columns. The precise definition of the state continues to be available in "desc". Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-01intel_idle: Convert to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. The two smp_call_function_single() invocations in intel_idle_cpu_init() have been removed because intel_idle_cpu_init() is now invoked via the hotplug callback which runs on the target CPU. The IRQ-off calling convention for auto_demotion_disable() and c1e_promotion_disable() has not been preserved because only those two modify the MSR during CPU intialization. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-01intel_idle: Remove superfluous SMP fuction callAnna-Maria Gleixner
Since commit 1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu") the CPU_ONLINE and CPU_DOWN_PREPARE notifiers are always run on the hot plugged CPU, and as of commit 3b9d6da67e11 ("cpu/hotplug: Fix rollback during error-out in __cpu_disable()") the CPU_DOWN_FAILED notifier also runs on the hot plugged CPU. This patch converts the SMP functional calls into direct calls. smp_function_call_single() executes the function with interrupts disabled. This calling convention is not preserved, because tick_broadcast_enable() and tick_braodcast_disable() handle interrupts themselves. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-01x86/intel_idle: Add Knights Mill CPUIDPiotr Luc
Add Knights Mill (KNM) to the list of CPUIDs supported by intel_idle. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01x86/intel_idle: Add CPU model 0x4a (Atom Z34xx series)Andy Shevchenko
Add CPU ID for Atom Z34xx processors. Datasheets indicate support for this, detailed information about potential quirks or limitations are missing, though. So we just reuse the definition from official BSP code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2016-10-07nmi_backtrace: generate one-line reports for idle cpusChris Metcalf
When doing an nmi backtrace of many cores, most of which are idle, the output is a little overwhelming and very uninformative. Suppress messages for cpus that are idling when they are interrupted and just emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN". We do this by grouping all the cpuidle code together into a new .cpuidle.text section, and then checking the address of the interrupted PC to see if it lies within that section. This commit suitably tags x86 and tile idle routines, and only adds in the minimal framework for other architectures. Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm] Tested-by: Petr Mladek <pmladek@suse.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-30Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpufeature updates from Thomas Gleixner: - a workaround for the MONITOR instruction erratum of Goldmont CPUs - small fixes and cleanups here and there * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont based CPUs x86/cpu: Rename "WESTMERE2" family to "NEHALEM_G" x86/amd_nb: Clean up init path x86/cpufeature: Add helper macro for mask check macros x86/cpufeature: Make sure DISABLED/REQUIRED macros are updated x86/cpufeature: Update cpufeaure macros
2016-07-09intel_idle: correct BXT supportJan Beulich
Commit 5dcef69486 ("intel_idle: add BXT support") added an 8-element lookup array with just a 2-bit value used for lookups. As per the SDM that bit field is really 3 bits wide. While this is supposedly benign here, future re-use of the code for other CPUs might expose the issue. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-09intel_idle: re-work bxt_idle_state_table_update() and its helperJan Beulich
Since irtl_ns_units[] has itself zero entries, make sure the caller recognized those cases along with the MSR read returning zero, as zero is not a valid value for exit_latency and target_residency. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-01x86/cpu: Rename "WESTMERE2" family to "NEHALEM_G"Dave Hansen
Len Brown noticed something was amiss in our INTEL_FAM6_* definitions. It seems like model 0x1F was a Nehalem part, marketed as "Intel Core i7 and i5 Processors" (according to the SDM). But, although it was a Nehalem 0x1F had some uncore events which were shared with Westmere. Len also mentioned he thought it was called "Havendale", which Wikipedia says was graphics-oriented and canceled: https://en.wikipedia.org/wiki/Nehalem_(microarchitecture) So either way, it's probably not imporant what we call it, but call it Nehalem to be accurate, and add a "G" since it seems graphics-related. If it were canceled that would be a good reason why it's so sparsely and inconsistently referred to in the code. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160629192737.949C41A8@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-23idle_intel: Add DenvertonJacob Pan
Denverton is an Intel Atom based micro server which shares the same Goldmont architecture as Broxton. The available C-states on Denverton is a subset of Broxton with only C1, C1e, and C6. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-23drivers/idle: make intel_idle.c driver more explicitly non-modularPaul Gortmaker
The Kconfig for this driver is currently declared with: config INTEL_IDLE bool "Cpuidle Driver for Intel Processors" ...meaning that it currently is not being built as a module by anyone. This was done in commit 6ce9cd8669fa1195fdc21643370e34523c7ac988 ("intel_idle: disable module support") since "...the module capability is cauing more trouble than it is worth." This was done over 5y ago, and Daniel adds that: ...the modular support has been removed from almost all the cpuidle drivers and the cpuidle framework is no longer assuming driver could be unloaded. Removing the modular dead code in the driver makes sense as this what have been done in the others drivers. So lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. At a later date we might want to consider whether subsys_init or another init category seems more appropriate than device_init. We replace module.h with moduleparam.h since the file does declare some module parameters, and leaving them as such is currently the easiest way to remain compatible with existing boot arg use cases. Note that MODULE_DEVICE_TABLE is a no-op for non-modular code. Also note that we can't remove intel_idle_cpuidle_devices_uninit() as that is still used for unwind purposes if the init fails. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-08x86/intel_idle: Use Intel family macros for intel_idleDave Hansen
Use the new INTEL_FAM6_* macros for intel_idle.c. Also fix up some of the macros to be consistent with how some of the intel_idle code refers to the model. There's on oddity here: model 0x1F is uniquely referred to here and nowhere else that I could find. 0x1E/0x1F are just spelled out as "Intel Core i7 and i5 Processors" in the SDM or as "Intel processors based on the Nehalem, Westmere microarchitectures" in the RDPMC section. Comments between tables 19-19 and 19-20 in the SDM seem to point to 0x1F being some kind of Westmere, so let's call it "WESTMERE2". Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jacob.jun.pan@intel.com Cc: linux-pm@vger.kernel.org Link: http://lkml.kernel.org/r/20160603001932.EE978EB9@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-09intel_idle: add BXT supportLen Brown
Broxton has all the HSW C-states, except C3. BXT C-state timing is slightly different. Here we trust the IRTL MSRs as authority on maximum C-state latency, and override the driver's tables with the values found in the associated IRTL MSRs. Further we set the target_residency to 1x maximum latency, trusting the hardware demotion logic. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Add KBL supportLen Brown
KBL is similar to SKL Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Add SKX supportLen Brown
SKX is similar to BDX Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Clean up all registered devices on exit.Richard Cochran
This driver registers cpuidle devices when a CPU comes online, but it leaves the registrations in place when a CPU goes offline. The module exit code only unregisters the currently online CPUs, leaving the devices for offline CPUs dangling. This patch changes the driver to clean up all registrations on exit, even those from CPUs that are offline. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Propagate hot plug errors.Richard Cochran
If a cpuidle registration error occurs during the hot plug notifier callback, we should really inform the hot plug machinery instead of just ignoring the error. This patch changes the callback to properly return on error. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Don't overreact to a cpuidle registration failure.Richard Cochran
The helper function, intel_idle_cpu_init, registers one new device with the cpuidle layer. If the registration should fail, that function immediately calls intel_idle_cpuidle_devices_uninit() to unregister every last CPU's device. However, it makes no sense to do so, when called from the hot plug notifier callback. This patch moves the call to intel_idle_cpuidle_devices_uninit() outside of the helper function to the one call site that actually needs to perform the de-registrations. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Setup the timer broadcast only on successful driver load.Richard Cochran
This driver sets the broadcast tick quite early on during probe and does not clean up again in cast of failure. This patch moves the setup call after the registration, placing the on_each_cpu() calls within the global CPU lock region. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Avoid a double free of the per-CPU data.Richard Cochran
The helper function, intel_idle_cpuidle_devices_uninit, frees the globally allocated per-CPU data. However, this function is invoked from the hot plug notifier callback at a time when freeing that data is not safe. If the call to cpuidle_register_driver() should fail (say, due to lack of memory), then the driver will free its per-CPU region. On the *next* CPU_ONLINE event, the driver will happily use the region again and even free it again if the failure repeats. This patch fixes the issue by moving the call to free_percpu() outside of the helper function at the two call sites that actually need to free the per-CPU data. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Fix dangling registration on error path.Richard Cochran
In the module_init() method, if the per-CPU allocation fails, then the active cpuidle registration is not cleaned up. This patch fixes the issue by attempting the allocation before registration, and then cleaning it up again on registration failure. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Fix deallocation order on the driver exit path.Richard Cochran
In the module_exit() method, this driver first frees its per-CPU pointer, then unregisters a callback making use of the pointer. Furthermore, the function, intel_idle_cpuidle_devices_uninit, is racy against CPU hot plugging as it calls for_each_online_cpu(). This patch corrects the issues by unregistering first on the exit path while holding the hot plug lock. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Remove redundant initialization calls.Richard Cochran
The function, intel_idle_cpuidle_driver_init, makes calls on each CPU to auto_demotion_disable() and c1e_promotion_disable(). These calls are redundant, as intel_idle_cpu_init() does the same calls just a bit later on. They are also premature, as the driver registration may yet fail. This patch removes the redundant code. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Fix a helper function's return value.Richard Cochran
The function, intel_idle_cpuidle_driver_init, delivers no error codes at all. This patch changes the function to return 'void' instead of returning zero. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: remove useless return from void function.Richard Cochran
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-23intel_idle: Support for Intel Xeon Phi Processor x200 Product FamilyDasaratharaman Chandramouli
Enables "Intel(R) Xeon Phi(TM) Processor x200 Product Family" support, formerly code-named KNL. It is based on modified Intel Atom Silvermont microarchitecture. Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> [micah.barany@intel.com: adjusted values of residency and latency] Signed-off-by: Micah Barany <micah.barany@intel.com> [hubert.chrzaniuk@intel.com: removed deprecated CPUIDLE_FLAG_TIME_VALID flag] Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Signed-off-by: Pawel Karczewski <pawel.karczewski@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-23intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabledLen Brown
Some SKL-H configurations require "intel_idle.max_cstate=7" to boot. While that is an effective workaround, it disables C10. This patch detects the problematic configuration, and disables C8 and C9, keeping C10 enabled. Note that enabling SGX in BIOS SETUP can also prevent this issue, if the system BIOS provides that option. https://bugzilla.kernel.org/show_bug.cgi?id=109081 "Freezes with Intel i7 6700HQ (Skylake), unless intel_idle.max_cstate=7" Signed-off-by: Len Brown <len.brown@intel.com> Cc: stable@vger.kernel.org
2015-09-10intel_idle: Skylake Client Support - updatedLen Brown
Addition of PC9 state, and minor tweaks to existing PC6 and PC8 states. Signed-off-by: Len Brown <len.brown@intel.com>
2015-08-15intel_idle: Skylake Client SupportLen Brown
Skylake Client CPU idle Power states (C-states) are similar to the previous generation, Broadwell. However, Skylake does get its own table with updated worst-case latency and average energy-break-even residency values. Signed-off-by: Len Brown <len.brown@intel.com>
2015-07-26intel_idle: allow idle states to be freeze-mode specificLen Brown
intel_idle uses a NULL "enter" field in a cpuidle state to recognize the invalid entry terminating a variable-length array. Linux-4.0 added support for the system-wide "freeze" state in cpuidle drivers via the new "enter_freeze" field. The natural way to expose a deep idle state for freeze, but not for run-time idle is to supply "enter_freeze" without "enter"; so we update the driver to accept such states. Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-14Merge tag 'pm+acpi-4.1-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael Wysocki: "These are mostly fixes and cleanups all over, although there are a few items that sort of fall into the new feature category. First off, we have new callbacks for PM domains that should help us to handle some issues related to device initialization in a better way. There also is some consolidation in the unified device properties API area allowing us to use that inferface for accessing data coming from platform initialization code in addition to firmware-provided data. We have some new device/CPU IDs in a few drivers, support for new chips and a new cpufreq driver too. Specifics: - Generic PM domains support update including new PM domain callbacks to handle device initialization better (Russell King, Rafael J Wysocki, Kevin Hilman) - Unified device properties API update including a new mechanism for accessing data provided by platform initialization code (Rafael J Wysocki, Adrian Hunter) - ARM cpuidle update including ARM32/ARM64 handling consolidation (Daniel Lezcano) - intel_idle update including support for the Silvermont Core in the Baytrail SOC and for the Airmont Core in the Cherrytrail and Braswell SOCs (Len Brown, Mathias Krause) - New cpufreq driver for Hisilicon ACPU (Leo Yan) - intel_pstate update including support for the Knights Landing chip (Dasaratharaman Chandramouli, Kristen Carlson Accardi) - QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann) - powernv cpufreq driver update (Shilpasri G Bhat) - devfreq update including Tegra support changes (Tomeu Vizoso, MyungJoo Ham, Chanwoo Choi) - powercap RAPL (Running-Average Power Limit) driver update including support for Intel Broadwell server chips (Jacob Pan, Mathias Krause) - ACPI device enumeration update related to the handling of the special PRP0001 device ID allowing DT-style 'compatible' property to be used for ACPI device identification (Rafael J Wysocki) - ACPI EC driver update including limited _DEP support (Lan Tianyu, Lv Zheng) - ACPI backlight driver update including a new mechanism to allow native backlight handling to be forced on non-Windows 8 systems and a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede) - New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu) - Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger, Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki) - Fixes related to suspend-to-idle for the iTCO watchdog driver and the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu) - PM tracing support for the suspend phase of system suspend/resume transitions (Zhonghui Fu) - Configurable delay for the system suspend/resume testing facility (Brian Norris) - PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki)" * tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits) ACPI / scan: Fix NULL pointer dereference in acpi_companion_match() ACPI / scan: Rework modalias creation when "compatible" is present intel_idle: mark cpu id array as __initconst powercap / RAPL: mark rapl_ids array as __initconst powercap / RAPL: add ID for Broadwell server intel_pstate: Knights Landing support intel_pstate: remove MSR test cpufreq: fix qoriq uniprocessor build ACPI / scan: Take the PRP0001 position in the list of IDs into account ACPI / scan: Simplify acpi_match_device() ACPI / scan: Generalize of_compatible matching device property: Introduce firmware node type for platform data device property: Make it possible to use secondary firmware nodes PM / watchdog: iTCO: stop watchdog during system suspend cpufreq: hisilicon: add acpu driver ACPI / EC: Call acpi_walk_dep_device_list() after installing EC opregion handler cpufreq: powernv: Report cpu frequency throttling intel_idle: Add support for the Airmont Core in the Cherrytrail and Braswell SOCs intel_idle: Update support for Silvermont Core in Baytrail SOC PM / devfreq: tegra: Register governor on module init ...
2015-04-11Merge branch 'cpuidle' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux into pm-cpuidle Pull intel_idle material for v4.1 from Len Brown. * 'cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: intel_idle: Add support for the Airmont Core in the Cherrytrail and Braswell SOCs intel_idle: Update support for Silvermont Core in Baytrail SOC
2015-04-11intel_idle: mark cpu id array as __initconstMathias Krause
The CPU ids are only tested in intel_idle_probe() which is itself an __init function. For the MODULE_DEVICE_TABLE() file2alias doesn't care about the section, just about the symbol name. So it's safe to mark the cpu id array as __initconst so its memory can be released after initialization is done. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-03intel_idle: Use explicit broadcast oneshot control functionThomas Gleixner
Replace the clockevents_notify() call with an explicit function call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20714596.QMfNNPbuyU@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03intel_idle: Use explicit broadcast control functionThomas Gleixner
Replace the clockevents_notify() call with an explicit function call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/3878165.rXNXrtVNuy@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>