summaryrefslogtreecommitdiff
path: root/kernel/power
AgeCommit message (Collapse)Author
2017-11-15mm: remove __GFP_COLDMel Gorman
As the page free path makes no distinction between cache hot and cold pages, there is no real useful ordering of pages in the free list that allocation requests can take advantage of. Juding from the users of __GFP_COLD, it is likely that a number of them are the result of copying other sites instead of actually measuring the impact. Remove the __GFP_COLD parameter which simplifies a number of paths in the page allocator. This is potentially controversial but bear in mind that the size of the per-cpu pagelists versus modern cache sizes means that the whole per-cpu list can often fit in the L3 cache. Hence, there is only a potential benefit for microbenchmarks that alloc/free pages in a tight loop. It's even worse when THP is taken into account which has little or no chance of getting a cache-hot page as the per-cpu list is bypassed and the zeroing of multiple pages will thrash the cache anyway. The truncate microbenchmarks are not shown as this patch affects the allocation path and not the free path. A page fault microbenchmark was tested but it showed no sigificant difference which is not surprising given that the __GFP_COLD branches are a miniscule percentage of the fault path. Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-13Merge branch 'pm-sleep'Rafael J. Wysocki
* pm-sleep: freezer: Fix typo in freezable_schedule_timeout() comment PM / s2idle: Clear the events_check_enabled flag PM / sleep: Remove pm_complete_with_resume_check() PM: ARM: locomo: Drop suspend and resume bus type callbacks PM: Use a more common logging style PM: Document rules on using pm_runtime_resume() in system suspend callbacks
2017-11-13Merge branches 'pm-cpufreq-sched' and 'pm-opp'Rafael J. Wysocki
* pm-cpufreq-sched: cpufreq: schedutil: Reset cached_raw_freq when not in sync with next_freq * pm-opp: PM / OPP: Add dev_pm_opp_{un}register_get_pstate_helper() PM / OPP: Support updating performance state of device's power domain PM / OPP: add missing of_node_put() for of_get_cpu_node() PM / OPP: Rename dev_pm_opp_register_put_opp_helper() PM / OPP: Add missing of_node_put(np) PM / OPP: Move error message to debug level PM / OPP: Use snprintf() to avoid kasprintf() and kfree() PM / OPP: Move the OPP directory out of power/
2017-11-08PM / s2idle: Clear the events_check_enabled flagRajat Jain
Problem: This flag does not get cleared currently in the suspend or resume path in the following cases: * In case some driver's suspend routine returns an error. * Successful s2idle case * etc? Why is this a problem: What happens is that the next suspend attempt could fail even though the user did not enable the flag by writing to /sys/power/wakeup_count. This is 1 use case how the issue can be seen (but similar use case with driver suspend failure can be thought of): 1. Read /sys/power/wakeup_count 2. echo count > /sys/power/wakeup_count 3. echo freeze > /sys/power/wakeup_count 4. Let the system suspend, and wakeup the system using some wake source that calls pm_wakeup_event() e.g. power button or something. 5. Note that the combined wakeup count would be incremented due to the pm_wakeup_event() in the resume path. 6. After resuming the events_check_enabled flag is still set. At this point if the user attempts to freeze again (without writing to /sys/power/wakeup_count), the suspend would fail even though there has been no wake event since the past resume. Address that by clearing the flag just before a resume is completed, so that it is always cleared for the corner cases mentioned above. Signed-off-by: Rajat Jain <rajatja@google.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-14Merge branch 'pm-domains' into pm-oppRafael J. Wysocki
2017-10-09Merge back suspend/resume/hibernate material for v4.15.Rafael J. Wysocki
2017-10-03PM: Use a more common logging styleJoe Perches
Convert printks to pr_<level>. Miscellanea: o Use pr_fmt with "PM:" and remove "PM: " from format strings o Coalesce format strings and realign format arguments o Convert an embedded incorrect function name to "%s: ", __func__ o Convert a couple multi-line formats to multiple pr_<level> calls Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-03PM / OPP: Move the OPP directory out of power/Viresh Kumar
The drivers/base/power/ directory is special and contains code related to power management core like system suspend/resume, hibernation, etc. It was fine to keep the OPP code inside it when we had just one file for it, but it is growing now and already has a directory for itself. Lets move it directly under drivers/ directory, just like cpufreq and cpuidle. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-29PM / s2idle: Invoke the ->wake() platform callback earlierRafael J. Wysocki
The role of the ->wake() platform callback for suspend-to-idle is to deal with possible spurious wakeups, among other things. The ACPI implementation of it, acpi_s2idle_wake(), additionally checks the conditions for entering the Low Power S0 Idle state by the platform and reports the ones that have not been met. However, the ->wake() platform callback is invoked after calling dpm_noirq_resume_devices(), which means that the power states of some devices may have changed since s2idle_enter() returned, so some unmet Low Power S0 Idle conditions may be reported incorrectly as a result of that. To avoid these false positives, reorder the invocations of the dpm_noirq_resume_devices() routine and the ->wake() platform callback in s2idle_loop(). Fixes: 726fb6b4f2a8 (ACPI / PM: Check low power idle constraints for debug only) Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-12Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Three fixes: - fix a suspend/resume cpusets bug - fix a !CONFIG_NUMA_BALANCING bug - fix a kerneldoc warning" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix nuisance kernel-doc warning sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs sched/fair: Fix wake_affine_llc() balancing rules
2017-09-07Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block layer updates from Jens Axboe: "This is the first pull request for 4.14, containing most of the code changes. It's a quiet series this round, which I think we needed after the churn of the last few series. This contains: - Fix for a registration race in loop, from Anton Volkov. - Overflow complaint fix from Arnd for DAC960. - Series of drbd changes from the usual suspects. - Conversion of the stec/skd driver to blk-mq. From Bart. - A few BFQ improvements/fixes from Paolo. - CFQ improvement from Ritesh, allowing idling for group idle. - A few fixes found by Dan's smatch, courtesy of Dan. - A warning fixup for a race between changing the IO scheduler and device remova. From David Jeffery. - A few nbd fixes from Josef. - Support for cgroup info in blktrace, from Shaohua. - Also from Shaohua, new features in the null_blk driver to allow it to actually hold data, among other things. - Various corner cases and error handling fixes from Weiping Zhang. - Improvements to the IO stats tracking for blk-mq from me. Can drastically improve performance for fast devices and/or big machines. - Series from Christoph removing bi_bdev as being needed for IO submission, in preparation for nvme multipathing code. - Series from Bart, including various cleanups and fixes for switch fall through case complaints" * 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits) kernfs: checking for IS_ERR() instead of NULL drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set drbd: Fix allyesconfig build, fix recent commit drbd: switch from kmalloc() to kmalloc_array() drbd: abort drbd_start_resync if there is no connection drbd: move global variables to drbd namespace and make some static drbd: rename "usermode_helper" to "drbd_usermode_helper" drbd: fix race between handshake and admin disconnect/down drbd: fix potential deadlock when trying to detach during handshake drbd: A single dot should be put into a sequence. drbd: fix rmmod cleanup, remove _all_ debugfs entries drbd: Use setup_timer() instead of init_timer() to simplify the code. drbd: fix potential get_ldev/put_ldev refcount imbalance during attach drbd: new disk-option disable-write-same drbd: Fix resource role for newly created resources in events2 drbd: mark symbols static where possible drbd: Send P_NEG_ACK upon write error in protocol != C drbd: add explicit plugging when submitting batches drbd: change list_for_each_safe to while(list_first_entry_or_null) drbd: introduce drbd_recv_header_maybe_unplug ...
2017-09-07sched/cpuset/pm: Fix cpuset vs. suspend-resume bugsPeter Zijlstra
Cpusets vs. suspend-resume is _completely_ broken. And it got noticed because it now resulted in non-cpuset usage breaking too. On suspend cpuset_cpu_inactive() doesn't call into cpuset_update_active_cpus() because it doesn't want to move tasks about, there is no need, all tasks are frozen and won't run again until after we've resumed everything. But this means that when we finally do call into cpuset_update_active_cpus() after resuming the last frozen cpu in cpuset_cpu_active(), the top_cpuset will not have any difference with the cpu_active_mask and this it will not in fact do _anything_. So the cpuset configuration will not be restored. This was largely hidden because we would unconditionally create identity domains and mobile users would not in fact use cpusets much. And servers what do use cpusets tend to not suspend-resume much. An addition problem is that we'd not in fact wait for the cpuset work to finish before resuming the tasks, allowing spurious migrations outside of the specified domains. Fix the rebuild by introducing cpuset_force_rebuild() and fix the ordering with cpuset_wait_for_hotplug(). Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: deb7aa308ea2 ("cpuset: reorganize CPU / memory hotplug handling") Link: http://lkml.kernel.org/r/20170907091338.orwxrqkbfkki3c24@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-04Merge branch 'pm-sleep'Rafael J. Wysocki
* pm-sleep: ACPI / PM: Check low power idle constraints for debug only PM / s2idle: Rename platform operations structure PM / s2idle: Rename ->enter_freeze to ->enter_s2idle PM / s2idle: Rename freeze_state enum and related items PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE ACPI / PM: Prefer suspend-to-idle over S3 on some systems platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle PM / suspend: Define pr_fmt() in suspend.c PM / suspend: Use mem_sleep_labels[] strings in messages PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq() PM / core: Add error argument to dpm_show_time() PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq() PM / s2idle: Rearrange the main suspend-to-idle loop PM / timekeeping: Print debug messages when requested PM / sleep: Mark suspend/hibernation start and finish PM / sleep: Do not print debug messages by default PM / suspend: Export pm_suspend_target_state
2017-08-23block: replace bi_bdev with a gendisk pointer and partitions indexChristoph Hellwig
This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-18ACPI / PM: Check low power idle constraints for debug onlySrinivas Pandruvada
For SoC to achieve its lowest power platform idle state a set of hardware preconditions must be met. These preconditions or constraints can be obtained by issuing a device specific method (_DSM) with function "1". Refer to the document provided in the link below. Here during initialization (from attach() callback of LPS0 device), invoke function 1 to get the device constraints. Each enabled constraint is stored in a table. The devices in this table are used to check whether they were in required minimum state, while entering suspend. This check is done from platform freeze wake() callback, only when /sys/power/pm_debug_messages attribute is non zero. If any constraint is not met and device is ACPI power managed then it prints the device information to kernel logs. Also if debug is enabled in acpi/sleep.c, the constraint table and state of each device on wake is dumped in kernel logs. Since pm_debug_messages_on setting is used as condition to check constraints outside kernel/power/main.c, pm_debug_messages_on is changed to a global variable. Link: http://www.uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-11PM / s2idle: Rename platform operations structureRafael J. Wysocki
Rename struct platform_freeze_ops to platform_s2idle_ops to make it clear that the callbacks in it are used during suspend-to-idle suspend/resume transitions and rename the related functions, variables and so on accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-11PM / s2idle: Rename freeze_state enum and related itemsRafael J. Wysocki
Rename the freeze_state enum representing the suspend-to-idle state machine states to s2idle_states and rename the related variables and functions accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-11PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLERafael J. Wysocki
To make it clear that the symbol in question refers to suspend-to-idle, rename it from PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-10mm: fix global NR_SLAB_.*CLAIMABLE counter readsJohannes Weiner
As Tetsuo points out: "Commit 385386cff4c6 ("mm: vmstat: move slab statistics from zone to node counters") broke "Slab:" field of /proc/meminfo . It shows nearly 0kB" In addition to /proc/meminfo, this problem also affects the slab counters OOM/allocation failure info dumps, can cause early -ENOMEM from overcommit protection, and miscalculate image size requirements during suspend-to-disk. This is because the patch in question switched the slab counters from the zone level to the node level, but forgot to update the global accessor functions to read the aggregate node data instead of the aggregate zone data. Use global_node_page_state() to access the global slab counters. Fixes: 385386cff4c6 ("mm: vmstat: move slab statistics from zone to node counters") Link: http://lkml.kernel.org/r/20170801134256.5400-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Stefan Agner <stefan@agner.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-05ACPI / PM: Prefer suspend-to-idle over S3 on some systemsRafael J. Wysocki
Modify the ACPI system sleep support setup code to select suspend-to-idle as the default system sleep state if (1) the ACPI_FADT_LOW_POWER_S0 flag is set in the FADT and (2) the Low Power Idle S0 _DSM interface has been discovered and (3) the default sleep state was not selected from the kernel command line. The main motivation for this change is that systems where the (1) and (2) conditions are met typically ship with OSes that don't exercise the S3 path in the platform firmware which remains untested and turns out to be non-functional at least in some cases. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2017-07-24PM / suspend: Define pr_fmt() in suspend.cRafael J. Wysocki
Define a common prefix ("PM:") for messages printed by the code in kernel/power/suspend.c. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2017-07-24PM / suspend: Use mem_sleep_labels[] strings in messagesRafael J. Wysocki
Some messages in suspend.c currently print state names from pm_states[], but that may be confusing if the mem_sleep sysfs attribute is changed to anything different from "mem", because in those cases the messages will say either "freeze" or "standby" after writing "mem" to /sys/power/state. To avoid the confusion, use mem_sleep_labels[] strings in those messages instead. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2017-07-24PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUGRafael J. Wysocki
The pm_test sysfs attribute is under CONFIG_PM_DEBUG, but it doesn't make sense to provide it if CONFIG_PM_SLEEP is unset, so put it under CONFIG_PM_SLEEP_DEBUG instead. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq()Rafael J. Wysocki
Restore the pm_wakeup_pending() check in __device_suspend_noirq() removed by commit eed4d47efe95 (ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle) as that allows the function to return earlier if there's a wakeup event pending already (so that it may spend less time on carrying out operations that will be reversed shortly anyway) and rework the main suspend-to-idle loop to take that optimization into account. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24PM / s2idle: Rearrange the main suspend-to-idle loopRafael J. Wysocki
As a preparation for subsequent changes, rearrange the core suspend-to-idle code by moving the initial invocation of dpm_suspend_noirq() into s2idle_loop(). This also causes debug messages from that code to appear in a less confusing order. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-23PM / timekeeping: Print debug messages when requestedRafael J. Wysocki
The messages printed by tk_debug_account_sleep_time() are basically useful for system sleep debugging, so print them only when the other debug messages from the core suspend/hibernate code are enabled. While at it, make it clear that the messages from tk_debug_account_sleep_time() are about timekeeping suspend duration, because in general timekeeping may be suspeded and resumed for multiple times during one system suspend-resume cycle. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-22PM / sleep: Mark suspend/hibernation start and finishRafael J. Wysocki
Regardless of whether or not debug messages from the core system suspend/hibernation code are enabled, it is useful to know when system-wide transitions start and finish (or fail), so print "info" messages at these points. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Mark Salyzyn <salyzyn@android.com>
2017-07-22PM / sleep: Do not print debug messages by defaultRafael J. Wysocki
Debug messages from the system suspend/hibernation infrastructure can fill up the entire kernel log buffer in some cases and anyway they are only useful for debugging. They depend on CONFIG_PM_DEBUG, but that is set as a rule as some generally useful diagnostic facilities depend on it too. For this reason, avoid printing those messages by default, but make it possible to turn them on as needed with the help of a new sysfs attribute under /sys/power/. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-22PM / suspend: Export pm_suspend_target_stateFlorian Fainelli
Have the core suspend/resume framework store the system-wide suspend state (suspend_state_t) we are about to enter, and expose it to drivers via pm_suspend_target_state in order to retrieve that. The state is assigned in suspend_devices_and_enter(). This is useful for platform specific drivers that may need to take a slightly different suspend/resume path based on the system's suspend/resume state being entered. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-10Merge tag 'pm-extra-4.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These revert one recent change in the generic power domains framework, fix a recently introduced build issue in there and constify attribute_group structures in some places. Specifics: - Revert a recent change in the generic power domains (genpd) framework that led to regressions and turned out the be misguided (Rafael Wysocki). - Fix a recently introduced build issue in the generic power domains (genpd) framework (Arnd Bergmann). - Constify attribute_group structures in the PM core, the cpufreq stats code and in intel_pstate (Arvind Yadav)" * tag 'pm-extra-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: constify attribute_group structures cpufreq: cpufreq_stats: constify attribute_group structures PM / sleep: constify attribute_group structures PM / Domains: provide pm_genpd_poweroff_noirq() stub Revert "PM / Domains: Handle safely genpd_syscore_switch() call on non-genpd device"
2017-07-10Merge branches 'pm-domains', 'pm-sleep' and 'pm-cpufreq'Rafael J. Wysocki
* pm-domains: PM / Domains: provide pm_genpd_poweroff_noirq() stub Revert "PM / Domains: Handle safely genpd_syscore_switch() call on non-genpd device" * pm-sleep: PM / sleep: constify attribute_group structures * pm-cpufreq: cpufreq: intel_pstate: constify attribute_group structures cpufreq: cpufreq_stats: constify attribute_group structures
2017-07-06kernel/power/snapshot.c: use linux/set_memory.hMichael Ellerman
This header always exists, so doesn't require an ifdef around its inclusion. When CONFIG_ARCH_HAS_SET_MEMORY=y it includes the asm header, otherwise it provides empty versions of the set_memory_xx() routines. Link: http://lkml.kernel.org/r/1498717781-29151-2-git-send-email-mpe@ellerman.id.au Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Laura Abbott <labbott@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-04Merge tag 'pm-4.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The big ticket items here are the rework of suspend-to-idle in order to add proper support for power button wakeup from it on recent Dell laptops and the rework of interfaces exporting the current CPU frequency on x86. In addition to that, support for a few new pieces of hardware is added, the PCI/ACPI device wakeup infrastructure is simplified significantly and the wakeup IRQ framework is fixed to unbreak the IRQ bus locking infrastructure. Also, there are some functional improvements for intel_pstate, tools updates and small fixes and cleanups all over. Specifics: - Rework suspend-to-idle to allow it to take wakeup events signaled by the EC into account on ACPI-based platforms in order to properly support power button wakeup from suspend-to-idle on recent Dell laptops (Rafael Wysocki). That includes the core suspend-to-idle code rework, support for the Low Power S0 _DSM interface, and support for the ACPI INT0002 Virtual GPIO device from Hans de Goede (required for USB keyboard wakeup from suspend-to-idle to work on some machines). - Stop trying to export the current CPU frequency via /proc/cpuinfo on x86 as that is inaccurate and confusing (Len Brown). - Rework the way in which the current CPU frequency is exported by the kernel (over the cpufreq sysfs interface) on x86 systems with the APERF and MPERF registers by always using values read from these registers, when available, to compute the current frequency regardless of which cpufreq driver is in use (Len Brown). - Rework the PCI/ACPI device wakeup infrastructure to remove the questionable and artificial distinction between "devices that can wake up the system from sleep states" and "devices that can generate wakeup signals in the working state" from it, which allows the code to be simplified quite a bit (Rafael Wysocki). - Fix the wakeup IRQ framework by making it use SRCU instead of RCU which doesn't allow sleeping in the read-side critical sections, but which in turn is expected to be allowed by the IRQ bus locking infrastructure (Thomas Gleixner). - Modify some computations in the intel_pstate driver to avoid rounding errors resulting from them (Srinivas Pandruvada). - Reduce the overhead of the intel_pstate driver in the HWP (hardware-managed P-states) mode and when the "performance" P-state selection algorithm is in use by making it avoid registering scheduler callbacks in those cases (Len Brown). - Rework the energy_performance_preference sysfs knob in intel_pstate by changing the values that correspond to different symbolic hint names used by it (Len Brown). - Make it possible to use more than one cpuidle driver at the same time on ARM (Daniel Lezcano). - Make it possible to prevent the cpuidle menu governor from using the 0 state by disabling it via sysfs (Nicholas Piggin). - Add support for FFH (Fixed Functional Hardware) MWAIT in ACPI C1 on AMD systems (Yazen Ghannam). - Make the CPPC cpufreq driver take the lowest nonlinear performance information into account (Prashanth Prakash). - Add support for hi3660 to the cpufreq-dt driver, fix the imx6q driver and clean up the sfi, exynos5440 and intel_pstate drivers (Colin Ian King, Krzysztof Kozlowski, Octavian Purdila, Rafael Wysocki, Tao Wang). - Fix a few minor issues in the generic power domains (genpd) framework and clean it up somewhat (Krzysztof Kozlowski, Mikko Perttunen, Viresh Kumar). - Fix a couple of minor issues in the operating performance points (OPP) framework and clean it up somewhat (Viresh Kumar). - Fix a CONFIG dependency in the hibernation core and clean it up slightly (Balbir Singh, Arvind Yadav, BaoJun Luo). - Add rk3228 support to the rockchip-io adaptive voltage scaling (AVS) driver (David Wu). - Fix an incorrect bit shift operation in the RAPL power capping driver (Adam Lessnau). - Add support for the EPP field in the HWP (hardware managed P-states) control register, HWP.EPP, to the x86_energy_perf_policy tool and update msr-index.h with HWP.EPP values (Len Brown). - Fix some minor issues in the turbostat tool (Len Brown). - Add support for AMD family 0x17 CPUs to the cpupower tool and fix a minor issue in it (Sherry Hurwitz). - Assorted cleanups, mostly related to the constification of some data structures (Arvind Yadav, Joe Perches, Kees Cook, Krzysztof Kozlowski)" * tag 'pm-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (69 commits) cpufreq: Update scaling_cur_freq documentation cpufreq: intel_pstate: Clean up after performance governor changes PM: hibernate: constify attribute_group structures. cpuidle: menu: allow state 0 to be disabled intel_idle: Use more common logging style PM / Domains: Fix missing default_power_down_ok comment PM / Domains: Fix unsafe iteration over modified list of domains PM / Domains: Fix unsafe iteration over modified list of domain providers PM / Domains: Fix unsafe iteration over modified list of device links PM / Domains: Handle safely genpd_syscore_switch() call on non-genpd device PM / Domains: Call driver's noirq callbacks PM / core: Drop run_wake flag from struct dev_pm_info PCI / PM: Simplify device wakeup settings code PCI / PM: Drop pme_interrupt flag from struct pci_dev ACPI / PM: Consolidate device wakeup settings code ACPI / PM: Drop run_wake from struct acpi_device_wakeup_flags PM / QoS: constify *_attribute_group. PM / AVS: rockchip-io: add io selectors and supplies for rk3228 powercap/RAPL: prevent overridding bits outside of the mask PM / sysfs: Constify attribute groups ...
2017-07-04PM / sleep: constify attribute_group structuresArvind Yadav
attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 3802 624 32 4458 116a kernel/power/main.o File size After adding 'const': text data bss dec hex filename 3866 560 32 4458 116a kernel/power/main.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-03Merge branch 'acpi-pm'Rafael J. Wysocki
* acpi-pm: PM / core: Drop run_wake flag from struct dev_pm_info PCI / PM: Simplify device wakeup settings code PCI / PM: Drop pme_interrupt flag from struct pci_dev ACPI / PM: Consolidate device wakeup settings code ACPI / PM: Drop run_wake from struct acpi_device_wakeup_flags ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems platform: x86: intel-hid: Wake up the system from suspend-to-idle platform: x86: intel-vbtn: Wake up the system from suspend-to-idle ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle platform/x86: Add driver for ACPI INT0002 Virtual GPIO device PCI / PM: Restore PME Enable if skipping wakeup setup PM / sleep: Print timing information if debug is enabled ACPI / PM: Clean up device wakeup enable/disable code ACPI / PM: Change log level of wakeup-related message USB / PCI / PM: Allow the PCI core to do the resume cleanup ACPI / PM: Run wakeup notify handlers synchronously Conflicts: drivers/base/power/main.c
2017-06-29PM: hibernate: constify attribute_group structures.Arvind Yadav
attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 6332 488 308 7128 1bd8 kernel/power/hibernate.o File size After adding 'const': text data bss dec hex filename 6396 424 308 7128 1bd8 kernel/power/hibernate.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-27PM / hibernate: Drop redundant parameter of swsusp_alloc()BaoJun Luo
The first parameter of swsusp_alloc is not used, so drop it. Signed-off-by: BaoJun Luo <baojun.luo@samsung.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-27PM / hibernate: Use CONFIG_HAVE_SET_MEMORY for include conditionBalbir Singh
Kbuild reported a build failure when CONFIG_STRICT_KERNEL_RWX was enabled on powerpc. We don't yet have ARCH_HAS_SET_MEMORY and ppc32 saw a build failure. I've only done a basic compile test with a config that has hibernation enabled. Fixes: 50327ddfbc92 (kernel/power/snapshot.c: use set_memory.h header) Reported-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-15ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idleRafael J. Wysocki
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ during suspend-to-idle transitions and, consequently, any events signaled through it wake up the system from that state. However, on some systems some of the events signaled via the ACPI SCI while suspended to idle should not cause the system to wake up. In fact, quite often they should just be discarded. Arguably, systems should not resume entirely on such events, but in order to decide which events really should cause the system to resume and which are spurious, it is necessary to resume up to the point when ACPI SCIs are actually handled and processed, which is after executing dpm_resume_noirq() in the system resume path. For this reasons, add a loop around freeze_enter() in which the platforms can process events signaled via multiplexed IRQ lines like the ACPI SCI and add suspend-to-idle hooks that can be used for this purpose to struct platform_freeze_ops. In the ACPI case, the ->wake hook is used for checking if the SCI has triggered while suspended and deferring the interrupt-induced system wakeup until the events signaled through it are actually processed sufficiently to decide whether or not the system should resume. In turn, the ->sync hook allows all of the relevant event queues to be flushed so as to prevent events from being missed due to race conditions. In addition to that, some ACPI code processing wakeup events needs to be modified to use the "hard" version of wakeup triggers, so that it will cause a system resume to happen on device-induced wakeup events even if the "soft" mechanism to prevent the system from suspending is not enabled. However, to preserve the existing behavior with respect to suspend-to-RAM, this only is done in the suspend-to-idle case and only if an SCI has occurred while suspended. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12Merge tag 'v4.12-rc5' into for-4.13/blockJens Axboe
We've already got a few conflicts and upcoming work depends on some of the changes that have gone into mainline as regression fixes for this series. Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream trees to continue working on 4.13 changes. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-09block: switch bios to blk_status_tChristoph Hellwig
Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09Merge branches 'intel_pstate' and 'pm-sleep'Rafael J. Wysocki
* intel_pstate: cpufreq: intel_pstate: Avoid division by 0 in min_perf_pct_min() * pm-sleep: Revert "ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle"
2017-06-07Revert "ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle"Rafael J. Wysocki
Revert commit eed4d47efe95 (ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle) as it turned out to be premature and triggered a number of different issues on various systems. That includes, but is not limited to, premature suspend-to-RAM aborts on Dell XPS 13 (9343) reported by Dominik. The issue the commit in question attempted to address is real and will need to be taken care of going forward, but evidently more work is needed for this purpose. Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-05-22Merge branches 'pm-sleep' and 'powercap'Rafael J. Wysocki
* pm-sleep: PM / hibernate: Declare variables as static RTC: rtc-cmos: Fix wakeup from suspend-to-idle PM / wakeup: Fix up wakeup_source_report_event() * powercap: PowerCap: Fix an error code in powercap_register_zone()
2017-05-14PM / hibernate: Declare variables as staticPushkar Jambhlekar
Fixing sparse warnings: 'symbol not declared. Should it be static?' Signed-off-by: Pushkar Jambhlekar <pushkar.iit@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-05-10Merge tag 'pm-extra-4.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These add new CPU IDs to a couple of drivers, fix a possible NULL pointer dereference in the cpuidle core, update DT-related things in the generic power domains framework and finally update the suspend/resume infrastructure to improve the handling of wakeups from suspend-to-idle. Specifics: - Add Intel Gemini Lake CPU IDs to the intel_idle and intel_rapl drivers (David Box). - Add a NULL pointer check to the cpuidle core to prevent it from crashing on platforms with incomplete cpuidle configuration (Fei Li). - Fix DT-related documentation in the generic power domains (genpd) framework and add a MAINTAINERS entry for DT-related material in genpd (Viresh Kumar). - Update the system suspend/resume infrastructure to improve the handling of aborts of suspend transitions in progress in the wakeup framework and rework the suspend-to-idle core loop to make it possible to filter out spurious wakeup events (specifically the ones coming from ACPI) without resuming all the way up to user space every time (Rafael Wysocki)" * tag 'pm-extra-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle PM / wakeup: Integrate mechanism to abort transitions in progress x86/intel_idle: add Gemini Lake support cpuidle: check dev before usage in cpuidle_use_deepest_state() powercap: intel_rapl: Add support for Gemini Lake PM / Domains: Add DT file to MAINTAINERS PM / Domains: Fix DT example
2017-05-08kernel/power/snapshot.c: use set_memory.h headerLaura Abbott
set_memory_* functions have moved to set_memory.h. Switch to this explicitly. Link: http://lkml.kernel.org/r/1488920133-27229-13-git-send-email-labbott@redhat.com Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-05ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idleRafael J. Wysocki
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ during suspend-to-idle transitions and, consequently, any events signaled through it wake up the system from that state. However, on some systems some of the events signaled via the ACPI SCI while suspended to idle should not cause the system to wake up. In fact, quite often they should just be discarded. Arguably, systems should not resume entirely on such events, but in order to decide which events really should cause the system to resume and which are spurious, it is necessary to resume up to the point when ACPI SCIs are actually handled and processed, which is after executing dpm_resume_noirq() in the system resume path. For this reasons, add a loop around freeze_enter() in which the platforms can process events signaled via multiplexed IRQ lines like the ACPI SCI and add suspend-to-idle hooks that can be used for this purpose to struct platform_freeze_ops. In the ACPI case, the ->wake hook is used for checking if the SCI has triggered while suspended and deferring the interrupt-induced system wakeup until the events signaled through it are actually processed sufficiently to decide whether or not the system should resume. In turn, the ->sync hook allows all of the relevant event queues to be flushed so as to prevent events from being missed due to race conditions. In addition to that, some ACPI code processing wakeup events needs to be modified to use the "hard" version of wakeup triggers, so that it will cause a system resume to happen on device-induced wakeup events even if the "soft" mechanism to prevent the system from suspending is not enabled (that also helps to catch device-induced wakeup events occurring during suspend transitions in progress). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-03Merge branch 'WIP.sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull sched.h split-up from Ingo Molnar: "The point of these changes is to significantly reduce the <linux/sched.h> header footprint, to speed up the kernel build and to have a cleaner header structure. After these changes the new <linux/sched.h>'s typical preprocessed size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K lines), which is around 40% faster to build on typical configs. Not much changed from the last version (-v2) posted three weeks ago: I eliminated quirks, backmerged fixes plus I rebased it to an upstream SHA1 from yesterday that includes most changes queued up in -next plus all sched.h changes that were pending from Andrew. I've re-tested the series both on x86 and on cross-arch defconfigs, and did a bisectability test at a number of random points. I tried to test as many build configurations as possible, but some build breakage is probably still left - but it should be mostly limited to architectures that have no cross-compiler binaries available on kernel.org, and non-default configurations" * 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits) sched/headers: Clean up <linux/sched.h> sched/headers: Remove #ifdefs from <linux/sched.h> sched/headers: Remove the <linux/topology.h> include from <linux/sched.h> sched/headers, hrtimer: Remove the <linux/wait.h> include from <linux/hrtimer.h> sched/headers, x86/apic: Remove the <linux/pm.h> header inclusion from <asm/apic.h> sched/headers, timers: Remove the <linux/sysctl.h> include from <linux/timer.h> sched/headers: Remove <linux/magic.h> from <linux/sched/task_stack.h> sched/headers: Remove <linux/sched.h> from <linux/sched/init.h> sched/core: Remove unused prefetch_stack() sched/headers: Remove <linux/rculist.h> from <linux/sched.h> sched/headers: Remove the 'init_pid_ns' prototype from <linux/sched.h> sched/headers: Remove <linux/signal.h> from <linux/sched.h> sched/headers: Remove <linux/rwsem.h> from <linux/sched.h> sched/headers: Remove the runqueue_is_locked() prototype sched/headers: Remove <linux/sched.h> from <linux/sched/hotplug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/debug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/nohz.h> sched/headers: Remove <linux/sched.h> from <linux/sched/stat.h> sched/headers: Remove the <linux/gfp.h> include from <linux/sched.h> sched/headers: Remove <linux/rtmutex.h> from <linux/sched.h> ...