summaryrefslogtreecommitdiff
path: root/drivers/base
AgeCommit message (Collapse)Author
2017-11-13Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Yet another big pile of changes: - More year 2038 work from Arnd slowly reaching the point where we need to think about the syscalls themself. - A new timer function which allows to conditionally (re)arm a timer only when it's either not running or the new expiry time is sooner than the armed expiry time. This allows to use a single timer for multiple timeout requirements w/o caring about the first expiry time at the call site. - A new NMI safe accessor to clock real time for the printk timestamp work. Can be used by tracing, perf as well if required. - A large number of timer setup conversions from Kees which got collected here because either maintainers requested so or they simply got ignored. As Kees pointed out already there are a few trivial merge conflicts and some redundant commits which was unavoidable due to the size of this conversion effort. - Avoid a redundant iteration in the timer wheel softirq processing. - Provide a mechanism to treat RTC implementations depending on their hardware properties, i.e. don't inflict the write at the 0.5 seconds boundary which originates from the PC CMOS RTC to all RTCs. No functional change as drivers need to be updated separately. - The usual small updates to core code clocksource drivers. Nothing really exciting" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (111 commits) timers: Add a function to start/reduce a timer pstore: Use ktime_get_real_fast_ns() instead of __getnstimeofday() timer: Prepare to change all DEFINE_TIMER() callbacks netfilter: ipvs: Convert timers to use timer_setup() scsi: qla2xxx: Convert timers to use timer_setup() block/aoe: discover_timer: Convert timers to use timer_setup() ide: Convert timers to use timer_setup() drbd: Convert timers to use timer_setup() mailbox: Convert timers to use timer_setup() crypto: Convert timers to use timer_setup() drivers/pcmcia: omap1: Fix error in automated timer conversion ARM: footbridge: Fix typo in timer conversion drivers/sgi-xp: Convert timers to use timer_setup() drivers/pcmcia: Convert timers to use timer_setup() drivers/memstick: Convert timers to use timer_setup() drivers/macintosh: Convert timers to use timer_setup() hwrng/xgene-rng: Convert timers to use timer_setup() auxdisplay: Convert timers to use timer_setup() sparc/led: Convert timers to use timer_setup() mips: ip22/32: Convert timers to use timer_setup() ...
2017-11-13Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main updates in this cycle were: - Group balancing enhancements and cleanups (Brendan Jackman) - Move CPU isolation related functionality into its separate kernel/sched/isolation.c file, with related 'housekeeping_*()' namespace and nomenclature et al. (Frederic Weisbecker) - Improve the interactive/cpu-intense fairness calculation (Josef Bacik) - Improve the PELT code and related cleanups (Peter Zijlstra) - Improve the logic of pick_next_task_fair() (Uladzislau Rezki) - Improve the RT IPI based balancing logic (Steven Rostedt) - Various micro-optimizations: - better !CONFIG_SCHED_DEBUG optimizations (Patrick Bellasi) - better idle loop (Cheng Jian) - ... plus misc fixes, cleanups and updates" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) sched/core: Optimize sched_feat() for !CONFIG_SCHED_DEBUG builds sched/sysctl: Fix attributes of some extern declarations sched/isolation: Document isolcpus= boot parameter flags, mark it deprecated sched/isolation: Add basic isolcpus flags sched/isolation: Move isolcpus= handling to the housekeeping code sched/isolation: Handle the nohz_full= parameter sched/isolation: Introduce housekeeping flags sched/isolation: Split out new CONFIG_CPU_ISOLATION=y config from CONFIG_NO_HZ_FULL sched/isolation: Rename is_housekeeping_cpu() to housekeeping_cpu() sched/isolation: Use its own static key sched/isolation: Make the housekeeping cpumask private sched/isolation: Provide a dynamic off-case to housekeeping_any_cpu() sched/isolation, watchdog: Use housekeeping_cpumask() instead of ad-hoc version sched/isolation: Move housekeeping related code to its own file sched/idle: Micro-optimize the idle loop sched/isolcpus: Fix "isolcpus=" boot parameter handling when !CONFIG_CPUMASK_OFFSTACK x86/tsc: Append the 'tsc=' description for the 'tsc=unstable' boot parameter sched/rt: Simplify the IPI based RT balancing logic block/ioprio: Use a helper to check for RT prio sched/rt: Add a helper to test for a RT task ...
2017-11-13Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking updates from Ingo Molnar: "The main changes in this cycle are: - Another attempt at enabling cross-release lockdep dependency tracking (automatically part of CONFIG_PROVE_LOCKING=y), this time with better performance and fewer false positives. (Byungchul Park) - Introduce lockdep_assert_irqs_enabled()/disabled() and convert open-coded equivalents to lockdep variants. (Frederic Weisbecker) - Add down_read_killable() and use it in the VFS's iterate_dir() method. (Kirill Tkhai) - Convert remaining uses of ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE(). Most of the conversion was Coccinelle driven. (Mark Rutland, Paul E. McKenney) - Get rid of lockless_dereference(), by strengthening Alpha atomics, strengthening READ_ONCE() with smp_read_barrier_depends() and thus being able to convert users of lockless_dereference() to READ_ONCE(). (Will Deacon) - Various micro-optimizations: - better PV qspinlocks (Waiman Long), - better x86 barriers (Michael S. Tsirkin) - better x86 refcounts (Kees Cook) - ... plus other fixes and enhancements. (Borislav Petkov, Juergen Gross, Miguel Bernal Marin)" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits) locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE rcu: Use lockdep to assert IRQs are disabled/enabled netpoll: Use lockdep to assert IRQs are disabled/enabled timers/posix-cpu-timers: Use lockdep to assert IRQs are disabled/enabled sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled irq_work: Use lockdep to assert IRQs are disabled/enabled irq/timings: Use lockdep to assert IRQs are disabled/enabled perf/core: Use lockdep to assert IRQs are disabled/enabled x86: Use lockdep to assert IRQs are disabled/enabled smp/core: Use lockdep to assert IRQs are disabled/enabled timers/hrtimer: Use lockdep to assert IRQs are disabled/enabled timers/nohz: Use lockdep to assert IRQs are disabled/enabled workqueue: Use lockdep to assert IRQs are disabled/enabled irq/softirqs: Use lockdep to assert IRQs are disabled/enabled locking/lockdep: Add IRQs disabled/enabled assertion APIs: lockdep_assert_irqs_enabled()/disabled() locking/pvqspinlock: Implement hybrid PV queued/unfair locks locking/rwlocks: Fix comments x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion() workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes ...
2017-11-13Merge tag 'regmap-v4.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap updates from Mark Brown: "After several quiet kernel releases we've got a couple of new features in regmap, support for using hwspinlocks as the lock for the internal data structures and a helper for polling on regmap_fields. The Kconfig dependencies on hwspinlocks were annoyingly difficult to squash between things behaving surprisingly and randconfig, I could've squashed those commits down but might've have caused hassle with other trees trying to use the new support. - support for using a hwspinlock to protect the regmap - an iopoll style helper for regmap_field" * tag 'regmap-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: Fix unused warning regmap: Try to work around Kconfig exploding on HWSPINLOCK regmap: Clean up hwspinlock on regmap exit regmap: Also protect hwspinlock in error handling path regmap: Add a config option for hwspinlock regmap: Add hardware spinlock support regmap: avoid -Wint-in-bool-context warning regmap: add iopoll-like polling macro for regmap_field regmap: constify regmap_bus structures regmap: Avoid namespace collision within macro & tidy up
2017-11-08Merge branch 'linus' into sched/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07Merge branch 'linus' into locking/core, to resolve conflictsIngo Molnar
Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-06Merge remote-tracking branches 'regmap/topic/const' and ↵Mark Brown
'regmap/topic/hwspinlock' into regmap-next
2017-11-06regmap: Fix unused warningBaolin Wang
This patch fixes the warning of label 'err_map' defined but not used. Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-11-06regmap: Try to work around Kconfig exploding on HWSPINLOCKMark Brown
Trying to work with hwspinlock from built in code is painful as it can be built modular. Invert the test for REGMAP_HWSPINLOCK for now so we end up requiring users to depend on HWSPINLOCK=y in order to turn on the hwspinlock code. Signed-off-by: Mark Brown <broonie@kernel.org>
2017-11-03regmap: Clean up hwspinlock on regmap exitMark Brown
We should free any hwspinlocks when we destroy the regmap, do so. Signed-off-by: Mark Brown <broonie@kernel.org>
2017-11-03regmap: Also protect hwspinlock in error handling pathMark Brown
The previous patch to allow the hwspinlock code to be disabled missed handling the free in the error path, do so using the better IS_ENABLED() pattern as suggested by Baolin. While we're at it also check that we have a hardware spinlock before freeing it - the core code reports an error when freeing an invalid lock. Suggested-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-11-03regmap: Add a config option for hwspinlockMark Brown
Unlike other lock types hwspinlocks are optional and can be built modular so we can't use them unconditionally in regmap so add a config option that drivers that want to use hwspinlocks with regmap can select which will ensure that hwspinlock is built in. Signed-off-by: Mark Brown <broonie@kernel.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01regmap: Add hardware spinlock supportBaolin Wang
On some platforms, when reading or writing some special registers through regmap, we should acquire one hardware spinlock to synchronize between the multiple subsystems. Thus this patch adds the hardware spinlock support for regmap. Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-10-27sched/isolation: Move isolcpus= handling to the housekeeping codeFrederic Weisbecker
We want to centralize the isolation features, to be done by the housekeeping subsystem and scheduler domain isolation is a significant part of it. No intended behaviour change, we just reuse the housekeeping cpumask and core code. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Christoph Lameter <cl@linux.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Link: http://lkml.kernel.org/r/1509072159-31808-11-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns ↵Mark Rutland
to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-24PM / QoS: Fix device resume latency PM QoSRafael J. Wysocki
The special value of 0 for device resume latency PM QoS means "no restriction", but there are two problems with that. First, device resume latency PM QoS requests with 0 as the value are always put in front of requests with positive values in the priority lists used internally by the PM QoS framework, causing 0 to be chosen as an effective constraint value. However, that 0 is then interpreted as "no restriction" effectively overriding the other requests with specific restrictions which is incorrect. Second, the users of device resume latency PM QoS have no way to specify that *any* resume latency at all should be avoided, which is an artificial limitation in general. To address these issues, modify device resume latency PM QoS to use S32_MAX as the "no constraint" value and 0 as the "no latency at all" one and rework its users (the cpuidle menu governor, the genpd QoS governor and the runtime PM framework) to follow these changes. Also add a special "n/a" value to the corresponding user space I/F to allow user space to indicate that it cannot accept any resume latencies at all for the given device. Fixes: 85dc0b8a4019 (PM / QoS: Make it possible to expose PM QoS latency constraints) Link: https://bugzilla.kernel.org/show_bug.cgi?id=197323 Reported-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Alex Shi <alex.shi@linaro.org> Cc: All applicable <stable@vger.kernel.org>
2017-10-13mm: only display online cpus of the numa nodeZhen Lei
When I execute numactl -H (which reads /sys/devices/system/node/nodeX/cpumap and displays cpumask_of_node for each node), I get different result on X86 and arm64. For each numa node, the former only displayed online CPUs, and the latter displayed all possible CPUs. Unfortunately, both Linux documentation and numactl manual have not described it clear. I sent a mail to ask for help, and Michal Hocko replied that he preferred to print online cpus because it doesn't really make much sense to bind anything on offline nodes. Will said: "I suspect the vast majority (if not all) code that reads this file was developed for x86, so having the same behaviour for arm64 sounds like something we should do ASAP before people try to special case with things like #ifdef __aarch64__. I'd rather have this in 4.14 if possible." Link: http://lkml.kernel.org/r/1506678805-15392-2-git-send-email-thunder.leizhen@huawei.com Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tianhong Ding <dingtianhong@huawei.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Libin <huawei.libin@huawei.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-11ACPI: properties: Align return codes of __acpi_node_get_property_reference()Sakari Ailus
acpi_fwnode_get_reference_args(), the function implementing ACPI support for fwnode_property_get_reference_args(), returns directly error codes from __acpi_node_get_property_reference(). The latter uses different error codes than the OF implementation. In particular, the OF implementation uses -ENOENT to indicate that the property is not found, a reference entry is empty and there are no more references. Document and align the error codes for property for fwnode_property_get_reference_args() so that they match with of_parse_phandle_with_args(). Fixes: 3e3119d3088f (device property: Introduce fwnode_property_get_reference_args) Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-10device property: Track owner device of device propertyJarkko Nikula
Deletion of subdevice will remove device properties associated to parent when they share the same firmware node after commit 478573c93abd (driver core: Don't leak secondary fwnode on device removal). This was observed with a driver adding subdevice that driver wasn't able to read device properties after rmmod/modprobe cycle. Consider the lifecycle of it: parent device registration ACPI_COMPANION_SET() device_add_properties() pset_copy_set() set_secondary_fwnode(dev, &p->fwnode) device_add() parent probe read device properties ACPI_COMPANION_SET(subdevice, ACPI_COMPANION(parent)) device_add(subdevice) parent remove device_del(subdevice) device_remove_properties() set_secondary_fwnode(dev, NULL); pset_free() Parent device will have its primary firmware node pointing to an ACPI node and secondary firmware node point to device properties. ACPI_COMPANION_SET() call in parent probe will set the subdevice's firmware node to point to the same 'struct fwnode_handle' and the associated secondary firmware node, i.e. the device properties as the parent. When subdevice is deleted in parent remove that will remove those device properties and attempt to read device properties in next parent probe call will fail. Fix this by tracking the owner device of device properties and delete them only when owner device is being deleted. Fixes: 478573c93abd (driver core: Don't leak secondary fwnode on device removal) Cc: 4.9+ <stable@vger.kernel.org> # 4.9+ Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-05timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()Kees Cook
Remove uses of init_timer_on_stack() with open-coded function and data assignments that could be expressed using timer_setup_on_stack(). Several were removed from the stack entirely since there was a one-to-one mapping of parent structure to timer, those are switched to using timer_setup() instead. All related callbacks were adjusted to use from_timer(). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Cc: Petr Mladek <pmladek@suse.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Sebastian Reichel <sre@kernel.org> Cc: Kalle Valo <kvalo@qca.qualcomm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Wim Van Sebroeck <wim@iguana.be> Cc: linux1394-devel@lists.sourceforge.net Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: linux-s390@vger.kernel.org Cc: linux-wireless@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com> Cc: linux-scsi@vger.kernel.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ursula Braun <ubraun@linux.vnet.ibm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Harish Patil <harish.patil@cavium.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Michael Reed <mdr@sgi.com> Cc: Manish Chopra <manish.chopra@cavium.com> Cc: Len Brown <len.brown@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-pm@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Mark Gross <mark.gross@intel.com> Cc: linux-watchdog@vger.kernel.org Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Guenter Roeck <linux@roeck-us.net> Cc: netdev@vger.kernel.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lkml.kernel.org/r/1507159627-127660-4-git-send-email-keescook@chromium.org
2017-10-03Merge tag 'driver-core-4.14-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are a few small fixes for 4.14-rc4. The removal of DRIVER_ATTR() was almost completed by 4.14-rc1, but one straggler made it in through some other tree (odds are, one of mine...) So there's a simple removal of the last user, and then finally the macro is removed from the tree. There's a fix for old crazy udev instances that insist on reloading a module when it is removed from the kernel due to the new uevents for bind/unbind. This fixes the reported regression, hopefully some year in the future we can drop the workaround, once users update to the latest version, but I'm not holding my breath. And then there's a build fix for a linker warning, and a buffer overflow fix to match the PCI fixes you took through the PCI tree in the same area. All of these have been in linux-next for a few weeks while I've been traveling, sorry for the delay" * tag 'driver-core-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: driver core: remove DRIVER_ATTR fpga: altera-cvp: remove DRIVER_ATTR() usage driver core: platform: Don't read past the end of "driver_override" buffer base: arch_topology: fix section mismatch build warnings driver core: suppress sending MODALIAS in UNBIND uevents
2017-09-26PM / OPP: Call notifier without holding opp_table->lockViresh Kumar
The notifier callbacks may want to call some OPP helper routines which may try to take the same opp_table->lock again and cause a deadlock. One such usecase was reported by Chanwoo Choi, where calling dev_pm_opp_disable() leads us to the devfreq's OPP notifier handler, which further calls dev_pm_opp_find_freq_floor() and it deadlocks. We don't really need the opp_table->lock to be held across the notifier call though, all we want to make sure is that the 'opp' doesn't get freed while being used from within the notifier chain. We can do it with help of dev_pm_opp_get/put() as well. Let's do it. Cc: 4.11+ <stable@vger.kernel.org> # 4.11+ Fixes: 5b650b388844 "PM / OPP: Take kref from _find_opp_table()" Reported-by: Chanwoo Choi <cw00.choi@samsung.com> Tested-by: Chanwoo Choi <cw00.choi@samsung.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-22Merge tag 'pm-4.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix a cpufreq regression introduced by recent changes related to the generic DT driver, an initialization time memory leak in cpuidle on ARM, a PM core bug that may cause system suspend/resume to fail on some systems, a request type validation issue in the PM QoS framework and two documentation-related issues. Specifics: - Fix a regression in cpufreq on systems using DT as the source of CPU configuration information where two different code paths attempt to create the cpufreq-dt device object (there can be only one) and fix up the "compatible" matching for some TI platforms on top of that (Viresh Kumar, Dave Gerlach). - Fix an initialization time memory leak in cpuidle on ARM which occurs if the cpuidle driver initialization fails (Stefan Wahren). - Fix a PM core function that checks whether or not there are any system suspend/resume callbacks for a device, but forgets to check legacy callbacks which then may be skipped incorrectly and the system may crash and/or the device may become unusable after a suspend-resume cycle (Rafael Wysocki). - Fix request type validation for latency tolerance PM QoS requests which may lead to unexpected behavior (Jan Schönherr). - Fix a broken link to PM documentation from a header file and a typo in a PM document (Geert Uytterhoeven, Rafael Wysocki)" * tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: ti-cpufreq: Support additional am43xx platforms ARM: cpuidle: Avoid memleak if init fail cpufreq: dt-platdev: Add some missing platforms to the blacklist PM: core: Fix device_pm_check_callbacks() PM: docs: Drop an excess character from devices.rst PM / QoS: Use the correct variable to check the QoS request type driver core: Fix link to device power management documentation
2017-09-22Merge branches 'pm-core', 'pm-qos' and 'pm-docs'Rafael J. Wysocki
* pm-core: PM: core: Fix device_pm_check_callbacks() * pm-qos: PM / QoS: Use the correct variable to check the QoS request type * pm-docs: PM: docs: Drop an excess character from devices.rst driver core: Fix link to device power management documentation
2017-09-19PM: core: Fix device_pm_check_callbacks()Rafael J. Wysocki
The device_pm_check_callbacks() function doesn't check legacy ->suspend and ->resume callback pointers under the device's bus type, class and driver, so in some cases it may set the no_pm_callbacks flag for the device incorrectly and then the callbacks may be skipped during system suspend/resume, which shouldn't happen. Fixes: aa8e54b55947 (PM / sleep: Go direct_complete if driver has no callbacks) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
2017-09-18driver core: platform: Don't read past the end of "driver_override" bufferNicolai Stange
When printing the driver_override parameter when it is 4095 and 4094 bytes long, the printing code would access invalid memory because we need count+1 bytes for printing. Reject driver_override values of these lengths in driver_override_store(). This is in close analogy to commit 4efe874aace5 ("PCI: Don't read past the end of sysfs "driver_override" buffer") from Sasha Levin. Fixes: 3d713e0e382e ("driver core: platform: add device binding path 'driver_override'") Cc: stable@vger.kernel.org # v3.17+ Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-18base: arch_topology: fix section mismatch build warningsSudeep Holla
Commit 2ef7a2953c81 ("arm, arm64: factorize common cpu capacity default code") introduced init_cpu_capacity_callback and init_cpu_capacity_notifier which are referenced from initcall and are missing __init{,data} annotations resulting the below section mismatch build warnings. "WARNING: vmlinux.o(.text+0xbab790): Section mismatch in reference from the function init_cpu_capacity_callback() to the variable .init.text:$x The function init_cpu_capacity_callback() references the variable __init $x. This is often because init_cpu_capacity_callback lacks a __init annotation or the annotation of $x is wrong." This patch fixes the above build warnings by adding the required annotations. Fixes: 2ef7a2953c81 ("arm, arm64: factorize common cpu capacity default code") Cc: Juri Lelli <juri.lelli@arm.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-18PM / QoS: Use the correct variable to check the QoS request typeJan H. Schönherr
Use the actual function argument for the validation of the request type, instead of the type field in a fresh (supposedly zero-initialized) request structure. Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-17dma-coherent: fix rmem_dma_device_init regressionArnd Bergmann
My recent bug fix introduced another bug, which caused rmem_dma_device_init to always fail, as rmem->priv is never set to anything. This restores the previous behavior, calling dma_init_coherent_memory() whenever ->priv is NULL. Fixes: d35b0996fef3 ("dma-coherent: fix dma_declare_coherent_memory() logic error") Reported-by: Roy Pledge <roy.pledge@nxp.com> Tested-by: Roy Pledge <roy.pledge@nxp.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-09-16firmware: Restore support for built-in firmwareMarkus Trippelsdorf
Commit 5620a0d1aac ("firmware: delete in-kernel firmware") removed the entire firmware directory. Unfortunately it thereby also removed the support for built-in firmware. This restores the ability to build firmware directly into the kernel by pruning the original Makefile to the necessary minimum. The default for EXTRA_FIRMWARE_DIR is now the standard directory /lib/firmware/. Fixes: 5620a0d1aac ("firmware: delete in-kernel firmware") Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Acked-by: Greg K-H <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-12Merge tag 'dma-mapping-4.14' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - removal of the old dma_alloc_noncoherent interface - remove unused flags to dma_declare_coherent_memory - restrict OF DMA configuration to specific physical busses - use the iommu mailing list for dma-mapping questions and patches * tag 'dma-mapping-4.14' of git://git.infradead.org/users/hch/dma-mapping: dma-coherent: fix dma_declare_coherent_memory() logic error ARM: imx: mx31moboard: Remove unused 'dma' variable dma-coherent: remove an unused variable MAINTAINERS: use the iommu list for the dma-mapping subsystem dma-coherent: remove the DMA_MEMORY_MAP and DMA_MEMORY_IO flags dma-coherent: remove the DMA_MEMORY_INCLUDES_CHILDREN flag of: restrict DMA configuration dma-mapping: remove dma_alloc_noncoherent and dma_free_noncoherent i825xx: switch to switch to dma_alloc_attrs au1000_eth: switch to dma_alloc_attrs sgiseeq: switch to dma_alloc_attrs dma-mapping: reduce dma_mapping_error inline bloat
2017-09-10Revert "firmware: add sanity check on shutdown/suspend"Linus Torvalds
This reverts commit 81f95076281fdd3bc382e004ba1bce8e82fccbce. It causes random failures of firmware loading at resume time (well, random for me, it seems to be more reliable for others) because the firmware disabling is not actually synchronous with any particular resume event, and at least the btusb driver that uses a workqueue to load the firmware at resume seems to occasionally hit the "firmware loading is disabled" logic because the firmware loader hasn't gotten the resume event yet. Some kind of sanity check for not trying to load firmware when it's not possible might be a good thing, but this commit was not it. Greg seems to have silently suffered the same issue, and pointed to the likely culprit, and Gabriel C verified the revert fixed it for him too. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Pointed-at-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Gabriel C <nix.or.die@gmail.com> Cc: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08treewide: make "nr_cpu_ids" unsignedAlexey Dobriyan
First, number of CPUs can't be negative number. Second, different signnnedness leads to suboptimal code in the following cases: 1) kmalloc(nr_cpu_ids * sizeof(X)); "int" has to be sign extended to size_t. 2) while (loff_t *pos < nr_cpu_ids) MOVSXD is 1 byte longed than the same MOV. Other cases exist as well. Basically compiler is told that nr_cpu_ids can't be negative which can't be deduced if it is "int". Code savings on allyesconfig kernel: -3KB add/remove: 0/0 grow/shrink: 25/264 up/down: 261/-3631 (-3370) function old new delta coretemp_cpu_online 450 512 +62 rcu_init_one 1234 1272 +38 pci_device_probe 374 399 +25 ... pgdat_reclaimable_pages 628 556 -72 select_fallback_rq 446 369 -77 task_numa_find_cpu 1923 1807 -116 Link: http://lkml.kernel.org/r/20170819114959.GA30580@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08mm: change the call sites of numa statistics itemsKemi Wang
Patch series "Separate NUMA statistics from zone statistics", v2. Each page allocation updates a set of per-zone statistics with a call to zone_statistics(). As discussed in 2017 MM summit, these are a substantial source of overhead in the page allocator and are very rarely consumed. This significant overhead in cache bouncing caused by zone counters (NUMA associated counters) update in parallel in multi-threaded page allocation (pointed out by Dave Hansen). A link to the MM summit slides: http://people.netfilter.org/hawk/presentations/MM-summit2017/MM-summit2017-JesperBrouer.pdf To mitigate this overhead, this patchset separates NUMA statistics from zone statistics framework, and update NUMA counter threshold to a fixed size of MAX_U16 - 2, as a small threshold greatly increases the update frequency of the global counter from local per cpu counter (suggested by Ying Huang). The rationality is that these statistics counters don't need to be read often, unlike other VM counters, so it's not a problem to use a large threshold and make readers more expensive. With this patchset, we see 31.3% drop of CPU cycles(537-->369, see below) for per single page allocation and reclaim on Jesper's page_bench03 benchmark. Meanwhile, this patchset keeps the same style of virtual memory statistics with little end-user-visible effects (only move the numa stats to show behind zone page stats, see the first patch for details). I did an experiment of single page allocation and reclaim concurrently using Jesper's page_bench03 benchmark on a 2-Socket Broadwell-based server (88 processors with 126G memory) with different size of threshold of pcp counter. Benchmark provided by Jesper D Brouer(increase loop times to 10000000): https://github.com/netoptimizer/prototype-kernel/tree/master/kernel/mm/bench Threshold CPU cycles Throughput(88 threads) 32 799 241760478 64 640 301628829 125 537 358906028 <==> system by default 256 468 412397590 512 428 450550704 4096 399 482520943 20000 394 489009617 30000 395 488017817 65533 369(-31.3%) 521661345(+45.3%) <==> with this patchset N/A 342(-36.3%) 562900157(+56.8%) <==> disable zone_statistics This patch (of 3): In this patch, NUMA statistics is separated from zone statistics framework, all the call sites of NUMA stats are changed to use numa-stats-specific functions, it does not have any functionality change except that the number of NUMA stats is shown behind zone page stats when users *read* the zone info. E.g. cat /proc/zoneinfo ***Base*** ***With this patch*** nr_free_pages 3976 nr_free_pages 3976 nr_zone_inactive_anon 0 nr_zone_inactive_anon 0 nr_zone_active_anon 0 nr_zone_active_anon 0 nr_zone_inactive_file 0 nr_zone_inactive_file 0 nr_zone_active_file 0 nr_zone_active_file 0 nr_zone_unevictable 0 nr_zone_unevictable 0 nr_zone_write_pending 0 nr_zone_write_pending 0 nr_mlock 0 nr_mlock 0 nr_page_table_pages 0 nr_page_table_pages 0 nr_kernel_stack 0 nr_kernel_stack 0 nr_bounce 0 nr_bounce 0 nr_zspages 0 nr_zspages 0 numa_hit 0 *nr_free_cma 0* numa_miss 0 numa_hit 0 numa_foreign 0 numa_miss 0 numa_interleave 0 numa_foreign 0 numa_local 0 numa_interleave 0 numa_other 0 numa_local 0 *nr_free_cma 0* numa_other 0 ... ... vm stats threshold: 10 vm stats threshold: 10 ... ... The next patch updates the numa stats counter size and threshold. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1503568801-21305-2-git-send-email-kemi.wang@intel.com Signed-off-by: Kemi Wang <kemi.wang@intel.com> Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Christopher Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Ying Huang <ying.huang@intel.com> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Tim Chen <tim.c.chen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06mm, memory_hotplug: remove zone restrictionsMichal Hocko
Historically we have enforced that any kernel zone (e.g ZONE_NORMAL) has to precede the Movable zone in the physical memory range. The purpose of the movable zone is, however, not bound to any physical memory restriction. It merely defines a class of migrateable and reclaimable memory. There are users (e.g. CMA) who might want to reserve specific physical memory ranges for their own purpose. Moreover our pfn walkers have to be prepared for zones overlapping in the physical range already because we do support interleaving NUMA nodes and therefore zones can interleave as well. This means we can allow each memory block to be associated with a different zone. Loosen the current onlining semantic and allow explicit onlining type on any memblock. That means that online_{kernel,movable} will be allowed regardless of the physical address of the memblock as long as it is offline of course. This might result in moveble zone overlapping with other kernel zones. Default onlining then becomes a bit tricky but still sensible. echo online > memoryXY/state will online the given block to 1) the default zone if the given range is outside of any zone 2) the enclosing zone if such a zone doesn't interleave with any other zone 3) the default zone if more zones interleave for this range where default zone is movable zone only if movable_node is enabled otherwise it is a kernel zone. Here is an example of the semantic with (movable_node is not present but it work in an analogous way). We start with following memblocks, all of them offline: memory34/valid_zones:Normal Movable memory35/valid_zones:Normal Movable memory36/valid_zones:Normal Movable memory37/valid_zones:Normal Movable memory38/valid_zones:Normal Movable memory39/valid_zones:Normal Movable memory40/valid_zones:Normal Movable memory41/valid_zones:Normal Movable Now, we online block 34 in default mode and block 37 as movable root@test1:/sys/devices/system/node/node1# echo online > memory34/state root@test1:/sys/devices/system/node/node1# echo online_movable > memory37/state memory34/valid_zones:Normal memory35/valid_zones:Normal Movable memory36/valid_zones:Normal Movable memory37/valid_zones:Movable memory38/valid_zones:Normal Movable memory39/valid_zones:Normal Movable memory40/valid_zones:Normal Movable memory41/valid_zones:Normal Movable As we can see all other blocks can still be onlined both into Normal and Movable zones and the Normal is default because the Movable zone spans only block37 now. root@test1:/sys/devices/system/node/node1# echo online_movable > memory41/state memory34/valid_zones:Normal memory35/valid_zones:Normal Movable memory36/valid_zones:Normal Movable memory37/valid_zones:Movable memory38/valid_zones:Movable Normal memory39/valid_zones:Movable Normal memory40/valid_zones:Movable Normal memory41/valid_zones:Movable Now the default zone for blocks 37-41 has changed because movable zone spans that range. root@test1:/sys/devices/system/node/node1# echo online_kernel > memory39/state memory34/valid_zones:Normal memory35/valid_zones:Normal Movable memory36/valid_zones:Normal Movable memory37/valid_zones:Movable memory38/valid_zones:Normal Movable memory39/valid_zones:Normal memory40/valid_zones:Movable Normal memory41/valid_zones:Movable Note that the block 39 now belongs to the zone Normal and so block38 falls into Normal by default as well. For completness root@test1:/sys/devices/system/node/node1# for i in memory[34]? do echo online > $i/state 2>/dev/null done memory34/valid_zones:Normal memory35/valid_zones:Normal memory36/valid_zones:Normal memory37/valid_zones:Movable memory38/valid_zones:Normal memory39/valid_zones:Normal memory40/valid_zones:Movable memory41/valid_zones:Movable Implementation wise the change is quite straightforward. We can get rid of allow_online_pfn_range altogether. online_pages allows only offline nodes already. The original default_zone_for_pfn will become default_kernel_zone_for_pfn. New default_zone_for_pfn implements the above semantic. zone_for_pfn_range is slightly reorganized to implement kernel and movable online type explicitly and MMOP_ONLINE_KEEP becomes a catch all default behavior. Link: http://lkml.kernel.org/r/20170714121233.16861-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Kani Toshimitsu <toshi.kani@hpe.com> Cc: <slaoub@gmail.com> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: <linux-api@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06mm, memory_hotplug: display allowed zones in the preferred orderingMichal Hocko
Prior to commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") we used to allow to change the valid zone types of a memory block if it is adjacent to a different zone type. This fact was reflected in memoryNN/valid_zones by the ordering of printed zones. The first one was default (echo online > memoryNN/state) and the other one could be onlined explicitly by online_{movable,kernel}. This behavior was removed by the said patch and as such the ordering was not all that important. In most cases a kernel zone would be default anyway. The only exception is movable_node handled by "mm, memory_hotplug: support movable_node for hotpluggable nodes". Let's reintroduce this behavior again because later patch will remove the zone overlap restriction and so user will be allowed to online kernel resp. movable block regardless of its placement. Original behavior will then become significant again because it would be non-trivial for users to see what is the default zone to online into. Implementation is really simple. Pull out zone selection out of move_pfn_range into zone_for_pfn_range helper and use it in show_valid_zones to display the zone for default onlining and then both kernel and movable if they are allowed. Default online zone is not duplicated. Link: http://lkml.kernel.org/r/20170714121233.16861-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Kani Toshimitsu <toshi.kani@hpe.com> Cc: <slaoub@gmail.com> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-05Merge tag 'devprop-4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull device properties framework updates from Rafael Wysocki: "These introduce fwnode operations for all of the separate types of 'firmware nodes' that can be handled by the device properties framework, make the framework use const fwnode arguments all over, add a helper for the consolidated handling of node references and switch over the framework to the new UUID API. Specifics: - Introduce fwnode operations for all of the separate types of 'firmware nodes' that can be handled by the device properties framework and drop the type field from struct fwnode_handle (Sakari Ailus, Arnd Bergmann). - Make the device properties framework use const fwnode arguments where possible (Sakari Ailus). - Add a helper for the consolidated handling of node references to the device properties framework (Sakari Ailus). - Switch over the ACPI part of the device properties framework to the new UUID API (Andy Shevchenko)" * tag 'devprop-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: device property: Switch to use new generic UUID API device property: export irqchip_fwnode_ops device property: Introduce fwnode_property_get_reference_args device property: Constify fwnode property API device property: Constify argument to pset fwnode backend ACPI: Constify internal fwnode arguments ACPI: Constify acpi_bus helper functions, switch to macros ACPI: Prepare for constifying acpi_get_next_subnode() fwnode argument device property: Get rid of struct fwnode_handle type field ACPI: Use IS_ERR_OR_NULL() instead of non-NULL check in is_acpi_data_node()
2017-09-05Merge tag 'pm-4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "This time (again) cpufreq gets the majority of changes which mostly are driver updates (including a major consolidation of intel_pstate), some schedutil governor modifications and core cleanups. There also are some changes in the system suspend area, mostly related to diagnostics and debug messages plus some renames of things related to suspend-to-idle. One major change here is that suspend-to-idle is now going to be preferred over S3 on systems where the ACPI tables indicate to do so and provide requsite support (the Low Power Idle S0 _DSM in particular). The system sleep documentation and the tools related to it are updated too. The rest is a few cpuidle changes (nothing major), devfreq updates, generic power domains (genpd) framework updates and a few assorted modifications elsewhere. Specifics: - Drop the P-state selection algorithm based on a PID controller from intel_pstate and make it use the same P-state selection method (based on the CPU load) for all types of systems in the active mode (Rafael Wysocki, Srinivas Pandruvada). - Rework the cpufreq core and governors to make it possible to take cross-CPU utilization updates into account and modify the schedutil governor to actually do so (Viresh Kumar). - Clean up the handling of transition latency information in the cpufreq core and untangle it from the information on which drivers cannot do dynamic frequency switching (Viresh Kumar). - Add support for new SoCs (MT2701/MT7623 and MT7622) to the mediatek cpufreq driver and update its DT bindings (Sean Wang). - Modify the cpufreq dt-platdev driver to autimatically create cpufreq devices for the new (v2) Operating Performance Points (OPP) DT bindings and update its whitelist of supported systems (Viresh Kumar, Shubhrajyoti Datta, Marc Gonzalez, Khiem Nguyen, Finley Xiao). - Add support for Ux500 to the cpufreq-dt driver and drop the obsolete dbx500 cpufreq driver (Linus Walleij, Arnd Bergmann). - Add new SoC (R8A7795) support to the cpufreq rcar driver (Khiem Nguyen). - Fix and clean up assorted issues in the cpufreq drivers and core (Arvind Yadav, Christophe Jaillet, Colin Ian King, Gustavo Silva, Julia Lawall, Leonard Crestez, Rob Herring, Sudeep Holla). - Update the IO-wait boost handling in the schedutil governor to make it less aggressive (Joel Fernandes). - Rework system suspend diagnostics to make it print fewer messages to the kernel log by default, add a sysfs knob to allow more suspend-related messages to be printed and add Low Power S0 Idle constraints checks to the ACPI suspend-to-idle code (Rafael Wysocki, Srinivas Pandruvada). - Prefer suspend-to-idle over S3 on ACPI-based systems with the ACPI_FADT_LOW_POWER_S0 flag set and the Low Power Idle S0 _DSM interface present in the ACPI tables (Rafael Wysocki). - Update documentation related to system sleep and rename a number of items in the code to make it cleare that they are related to suspend-to-idle (Rafael Wysocki). - Export a variable allowing device drivers to check the target system sleep state from the core system suspend code (Florian Fainelli). - Clean up the cpuidle subsystem to handle the polling state on x86 in a more straightforward way and to use %pOF instead of full_name (Rafael Wysocki, Rob Herring). - Update the devfreq framework to fix and clean up a few minor issues (Chanwoo Choi, Rob Herring). - Extend diagnostics in the generic power domains (genpd) framework and clean it up slightly (Thara Gopinath, Rob Herring). - Fix and clean up a couple of issues in the operating performance points (OPP) framework (Viresh Kumar, Waldemar Rymarkiewicz). - Add support for RV1108 to the rockchip-io Adaptive Voltage Scaling (AVS) driver (David Wu). - Fix the usage of notifiers in CPU power management on some platforms (Alex Shi). - Update the pm-graph system suspend/hibernation and boot profiling utility (Todd Brandt). - Make it possible to run the cpupower utility without CPU0 (Prarit Bhargava)" * tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (87 commits) cpuidle: Make drivers initialize polling state cpuidle: Move polling state initialization code to separate file cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol cpufreq: imx6q: Fix imx6sx low frequency support cpufreq: speedstep-lib: make several arrays static, makes code smaller PM: docs: Delete the obsolete states.txt document PM: docs: Describe high-level PM strategies and sleep states PM / devfreq: Fix memory leak when fail to register device PM / devfreq: Add dependency on PM_OPP PM / devfreq: Move private devfreq_update_stats() into devfreq PM / devfreq: Convert to using %pOF instead of full_name PM / AVS: rockchip-io: add io selectors and supplies for RV1108 cpufreq: ti: Fix 'of_node_put' being called twice in error handling path cpufreq: dt-platdev: Drop few entries from whitelist cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2 ARM: ux500: don't select CPUFREQ_DT cpuidle: Convert to using %pOF instead of full_name cpufreq: Convert to using %pOF instead of full_name PM / Domains: Convert to using %pOF instead of full_name cpufreq: Cap the default transition delay value to 10 ms ...
2017-09-05dma-coherent: fix dma_declare_coherent_memory() logic errorArnd Bergmann
A recent change interprets the return code of dma_init_coherent_memory as an error value, but it is instead a boolean, where 'true' indicates success. This leads causes the caller to always do the wrong thing, and also triggers a compile-time warning about it: drivers/base/dma-coherent.c: In function 'dma_declare_coherent_memory': drivers/base/dma-coherent.c:99:15: error: 'mem' may be used uninitialized in this function [-Werror=maybe-uninitialized] I ended up changing the code a little more, to give use the usual error handling, as this seemed the best way to fix up the warning and make the code look reasonable at the same time. Fixes: 2436bdcda53f ("dma-coherent: remove the DMA_MEMORY_MAP and DMA_MEMORY_IO flags") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-09-04dma-coherent: remove an unused variableChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
2017-09-04Merge branch 'pm-sleep'Rafael J. Wysocki
* pm-sleep: ACPI / PM: Check low power idle constraints for debug only PM / s2idle: Rename platform operations structure PM / s2idle: Rename ->enter_freeze to ->enter_s2idle PM / s2idle: Rename freeze_state enum and related items PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE ACPI / PM: Prefer suspend-to-idle over S3 on some systems platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle PM / suspend: Define pr_fmt() in suspend.c PM / suspend: Use mem_sleep_labels[] strings in messages PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq() PM / core: Add error argument to dpm_show_time() PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq() PM / s2idle: Rearrange the main suspend-to-idle loop PM / timekeeping: Print debug messages when requested PM / sleep: Mark suspend/hibernation start and finish PM / sleep: Do not print debug messages by default PM / suspend: Export pm_suspend_target_state
2017-09-04Merge branches 'pm-core', 'pm-opp', 'pm-domains', 'pm-cpu' and 'pm-avs'Rafael J. Wysocki
* pm-core: PM / wakeup: Set power.can_wakeup if wakeup_sysfs_add() fails * pm-opp: PM / OPP: Fix get sharing CPUs when hotplug is used PM / OPP: OF: Use pr_debug() instead of pr_err() while adding OPP table * pm-domains: PM / Domains: Convert to using %pOF instead of full_name PM / Domains: Extend generic power domain debugfs PM / Domains: Add time accounting to various genpd states * pm-cpu: PM / CPU: replace raw_notifier with atomic_notifier * pm-avs: PM / AVS: rockchip-io: add io selectors and supplies for RV1108
2017-09-01dma-coherent: remove the DMA_MEMORY_MAP and DMA_MEMORY_IO flagsChristoph Hellwig
DMA_MEMORY_IO was never used in the tree, so remove it. That means there is no need for the DMA_MEMORY_MAP flag either now, so remove it as well and change dma_declare_coherent_memory to return a normal errno value. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
2017-09-01dma-coherent: remove the DMA_MEMORY_INCLUDES_CHILDREN flagChristoph Hellwig
This flag was never implemented or used. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2017-08-31driver core: bus: Fix a potential double freeChristophe JAILLET
The .release function of driver_ktype is 'driver_release()'. This function frees the container_of this kobject. So, this memory must not be freed explicitly in the error handling path of 'bus_add_driver()'. Otherwise a double free will occur. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-28Do not disable driver and bus shutdown hook when class shutdown hook is set.Michal Suchanek
As seen from the implementation of the single class shutdown hook this is not very sound design. Rename the class shutdown hook to shutdown_pre to make it clear it runs before the driver shutdown hook. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-28base: topology: constify attribute_group structures.Arvind Yadav
attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-28base: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-25PM / Domains: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>