summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-03-09ACPI / EC: Deny write access unless requested by module paramOleg Drokin
In debugfs it's not enough to just set file mode to read-only to deny write access to a file, instead just don't provide the write method unless write access is really requested. Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Acked-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09ACPI / fan: Make struct dev_pm_ops constKaiyen Chang
Silence the following checkpatch warning: WARNING: struct dev_pm_ops should normally be const. Signed-off-by: Kaiyen Chang <kaiyen.chang@intel.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09Merge tag 'for-v4.5-rc/omap-critical-fixes-a' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into fixes ARM: OMAP2+: critical DRA7xx fix for v4.5-rc Force the DRA7xx Ethernet internal clock source to stay enabled per TI erratum i877: http://www.ti.com/lit/er/sprz429h/sprz429h.pdf Otherwise, if the Ethernet internal clock source is disabled, the chip will age prematurely, and the RGMII I/O timing will soon fail to meet the delay time and skew specifications for 1000Mbps Ethernet. This fix should go in as soon as possible. Basic build, boot, and PM test results are available here: http://www.pwsan.com/omap/testlogs/omap-critical-fixes-for-v4.5-rc/20160307014209/ * tag 'for-v4.5-rc/omap-critical-fixes-a' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending: ARM: dts: dra7: do not gate cpsw clock due to errata i877 ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property Signed-off-by: Olof Johansson <olof@lixom.net>
2016-03-09Merge tag 'pci-v4.5-fixes-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Here's another fix for v4.5. It fixes an ARM regression in v4.0 that causes many boxes to crash on boot, including cns3xxx, dove, footbridge, iopl13xx, ip32x, iop33x, ixp4xx, ks8695, mv78xx0, orion5x, pxa, sa1100, etc. The change is in code that's only built for ARM and ARM64. Summary: Enumeration: Allow generic PCI domains without bridge "parent" pointer (Krzysztof Hałasa)" * tag 'pci-v4.5-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr()
2016-03-09Revert "cpufreq: postfix policy directory with the first CPU in related_cpus"Viresh Kumar
Revert commit 3510fac45492 (cpufreq: postfix policy directory with the first CPU in related_cpus). Earlier, the policy->kobj was added to the kobject core, before ->init() callback was called for the cpufreq drivers. Which allowed those drivers to add or remove, driver dependent, sysfs files/directories to the same kobj from their ->init() and ->exit() callbacks. That isn't possible anymore after commit 3510fac45492. Now, there is no other clean alternative that people can adopt. Its better to revert the earlier commit to allow cpufreq drivers to create/remove sysfs files from ->init() and ->exit() callbacks. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09pmem: don't allocate unused major device numberNeilBrown
When alloc_disk(0) or alloc_disk-node(0, XX) is used, the ->major number is completely ignored: all devices are allocated with a major of BLOCK_EXT_MAJOR. So there is no point allocating pmem_major. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-09ACPI: Change NFIT driver to insert new resourceToshi Kani
ACPI 6 defines persistent memory (PMEM) ranges in multiple firmware interfaces, e820, EFI, and ACPI NFIT table. This EFI change, however, leads to hit a bug in the grub bootloader, which treats EFI_PERSISTENT_MEMORY type as regular memory and corrupts stored user data [1]. Therefore, BIOS may set generic reserved type in e820 and EFI to cover PMEM ranges. The kernel can initialize PMEM ranges from ACPI NFIT table alone. This scheme causes a problem in the iomem table, though. On x86, for instance, e820_reserve_resources() initializes top-level entries (iomem_resource.child) from the e820 table at early boot-time. This creates "reserved" entry for a PMEM range, which does not allow region_intersects() to check with PMEM type. Change acpi_nfit_register_region() to call acpi_nfit_insert_resource(), which calls insert_resource() to insert a PMEM entry from NFIT when the iomem table does not have a PMEM entry already. That is, when a PMEM range is marked as reserved type in e820, it inserts "Persistent Memory" entry, which results as follows. + "Persistent Memory" + "reserved" This allows the EINJ driver, which calls region_intersects() to check PMEM ranges, to work continuously even if BIOS sets reserved type (or sets nothing) to PMEM ranges in e820 and EFI. [1]: https://lists.gnu.org/archive/html/grub-devel/2015-11/msg00209.html Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-09resource: Export insert_resource and remove_resourceToshi Kani
insert_resource() and remove_resouce() are called by producers of resources, such as FW modules and bus drivers. These modules may be implemented as loadable modules. Export insert_resource() and remove_resouce() so that they can be called from such modules. link: https://lkml.org/lkml/2016/3/8/872 Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-09resource: Add remove_resource interfaceToshi Kani
insert_resource() and insert_resource_conflict() are called by resource producers to insert a new resource. When there is any conflict, they move conflicting resources down to the children of the new resource. There is no destructor of these interfaces, however. Add remove_resource(), which removes a resource previously inserted by insert_resource() or insert_resource_conflict(), and moves the children up to where they were before. __release_resource() is changed to have @release_child, so that this function can be used for remove_resource() as well. Also add comments to clarify that these functions are intended for producers of resources to avoid any confusion with request/release_resource() for consumers. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-09resource: Change __request_region to inherit from immediate parentToshi Kani
__request_region() sets 'flags' of a new resource from @parent as it inherits the parent's attribute. When a target resource has a conflict, this function inserts the new resource entry under the conflicted entry by updating @parent. In this case, the new resource entry needs to inherit attribute from the updated parent. This conflict is a typical case since __request_region() is used to allocate a new resource from a specific resource range. For instance, request_mem_region() calls __request_region() with @parent set to &iomem_resource, which is the root entry of the whole iomem range. When this request results in inserting a new entry "DEV-A" under "BUS-1", "DEV-A" needs to inherit from the immediate parent "BUS-1" as it holds specific attribute for the range. root (&iomem_resource) : + "BUS-1" + "DEV-A" Change __request_region() to set 'flags' and 'desc' of a new entry from the immediate parent. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-03-09tracing: Fix check for cpu online when event is disabledSteven Rostedt (Red Hat)
Commit f37755490fe9b ("tracepoints: Do not trace when cpu is offline") added a check to make sure that tracepoints only get called when the cpu is online, as it uses rcu_read_lock_sched() for protection. Commit 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") added lockdep checks (including rcu checks) for events that are not enabled to catch possible RCU issues that would only be triggered if a trace event was enabled. Commit f37755490fe9b only stopped the warnings when the trace event was enabled but did not prevent warnings if the trace event was called when disabled. To fix this, the cpu online check is moved to where the condition is added to the trace event. This will place the cpu online check in all places that it may be used now and in the future. Cc: stable@vger.kernel.org # v3.18+ Fixes: f37755490fe9b ("tracepoints: Do not trace when cpu is offline") Fixes: 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") Reported-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-09arm64: hugetlb: partial revert of 66b3923a1a0fWill Deacon
Commit 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") introduced support for huge pages using the contiguous bit in the PTE as opposed to block mappings, which may be slightly unwieldy (512M) in 64k page configurations. Unfortunately, this support has resulted in some late regressions when running the libhugetlbfs test suite with 64k pages and CONFIG_DEBUG_VM as a result of a BUG: | readback (2M: 64): ------------[ cut here ]------------ | kernel BUG at fs/hugetlbfs/inode.c:446! | Internal error: Oops - BUG: 0 [#1] SMP | Modules linked in: | CPU: 7 PID: 1448 Comm: readback Not tainted 4.5.0-rc7 #148 | Hardware name: linux,dummy-virt (DT) | task: fffffe0040964b00 ti: fffffe00c2668000 task.ti: fffffe00c2668000 | PC is at remove_inode_hugepages+0x44c/0x480 | LR is at remove_inode_hugepages+0x264/0x480 Rather than revert the entire patch, simply avoid advertising the contiguous huge page sizes for now while people are actively working on a fix. This patch can then be reverted once things have been sorted out. Cc: David Woods <dwoods@ezchip.com> Reported-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-09gpio: tps65912: fix bad mergeLinus Walleij
I screwed up while merging the immutable branch for TPS65912, so fixing it unbroken again. Cc: Lee Jones <lee.jones@linaro.org> Cc: Andrew F. Davis <afd@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-03-09Revert "gpio: lp3943: Drop pin_used and lp3943_gpio_request/lp3943_gpio_free"Linus Walleij
This reverts commit 3fab91ea284a3b795327dda915a3c150a49e4be2.
2016-03-09arm64: account for sparsemem section alignment when choosing vmemmap offsetArd Biesheuvel
Commit dfd55ad85e4a ("arm64: vmemmap: use virtual projection of linear region") fixed an issue where the struct page array would overflow into the adjacent virtual memory region if system RAM was placed so high up in physical memory that its addresses were not representable in the build time configured virtual address size. However, the fix failed to take into account that the vmemmap region needs to be relatively aligned with respect to the sparsemem section size, so that a sequence of page structs corresponding with a sparsemem section in the linear region appears naturally aligned in the vmemmap region. So round up vmemmap to sparsemem section size. Since this essentially moves the projection of the linear region up in memory, also revert the reduction of the size of the vmemmap region. Cc: <stable@vger.kernel.org> Fixes: dfd55ad85e4a ("arm64: vmemmap: use virtual projection of linear region") Tested-by: Mark Langsdorf <mlangsdo@redhat.com> Tested-by: David Daney <david.daney@cavium.com> Tested-by: Robert Richter <rrichter@cavium.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-03-09cpufreq: Reduce cpufreq_update_util() overhead a bitRafael J. Wysocki
Use the observation that cpufreq_update_util() is only called by the scheduler with rq->lock held, so the callers of cpufreq_set_update_util_data() can use synchronize_sched() instead of synchronize_rcu() to wait for cpufreq_update_util() to complete. Moreover, if they are updated to do that, rcu_read_(un)lock() calls in cpufreq_update_util() might be replaced with rcu_read_(un)lock_sched(), respectively, but those aren't really necessary, because the scheduler calls that function from RCU-sched read-side critical sections already. In addition to that, if cpufreq_set_update_util_data() checks the func field in the struct update_util_data before setting the per-CPU pointer to it, the data->func check may be dropped from cpufreq_update_util() as well. Make the above changes to reduce the overhead from cpufreq_update_util() in the scheduler paths invoking it and to make the cleanup after removing its callbacks less heavy-weight somewhat. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-03-09cpufreq: Select IRQ_WORK if CPU_FREQ_GOV_COMMON is setRafael J. Wysocki
Commit 0eb463be3436 (cpufreq: governor: Replace timers with utilization update callbacks) made CPU_FREQ select IRQ_WORK, but that's not necessary, as it is sufficient for IRQ_WORK to be selected by CPU_FREQ_GOV_COMMON, so modify the cpufreq Kconfig to that effect. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09dma, mm/pat: Rename dma_*_writecombine() to dma_*_wc()Luis R. Rodriguez
Rename dma_*_writecombine() to dma_*_wc(), so that the naming is coherent across the various write-combining APIs. Keep the old names for compatibility for a while, these can be removed at a later time. A guard is left to enable backporting of the rename, and later remove of the old mapping defines seemlessly. Build tested successfully with allmodconfig. The following Coccinelle SmPL patch was used for this simple transformation: @ rename_dma_alloc_writecombine @ expression dev, size, dma_addr, gfp; @@ -dma_alloc_writecombine(dev, size, dma_addr, gfp) +dma_alloc_wc(dev, size, dma_addr, gfp) @ rename_dma_free_writecombine @ expression dev, size, cpu_addr, dma_addr; @@ -dma_free_writecombine(dev, size, cpu_addr, dma_addr) +dma_free_wc(dev, size, cpu_addr, dma_addr) @ rename_dma_mmap_writecombine @ expression dev, vma, cpu_addr, dma_addr, size; @@ -dma_mmap_writecombine(dev, vma, cpu_addr, dma_addr, size) +dma_mmap_wc(dev, vma, cpu_addr, dma_addr, size) We also keep the old names as compatibility helpers, and guard against their definition to make backporting easier. Generated-by: Coccinelle SmPL Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: airlied@linux.ie Cc: akpm@linux-foundation.org Cc: benh@kernel.crashing.org Cc: bhelgaas@google.com Cc: bp@suse.de Cc: dan.j.williams@intel.com Cc: daniel.vetter@ffwll.ch Cc: dhowells@redhat.com Cc: julia.lawall@lip6.fr Cc: konrad.wilk@oracle.com Cc: linux-fbdev@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: luto@amacapital.net Cc: mst@redhat.com Cc: tomi.valkeinen@ti.com Cc: toshi.kani@hp.com Cc: vinod.koul@intel.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/1453516462-4844-1-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-09x86/defconfigs/32: Set CONFIG_FRAME_WARN to the Kconfig defaultBorislav Petkov
Sync it to the Kconfig default for 32-bit. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: tim.gardner@canonical.com Link: http://lkml.kernel.org/r/20160309134821.GD6564@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-09perf tools: Omit unnecessary cast in perf_pmu__parse_scaleJiri Olsa
There's no need to use a const char pointer, we can used char pointer from the beginning and omit the unnecessary cast. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160308184230.GB7897@krava.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-09cpufreq: Remove 'policy->governor_enabled'Viresh Kumar
The entire sequence of events (like INIT/START or STOP/EXIT) for which cpufreq_governor() is called, is guaranteed to be protected by policy->rwsem now. The additional checks that were added earlier (as we were forced to drop policy->rwsem before calling cpufreq_governor() for EXIT event), aren't required anymore. Over that, they weren't sufficient really. They just take care of START/STOP events, but not INIT/EXIT and the state machine was never maintained properly by them. Kill the unnecessary checks and policy->governor_enabled field. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: Rename __cpufreq_governor() to cpufreq_governor()Viresh Kumar
The __ at the beginning of the routine aren't really necessary at all. Rename it to cpufreq_governor() instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: Relocate handle_update() to kill its declarationViresh Kumar
handle_update() is declared at the top of the file as its user appear before its definition. Relocate the routine to get rid of this. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: governor: Drop unnecessary checks from show() and store()Viresh Kumar
The show() and store() routines in the cpufreq-governor core don't need to check if the struct governor_attr they want to use really provides the callbacks they need as expected (if that's not the case, it means a bug in the code anyway), so change them to avoid doing that. Also change the error value to -EBUSY, if the governor is getting removed and we aren't allowed to store any more changes. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: governor: Fix race in dbs_update_util_handler()Rafael J. Wysocki
There is a scenario that may lead to undesired results in dbs_update_util_handler(). Namely, if two CPUs sharing a policy enter the funtion at the same time, pass the sample delay check and then one of them is stalled until dbs_work_handler() (queued up by the other CPU) clears the work counter, it may update the work counter and queue up another work item prematurely. To prevent that from happening, use the observation that the CPU queuing up a work item in dbs_update_util_handler() updates the last sample time. This means that if another CPU was stalling after passing the sample delay check and now successfully updated the work counter as a result of the race described above, it will see the new value of the last sample time which is different from what it used in the sample delay check before. If that happens, the sample delay check passed previously is not valid any more, so the CPU should not continue. Fixes: f17cbb53783c (cpufreq: governor: Avoid atomic operations in hot paths) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Make gov_set_update_util() staticRafael J. Wysocki
The gov_set_update_util() routine is only used internally by the common governor code and it doesn't need to be exported, so make it static. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Narrow down the dbs_data_mutex coverageRafael J. Wysocki
Since cpufreq_governor_dbs() is now always called with policy->rwsem held, it cannot be executed twice in parallel for the same policy. Thus it is not necessary to hold dbs_data_mutex around the invocations of cpufreq_governor_start/stop/limits() from it as those functions never modify any data that can be shared between different policies. However, cpufreq_governor_dbs() may be executed twice in parallal for different policies using the same gov->gdbs_data object and dbs_data_mutex is still necessary to protect that object against concurrent updates. For this reason, narrow down the dbs_data_mutex locking to cpufreq_governor_init/exit() where it is needed and rename the mutex to gov_dbs_data_mutex to reflect its purpose. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Make dbs_data_mutex staticRafael J. Wysocki
That mutex is only used by cpufreq_governor_dbs() and it doesn't need to be exported to modules, so make it static and drop the export incantation. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Relocate definitions of tuners structuresRafael J. Wysocki
Move the definitions of struct od_dbs_tuners and struct cs_dbs_tuners from the common governor header to the ondemand and conservative governor code, respectively, as they don't need to be in the common header any more. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Move per-CPU data to the common codeRafael J. Wysocki
After previous changes there is only one piece of code in the ondemand governor making references to per-CPU data structures, but it can be easily modified to avoid doing that, so modify it accordingly and move the definition of per-CPU data used by the ondemand and conservative governors to the common code. Next, change that code to access the per-CPU data structures directly rather than via a governor callback. This causes the ->get_cpu_cdbs governor callback to become unnecessary, so drop it along with the macro and function definitions related to it. Finally, drop the definitions of struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s that aren't necessary any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Make governor private data per-policyRafael J. Wysocki
Some fields in struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s are only used for a limited set of CPUs. Namely, if a policy is shared between multiple CPUs, those fields will only be used for one of them (policy->cpu). This means that they really are per-policy rather than per-CPU and holding room for them in per-CPU data structures is generally wasteful. Also moving those fields into per-policy data structures will allow some significant simplifications to be made going forward. For this reason, introduce struct cs_policy_dbs_info and struct od_policy_dbs_info to hold those fields. Define each of the new structures as an extension of struct policy_dbs_info (such that struct policy_dbs_info is embedded in each of them) and introduce new ->alloc and ->free governor callbacks to allocate and free those structures, respectively, such that ->alloc() will return a pointer to the struct policy_dbs_info embedded in the allocated data structure and ->free() will take that pointer as its argument. With that, modify the code accessing the data fields in question in per-CPU data objects to look for them in the new structures via the struct policy_dbs_info pointer available to it and drop them from struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: ondemand: Rework the handling of powersave bias updatesRafael J. Wysocki
The ondemand_powersave_bias_init() function used for resetting data fields related to the powersave bias tunable of the ondemand governor works by walking all of the online CPUs in the system and updating the od_cpu_dbs_info_s structures for all of them. However, if governor tunables are per policy, the update should not touch the CPUs that are not associated with the given dbs_data. Moreover, since the data fields in question are only ever used for policy->cpu in each policy governed by ondemand, the update can be limited to those specific CPUs. Rework the code to take the above observations into account. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Fix CPU load information updates via ->storeRafael J. Wysocki
The ->store() callbacks of some tunable sysfs attributes of the ondemand and conservative governors trigger immediate updates of the CPU load information for all CPUs "governed" by the given dbs_data by walking the cpu_dbs_info structures for all online CPUs in the system and updating them. This is questionable for two reasons. First, it may lead to a lot of extra overhead on a system with many CPUs if the given dbs_data is only associated with a few of them. Second, if governor tunables are per-policy, the CPUs associated with the other sets of governor tunables should not be updated. To address this issue, use the observation that in all of the places in question the update operation may be carried out in the same way (because all of the tunables involved are now located in struct dbs_data and readily available to the common code) and make the code in those places invoke the same (new) helper function that will carry out the update correctly. That new function always checks the ignore_nice_load tunable value and updates the CPUs' prev_cpu_nice data fields if that's set, which wasn't done by the original code in store_io_is_busy(), but it should have been done in there too. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: ondemand: Drop one more callback from struct od_opsRafael J. Wysocki
The ->powersave_bias_init_cpu callback in struct od_ops is only used in one place and that invocation may be replaced with a direct call to the function pointed to by that callback, so change the code accordingly and drop the callback. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Drop unused governor callback and data fieldsRafael J. Wysocki
After some previous changes, the ->get_cpu_dbs_info_s governor callback and the "governor" field in struct dbs_governor (whose value represents the governor type) are not used any more, so drop them. Also drop the unused gov_ops field from struct dbs_governor. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Add a ->start callback for governorsRafael J. Wysocki
To avoid having to check the governor type explicitly in the common code in order to initialize data structures specific to the governor type properly, add a ->start callback to struct dbs_governor and use it to initialize those data structures for the ondemand and conservative governors. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Move io_is_busy to struct dbs_dataRafael J. Wysocki
The io_is_busy governor tunable is only used by the ondemand governor and is located in the ondemand-specific data structure, but it is looked at by the common governor code that has to do ugly things to get to that value, so move it to struct dbs_data and modify ondemand accordingly. Since the conservative governor never touches that field, it will be always 0 for that governor and it won't have any effect on the results of computations in that case. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Close dbs_data update race conditionRafael J. Wysocki
It is possible for a dbs_data object to be updated after its usage counter has become 0. That may happen if governor_store() runs (via a govenor tunable sysfs attribute write) in parallel with cpufreq_governor_exit() called for the last cpufreq policy associated with the dbs_data in question. In that case, if governor_store() acquires dbs_data->mutex right after cpufreq_governor_exit() has released it, the ->store() callback invoked by it may operate on dbs_data with no users. Although sysfs will cause the kobject_put() in cpufreq_governor_exit() to block until governor_store() has returned, that situation may lead to some unexpected results, depending on the implementation of the ->store callback, and therefore it should be avoided. To that end, modify governor_store() to check the dbs_data's usage count before invoking the ->store() callback and return an error if it is 0 at that point. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: ondemand: Drop unused callback from struct od_opsRafael J. Wysocki
The ->freq_increase callback in struct od_ops is never invoked, so drop it. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: ondemand: Simplify od_update() slightlyRafael J. Wysocki
Drop some lines of code from od_update() by arranging the statements in there in a more logical way. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Use microseconds in sample delay computationsRafael J. Wysocki
Do not convert microseconds to jiffies and the other way around in governor computations related to the sampling rate and sample delay and drop delay_for_sampling_rate() which isn't of any use then. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: ondemand: Simplify conditionals in od_dbs_timer()Rafael J. Wysocki
Reduce the indentation level in the conditionals in od_dbs_timer() and drop the delay variable from it. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Move rate_mult to struct policy_dbsRafael J. Wysocki
The rate_mult field in struct od_cpu_dbs_info_s is used by the code shared with the conservative governor and to access it that code has to do an ugly governor type check. However, first of all it is ever only used for policy->cpu, so it is per-policy rather than per-CPU and second, it is initialized to 1 by cpufreq_governor_start(), so if the conservative governor never modifies it, it will have no effect on the results of any computations. For these reasons, move rate_mult to struct policy_dbs_info (as a common field). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Reset sample delay in store_sampling_rate()Rafael J. Wysocki
If store_sampling_rate() updates the sample delay when the ondemand governor is in the middle of its high/low dance (OD_SUB_SAMPLE sample type is set), the governor will still do the bottom half of the previous sample which may take too much time. To prevent that from happening, change store_sampling_rate() to always reset the sample delay to 0 which also is consistent with the new behavior of cpufreq_governor_limits(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Get rid of the ->gov_check_cpu callbackRafael J. Wysocki
The way the ->gov_check_cpu governor callback is used by the ondemand and conservative governors is not really straightforward. Namely, the governor calls dbs_check_cpu() that updates the load information for the policy and the invokes ->gov_check_cpu() for the governor. To get rid of that entanglement, notice that cpufreq_governor_limits() doesn't need to call dbs_check_cpu() directly. Instead, it can simply reset the sample delay to 0 which will cause a sample to be taken immediately. The result of that is practically equivalent to calling dbs_check_cpu() except that it will trigger a full update of governor internal state and not just the ->gov_check_cpu() part. Following that observation, make cpufreq_governor_limits() reset the sample delay and turn dbs_check_cpu() into a function that will simply evaluate the load and return the result called dbs_update(). That function can now be called by governors from the routines that previously were pointed to by ->gov_check_cpu and those routines can be called directly by each governor instead of dbs_check_cpu(). This way ->gov_check_cpu becomes unnecessary, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Clean up load-related computationsRafael J. Wysocki
Clean up some load-related computations in dbs_check_cpu() and cpufreq_governor_start() to get rid of unnecessary operations and type casts and make the code easier to read. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Fix nice contribution computation in dbs_check_cpu()Rafael J. Wysocki
The contribution of the CPU nice time to the idle time in dbs_check_cpu() is computed in a bogus way, as the code may subtract current and previous nice values for different CPUs. That doesn't matter for cases when cpufreq policies are not shared, but may lead to problems otherwise. Fix the computation and simplify it to avoid taking unnecessary steps. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Avoid atomic operations in hot pathsRafael J. Wysocki
Rework the handling of work items by dbs_update_util_handler() and dbs_work_handler() so the former (which is executed in scheduler paths) only uses atomic operations when absolutely necessary. That is, when the policy is shared and dbs_update_util_handler() has already decided that this is the time to queue up a work item. In particular, this avoids the atomic ops entirely on platforms where policy objects are never shared. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Simplify gov_cancel_work() slightlyRafael J. Wysocki
The atomic work counter incrementation in gov_cancel_work() is not necessary any more, because work items won't be queued up after gov_clear_update_util() anyway, so drop it along with the comment about how it may be missed by the gov_clear_update_util(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-09cpufreq: governor: Avoid irq_work_queue_on() crash on non-SMP ARMRafael J. Wysocki
As it turns out, irq_work_queue_on() will crash if invoked on non-SMP ARM platforms, but in fact it is not necessary to use that function in the cpufreq governor code (as it doesn't matter to that code which CPU will handle the irq_work), so change it to always use irq_work_queue(). Fixes: 8fb47ff100af (cpufreq: governor: Replace timers with utilization update callbacks) Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Reported-and-tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>