summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-10-03Merge back cpufreq material for v3.18.Rafael J. Wysocki
2014-10-03cpufreq: cpufreq-dt: fix potential double put of cpu OF nodeLucas Stach
If cpufreq_generic_init() fails we jump into the resource cleanup path which contains a of_node_put() call. Another instance of this has already been called at that time resulting a double decrement of the refcount. Fix this by calling of_node_put() only after we are sure that nothing has gone wrong. Fixes: d2f31f1da54f "cpufreq: cpu0: Move per-cluster initialization code to ->init()" Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-03GFS2: Use gfs2_rbm_incr in rgblk_freeBob Peterson
This patch speeds up GFS2 unlink operations by using function gfs2_rbm_incr rather than continuously calculating the rbm. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-10-03cpufreq: cpu0: rename driver and internals to 'cpufreq_dt'Viresh Kumar
The naming convention of this driver was always under the scanner, people complained that it should have a more generic name than cpu0, as it manages all CPUs that are sharing clock lines. Also, in future it will be modified to support any number of clusters with separate clock/voltage lines. Lets rename it to 'cpufreq_dt' from 'cpufreq_cpu0'. Tested-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-03KEYS: handle error code encoded in pointerDmitry Kasatkin
If hexlen is odd then function returns an error. Use IS_ERR to check for error, otherwise invalid pointer is used and kernel gives oops: [ 132.816522] BUG: unable to handle kernel paging request at ffffffffffffffea [ 132.819902] IP: [<ffffffff812bfc20>] asymmetric_key_id_same+0x14/0x36 [ 132.820302] PGD 1a12067 PUD 1a14067 PMD 0 [ 132.820302] Oops: 0000 [#1] SMP [ 132.820302] Modules linked in: bridge(E) stp(E) llc(E) evdev(E) serio_raw(E) i2c_piix4(E) button(E) fuse(E) [ 132.820302] CPU: 0 PID: 2993 Comm: cat Tainted: G E 3.16.0-kds+ #2847 [ 132.820302] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 132.820302] task: ffff88004249a430 ti: ffff880056640000 task.ti: ffff880056640000 [ 132.820302] RIP: 0010:[<ffffffff812bfc20>] [<ffffffff812bfc20>] asymmetric_key_id_same+0x14/0x36 [ 132.820302] RSP: 0018:ffff880056643930 EFLAGS: 00010246 [ 132.820302] RAX: 0000000000000000 RBX: ffffffffffffffea RCX: ffff880056643ae0 [ 132.820302] RDX: 000000000000005e RSI: ffffffffffffffea RDI: ffff88005bac9300 [ 132.820302] RBP: ffff880056643948 R08: 0000000000000003 R09: 00000007504aa01a [ 132.820302] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88005d68ca40 [ 132.820302] R13: 0000000000000101 R14: 0000000000000000 R15: ffff88005bac5280 [ 132.820302] FS: 00007f67a153c740(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000 [ 132.820302] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 132.820302] CR2: ffffffffffffffea CR3: 000000002e663000 CR4: 00000000000006f0 [ 132.820302] Stack: [ 132.820302] ffffffff812bfc66 ffff880056643ae0 ffff88005bac5280 ffff880056643958 [ 132.820302] ffffffff812bfc9d ffff880056643980 ffffffff812971d9 ffff88005ce930c1 [ 132.820302] ffff88005ce930c0 0000000000000000 ffff8800566439c8 ffffffff812fb753 [ 132.820302] Call Trace: [ 132.820302] [<ffffffff812bfc66>] ? asymmetric_match_key_ids+0x24/0x42 [ 132.820302] [<ffffffff812bfc9d>] asymmetric_key_cmp+0x19/0x1b [ 132.820302] [<ffffffff812971d9>] keyring_search_iterator+0x74/0xd7 [ 132.820302] [<ffffffff812fb753>] assoc_array_subtree_iterate+0x67/0xd2 [ 132.820302] [<ffffffff81297165>] ? key_default_cmp+0x20/0x20 [ 132.820302] [<ffffffff812fbaa1>] assoc_array_iterate+0x19/0x1e [ 132.820302] [<ffffffff81297332>] search_nested_keyrings+0xf6/0x2b6 [ 132.820302] [<ffffffff810728da>] ? sched_clock_cpu+0x91/0xa2 [ 132.820302] [<ffffffff810860d2>] ? mark_held_locks+0x58/0x6e [ 132.820302] [<ffffffff810a137d>] ? current_kernel_time+0x77/0xb8 [ 132.820302] [<ffffffff81297871>] keyring_search_aux+0xe1/0x14c [ 132.820302] [<ffffffff812977fc>] ? keyring_search_aux+0x6c/0x14c [ 132.820302] [<ffffffff8129796b>] keyring_search+0x8f/0xb6 [ 132.820302] [<ffffffff812bfc84>] ? asymmetric_match_key_ids+0x42/0x42 [ 132.820302] [<ffffffff81297165>] ? key_default_cmp+0x20/0x20 [ 132.820302] [<ffffffff812ab9e3>] asymmetric_verify+0xa4/0x214 [ 132.820302] [<ffffffff812ab90e>] integrity_digsig_verify+0xb1/0xe2 [ 132.820302] [<ffffffff812abe41>] ? evm_verifyxattr+0x6a/0x7a [ 132.820302] [<ffffffff812b0390>] ima_appraise_measurement+0x160/0x370 [ 132.820302] [<ffffffff81161db2>] ? d_absolute_path+0x5b/0x7a [ 132.820302] [<ffffffff812ada30>] process_measurement+0x322/0x404 Reported-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: David Howells <dhowells@redhat.com>
2014-10-03perf callchain: Move callchain_param to util object in to fix python testJiri Olsa
In following commit we changed the location of callchains data: 72a128aa083a7f4cc4f800718aaae05d9c698e26 perf tools: Move callchain config from record_opts to callchain_param Now all callchains stuff stays in callchain_param struct, which adds its dependency for evsel.c object and breaks python perf.so usage (unresolved callchain_param). Moving callchain_param into callchain.c and adding it into python-ext-sources unleash just another dependency hell, so I ended up adding callchain_param into util.c for now. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1412179229-19466-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-03perf kvm stat live: Use fdarray object instead of pollfdJiri Olsa
The reason is that we don't need to count the number of file descriptors because it's already handled in fdarray object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1412179229-19466-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-03perf kvm stat live: Use perf_evlist__add_pollfd return fd positionJiri Olsa
With the interface changed in following commit: 2171a9256862 tools lib fd array: Allow associating an integer cookie with each entry the perf_evlist__add_pollfd function now returns the fd position in the pollfd array. Hence we no longer need to count the fd position, because we get it as the return value. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1412179229-19466-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-03perf kvm stat live: Fix perf_evlist__add_pollfd error handlingJiri Olsa
With the interface changed in following commit: 2171a9256862 tools lib fd array: Allow associating an integer cookie with each entry the perf_evlist__add_pollfd function now returns the fd position in the pollfd array. We need to change this function's error check condition. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1412179229-19466-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-03mmc, sdhci, bcm-kona, LLVMLinux: Remove use of __initconstBehan Webster
The __initconst is in the wrong place, and when moved to the correct place it uncovers an error where the variable is used by non-init data structures. Instead merely make them const and put the const in the right spot. Signed-off-by: Behan Webster <behanw@converseincode.com> Reviewed-by: Mark Charlebois <charlebm@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Matt Porter <mporter@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2014-10-03mmc: sdhci-pci: Fix Braswell eMMC timeout clock frequencyAdrian Hunter
Braswell eMMC host controller specifies an incorrect timeout clock frequncy in the capabilities registers. The correct value is 1 MHz. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2014-10-03mmc: sdhci: Let a driver override timeout clock frequencyAdrian Hunter
Let a driver override the timeout clock frequency by populating it before calling sdhci_add_host(). Note the value will otherwise be zero because sdhci_host is zeroed when allocated. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2014-10-03mmc: sdhci-pci: Add Bay Trail and Braswell SD card detectAdrian Hunter
Add support for card detect for Bay Trail and Braswell SD Card host controllers in PCI mode. This uses the gpio descriptor API which can find gpio descriptors, for example, on an ACPI comapnion device. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2014-10-03netfilter: nft_masq: register/unregister notifiers on module init/exitArturo Borrero
We have to register the notifiers in the masquerade expression from the the module _init and _exit path. This fixes crashes when removing the masquerade rule with no ipt_MASQUERADE support in place (which was masking the problem). Fixes: 9ba1f72 ("netfilter: nf_tables: add new nft_masq expression") Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-03mmc: sdhci-pci: Set SDHCI_QUIRK2_STOP_WITH_TC for Intel BYT host controllersAdrian Hunter
Add quirk SDHCI_QUIRK2_STOP_WITH_TC for Intel BYT host controllers. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2014-10-03mmc: sdhci-acpi: Add a HID and UID for a SD Card host controllerAdrian Hunter
Add a HID (INT33BB) and UID (3) for a SD Card host controller. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2014-10-03mmc: sdhci-acpi: Set SDHCI_QUIRK2_STOP_WITH_TC for Intel host controllersAdrian Hunter
Add quirk SDHCI_QUIRK2_STOP_WITH_TC for Intel host controllers. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2014-10-03mmc: sdhci: Add quirk for always getting TC with stop cmdAdrian Hunter
Add a quirk for a host controller that always sets a Transfer Complete interrupt status for the stop command even when a busy response is not indicated. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2014-10-03xen: eliminate scalability issues from initrd handlingJuergen Gross
Size restrictions native kernels wouldn't have resulted from the initrd getting mapped into the initial mapping. The kernel doesn't really need the initrd to be mapped, so use infrastructure available in Xen to avoid the mapping and hence the restriction. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-03xen: sync some headers with xen treeJuergen Gross
To be able to use an initially unmapped initrd with xen the following header files must be synced to a newer version from the xen tree: include/xen/interface/elfnote.h include/xen/interface/xen.h As the KEXEC and DUMPCORE related ELFNOTES are not relevant for the kernel they are omitted from elfnote.h. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-03xen: make pvscsi frontend dependant on xenbus frontendJuergen Gross
The pvscsi frontend driver requires the xenbus frontend driver. Reflect this in Kconfig. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-10-03arm{,64}/xen: Remove "EXPERIMENTAL" in the description of the Xen optionsJulien Grall
The Xen ARM API is stable since Xen 4.4 and everything has been upstreamed in Linux for ARM and ARM64. Therefore we can drop "EXPERIMENTAL" from the Xen option in the both Kconfig. Signed-off-by: Julien Grall <julien.grall@linaro.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org
2014-10-03xen-scsifront: don't deadlock if the ring becomes fullDavid Vrabel
scsifront_action_handler() will deadlock on host->host_lock, if the ring is full and it has to wait for entries to become available. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>
2014-10-03ASoC: tegra: add mic detect gpio to tegra_max98090Dylan Reid
Add an optional mic detect gpio property. If specified in device tree there will be a mic jack created for the given gpio. This will be used by the Tegra-based Chromebooks. Signed-off-by: Dylan Reid <dgreid@chromium.org> Reviewed-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2014-10-03ASoC: sgtl5000: Do a sanity check on SYS_MCLKFabio Estevam
According to the sgtl5000 datasheet the valid range for SYS_MCLK is from 8 to 27 MHz. Add a sanity check prior to enabling SYS_MCLK. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2014-10-03ASoC: sgtl5000: Improve the error message on slave mode settingFabio Estevam
For sgtl5000 to operate in slave mode it can only work in "Synchronous SYS_MCLK input" mode. In this mode only the following rates can be supported: 256*Fs, 384*Fs, 512*Fs. Improve the error message to give a better indication as to why the clocking failed for slave mode: [ 12.515399] sgtl5000 1-000a: PLL not supported in slave mode [ 12.524124] sgtl5000 1-000a: 233 ratio is not supported. SYS_MCLK needs to be 256, 384 or 512 * fs [ 12.535938] sgtl5000 1-000a: ASoC: can't set sgtl5000 hw params: -22 Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2014-10-03ASoC: rt286: Add depends on I2CBard Liao
rt286 use I2C as its I/O. So the driver can only available when I2C is selected. Signed-off-by: Bard Liao <bardliao@realtek.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2014-10-03spi: spi-mxs: fix a tiny typo in a commentMichael Heimpold
Signed-off-by: Michael Heimpold <mhei@heimpold.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2014-10-03[SCSI] uas: disable use of blk-mq I/O pathChristoph Hellwig
The uas driver uses the block layer tag for USB3 stream IDs. With blk-mq we can get larger tag numbers that the queue depth, which breaks this assumption. A fix is under way for 3.18, but sits on top of large changes so can't easily be backported. Set the disable_blk_mq path so that a uas device can't easily crash the system when using blk-mq for SCSI. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-10-03m68k: Reformat arch/m68k/mm/hwtest.cGeert Uytterhoeven
No functional changes Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2014-10-03m68k: Disable/restore interrupts in hwreg_present()/hwreg_write()Geert Uytterhoeven
hwreg_present() and hwreg_write() temporarily change the VBR register to another vector table. This table contains a valid bus error handler only, all other entries point to arbitrary addresses. If an interrupt comes in while the temporary table is active, the processor will start executing at such an arbitrary address, and the kernel will crash. While most callers run early, before interrupts are enabled, or explicitly disable interrupts, Finn Thain pointed out that macsonic has one callsite that doesn't, causing intermittent boot crashes. There's another unsafe callsite in hilkbd. Fix this for good by disabling and restoring interrupts inside hwreg_present() and hwreg_write(). Explicitly disabling interrupts can be removed from the callsites later. Reported-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@vger.kernel.org
2014-10-03powerpc: Enable CONFIG_CRASH_DUMP=y for ppc64_defconfigMichael Ellerman
It pulls in more code, including causing us to build a relocatable kernel, which is good for testing. The resulting kernel is still usable as a non-crash dump kernel. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03powerpc/kdump: crash_dump.c needs to include io.hMichael Ellerman
For __ioremap(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03powerpc: Don't build powernv for other platform defconfigsMichael Ellerman
Because powernv arrived after these other platforms, the defconfigs didn't have PPC_POWERNV disabled, and being default y it gets turned on. If we're going to bother having defconfigs for the specific platforms then they should only build the code required for those platforms. The grab bag of everything config is ppc64_defconfig. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03powerpc/pci: remove duplicate declaration of pci_bus_find_capabilityWei Yang
pci_bus_find_capability() is decleared in pci.h, so it is not necessary to do it again. This patch removes it. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03Merge tag 'irqchip-core-3.18-2' of ↵Thomas Gleixner
git://git.infradead.org/users/jcooper/linux into irq/core irqchip core changes for v3.18 (round 2) from Jason Cooper * atmel: - Add sama5d4 support - Correct # irqs for sama5d3 * broadcom: - Add bcm7120 l2 interrupt controller and DT binding * gic-v3: - Add CPU PM notifier - Add enable/disable support to gic_enable_redist
2014-10-02Revert "serial/core: Initialize the console pm state"Greg Kroah-Hartman
This reverts commit a86713b1536c818972675e6dd8c6e738f0379f1d. Kevin Hilman writes: Multiple boot failures on ARM[1] were bisected down to this patch. How was this patch tested, and on which platforms? Also, the changelog states that this should be done only for UART_CAP_SLEEP, but the patch does it for every UART. Greg, I suggest this patch be dropped from tty-next until it has been better described and tested. [1] http://lists.linaro.org/pipermail/kernel-build-reports/2014-October/005550.html Reported-by: Kevin Hilman <khilman@kernel.org> Cc: Sudhir Sreedharan <ssreedharan@mvista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-03powerpc/iommu/ddw: Fix endiannessAlexey Kardashevskiy
rtas_call() accepts and returns values in CPU endianness. The ddw_query_response and ddw_create_response structs members are defined and treated as BE but as they are passed to rtas_call() as (u32 *) and they get byteswapped automatically, the data is CPU-endian. This fixes ddw_query_response and ddw_create_response definitions and use. of_read_number() is designed to work with device tree cells - it assumes the input is big-endian and returns data in CPU-endian. However due to the ddw_create_response struct fix, create.addr_hi/lo are already CPU-endian so do not byteswap them. ddw_avail is a pointer to the "ibm,ddw-applicable" property which contains 3 cells which are big-endian as it is a device tree. rtas_call() accepts a RTAS token in CPU-endian. This makes use of of_property_read_u32_array to byte swap and avoid the need for a number of be32_to_cpu calls. Cc: stable@vger.kernel.org # v3.13+ Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> [aik: folded Anton's patch with of_property_read_u32_array] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03locking/lockdep: Revert qrwlock recusive stuffPeter Zijlstra
Commit f0bab73cb539 ("locking/lockdep: Restrict the use of recursive read_lock() with qrwlock") changed lockdep to try and conform to the qrwlock semantics which differ from the traditional rwlock semantics. In particular qrwlock is fair outside of interrupt context, but in interrupt context readers will ignore all fairness. The problem modeling this is that read and write side have different lock state (interrupts) semantics but we only have a single representation of these. Therefore lockdep will get confused, thinking the lock can cause interrupt lock inversions. So revert it for now; the old rwlock semantics were already imperfectly modeled and the qrwlock extra won't fit either. If we want to properly fix this, I think we need to resurrect the work by Gautham did a few years ago that split the read and write state of locks: http://lwn.net/Articles/332801/ FWIW the locking selftest that would've failed (and was reported by Borislav earlier) is something like: RL(X1); /* IRQ-ON */ LOCK(A); UNLOCK(A); RU(X1); IRQ_ENTER(); RL(X1); /* IN-IRQ */ RU(X1); IRQ_EXIT(); At which point it would report that because A is an IRQ-unsafe lock we can suffer the following inversion: CPU0 CPU1 lock(A) lock(X1) lock(A) <IRQ> lock(X1) And this is 'wrong' because X1 can recurse (assuming the above lock are in fact read-lock) but lockdep doesn't know about this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Waiman Long <Waiman.Long@hp.com> Cc: ego@linux.vnet.ibm.com Cc: bp@alien8.de Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20140930132600.GA7444@worktop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03locking/rwsem: Avoid double checking before try acquiring write lockJason Low
Commit 9b0fc9c09f1b ("rwsem: skip initial trylock in rwsem_down_write_failed") checks for if there are known active lockers in order to avoid write trylocking using expensive cmpxchg() when it likely wouldn't get the lock. However, a subsequent patch was added such that we directly check for sem->count == RWSEM_WAITING_BIAS right before trying that cmpxchg(). Thus, commit 9b0fc9c09f1b now just adds overhead. This patch modifies it so that we only do a check for if count == RWSEM_WAITING_BIAS. Also, add a comment on why we do an "extra check" of count before the cmpxchg(). Signed-off-by: Jason Low <jason.low2@hp.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Chegu Vinod <chegu_vinod@hp.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1410913017.2447.22.camel@j-VirtualBox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03locking,arch: Use ACCESS_ONCE() instead of cast to volatile in atomic_read()Pranith Kumar
Use the much more reader friendly ACCESS_ONCE() instead of the cast to volatile. This is purely a stylistic change. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/1411482607-20948-1-git-send-email-bobby.prani@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03perf/x86: Tone down kernel messages when the PMU check fails in a virtual ↵Wei Huang
environment PMU checking can fail due to various reasons. On native machine, this is mostly caused by faulty hardware and it is reasonable to use KERN_ERR in reporting. However, when kernel is running on virtualized environment, this checking can fail if virtual PMU is not supported (e.g. KVM on AMD host). It is annoying to see an error message on splash screen, even though we know such failure is benign on virtualized environment. This patch checks if the kernel is running in a virtualized environment. If so, it will use KERN_INFO in reporting, which reduces the syslog priority of them. This patch was tested successfully on KVM. Signed-off-by: Wei Huang <wei@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Link: http://lkml.kernel.org/r/1411617314-24659-1-git-send-email-wei@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03perf/x86/intel/uncore: Fix minor race in box set upAndi Kleen
I was looking for the trinity oops cause in the uncore driver. (so far didn't found it) However I found this tiny race: when a box is set up two threads on the same CPU, they may be setting up the box in parallel (e.g. with kernel preemption). This could lead to the reference count being increasing too much. Always recheck there is no existing cpu reference inside the lock. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: eranian@google.com Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Link: http://lkml.kernel.org/r/1411424826-15629-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03sched/dl: Use dl_bw_of() under rcu_read_lock_sched()Kirill Tkhai
rq->rd is freed using call_rcu_sched(), so rcu_read_lock() to access it is not enough. We should use either rcu_read_lock_sched() or preempt_disable(). Reported-by: Sasha Levin <sasha.levin@oracle.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kirill Tkhai <ktkhai@parallels.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Fixes: 66339c31bc39 "sched: Use dl_bw_of() under RCU read lock" Link: http://lkml.kernel.org/r/1412065417.20287.24.camel@tkhai Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03sched/fair: Delete resched_cpu() from idle_balance()Kirill Tkhai
We already reschedule env.dst_cpu in attach_tasks()->check_preempt_curr() if this is necessary. Furthermore, a higher priority class task may be current on dest rq, we shouldn't disturb it. Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Cc: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140930210441.5258.55054.stgit@localhost Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03sched, time: Fix build error with 64 bit cputime_t on 32 bit systemsRik van Riel
On 32 bit systems cmpxchg cannot handle 64 bit values, so some additional magic is required to allow a 32 bit system with CONFIG_VIRT_CPU_ACCOUNTING_GEN=y enabled to build. Make sure the correct cmpxchg function is used when doing an atomic swap of a cputime_t. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: umgwanakikbuti@gmail.com Cc: fweisbec@gmail.com Cc: srao@redhat.com Cc: lwoodman@redhat.com Cc: atheurer@redhat.com Cc: oleg@redhat.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: linux390@de.ibm.com Cc: linux-arch@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Link: http://lkml.kernel.org/r/20140930155947.070cdb1f@annuminas.surriel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03sched: Improve sysbench performance by fixing spurious active migrationVincent Guittot
Since commit caeb178c60f4 ("sched/fair: Make update_sd_pick_busiest() ...") sd_pick_busiest returns a group that can be neither imbalanced nor overloaded but is only more loaded than others. This change has been introduced to ensure a better load balance in system that are not overloaded but as a side effect, it can also generate useless active migration between groups. Let take the example of 3 tasks on a quad cores system. We will always have an idle core so the load balance will find a busiest group (core) whenever an ILB is triggered and it will force an active migration (once above nr_balance_failed threshold) so the idle core becomes busy but another core will become idle. With the next ILB, the freshly idle core will try to pull the task of a busy CPU. The number of spurious active migration is not so huge in quad core system because the ILB is not triggered so much. But it becomes significant as soon as you have more than one sched_domain level like on a dual cluster of quad cores where the ILB is triggered every tick when you have more than 1 busy_cpu We need to ensure that the migration generate a real improveùent and will not only move the avg_load imbalance on another CPU. Before caeb178c60f4f93f1b45c0bc056b5cf6d217b67f, the filtering of such use case was ensured by the following test in f_b_g: if ((local->idle_cpus < busiest->idle_cpus) && busiest->sum_nr_running <= busiest->group_weight) This patch modified the condition to take into account situation where busiest group is not overloaded: If the diff between the number of idle cpus in 2 groups is less than or equal to 1 and the busiest group is not overloaded, moving a task will not improve the load balance but just move it. A test with sysbench on a dual clusters of quad cores gives the following results: command: sysbench --test=cpu --num-threads=5 --max-time=5 run The HZ is 200 which means that 1000 ticks has fired during the test. With Mainline, perf gives the following figures: Samples: 727 of event 'sched:sched_migrate_task' Event count (approx.): 727 Overhead Command Shared Object Symbol ........ ............... ............. .............. 12.52% migration/1 [unknown] [.] 00000000 12.52% migration/5 [unknown] [.] 00000000 12.52% migration/7 [unknown] [.] 00000000 12.10% migration/6 [unknown] [.] 00000000 11.83% migration/0 [unknown] [.] 00000000 11.83% migration/3 [unknown] [.] 00000000 11.14% migration/4 [unknown] [.] 00000000 10.87% migration/2 [unknown] [.] 00000000 2.75% sysbench [unknown] [.] 00000000 0.83% swapper [unknown] [.] 00000000 0.55% ktps65090charge [unknown] [.] 00000000 0.41% mmcqd/1 [unknown] [.] 00000000 0.14% perf [unknown] [.] 00000000 With this patch, perf gives the following figures Samples: 20 of event 'sched:sched_migrate_task' Event count (approx.): 20 Overhead Command Shared Object Symbol ........ ............... ............. .............. 80.00% sysbench [unknown] [.] 00000000 10.00% swapper [unknown] [.] 00000000 5.00% ktps65090charge [unknown] [.] 00000000 5.00% migration/1 [unknown] [.] 00000000 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1412170735-5356-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03sched/x86: Fix up typo in topology detectionDave Hansen
Commit: cebf15eb09a2 ("x86, sched: Add new topology for multi-NUMA-node CPUs") some code to try to detect the situation where we have a NUMA node inside of the "DIE" sched domain. It detected this by looking for cpus which match_die() but do not match NUMA nodes via topology_same_node(). I wrote it up as: if (match_die(c, o) == !topology_same_node(c, o)) which actually seemed to work some of the time, albiet accidentally. It should have been doing an &&, not an ==. This code essentially chopped off the "DIE" domain on one of Andrew Morton's systems. He reported that this patch fixed his issue. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dave Hansen <dave@sr71.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Lan Tianyu <tianyu.lan@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/20140930214546.FD481CFF@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03perf: Fix perf bug in fork()Peter Zijlstra
Oleg noticed that a cleanup by Sylvain actually uncovered a bug; by calling perf_event_free_task() when failing sched_fork() we will not yet have done the memset() on ->perf_event_ctxp[] and will therefore try and 'free' the inherited contexts, which are still in use by the parent process. This is bad and might explain some outstanding fuzzer failures ... Suggested-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Sylvain 'ythier' Hitier <sylvain.hitier@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Daeseok Youn <daeseok.youn@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Rik van Riel <riel@redhat.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20140929101201.GE5430@worktop Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03perf: Fix unclone_ctx() vs. lockingPeter Zijlstra
The idiot who did 4a1c0f262f88 ("perf: Fix lockdep warning on process exit") forgot to pay attention and fix all similar cases. Do so now. In particular, unclone_ctx() must be called while holding ctx->lock, therefore all such sites are broken for the same reason. Pull the put_ctx() call out from under ctx->lock. Reported-by: Sasha Levin <sasha.levin@oracle.com> Probably-also-reported-by: Vince Weaver <vincent.weaver@maine.edu> Fixes: 4a1c0f262f88 ("perf: Fix lockdep warning on process exit") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Cong Wang <cwang@twopensource.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140930172308.GI4241@worktop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>