summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-10-20IB/cma: Use inner P_Key to determine netdevHaggai Eran
When discussing the patches to demux ids in rdma_cm instead of ib_cm, it was decided that it is best to use the P_Key value in the packet headers. However, the mlx5 and ipath drivers are currently unable to send correct P_Key values in GMP headers. They always send using a single P_Key that is set during the GSI QP initialization. Change the rdma_cm code to look at the P_Key value that is part of the packet payload as a workaround. Once the drivers are fixed this patch can be reverted. Fixes: 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA CM") Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-20IB/ucma: check workqueue allocation before usageSasha Levin
Allocating a workqueue might fail, which wasn't checked so far and would lead to NULL ptr derefs when an attempt to use it was made. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-20IB/cma: Potential NULL dereference in cma_id_from_eventHaggai Eran
If the lookup of a listening ID failed for an AF_IB request, the code would try to call dev_put() on a NULL net_dev. Fixes: be688195bd08 ("IB/cma: Fix net_dev reference leak with failed requests") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-20IB/core: Fix use after free of ifaMatan Barak
When using ifup/ifdown while executing enum_netdev_ipv4_ips, ifa could become invalid and cause use after free error. Fixing it by protecting with RCU lock. Fixes: 03db3a2d81e6 ('IB/core: Add RoCE GID table management') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-20clkdev: fix clk_add_alias() with a NULL alias device nameRussell King
clk_add_alias() was not correctly handling the case where alias_dev_name was NULL: rather than producing an entry with a NULL dev_id pointer, it would produce a device name of (null). Fix this. Cc: <stable@vger.kernel.org> Fixes: 2568999835d7 ("clkdev: add clkdev_create() helper") Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-20arm/arm64: KVM: Fix disabled distributor operationChristoffer Dall
We currently do a single update of the vgic state when the distributor enable/disable control register is accessed and then bypass updating the state for as long as the distributor remains disabled. This is incorrect, because updating the state does not consider the distributor enable bit, and this you can end up in a situation where an interrupt is marked as pending on the CPU interface, but not pending on the distributor, which is an impossible state to be in, and triggers a warning. Consider for example the following sequence of events: 1. An interrupt is marked as pending on the distributor - the interrupt is also forwarded to the CPU interface 2. The guest turns off the distributor (it's about to do a reboot) - we stop updating the CPU interface state from now on 3. The guest disables the pending interrupt - we remove the pending state from the distributor, but don't touch the CPU interface, see point 2. Since the distributor disable bit really means that no interrupts should be forwarded to the CPU interface, we modify the code to keep updating the internal VGIC state, but always set the CPU interface pending bits to zero when the distributor is disabled. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20arm/arm64: KVM: Clear map->active on pend/active clearChristoffer Dall
When a guest reboots or offlines/onlines CPUs, it is not uncommon for it to clear the pending and active states of an interrupt through the emulated VGIC distributor. However, since the architected timers are defined by the architecture to be level triggered and the guest rightfully expects them to be that, but we emulate them as edge-triggered, we have to mimic level-triggered behavior for an edge-triggered virtual implementation. We currently do not signal the VGIC when the map->active field is true, because it indicates that the guest has already been signalled of the interrupt as required. Normally this field is set to false when the guest deactivates the virtual interrupt through the sync path. We also need to catch the case where the guest deactivates the interrupt through the emulated distributor, again allowing guests to boot even if the original virtual timer signal hit before the guest's GIC initialization sequence is run. Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20arm/arm64: KVM: Fix arch timer behavior for disabled interruptsChristoffer Dall
We have an interesting issue when the guest disables the timer interrupt on the VGIC, which happens when turning VCPUs off using PSCI, for example. The problem is that because the guest disables the virtual interrupt at the VGIC level, we never inject interrupts to the guest and therefore never mark the interrupt as active on the physical distributor. The host also never takes the timer interrupt (we only use the timer device to trigger a guest exit and everything else is done in software), so the interrupt does not become active through normal means. The result is that we keep entering the guest with a programmed timer that will always fire as soon as we context switch the hardware timer state and run the guest, preventing forward progress for the VCPU. Since the active state on the physical distributor is really part of the timer logic, it is the job of our virtual arch timer driver to manage this state. The timer->map->active boolean field indicates whether we have signalled this interrupt to the vgic and if that interrupt is still pending or active. As long as that is the case, the hardware doesn't have to generate physical interrupts and therefore we mark the interrupt as active on the physical distributor. We also have to restore the pending state of an interrupt that was queued to an LR but was retired from the LR for some reason, while remaining pending in the LR. Cc: Marc Zyngier <marc.zyngier@arm.com> Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20KVM: arm: use GIC support unconditionallyArnd Bergmann
The vgic code on ARM is built for all configurations that enable KVM, but the parent_data field that it references is only present when CONFIG_IRQ_DOMAIN_HIERARCHY is set: virt/kvm/arm/vgic.c: In function 'kvm_vgic_map_phys_irq': virt/kvm/arm/vgic.c:1781:13: error: 'struct irq_data' has no member named 'parent_data' This flag is implied by the GIC driver, and indeed the VGIC code only makes sense if a GIC is present. This changes the CONFIG_KVM symbol to always select GIC, which avoids the issue. Fixes: 662d9715840 ("arm/arm64: KVM: Kill CONFIG_KVM_ARM_{VGIC,TIMER}") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20KVM: arm/arm64: Fix memory leak if timer initialization failsPavel Fedin
Jump to correct label and free kvm_host_cpu_state Reviewed-by: Wei Huang <wei@redhat.com> Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20KVM: arm/arm64: Do not inject spurious interruptsPavel Fedin
When lowering a level-triggered line from userspace, we forgot to lower the pending bit on the emulated CPU interface and we also did not re-compute the pending_on_cpu bitmap for the CPU affected by the change. Update vgic_update_irq_pending() to fix the two issues above and also raise a warning in vgic_quue_irq_to_lr if we encounter an interrupt pending on a CPU which is neither marked active nor pending. [ Commit text reworked completely - Christoffer ] Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-10-20tracing: Have stack tracer force RCU to be watchingSteven Rostedt (Red Hat)
The stack tracer was triggering the WARN_ON() in module.c: static void module_assert_mutex_or_preempt(void) { #ifdef CONFIG_LOCKDEP if (unlikely(!debug_locks)) return; WARN_ON(!rcu_read_lock_sched_held() && !lockdep_is_held(&module_mutex)); #endif } The reason is that the stack tracer traces all function calls, and some of those calls happen while exiting or entering user space and idle. Some of these functions are called after RCU had already stopped watching, as RCU does not watch userspace or idle CPUs. If a max stack is hit, then the save_stack_trace() is called, which will check module addresses and call module_assert_mutex_or_preempt(), and then trigger the warning. Sad part is, the warning itself will also do a stack trace and tigger the same warning. That probably should be fixed. The warning was added by 0be964be0d45 "module: Sanitize RCU usage and locking" but this bug has probably been around longer. But it's unlikely to cause much harm, but the new warning causes the system to lock up. Cc: stable@vger.kernel.org # 4.2+ Cc: Peter Zijlstra <peterz@infradead.org> Cc:"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-10-20ASoC: wm8904: Correct number of EQ registersCharles Keepax
There are 24 EQ registers not 25, I suspect this bug came about because the registers start at EQ1 not zero. The bug is relatively harmless as the extra register written is an unused one. Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
2015-10-20ALSA: hda - Fix deadlock at error in building PCMTakashi Iwai
The HDA codec driver issues snd_hda_codec_reset() at the error path of PCM build. This was needed in the earlier code base, but the recent rewrite to use the standard bus binding made this a deadlock: modprobe D 0000000000000005 0 720 716 0x00000080 Call Trace: [<ffffffff816a5dbe>] schedule+0x3e/0x90 [<ffffffff816a61a5>] schedule_preempt_disabled+0x15/0x20 [<ffffffff816a7ae5>] __mutex_lock_slowpath+0xb5/0x120 [<ffffffff816a7b6b>] mutex_lock+0x1b/0x30 [<ffffffff8148656b>] device_release_driver+0x1b/0x30 [<ffffffff81485c15>] bus_remove_device+0x105/0x180 [<ffffffff814822b9>] device_del+0x139/0x260 [<ffffffffa05e0ec5>] snd_hdac_device_unregister+0x25/0x30 [snd_hda_core] [<ffffffffa074fa6a>] snd_hda_codec_reset+0x2a/0x70 [snd_hda_codec] [<ffffffffa075007b>] snd_hda_codec_build_pcms+0x18b/0x1b0 [snd_hda_codec] [<ffffffffa074a44e>] hda_codec_driver_probe+0xbe/0x140 [snd_hda_codec] [<ffffffff81486ac4>] driver_probe_device+0x1f4/0x460 [<ffffffff81486dc0>] __driver_attach+0x90/0xa0 [<ffffffff81484844>] bus_for_each_dev+0x64/0xa0 [<ffffffff814862de>] driver_attach+0x1e/0x20 [<ffffffff81485e7b>] bus_add_driver+0x1eb/0x280 [<ffffffff81487680>] driver_register+0x60/0xe0 [<ffffffffa074a0da>] __hda_codec_driver_register+0x5a/0x60 [snd_hda_codec] [<ffffffffa070a01e>] realtek_driver_init+0x1e/0x1000 [snd_hda_codec_realtek] [<ffffffff810002f3>] do_one_initcall+0xb3/0x200 [<ffffffff816a1fc5>] do_init_module+0x60/0x1f8 [<ffffffff810ee5c3>] load_module+0x1653/0x1bd0 [<ffffffff810eed48>] SYSC_finit_module+0x98/0xc0 [<ffffffff810eed8e>] SyS_finit_module+0xe/0x10 [<ffffffff816aa032>] entry_SYSCALL_64_fastpath+0x16/0x75 The simple fix is just to remove this call, since we don't need to think about unbinding at there any longer. Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=948758 Cc: <stable@vger.kernel.org> # v4.1+ Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-10-20pinctrl: sh-pfc: Remove obsolete r8a7778 platform_device_id entryGeert Uytterhoeven
Since the removal of the r8a7778 legacy SoC code in commit 4baadb9e05c68962 ("ARM: shmobile: r8a7778: remove obsolete setup code"), r8a7778 is only supported in generic DT-only ARM multi-platform builds. The driver doesn't need to match platform devices by name anymore, hence remove the corresponding platform_device_id entry. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Linus Walleij <linus.walleij@linaro.org>
2015-10-20pinctrl: sh-pfc: Remove obsolete r8a7779 platform_device_id entryGeert Uytterhoeven
Since the removal of the r8a7779 legacy SoC code in commit c99cd90d98a98aa1 ("ARM: shmobile: r8a7779: Remove legacy SoC code"), r8a7779 is only supported in generic DT-only ARM multi-platform builds. The driver doesn't need to match platform devices by name anymore, hence remove the corresponding platform_device_id entry. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Linus Walleij <linus.walleij@linaro.org>
2015-10-20pinctrl: sh-pfc: Stop including <linux/platform_data/gpio-rcar.h>Geert Uytterhoeven
This header file will be removed soon. Copy the helper macro RCAR_GP_PIN(), which is used by the pinctrl drivers only, to sh_pfc.h, and drop the #include. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Linus Walleij <linus.walleij@linaro.org>
2015-10-20usb: renesas_usbhs: Remove unneeded #include <linux/platform_data/gpio-rcar.h>Geert Uytterhoeven
This header file will be removed soon. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Felipe Balbi <balbi@ti.com>
2015-10-20pinctrl: sh-pfc: Rename .gpio_data[] to .pinmux_data[]Geert Uytterhoeven
The sh_pfc_soc_info.gpio_data[] array contains not only GPIO data, but also various other pinmux-related data (functions and marks). Every single driver already calls its local array pinmux_data[]. Hence rename the sh_pfc_soc_info member to "pinmux_data". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
2015-10-20pinctrl: sh-pfc: r8a7778: Add bias (pull-up) pinconf supportUlrich Hecht
On this SoC there is no simple mapping of GP pins to pull-up register bits, so we need a table. Signed-off-by: Ulrich Hecht <ulrich.hecht+renesas@gmail.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2015-10-20pinctrl: sh-pfc: Add macros defining GP ports with config flagsUlrich Hecht
PORT_GP_CFG_1 and PORT_GP_CFG_32 work like their non-CFG counterparts but accept an extra argument with config flags. Signed-off-by: Ulrich Hecht <ulrich.hecht+renesas@gmail.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2015-10-20pinctrl: sh-pfc: r8a7794: Add VIN pin groupsKoji Matsuoka
Add VIN0/1 pin groups to R8A7794 PFC driver. Sergei: rebased, renamed, added changelog, gathered 12 VIN1 data pins into a single pin group, added "vin1_data10" pin group, used 'union vin_data' and VIN_DATA_PIN_GROUP() macro to describe VIN1 pins, reversed the order of the VIN1 pin groups, removed unneeded empty lines, fixed VIN1 separator comment. Signed-off-by: Koji Matsuoka <koji.matsuoka.xm@renesas.com> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2015-10-20pinctrl: sh-pfc: r8a779[01]: Move 'union vin_data' to shared header fileSergei Shtylyov
R8A7790/1 PFC drivers use almost identical 'union vin_data' and completely identical VIN_DATA_PIN_GROUP() macro; we thus can move them into the shared header file... Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2015-10-20crypto: api - Only abort operations on fatal signalHerbert Xu
Currently a number of Crypto API operations may fail when a signal occurs. This causes nasty problems as the caller of those operations are often not in a good position to restart the operation. In fact there is currently no need for those operations to be interrupted by user signals at all. All we need is for them to be killable. This patch replaces the relevant calls of signal_pending with fatal_signal_pending, and wait_for_completion_interruptible with wait_for_completion_killable, respectively. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-10-20perf build: Add fixdep to .gitignoreYunlong Song
Commit 7c422f5572667fef0db38d2046ecce69dcf0afc8 ("tools build: Build fixdep helper from perf and basic libs") dynamically creates fixdep during the perf building. Add it to .gitignore. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 7c422f557266 ("tools build: Build fixdep helper from perf and basic libs") Link: http://lkml.kernel.org/r/1444899116-8220-1-git-send-email-yunlong.song@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-20Merge branch 'clockevents/4.4' of ↵Thomas Gleixner
http://git.linaro.org/people/daniel.lezcano/linux into timers/core clockevent updates from Daniel Lezcano: - Remove unneeded memset in em_sti, sh_cmt and h8300 because there are already zeroed by a kzalloc (Alexey Klimov) - Optimize code by replacing this_cpu_ptr by container_of on the exynos_mct (Alexey Klimov) - Get immune from a spurious interrupt when enabling the mtk_timer (Daniel Lezcano) - Use the dynamic irq affinity to optimize wakeup and useless IPI timer on the imx timer (Lucas Stach) - Add new timer for Tango SoCs (Marc Gonzalez) - Implement the timer delay for armada-370-xp (Russell King) - Use GPT as clock source (Yingjoe Chen)
2015-10-20Merge branch 'fortglx/4.4/time' of ↵Thomas Gleixner
https://git.linaro.org/people/john.stultz/linux into timers/core Time updates from John Stultz: - More 2038 work from Arnd Bergmann around ntp and pps
2015-10-20x86/mm, kasan: Silence KASAN warnings in get_wchan()Andrey Ryabinin
get_wchan() is racy by design, it may access volatile stack of running task, thus it may access redzone in a stack frame and cause KASAN to warn about this. Use READ_ONCE_NOCHECK() to silence these warnings. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de> Cc: kasan-dev <kasan-dev@googlegroups.com> Link: http://lkml.kernel.org/r/1445243838-17763-3-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20compiler, atomics, kasan: Provide READ_ONCE_NOCHECK()Andrey Ryabinin
Some code may perform racy by design memory reads. This could be harmless, yet such code may produce KASAN warnings. To hide such accesses from KASAN this patch introduces READ_ONCE_NOCHECK() macro. KASAN will not check the memory accessed by READ_ONCE_NOCHECK(). The KernelThreadSanitizer (KTSAN) is going to ignore it as well. This patch creates __read_once_size_nocheck() a clone of __read_once_size(). The only difference between them is 'no_sanitized_address' attribute appended to '*_nocheck' function. This attribute tells the compiler that instrumentation of memory accesses should not be applied to that function. We declare it as static '__maybe_unsed' because GCC is not capable to inline such function: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368 With KASAN=n READ_ONCE_NOCHECK() is just a clone of READ_ONCE(). Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de> Cc: kasan-dev <kasan-dev@googlegroups.com> Link: http://lkml.kernel.org/r/1445243838-17763-2-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20perf record: Add ability to sample call branchesStephane Eranian
This patch add a new branch type sampling filter to perf record. It is named 'call' and maps to PERF_SAMPLE_BRANCH_CALL. It samples direct call branches only, unlike 'any_call' which includes indirect calls as well. $ perf record -j call -e cycles ..... The man page is updated accordingly. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: khandual@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1444720151-10275-5-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20perf/powerpc: Add support for PERF_SAMPLE_BRANCH_CALLStephane Eranian
The patch catches PERF_SAMPLE_BRANCH_CALL because it is not clear whether this is actually supported by the hardware. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: khandual@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1444720151-10275-4-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20perf/x86: Add support for PERF_SAMPLE_BRANCH_CALLStephane Eranian
This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL for Intel x86 processors. When the processor support LBR filtering this the selection is done in hardware. Otherwise, the filter is applied by software. Note that we chose to include zero length calls because they also represent calls. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: khandual@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1444720151-10275-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20perf: Add PERF_SAMPLE_BRANCH_CALLStephane Eranian
Add a new branch sample type to cover only call branches (function calls). The current ANY_CALL included direct, indirect calls and far jumps. We want to be able to differentiate indirect from direct calls. Therefore we introduce PERF_SAMPLE_BRANCH_CALL. The implementation is up to each architecture. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: khandual@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1444720151-10275-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20perf/x86: Fix time_shift in perf_event_mmap_pageAdrian Hunter
Commit: b20112edeadf ("perf/x86: Improve accuracy of perf/sched clock") allowed the time_shift value in perf_event_mmap_page to be as much as 32. Unfortunately the documented algorithms for using time_shift have it shifting an integer, whereas to work correctly with the value 32, the type must be u64. In the case of perf tools, Intel PT decodes correctly but the timestamps that are output (for example by perf script) have lost 32-bits of granularity so they look like they are not changing at all. Fix by limiting the shift to 31 and adjusting the multiplier accordingly. Also update the documentation of perf_event_mmap_page so that new code based on it will be more future-proof. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: b20112edeadf ("perf/x86: Improve accuracy of perf/sched clock") Link: http://lkml.kernel.org/r/1445001845-13688-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20sched: Don't scan all-offline ->cpus_allowed twice if !CONFIG_CPUSETSOleg Nesterov
If CONFIG_CPUSETS=n then "case cpuset" changes the state and runs the already failed for_each_cpu() loop again for no reason. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: heiko.carstens@de.ibm.com Link: http://lkml.kernel.org/r/20151010185315.GA24100@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20sched: Move cpu_active() tests from stop_two_cpus() into migrate_swap_stop()Peter Zijlstra
The cpu_active() tests are not fundamentally part of stop_two_cpus(), move then into the scheduler where they belong. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20sched: Start stopper earlyPeter Zijlstra
Ensure the stopper thread is active 'early', because the load balancer pretty much assumes that its available. And when 'online && active' the load-balancer is fully available. Not only the numa balancing stop_two_cpus() caller relies on it, but also the self migration stuff does, and at CPU_ONLINE time the cpu really is 'free' to run anything. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: heiko.carstens@de.ibm.com Link: http://lkml.kernel.org/r/20151009160054.GA10176@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20stop_machine: Kill cpu_stop_threads->setup() and cpu_stop_unpark()Oleg Nesterov
Now that we always use stop_machine_unpark() to wake the stopper threas up, we can kill ->setup() and fold cpu_stop_unpark() into stop_machine_unpark(). And we do not need stopper->lock to set stopper->enabled = true. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: heiko.carstens@de.ibm.com Link: http://lkml.kernel.org/r/20151009160051.GA10169@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20stop_machine: Kill smp_hotplug_thread->pre_unpark, introduce ↵Oleg Nesterov
stop_machine_unpark() 1. Change smpboot_unpark_thread() to check ->selfparking, just like smpboot_park_thread() does. 2. Introduce stop_machine_unpark() which sets ->enabled and calls kthread_unpark(). 3. Change smpboot_thread_call() and cpu_stop_init() to call stop_machine_unpark() by hand. This way: - IMO the ->selfparking logic becomes more consistent. - We can kill the smp_hotplug_thread->pre_unpark() method. - We can easily unpark the stopper thread earlier. Say, we can move stop_machine_unpark() from smpboot_thread_call() to sched_cpu_active() as Peter suggests. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: heiko.carstens@de.ibm.com Link: http://lkml.kernel.org/r/20151009160049.GA10166@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20stop_machine: Change cpu_stop_queue_two_works() to rely on stopper->enabledOleg Nesterov
Change cpu_stop_queue_two_works() to ensure that both CPU's have stopper->enabled == T or fail otherwise. This way stop_two_cpus() no longer needs to check cpu_active() to avoid the deadlock. This patch doesn't remove these checks, we will do this later. Note: we need to take both stopper->lock's at the same time, but this will also help to remove lglock from stop_machine.c, so I hope this is fine. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: heiko.carstens@de.ibm.com Link: http://lkml.kernel.org/r/20151008170141.GA25537@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20stop_machine: Introduce __cpu_stop_queue_work() and cpu_stop_queue_two_works()Oleg Nesterov
Preparation to simplify the review of the next change. Add two simple helpers, __cpu_stop_queue_work() and cpu_stop_queue_two_works() which simply take a bit of code from their callers. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: heiko.carstens@de.ibm.com Link: http://lkml.kernel.org/r/20151008145134.GA18146@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20stop_machine: Ensure that a queued callback will be called before ↵Oleg Nesterov
cpu_stop_park() cpu_stop_queue_work() checks stopper->enabled before it queues the work, but ->enabled == T can only guarantee cpu_stop_signal_done() if we race with cpu_down(). This is not enough for stop_two_cpus() or stop_machine(), they will deadlock if multi_cpu_stop() won't be called by one of the target CPU's. stop_machine/stop_cpus are fine, they rely on stop_cpus_mutex. But stop_two_cpus() has to check cpu_active() to avoid the same race with hotplug, and this check is very unobvious and probably not even correct if we race with cpu_up(). Change cpu_down() pass to clear ->enabled before cpu_stopper_thread() flushes the pending ->works and returns with KTHREAD_SHOULD_PARK set. Note also that smpboot_thread_call() calls cpu_stop_unpark() which sets enabled == T at CPU_ONLINE stage, so this CPU can't go away until cpu_stopper_thread() is called at least once. This all means that if cpu_stop_queue_work() succeeds, we know that work->fn() will be called. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: heiko.carstens@de.ibm.com Link: http://lkml.kernel.org/r/20151008145131.GA18139@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20Merge branch 'sched/urgent' into sched/core, to pick up fixes and resolve ↵Ingo Molnar
conflicts Conflicts: kernel/sched/fair.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20gpio: add a real time compliance checklistLinus Walleij
Add some information about real time compliance to the driver document. Inspired by Grygorii Strashko's real time compliance patches. Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-10-20ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}Will Deacon
Now that the core code supports acquire/release/relaxed versions of the atomic_inc family, implement only the _relaxed flavours in the ARM backend so that we get all of the others for free. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1444227038-12533-1-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20Merge tag 'v4.3-rc6' into locking/core, to pick up fixes before applying new ↵Ingo Molnar
changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20sched/deadline: Fix migration of SCHED_DEADLINE tasksLuca Abeni
Commit: 9d5142624256 ("sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target") broke select_task_rq_dl() and find_lock_later_rq(), because it introduced a comparison between the local task's deadline and dl.earliest_dl.curr of the remote queue. However, if the remote runqueue does not contain any SCHED_DEADLINE task its earliest_dl.curr is 0 (always smaller than the deadline of the local task) and the remote runqueue is not selected for pushing. As a result, if an application creates multiple SCHED_DEADLINE threads, they will never be pushed to runqueues that do not already contain SCHED_DEADLINE tasks. This patch fixes the issue by checking if dl.dl_nr_running == 0. Signed-off-by: Luca Abeni <luca.abeni@unitn.it> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@linux.intel.com> Fixes: 9d5142624256 ("sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target") Link: http://lkml.kernel.org/r/1444982781-15608-1-git-send-email-luca.abeni@unitn.it Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20nohz: Revert "nohz: Set isolcpus when nohz_full is set"Frederic Weisbecker
This reverts: 8cb9764fc88b ("nohz: Set isolcpus when nohz_full is set") We assumed that full-nohz users always want scheduler isolation on full dynticks CPUs, therefore we included full-nohz CPUs on cpu_isolated_map. This means that tasks run by default on CPUs outside the nohz_full range unless their affinity is explicity overwritten. This suits pure isolation workloads but when the machine is needed to run common workloads, the available sets of CPUs to run common tasks becomes reduced. We reach an extreme case when CONFIG_NO_HZ_FULL_ALL is enabled as it leaves only CPU 0 for non-isolation tasks, which makes people think that their supercomputer regressed to 90's UP - which is true in a sense. Some full-nohz users appear to be interested in running normal workloads either before or after an isolation workload. Full-nohz isn't optimized toward normal workloads but it's still better than UP performance. We are reaching a limitation in kernel presets here. Lets revert this cpu_isolated_map inclusion and let userspace do its own scheduler isolation using cpusets or explicit affinity settings. Reported-by: Ingo Molnar <mingo@kernel.org> Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dave Jones <davej@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/r/1444663283-30068-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20sched/fair: Update task group's load_avg after task migrationYuyang Du
When cfs_rq has cfs_rq->removed_load_avg set (when a task migrates from this cfs_rq), we need to update its contribution to the group's load_avg. This should not increase tg's update too much, because in most cases, the cfs_rq has already decayed its load_avg. Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1444699103-20272-2-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20sched/fair: Fix overly small weight for interactive group entitiesYuyang Du
Commit: 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking") led to an overly small weight for interactive group entities. The bad case can be easily reproduced when a number of CPU hogs compete for the CPUs at the same time (thanks to Mike). This is largly because the task group's load average tracking cross CPUs lags behind the real changes. To fix this we accelerate the group share distribution process by using the load.weight of the cfs_rq. This may increase the entire group's share, but we have to do so to protect the (fragile) interactive tasks, especially from CPU hogs. Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1444699103-20272-1-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>