summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-07-16Merge tag 'sound-3.16-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Things seem to calm down so far, just a small few HD-audio fixes (regression fixes and a new codec ID addition) popping up" * tag 'sound-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Fix broken PM due to incomplete i915 initialization ALSA: hda - Revert stream assignment order for Intel controllers ALSA: hda - Add new GPU codec ID 0x10de0070 to snd-hda ALSA: hda: Fix build warning
2014-07-16ftrace: Allow archs to specify if they need a separate function graph trampolineSteven Rostedt (Red Hat)
Currently if an arch supports function graph tracing, the core code will just assign the function graph trampoline to the function graph addr that gets called. But as the old method for function graph tracing always calls the function trampoline first and that calls the function graph trampoline, some archs may have the function graph trampoline dependent on operations that were done in the function trampoline. This causes function graph tracer to break on those archs. Instead of having the default be to set the function graph ftrace_ops to the function graph trampoline, have it instead just set it to zero which will keep it from jumping to a trampoline that is not set up to be jumped directly too. Link: http://lkml.kernel.org/r/53BED155.9040607@nvidia.com Reported-by: Tuomas Tynkkynen <ttynkkynen@nvidia.com> Tested-by: Tuomas Tynkkynen <ttynkkynen@nvidia.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-16x86/debug: Drop several unnecessary CFI annotationsJan Beulich
With the conversion of the register saving code from macros to functions, and with those functions not clobbering most of the registers they spill, there's no need to annotate most of the spill operations; the only exceptions being %rbx (always modified) and %rcx (modified on the error_kernelspace: path). Also remove a bogus commented out annotation - there's no register %orig_rax after all. Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/53AAE69A020000780001D3C7@mail.emea.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16cpufreq: cpu0: OPPs can be populated at runtimeViresh Kumar
OPPs can be populated statically, via DT, or added at run time with dev_pm_opp_add(). While this driver handles the first case correctly, it would fail to populate OPPs added at runtime. Because call to of_init_opp_table() would fail as there are no OPPs in DT and probe will return early. To fix this, remove error checking and call dev_pm_opp_init_cpufreq_table() unconditionally. Update bindings as well. Suggested-by: Stephen Boyd <sboyd@codeaurora.org> Tested-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-16cpufreq: kirkwood: Reinstate cpufreq driver for ARCH_KIRKWOODQuentin Armitage
Commit ff1f0018cf66080d8e6f59791e552615648a033a ("drivers: Enable building of Kirkwood drivers for mach-mvebu") added Kirkwood into mach-mvebu, adding MACH_KIRKWOOD to ARCH_KIRKWOOD in the KConfig files. The change for ARM_KIRKWOOD_CPUFREQ replaced ARCH_KIRKWOOD with MACH_KIRKWOOD, whereas all the other changes were ARCH_KIRKWOOD || MACH_KIRKWOOD. As a consequence of this change, the cpufreq driver is no longer enabled for ARCH_KIRKWOOD. This patch reinstates ARM_KIRKWOOD_CPUFREQ for ARCH_KIRKWOOD. Signed-off-by: Quentin Armitage <quentin@armitage.org.uk> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-16sched: Allow wait_on_bit_action() functions to support a timeoutNeilBrown
It is currently not possible for various wait_on_bit functions to implement a timeout. While the "action" function that is called to do the waiting could certainly use schedule_timeout(), there is no way to carry forward the remaining timeout after a false wake-up. As false-wakeups a clearly possible at least due to possible hash collisions in bit_waitqueue(), this is a real problem. The 'action' function is currently passed a pointer to the word containing the bit being waited on. No current action functions use this pointer. So changing it to something else will be a little noisy but will have no immediate effect. This patch changes the 'action' function to take a pointer to the "struct wait_bit_key", which contains a pointer to the word containing the bit so nothing is really lost. It also adds a 'private' field to "struct wait_bit_key", which is initialized to zero. An action function can now implement a timeout with something like static int timed_out_waiter(struct wait_bit_key *key) { unsigned long waited; if (key->private == 0) { key->private = jiffies; if (key->private == 0) key->private -= 1; } waited = jiffies - key->private; if (waited > 10 * HZ) return -EAGAIN; schedule_timeout(waited - 10 * HZ); return 0; } If any other need for context in a waiter were found it would be easy to use ->private for some other purpose, or even extend "struct wait_bit_key". My particular need is to support timeouts in nfs_release_page() to avoid deadlocks with loopback mounted NFS. While wait_on_bit_timeout() would be a cleaner interface, it will not meet my need. I need the timeout to be sensitive to the state of the connection with the server, which could change. So I need to use an 'action' interface. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: David Howells <dhowells@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brown Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16sched: Remove proliferation of wait_on_bit() action functionsNeilBrown
The current "wait_on_bit" interface requires an 'action' function to be provided which does the actual waiting. There are over 20 such functions, many of them identical. Most cases can be satisfied by one of just two functions, one which uses io_schedule() and one which just uses schedule(). So: Rename wait_on_bit and wait_on_bit_lock to wait_on_bit_action and wait_on_bit_lock_action to make it explicit that they need an action function. Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io which are *not* given an action function but implicitly use a standard one. The decision to error-out if a signal is pending is now made based on the 'mode' argument rather than being encoded in the action function. All instances of the old wait_on_bit and wait_on_bit_lock which can use the new version have been changed accordingly and their action functions have been discarded. wait_on_bit{_lock} does not return any specific error code in the event of a signal so the caller must check for non-zero and interpolate their own error code as appropriate. The wait_on_bit() call in __fscache_wait_on_invalidate() was ambiguous as it specified TASK_UNINTERRUPTIBLE but used fscache_wait_bit_interruptible as an action function. David Howells confirms this should be uniformly "uninterruptible" The main remaining user of wait_on_bit{,_lock}_action is NFS which needs to use a freezer-aware schedule() call. A comment in fs/gfs2/glock.c notes that having multiple 'action' functions is useful as they display differently in the 'wchan' field of 'ps'. (and /proc/$PID/wchan). As the new bit_wait{,_io} functions are tagged "__sched", they will not show up at all, but something higher in the stack. So the distinction will still be visible, only with different function names (gds2_glock_wait versus gfs2_glock_dq_wait in the gfs2/glock.c case). Since first version of this patch (against 3.15) two new action functions appeared, on in NFS and one in CIFS. CIFS also now uses an action function that makes the same freezer aware schedule call as NFS. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: David Howells <dhowells@redhat.com> (fscache, keys) Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2) Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16Merge tag 'v3.16-rc5' into sched/core, to refresh the branch before applying ↵Ingo Molnar
bigger tree-wide changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16Merge branch 'liblockdep-fixes' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux into locking/urgent Pull liblockdep fixes from Sasha Levin. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNERDavidlohr Bueso
Just like with mutexes (CONFIG_MUTEX_SPIN_ON_OWNER), encapsulate the dependencies for rwsem optimistic spinning. No logical changes here as it continues to depend on both SMP and the XADD algorithm variant. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Acked-by: Jason Low <jason.low2@hp.com> [ Also make it depend on ARCH_SUPPORTS_ATOMIC_RMW. ] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1405112406-13052-2-git-send-email-davidlohr@hp.com Cc: aswin@hp.com Cc: Chris Mason <clm@fb.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Josef Bacik <jbacik@fusionio.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16locking/mutex: Disable optimistic spinning on some architecturesPeter Zijlstra
The optimistic spin code assumes regular stores and cmpxchg() play nice; this is found to not be true for at least: parisc, sparc32, tile32, metag-lock1, arc-!llsc and hexagon. There is further wreckage, but this in particular seemed easy to trigger, so blacklist this. Opt in for known good archs. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Jason Low <jason.low2@hp.com> Cc: Waiman Long <waiman.long@hp.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: John David Anglin <dave.anglin@bell.net> Cc: James Hogan <james.hogan@imgtec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: stable@vger.kernel.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/20140606175316.GV13930@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16locking/rwsem: Reduce the size of struct rw_semaphoreJason Low
Recent optimistic spinning additions to rwsem provide significant performance benefits on many workloads on large machines. The cost of it was increasing the size of the rwsem structure by up to 128 bits. However, now that the previous patches in this series bring the overhead of struct optimistic_spin_queue to 32 bits, this patch reorders some fields in struct rw_semaphore such that we can reduce the overhead of the rwsem structure by 64 bits (on 64 bit systems). The extra overhead required for rwsem optimistic spinning would now be up to 8 additional bytes instead of up to 16 bytes. Additionally, the size of rwsem would now be more in line with mutexes. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Scott Norton <scott.norton@hp.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Waiman Long <waiman.long@hp.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fusionio.com> Link: http://lkml.kernel.org/r/1405358872-3732-6-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16locking/rwsem: Rename 'activity' to 'count'Peter Zijlstra
There are two definitions of struct rw_semaphore, one in linux/rwsem.h and one in linux/rwsem-spinlock.h. For some reason they have different names for the initial field. This makes it impossible to use C99 named initialization for __RWSEM_INITIALIZER() -- or we have to duplicate that entire thing along with the structure definitions. The simpler patch is renaming the rwsem-spinlock variant to match the regular rwsem. This allows us to switch to C99 named initialization. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-bmrZolsbGmautmzrerog27io@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16cpufreq: imx6q: Select PM_OPPNicolas Del Piano
PM_OPP is a library used by several of the existing cpufreq drivers. ARM IMX6Q cpufreq driver uses this library for its functionality. Thus, it should be selected in Kconfig. Reported-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Signed-off-by: Nicolas Del Piano <ndel314@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-16cpufreq: sa1110: set memory type for h3600Linus Walleij
The Compaq iPAQ h3600 also has the K4S281632b-1H memory type. Verified by prying apart a broken board. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-16kprobes/x86: Don't try to resolve kprobe faults from userspaceAndy Lutomirski
This commit: commit 6f6343f53d133bae516caf3d254bce37d8774625 Author: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Date: Thu Apr 17 17:17:33 2014 +0900 kprobes/x86: Call exception handlers directly from do_int3/do_debug appears to have inadvertently dropped a check that the int3 came from kernel mode. Trying to dereference addr when addr is user-controlled is completely bogus. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/c4e339882c121aa76254f2adde3fcbdf502faec2.1405099506.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16ACPI / video: Add use_native_backlight quirk for HP ProBook 4540sHans de Goede
As reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1025690 This is yet another model which needs this quirk. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1025690 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-16Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core Pull perf/core improvements and fixes from Jiri Olsa: * Add IO mode into timechart command (Stanislav Fomichev) Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: o Prep patches to support 'perf kvm stat' on s390 (Alexander Yarygin) o Add pagefault statistics in 'trace' (Stanislav Fomichev) o Add header for columns in 'top' and 'report' TUI browsers (Jiri Olsa) o Add pagefault statistics in 'trace' (Stanislav Fomichev) Build fixes: o Fix build on 32-bit systems (Arnaldo Carvalho de Melo) Cleanups: o Convert open coded equivalents to asprintf() (Andy Shevchenko) Plumbing changes: o Allow reserving a row for header purposes in the hists browser (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16sched/numa: Revert "Use effective_load() to balance NUMA loads"Peter Zijlstra
Due to divergent trees, Rik find that this patch is no longer required. Requested-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-u6odkgkw8wz3m7orgsjfo5pi@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16sched: Fix static_key race with sched_feat()Jason Baron
As pointed out by Andi Kleen, the usage of static keys can be racy in sched_feat_disable() vs. sched_feat_enable(). Currently, we first check the value of keys->enabled, and subsequently update the branch direction. This, can be racy and can potentially leave the keys in an inconsistent state. Take the i_mutex around these calls to resolve the race. Reported-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/9d7780c83db26683955cd01e6bc654ee2586e67f.1404315388.git.jbaron@akamai.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16sched: Remove extra static_key*() function indirectionJason Baron
I think its a bit simpler without having to follow an extra layer of static inline fuctions. No functional change just cosmetic. Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: rostedt@goodmis.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/2ce52233ce200faad93b6029d90f1411cd926667.1404315388.git.jbaron@akamai.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16sched/rt: Fix replenish_dl_entity() comments to match the current upstream codexiaofeng.yan
Signed-off-by: xiaofeng.yan <xiaofeng.yan@huawei.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1404712744-16986-1-git-send-email-xiaofeng.yan@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16sched: Transform resched_task() into resched_curr()Kirill Tkhai
We always use resched_task() with rq->curr argument. It's not possible to reschedule any task but rq's current. The patch introduces resched_curr(struct rq *) to replace all of the repeating patterns. The main aim is cleanup, but there is a little size profit too: (before) $ size kernel/sched/built-in.o text data bss dec hex filename 155274 16445 7042 178761 2ba49 kernel/sched/built-in.o $ size vmlinux text data bss dec hex filename 7411490 1178376 991232 9581098 92322a vmlinux (after) $ size kernel/sched/built-in.o text data bss dec hex filename 155130 16445 7042 178617 2b9b9 kernel/sched/built-in.o $ size vmlinux text data bss dec hex filename 7411362 1178376 991232 9580970 9231aa vmlinux I was choosing between resched_curr() and resched_rq(), and the first name looks better for me. A little lie in Documentation/trace/ftrace.txt. I have not actually collected the tracing again. With a hope the patch won't make execution times much worse :) Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20140628200219.1778.18735.stgit@localhost Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16sched/deadline: Kill task_struct->pi_top_taskOleg Nesterov
Remove task_struct->pi_top_task. The only user, rt_mutex_setprio(), can use a local. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Daeseok Youn <daeseok.youn@gmail.com> Cc: Dario Faggioli <raistlin@linux.it> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Dempsky <mdempsky@chromium.org> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20140606165206.GB29465@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16sched: Fix possible divide by zero in avg_atom() calculationMateusz Guzik
proc_sched_show_task() does: if (nr_switches) do_div(avg_atom, nr_switches); nr_switches is unsigned long and do_div truncates it to 32 bits, which means it can test non-zero on e.g. x86-64 and be truncated to zero for division. Fix the problem by using div64_ul() instead. As a side effect calculations of avg_atom for big nr_switches are now correct. Signed-off-by: Mateusz Guzik <mguzik@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1402750809-31991-1-git-send-email-mguzik@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16perf/x86/intel: Avoid spamming kernel log for BTS buffer failureDavid Rientjes
It's unnecessary to excessively spam the kernel log anytime the BTS buffer cannot be allocated, so make this allocation __GFP_NOWARN. The user probably will want to at least find some artifact that the allocation has failed in the past, probably due to fragmentation because of its large size, when it's not allocated at bootstrap. Thus, add a WARN_ONCE() so something is left behind for them to understand why perf commnads that require PEBS is not working properly. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1406301600460.26302@chino.kir.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16perf: Add vm_ops->name call for mmap event name retrievalJiri Olsa
The following patch added another way to get mmap name: 78d683e838a6 ("mm, fs: Add vm_ops->name as an alternative to arch_vma_name") The vdso vma mapping already switch to this and we no longer get vdso name via arch_vma_name function. Adding this way to the perf mmap event name retrieval code. Caught this via perf test: $ sudo ./perf test -v 7 7: Validate PERF_RECORD_* events & perf_sample fields : --- start --- SNIP PERF_RECORD_MMAP for [vdso] missing! test child finished with 255 ---- end ---- Validate PERF_RECORD_* events & perf_sample fields: FAILED! Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1405353439-14211-1-git-send-email-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16perf/x86/amd: Try to fix some mem allocation failure handlingZhouyi Zhou
According to Peter's advice, put the failure handling to a goto chain. Compiled in x86_64, could you check if there is anything that I missed. Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1402459743-20513-1-git-send-email-zhouzhouyi@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16locking/spinlocks/mcs: Micro-optimize osq_unlock()Jason Low
In the unlock function of the cancellable MCS spinlock, the first thing we do is to retrive the current CPU's osq node. However, due to the changes made in the previous patch, in the common case where the lock is not contended, we wouldn't need to access the current CPU's osq node anymore. This patch optimizes this by only retriving this CPU's osq node after we attempt the initial cmpxchg to unlock the osq and found that its contended. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Scott Norton <scott.norton@hp.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Waiman Long <waiman.long@hp.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1405358872-3732-5-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16locking/spinlocks/mcs: Introduce and use init macro and function for osq locksJason Low
Currently, we initialize the osq lock by directly setting the lock's values. It would be preferable if we use an init macro to do the initialization like we do with other locks. This patch introduces and uses a macro and function for initializing the osq lock. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Scott Norton <scott.norton@hp.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Waiman Long <waiman.long@hp.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fusionio.com> Link: http://lkml.kernel.org/r/1405358872-3732-4-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overheadJason Low
The cancellable MCS spinlock is currently used to queue threads that are doing optimistic spinning. It uses per-cpu nodes, where a thread obtaining the lock would access and queue the local node corresponding to the CPU that it's running on. Currently, the cancellable MCS lock is implemented by using pointers to these nodes. In this patch, instead of operating on pointers to the per-cpu nodes, we store the CPU numbers in which the per-cpu nodes correspond to in atomic_t. A similar concept is used with the qspinlock. By operating on the CPU # of the nodes using atomic_t instead of pointers to those nodes, this can reduce the overhead of the cancellable MCS spinlock by 32 bits (on 64 bit systems). Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Scott Norton <scott.norton@hp.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Waiman Long <waiman.long@hp.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chris Mason <clm@fb.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Josef Bacik <jbacik@fusionio.com> Link: http://lkml.kernel.org/r/1405358872-3732-3-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16locking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node()Jason Low
Currently, the per-cpu nodes structure for the cancellable MCS spinlock is named "optimistic_spin_queue". However, in a follow up patch in the series we will be introducing a new structure that serves as the new "handle" for the lock. It would make more sense if that structure is named "optimistic_spin_queue". Additionally, since the current use of the "optimistic_spin_queue" structure are "nodes", it might be better if we rename them to "node" anyway. This preparatory patch renames all current "optimistic_spin_queue" to "optimistic_spin_node". Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Scott Norton <scott.norton@hp.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Waiman Long <waiman.long@hp.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chris Mason <clm@fb.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Josef Bacik <jbacik@fusionio.com> Link: http://lkml.kernel.org/r/1405358872-3732-2-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16locking/rwsem: Allow conservative optimistic spinning when readers have lockJason Low
Commit 4fc828e24cd9 ("locking/rwsem: Support optimistic spinning") introduced a major performance regression for workloads such as xfs_repair which mix read and write locking of the mmap_sem across many threads. The result was xfs_repair ran 5x slower on 3.16-rc2 than on 3.15 and using 20x more system CPU time. Perf profiles indicate in some workloads that significant time can be spent spinning on !owner. This is because we don't set the lock owner when readers(s) obtain the rwsem. In this patch, we'll modify rwsem_can_spin_on_owner() such that we'll return false if there is no lock owner. The rationale is that if we just entered the slowpath, yet there is no lock owner, then there is a possibility that a reader has the lock. To be conservative, we'll avoid spinning in these situations. This patch reduced the total run time of the xfs_repair workload from about 4 minutes 24 seconds down to approximately 1 minute 26 seconds, back to close to the same performance as on 3.15. Retesting of AIM7, which were some of the workloads used to test the original optimistic spinning code, confirmed that we still get big performance gains with optimistic spinning, even with this additional regression fix. Davidlohr found that while the 'custom' workload took a performance hit of ~-14% to throughput for >300 users with this additional patch, the overall gain with optimistic spinning is still ~+45%. The 'disk' workload even improved by ~+15% at >1000 users. Tested-by: Dave Chinner <dchinner@redhat.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1404532172.2572.30.camel@j-VirtualBox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16perf/x86/intel: Protect LBR and extra_regs against KVM lyingKan Liang
With -cpu host, KVM reports LBR and extra_regs support, if the host has support. When the guest perf driver tries to access LBR or extra_regs MSR, it #GPs all MSR accesses,since KVM doesn't handle LBR and extra_regs support. So check the related MSRs access right once at initialization time to avoid the error access at runtime. For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y (for host kernel). And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel). Start the guest with -cpu host. Run perf record with --branch-any or --branch-filter in guest to trigger LBR Run perf stat offcore events (E.g. LLC-loads/LLC-load-misses ...) in guest to trigger offcore_rsp #GP Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com> Cc: Mark Davies <junk@eslaf.co.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: Yan, Zheng <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1405365957-20202-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16perf: Fix lockdep warning on process exitPeter Zijlstra
Sasha Levin reported: > While fuzzing with trinity inside a KVM tools guest running the latest -next > kernel I've stumbled on the following spew: > > ====================================================== > [ INFO: possible circular locking dependency detected ] > 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 Not tainted > ------------------------------------------------------- > trinity-c578/9725 is trying to acquire lock: > (&(&pool->lock)->rlock){-.-...}, at: __queue_work (kernel/workqueue.c:1346) > > but task is already holding lock: > (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) > > which lock already depends on the new lock. > 1 lock held by trinity-c578/9725: > #0: (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) > > Call Trace: > dump_stack (lib/dump_stack.c:52) > print_circular_bug (kernel/locking/lockdep.c:1216) > __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) > lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) > _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) > __queue_work (kernel/workqueue.c:1346) > queue_work_on (kernel/workqueue.c:1424) > free_object (lib/debugobjects.c:209) > __debug_check_no_obj_freed (lib/debugobjects.c:715) > debug_check_no_obj_freed (lib/debugobjects.c:727) > kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711) > free_task (kernel/fork.c:221) > __put_task_struct (kernel/fork.c:250) > put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898) > perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533) > do_exit (kernel/exit.c:766) > do_group_exit (kernel/exit.c:884) > get_signal_to_deliver (kernel/signal.c:2347) > do_signal (arch/x86/kernel/signal.c:698) > do_notify_resume (arch/x86/kernel/signal.c:751) > int_signal (arch/x86/kernel/entry_64.S:600) Urgh.. so the only way I can make that happen is through: perf_event_exit_task_context() raw_spin_lock(&child_ctx->lock); unclone_ctx(child_ctx) put_ctx(ctx->parent_ctx); raw_spin_unlock_irqrestore(&child_ctx->lock); And we can avoid this by doing the change below. I can't immediately see how this changed recently, but given that you say it's easy to reproduce, lets fix this. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Dave Jones <davej@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140623141242.GB19860@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16perf/x86/intel/uncore: Fix SNB-EP/IVT Cbox filter mappingsStephane Eranian
This patch fixes the SNB-EP and IVT Cbox filter mapping table. The table controls which filters are supported by which events. There were several mistakes in those tables causing some filters to be ignored, such as NID on TOR_INSERTS. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: zheng.z.yan@intel.com Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140630144624.GA2604@quad Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16perf/x86/intel: Use proper dTLB-load-misses event on IvyBridgeVince Weaver
This was discussed back in February: https://lkml.org/lkml/2014/2/18/956 But I never saw a patch come out of it. On IvyBridge we share the SandyBridge cache event tables, but the dTLB-load-miss event is not compatible. Patch it up after the fact to the proper DTLB_LOAD_MISSES.DEMAND_LD_MISS_CAUSES_A_WALK Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1407141528200.17214@vincent-weaver-1.umelst.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16perf: Revert ("perf: Always destroy groups on exit")Peter Zijlstra
Vince reported that commit 15a2d4de0eab5 ("perf: Always destroy groups on exit") causes a regression with grouped events. In particular his read_group_attached.c test fails. https://github.com/deater/perf_event_tests/blob/master/tests/bugs/read_group_attached.c Because of the context switch optimization in perf_event_context_sched_out() the 'original' event may end up in the child process and when that exits the change in the patch in question destroys the actual grouping. Therefore revert that change and only destroy inherited groups. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-zedy3uktcp753q8fw8dagx7a@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16x86: Remove unused variable "polling"Paul Bolle
Compile tested. "polling" is unused since commit f80c5b39b80a ("sched/idle, x86: Switch from TS_POLLING to TIF_POLLING_NRFLAG"). Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jkosina@suse.cz> Link: http://lkml.kernel.org/r/1404138749.2978.6.camel@x41 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16s390: fix restore of invalid floating-point-controlMartin Schwidefsky
The fixup of the inline assembly to restore the floating-point-control register needs to check for instruction address *after* the lfcp instruction as the specification and data exceptions are suppresssing. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-16s390/zcrypt: improve device probing for zcrypt adapter cardsIngo Tuchscherer
Improve device probing process for zcrypt adapters to transmit service request during registration process. Signed-off-by: Ingo Tuchscherer <ingo.tuchscherer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-16s390/ptrace: fix PSW mask checkMartin Schwidefsky
The PSW mask check of the PTRACE_POKEUSR_AREA command is incorrect. The PSW_MASK_USER define contains the PSW_MASK_ASC bits, the ptrace interface accepts all combinations for the address-space-control bits. To protect the kernel space the PSW mask check in ptrace needs to reject the address-space-control bit combination for home space. Fixes CVE-2014-3534 Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-16s390/MSI: Use standard mask and unmask funtionsYijing Wang
MSI irqchip in s390 has its own mask and unmask MSI irq functions, zpci_enable_irq() and zpci_disable_irq(). They mask and unmask MSI irq in standard ways, no arch special. MSI driver provides two global standard functions mask_msi_irq() and unmask_msi_irq(). Local zpci_enable_irq() and zpci_disable_irq() are almost the same as the standard two. the difference is local mask/unmask functions read the mask status before mask and unmask everytime. Then change the value and rewrite to hardware. In standard functions, save the mask status after mask and unmask msi irq, and use the cached status to change the mask status. When we mask or unmask a MSI irq, we always cache its mask status except we know need not to cache it, like in pci_msi_shutdown. So use the standard functions to replace the local is safe. Signed-off-by: Yijing Wang <wangyijing@huawei.com> [sebott: fixed inverted function pointers] Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-16s390/3270: correct size detection with the read-partition commandMartin Schwidefsky
The size detection for 3270 terminals with the read-partition command is broken. The raw3270_reset_device_cb function clears the init_data array, but if raw3270_writesf_readpart has been called the read-partition command is queued which needs the init_data array. In this case the size detection will fail and the invalid command does funny things to the terminal. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-16s390: require mvcos facility, not tod clock steering facilityDavid Hildenbrand
Inlined uaccess functions require the mvcos facility (bit 27), not the tod clock steering facility (bit 28) for z10 and newer machines. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-16UBI: fastmap: do not miss bit-flipsBrian Norris
The return value from 'ubi_io_read_ec_hdr()' was stored in 'err', not in 'ret'. This fix makes sure Fastmap-enabled UBI does not miss bit-flip while reading EC headers, events and scrubs the affected PEBs. This issue was reported by Coverity Scan. Artem: improved the commit message. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-15Merge tag 'xtensa-for-next-20140715' of ↵Chris Zankel
git://github.com/jcmvbkbc/linux-xtensa into for_next Xtensa fixes for 3.16: - resolve FIXMEs in double exception handler for window overflow. This fix makes native building of linux on xtensa host possible; - fix sysmem region removal issue introduced in 3.15.
2014-07-15Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota fix from Jan Kara: "Fix locking of dquot shrinker" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: missing lock in dqcache_shrink_scan()
2014-07-15Merge tag 'gpio-v3.16-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fix from Linus Walleij: "Fix up some merge confusion from the merge window" * tag 'gpio-v3.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: mcp23s08: Eliminates redundant checking.