summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2009-10-14perf_event: Adjust frequency and unthrottle for non-group-leader eventsPaul Mackerras
The loop in perf_ctx_adjust_freq checks the frequency of sampling event counters, and adjusts the event interval and unthrottles the event if required, and resets the interrupt count for the event. However, at present it only looks at group leaders. This means that a sampling event that is not a group leader will eventually get throttled, once its interrupt count reaches sysctl_perf_event_sample_rate/HZ --- and that is guaranteed to happen, if the event is active for long enough, since the interrupt count never gets reset. Once it is throttled it never gets unthrottled, so it basically just stops working at that point. This fixes it by making perf_ctx_adjust_freq use ctx->event_list rather than ctx->group_list. The existing spin_lock/spin_unlock around the loop makes it unnecessary to put rcu_read_lock/ rcu_read_unlock around the list_for_each_entry_rcu(). Reported-by: Mark W. Krentel <krentel@cs.rice.edu> Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <19157.26731.855609.165622@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14tracing: Enable records during the module loadJiri Olsa
I was debuging some module using "function" and "function_graph" tracers and noticed, that if you load module after you enabled tracing, the module's hooks will convert only to NOP instructions. The attached patch enables modules' hooks if there's function trace allready on, thus allowing to trace module functions. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091013203425.896285120@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14tracing: Support multiple pids in set_pid_ftrace filejolsa@redhat.com
Adding the possibility to set more than 1 pid in the set_pid_ftrace file, thus allowing to trace more than 1 independent processes. Usage: sh-4.0# echo 284 > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid 284 sh-4.0# echo 1 >> ./set_ftrace_pid sh-4.0# echo 0 >> ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid swapper tasks 1 284 sh-4.0# echo 4 > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid 4 sh-4.0# echo > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid no pid sh-4.0# Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091013203425.565454612@goodmis.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14capabilities: simplify bound checks for copy_from_user()Arjan van de Ven
The capabilities syscall has a copy_from_user() call where gcc currently cannot prove to itself that the copy is always within bounds. This patch adds a very explicity bound check to prove to gcc that this copy_from_user cannot overflow its destination buffer. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Morris <jmorris@namei.org>
2009-10-13futex: Handle spurious wake upThomas Gleixner
The futex code does not handle spurious wake up in futex_wait and futex_wait_requeue_pi. The code assumes that any wake up which was not caused by futex_wake / requeue or by a timeout was caused by a signal wake up and returns one of the syscall restart error codes. In case of a spurious wake up the signal delivery code which deals with the restart error codes is not invoked and we return that error code to user space. That causes applications which actually check the return codes to fail. Blaise reported that on preempt-rt a python test program run into a exception trap. -rt exposed that due to a built in spurious wake up accelerator :) Solve this by checking signal_pending(current) in the wake up path and handle the spurious wake up case w/o returning to user space. Reported-by: Blaise Gassend <blaise@willowgarage.com> Debugged-by: Darren Hart <dvhltc@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@kernel.org LKML-Reference: <new-submission>
2009-10-13Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: cciss: Add cciss_allow_hpsa module parameter cciss: Fix multiple calls to pci_release_regions blk-settings: fix function parameter kernel-doc notation writeback: kill space in debugfs item name writeback: account IO throttling wait as iowait elv_iosched_store(): fix strstrip() misuse cfq-iosched: avoid probable slice overrun when idling cfq-iosched: apply bool value where we return 0/1 cfq-iosched: fix think time allowed for seekers cfq-iosched: fix the slice residual sign cfq-iosched: abstract out the 'may this cfqq dispatch' logic block: use proper BLK_RW_ASYNC in blk_queue_start_tag() block: Seperate read and write statistics of in_flight requests v2 block: get rid of kblock_schedule_delayed_work() cfq-iosched: fix possible problem with jiffies wraparound cfq-iosched: fix issue with rq-rq merging and fifo list ordering
2009-10-13this_cpu: Use this_cpu_xx in trace_functions_graph.cTejun Heo
ftrace_cpu_disabled usage in trace_functions_graph.c were left out during this_cpu_xx conversion in commit 9288f99a causing compile failure. Convert them. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christoph Lameter <cl@linux-foundation.org>
2009-10-13Merge branch 'tracing/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into tracing/core
2009-10-13tracing: Remove unused ftrace_trace_addr helperFrederic Weisbecker
Remove the ftrace_trace_addr() function as only its off-case is implemented and there are no users of it currently. But we keep ftrace_graph_addr() off-case, in case someone come to use the function graph tracer to profit from top-level callers filtering. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com>
2009-10-13tracing: Rename set_ftrace to set_bootup_ftraceFrederic Weisbecker
Do this rename because set_ftrace is too much generic and not enough self-explainable as a name. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com>
2009-10-13Merge commit 'v2.6.32-rc4' into perf/coreIngo Molnar
Merge reason: we were on an -rc1 base, merge up to -rc4. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13Merge branch 'tracing/urgent' into tracing/coreIngo Molnar
Merge reason: Pick up tracing/filters fix from the urgent queue, we will queue up dependent patches. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12net: Introduce recvmmsg socket syscallArnaldo Carvalho de Melo
Meaning receive multiple messages, reducing the number of syscalls and net stack entry/exit operations. Next patches will introduce mechanisms where protocols that want to optimize this operation will provide an unlocked_recvmsg operation. This takes into account comments made by: . Paul Moore: sock_recvmsg is called only for the first datagram, sock_recvmsg_nosec is used for the rest. . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that works in the same fashion as the ppoll one. If the underlying protocol returns a datagram with MSG_OOB set, this will make recvmmsg return right away with as many datagrams (+ the OOB one) it has received so far. . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen datagrams and then recvmsg returns an error, recvmmsg will return the successfully received datagrams, store the error and return it in the next call. This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg, where we will be able to acquire the lock only at batch start and end, not at every underlying recvmsg call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-13tracing/filters: Fix memory leak when setting a filterLi Zefan
Every time we set a filter, we leak memory allocated by postfix_append_operand() and postfix_append_op(). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: <stable@kernel.org> # for v2.6.31.x LKML-Reference: <4AD3D7D9.4070400@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12tracing/kprobes: Robustify fixed field names against variable field names ↵Masami Hiramatsu
conflicts Rename probe-common fixed field names to harder conflictable names, because current 'ip', 'func', and other probe field names are easily in conflict with user-specified variable names. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Frank Ch. Eigler <fche@redhat.com> LKML-Reference: <20091007222814.1684.407.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-10-12tracing/kprobes: Avoid field name conflictionMasami Hiramatsu
Check whether the argument name is in conflict with other field names while creating a kprobe through the debugfs interface. Changes in v3: - Check strcmp() == 0 instead of !strcmp(). Changes in v2: - Add common_lock_depth to reserved name list. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Frank Ch. Eigler <fche@redhat.com> LKML-Reference: <20091007222807.1684.26880.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-10-12tracing/kprobes: Make special variable names more self-explainableMasami Hiramatsu
Rename special variables to more self-explainable names as below: - $rv to $retval - $sa to $stack - $aN to $argN - $sN to $stackN Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Frank Ch. Eigler <fche@redhat.com> LKML-Reference: <20091007222759.1684.3319.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-10-12ftrace: add kernel command line graph function filteringStefan Assmann
Add a command line parameter to allow limiting the function graphs that are traced on boot up from the given top-level callers , when ftrace=function_graph is specified. This patch adds the following command line option: ftrace_graph_filter=function-list Where function-list is a comma separated list of functions to filter. [fweisbec@gmail.com: picked the documentation changes from the v2 patch] Signed-off-by: Stefan Assmann <sassmann@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4AD2DEB9.2@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-10-12tracing/kprobes: Remove '$ra' special variableMasami Hiramatsu
Remove '$ra' (return address) because it is already shown at the head of each entry. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Frank Ch. Eigler <fche@redhat.com> LKML-Reference: <20091007222748.1684.12711.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-10-12tracing/kprobes: Add $ prefix to special variablesMasami Hiramatsu
Add $ prefix to the special variables(e.g. sa, rv) of kprobe-tracer. This resolves consistency issues between kprobe_events and perf-kprobe. The main goal is to avoid conflicts between local variable names of probed functions, used by perf probe, and special variables used in the kprobe event creation interface (stack values, etc...) and also available from perf probe. ie: we don't want rv (return value) to conflict with a local variable named rv in a probed function. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Frank Ch. Eigler <fche@redhat.com> LKML-Reference: <20091007222740.1684.91170.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-10-12this_cpu: Use this_cpu_xx for ftraceChristoph Lameter
this_cpu_xx can reduce the instruction count here and also avoid address arithmetic. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-10-12sched: Fix missing kernel-doc notationRandy Dunlap
The following htmldocs warnings: Warning(kernel/sched.c:685): No description found for parameter 'cpu' Warning(kernel/sched.c:3676): No description found for parameter 'sd' Trigger because new parameters were added to update_rq_clock() and update_group_power() without updating the kernel-doc notation. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4AD29070.7070002@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-11headers: remove sched.h from interrupt.hAlexey Dobriyan
After m68k's task_thread_info() doesn't refer to current, it's possible to remove sched.h from interrupt.h and not break m68k! Many thanks to Heiko Carstens for allowing this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-10-09sched: Update the clock of runqueue select_task_rq() selectedMike Galbraith
In try_to_wake_up(), we update the runqueue clock, but select_task_rq() may select a different runqueue than the one we updated, leaving the new runqueue's clock stale for a bit. This patch cures occasional huge latencies reported by latencytop when coming out of idle on a mostly idle NO_HZ box. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1255070103.7639.30.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09lockdep: Use cpu_clock() for lockstatPeter Zijlstra
Some tracepoint magic (TRACE_EVENT(lock_acquired)) relies on the fact that lock hold times are positive and uses div64 on that. That triggered a build warning on MIPS, and probably causes bad output in certain circumstances as well. Make it truly positive. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1254818502.21044.112.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09writeback: account IO throttling wait as iowaitWu Fengguang
It makes sense to do IOWAIT when someone is blocked due to IO throttle, as suggested by Kame and Peter. There is an old comment for not doing IOWAIT on throttle, however it has been mismatching the code for a long time. If we stop accounting IOWAIT for 2.6.32, it could be an undesirable behavior change. So restore the io_schedule. CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-09tracing: fix trace_vprintk callSteven Rostedt
The addition of trace_array_{v}printk used the wrong function for trace_vprintk to call. This broke trace_marker and trace_vprintk itself. Although trace_printk may not have been affected by those that end up calling trace_vbprintk. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-10-08Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: futex: fix requeue_pi key imbalance futex: Fix typo in FUTEX_WAIT/WAKE_BITSET_PRIVATE definitions rcu: Place root rcu_node structure in separate lockdep class rcu: Make hot-unplugged CPU relinquish its own RCU callbacks rcu: Move rcu_barrier() to rcutree futex: Move exit_pi_state() call to release_mm() futex: Nullify robust lists after cleanup futex: Fix locking imbalance panic: Fix panic message visibility by calling bust_spinlocks(0) before dying rcu: Replace the rcu_barrier enum with pointer to call_rcu*() function rcu: Clean up code based on review feedback from Josh Triplett, part 4 rcu: Clean up code based on review feedback from Josh Triplett, part 3 rcu: Fix rcu_lock_map build failure on CONFIG_PROVE_LOCKING=y rcu: Clean up code to address Ingo's checkpatch feedback rcu: Clean up code based on review feedback from Josh Triplett, part 2 rcu: Clean up code based on review feedback from Josh Triplett
2009-10-08Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Set correct normal_prio and prio values in sched_fork()
2009-10-08Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: user local buffer variable for trace branch tracer tracing: fix warning on kernel/trace/trace_branch.c andtrace_hw_branches.c ftrace: check for failure for all conversions tracing: correct module boundaries for ftrace_release tracing: fix transposed numbers of lock_depth and preempt_count trace: Fix missing assignment in trace_ctxwake_* tracing: Use free_percpu instead of kfree tracing: Check total refcount before releasing bufs in profile_enable failure
2009-10-08Merge branch 'sparc-perf-events-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sparc-perf-events-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: mm, perf_event: Make vmalloc_user() align base kernel virtual address to SHMLBA perf_event: Provide vmalloc() based mmap() backing
2009-10-08Merge branch 'perf-fixes-for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_events: Make ABI definitions available to userspace perf tools: elf_sym__is_function() should accept "zero" sized functions tracing/syscalls: Use long for syscall ret format and field definitions perf trace: Update eval_flag() flags array to match interrupt.h perf trace: Remove unused code in builtin-trace.c perf: Propagate term signal to child
2009-10-07tracing: user local buffer variable for trace branch tracerSteven Rostedt
Just using the tr->buffer for the API to trace_buffer_lock_reserve is not good enough. This is because the tr->buffer may change, and we do not want to commit with a different buffer that we reserved from. This patch uses a local variable to hold the buffer that was used to reserve and commit with. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-10-07tracing: fix warning on kernel/trace/trace_branch.c andtrace_hw_branches.cZhenwen Xu
fix warnings that caused the API change of trace_buffer_lock_reserve() change files: kernel/trace/trace_hw_branch.c kernel/trace/trace_branch.c Signed-off-by: Zhenwen Xu <helight.xu@gmail.com> LKML-Reference: <20091008012146.GA4170@helight> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-10-07ftrace: check for failure for all conversionsSteven Rostedt
Due to legacy code from back when the dynamic tracer used a daemon, only core kernel code was checking for failures. This is no longer the case. We must check for failures any time we perform text modifications. Cc: stable@kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-10-07tracing: correct module boundaries for ftrace_releasejolsa@redhat.com
When the module is about the unload we release its call records. The ftrace_release function was given wrong values representing the module core boundaries, thus not releasing its call records. Plus making ftrace_release function module specific. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <1254934835-363-3-git-send-email-jolsa@redhat.com> Cc: stable@kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-10-07futex: fix requeue_pi key imbalanceDarren Hart
If futex_wait_requeue_pi() wakes prior to requeue, we drop the reference to the source futex_key twice, once in handle_early_requeue_pi_wakeup() and once on our way out. Remove the drop from the handle_early_requeue_pi_wakeup() and keep the get/drops together in futex_wait_requeue_pi(). Reported-by: Helge Bahmann <hcb@chaoticmind.net> Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Cc: Helge Bahmann <hcb@chaoticmind.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Dinakar Guniguntala <dino@in.ibm.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: stable-2.6.31 <stable@kernel.org> LKML-Reference: <4ACCE21E.5030805@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-10-07tracing: fix transposed numbers of lock_depth and preempt_countSteven Rostedt
The lock_depth and preempt_count numbers in the latency format is transposed. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-10-07NOHZ: update idle state also when NOHZ is inactiveEero Nurkkala
Commit f2e21c9610991e95621a81407cdbab881226419b had unfortunate side effects with cpufreq governors on some systems. If the system did not switch into NOHZ mode ts->inidle is not set when tick_nohz_stop_sched_tick() is called from the idle routine. Therefor all subsequent calls from irq_exit() to tick_nohz_stop_sched_tick() fail to call tick_nohz_start_idle(). This results in bogus idle accounting information which is passed to cpufreq governors. Set the inidle flag unconditionally of the NOHZ active state to keep the idle time accounting correct in any case. [ tglx: Added comment and tweaked the changelog ] Reported-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Eero Nurkkala <ext-eero.nurkkala@nokia.com> Cc: Rik van Riel <riel@redhat.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Greg KH <greg@kroah.com> Cc: Steven Noonan <steven@uplinklabs.net> Cc: stable@kernel.org LKML-Reference: <1254907901.30157.93.camel@eenurkka-desktop> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-10-07rcu: Place root rcu_node structure in separate lockdep classPaul E. McKenney
Before this patch, all of the rcu_node structures were in the same lockdep class, so that lockdep would complain when rcu_preempt_offline_tasks() acquired the root rcu_node structure's lock while holding one of the leaf rcu_nodes' locks. This patch changes rcu_init_one() to use a separate spin_lock_init() for the root rcu_node structure's lock than is used for that of all of the rest of the rcu_node structures, which puts the root rcu_node structure's lock in its own lockdep class. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12548908983277-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-07rcu: Make hot-unplugged CPU relinquish its own RCU callbacksPaul E. McKenney
The current interaction between RCU and CPU hotplug requires that RCU block in CPU notifiers waiting for callbacks to drain. This can be greatly simplified by having each CPU relinquish its own callbacks, and for both _rcu_barrier() and CPU_DEAD notifiers to adopt all callbacks that were previously relinquished. This change also eliminates the possibility of certain types of hangs due to the previous practice of waiting for callbacks to be invoked from within CPU notifiers. If you don't every wait, you cannot hang. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1254890898456-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-07rcu: Move rcu_barrier() to rcutreePaul E. McKenney
Move the existing rcu_barrier() implementation to rcutree.c, consistent with the fact that the rcu_barrier() implementation is tied quite tightly to the RCU implementation. This opens the way to simplify and fix rcutree.c's rcu_barrier() implementation in a later patch. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12548908982563-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06futex: Move exit_pi_state() call to release_mm()Thomas Gleixner
exit_pi_state() is called from do_exit() but not from do_execve(). Move it to release_mm() so it gets called from do_execve() as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: stable@kernel.org Cc: Anirban Sinha <ani@anirban.org> Cc: Peter Zijlstra <peterz@infradead.org>
2009-10-06futex: Nullify robust lists after cleanupPeter Zijlstra
The robust list pointers of user space held futexes are kept intact over an exec() call. When the exec'ed task exits exit_robust_list() is called with the stale pointer. The risk of corruption is minimal, but still it is incorrect to keep the pointers valid. Actually glibc should uninstall the robust list before calling exec() but we have to deal with it anyway. Nullify the pointers after [compat_]exit_robust_list() has been called. Reported-by: Anirban Sinha <ani@anirban.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: stable@kernel.org
2009-10-06tracing/events: Add 'signed' field to format filesTom Zanussi
The sign info used for filters in the kernel is also useful to applications that process the trace stream. Add it to the format files and make it available to userspace. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: hch@infradead.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1254809398-8078-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06trace: Fix missing assignment in trace_ctxwake_*Hiroshi Shimamoto
The state char variable S should be reassigned, if S == 0. We are missing the state of the task that is going to sleep for the context switch events (in the raw mode). Fortunately the problem arises with the sched_switch/wake_up tracers, not the sched trace events. The formers are legacy now. But still, that was buggy. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Steven Rostedt <srostedt@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4AC43118.6050409@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06perf_event: Provide vmalloc() based mmap() backingPeter Zijlstra
Some architectures such as Sparc, ARM and MIPS (basically everything with flush_dcache_page()) need to deal with dcache aliases by carefully placing pages in both kernel and user maps. These architectures typically have to use vmalloc_user() for this. However, on other architectures, vmalloc() is not needed and has the downsides of being more restricted and slower than regular allocations. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: David Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1254830228.21044.272.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06tracing/syscalls: Use long for syscall ret format and field definitionsTom Zanussi
The syscall event definitions use long for the syscall exit ret value, but unsigned long for the same thing in the format and field definitions. Change them all to long. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1254808849-7829-4-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05sched: Remove obsolete comment in sched_init()Jayson R. King
Remove the comment about calling alloc_bootmem() as it is not called here since commit 36b7b6d465489c4754c4fd66fcec6086eba87896. Signed-off-by: Jayson R. King <dev@jaysonking.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <trivial@kernel.org> LKML-Reference: <4AC9C8A6.6010209@jaysonking.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05futex: Fix locking imbalanceThomas Gleixner
Rich reported a lock imbalance in the futex code: http://bugzilla.kernel.org/show_bug.cgi?id=14288 It's caused by the displacement of the retry_private label in futex_wake_op(). The code unlocks the hash bucket locks in the error handling path and retries without locking them again which makes the next unlock fail. Move retry_private so we lock the hash bucket locks when we retry. Reported-by: Rich Ercolany <rercola@acm.jhu.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Darren Hart <dvhltc@us.ibm.com> Cc: stable-2.6.31 <stable@kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>