summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2009-08-09Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: posix_cpu_timers_exit_group(): Do not use thread_group_cputimer()
2009-08-09Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Fix/complete ftrace event records sampling perf_counter, ftrace: Fix perf_counter integration tracing/filters: Always free pred on filter_add_subsystem_pred() failure tracing/filters: Don't use pred on alloc failure ring-buffer: Fix memleak in ring_buffer_free() tracing: Fix recordmcount.pl to handle sections with only weak functions ring-buffer: Fix advance of reader in rb_buffer_peek() tracing: do not use functions starting with .L in recordmcount.pl ring-buffer: do not disable ring buffer on oops_in_progress ring-buffer: fix check of try_to_discard result
2009-08-09Merge branch 'core-fixes-for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: Fix typos in documentation lockdep: Fix file mode of lock_stat rtmutex: Avoid deadlock in rt_mutex_start_proxy_lock()
2009-08-09perf_counter: Fix a race on perf_counter_ctxPeter Zijlstra
While extending perfcounters with BTS hw-tracing, Markus Metzger managed to trigger this warning: [ 995.557128] WARNING: at kernel/perf_counter.c:1191 __perf_counter_task_sched_out+0x48/0x6b() triggers because commit 9f498cc5be7e013d8d6e4c616980ed0ffc8680d2 (perf_counter: Full task tracing) removed clearing of tsk->perf_counter_ctxp out from under ctx->lock which introduced a race (against perf_lock_task_context). Move it back and deal with the exit notification by explicitly passing along the former task context. Reported-by: Markus T Metzger <markus.t.metzger@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249667341.17467.5.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter: Fix tracepoint sampling to be part of generic samplingFrederic Weisbecker
Based on Peter's comments, make tracepoint sampling generic just like all the other sampling bits are. This is a rename with no code changes: - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW - struct perf_tracepoint_record to perf_raw_record We want the system in place that transport tracepoints raw samples events into the perf ring buffer to be generalized and usable by any type of counter. Reported-by; Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter: Work around gcc warning by initializing tracepoint record ↵Frederic Weisbecker
unconditionally Despite that the tracepoint record is always present when the PERF_SAMPLE_TP_RECORD flag is set, gcc raises a warning, thinking it might not be initialized: kernel/perf_counter.c: In function ‘perf_counter_output’: kernel/perf_counter.c:2650: warning: ‘tp’ may be used uninitialized in this function Then, initialize it to NULL and always check if it's not NULL before dereference it. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249698400-5441-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter: Fix software counters for fast moving event sourcesPeter Zijlstra
Reimplement the software counters to deal with fast moving event sources (such as tracepoints). This means being able to generate multiple overflows from a single 'event' as well as support throttling. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter: Fix/complete ftrace event records samplingFrederic Weisbecker
This patch implements the kernel side support for ftrace event record sampling. A new counter sampling attribute is added: PERF_SAMPLE_TP_RECORD which requests ftrace events record sampling. In this case if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint fires, we emit the tracepoint binary record to the perfcounter event buffer, as a sample. Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf record: perf record -f -F 1 -a -e workqueue:workqueue_execution perf report -D 0x21e18 [0x48]: event: 9 . . ... raw event: size 72 bytes . 0000: 09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff ......H........ . 0010: 0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00 ........!...... . 0020: 2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e +...........eve . 0030: 74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00 ts/1........... . 0040: e0 b1 31 81 ff ff ff ff ....... . 0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33 The raw ftrace binary record starts at offset 0020. Translation: struct trace_entry { type = 0x2b = 43; flags = 1; preempt_count = 2; pid = 0xa = 10; tgid = 0xa = 10; } thread_comm = "events/1" thread_pid = 0xa = 10; func = 0xffffffff8131b1e0 = flush_to_ldisc() What will come next? - Userspace support ('perf trace'), 'flight data recorder' mode for perf trace, etc. - The unconditional copy from the profiling callback brings some costs however if someone wants no such sampling to occur, and needs to be fixed in the future. For that we need to have an instant access to the perf counter attribute. This is a matter of a flag to add in the struct ftrace_event. - Take care of the events recursivity! Don't ever try to record a lock event for example, it seems some locking is used in the profiling fast path and lead to a tracing recursivity. That will be fixed using raw spinlock or recursivity protection. - [...] - Profit! :-) Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf_counter, ftrace: Fix perf_counter integrationPeter Zijlstra
Adds possible second part to the assign argument of TP_EVENT(). TP_perf_assign( __perf_count(foo); __perf_addr(bar); ) Which, when specified make the swcounter increment with @foo instead of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address associated with the event) when this triggers a counter overflow. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09Merge branch 'linus' into tracing/urgentIngo Molnar
Merge reason: Merge up to almost-rc6 to pick up latest perfcounters (on which we'll queue up a dependent fix) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09irq: Remove superfluous NULL pointer check in check_irq_resend()Bartlomiej Zolnierkiewicz
This takes care of the following entry from Dan's list: kernel/irq/resend.c +73 check_irq_resend(17) warning: variable derefenced before check 'desc->chip' Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Eugene Teo <eteo@redhat.com> Cc: Julia Lawall <julia@diku.dk> LKML-Reference: <200908062146.03638.bzolnier@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08posix_cpu_timers_exit_group(): Do not use thread_group_cputimer()Stanislaw Gruszka
When the process exits we don't have to run new cputimer nor use running one (as it not accounts when tsk->exit_state != 0) to get process CPU times. As there is only one thread we can just use CPU times fields from task and signal structs. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Roland McGrath <roland@redhat.com> Cc: Vitaly Mayatskikh <vmayatsk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08printk: Fix "printk: Enable the use of more than one CON_BOOT (early console)"Sonic Zhang
Don't return when we find the first bootconsole - it can leave other bootconsoles still installed, and they can be used and cause problems later (if they are in the init section, and eventually released), and cause problems. Make sure we remove all of them. Signed-off-by: Sonic Zhang <Sonic.Zhang@analog.com> Signed-off-by: Robin Getz <rgetz@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08tracing/filters: Don't use pred on alloc failureTom Zanussi
Dan Carpenter sent me a fix to prevent pred from being used if it couldn't be allocated. This updates his patch for the same problem in the tracing tree (which has changed this code quite substantially). Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <1249746576.6453.30.camel@tropicana> Signed-off-by: Ingo Molnar <mingo@elte.hu> The original report: create_logical_pred() could sometimes return NULL. It's a static checker complaining rather than problems at runtime...
2009-08-08tracing/filters: Always free pred on filter_add_subsystem_pred() failureTom Zanussi
If filter_add_subsystem_pred() fails due to ENOSPC or ENOMEM, the pred doesn't get freed, while as a side effect it does for other errors. Make it so the caller always frees the pred for any error. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <1249746593.6453.32.camel@tropicana> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08tracing/filters: Don't use pred on alloc failureTom Zanussi
Dan Carpenter sent me a fix to prevent pred from being used if it couldn't be allocated. I noticed the same problem also existed for the create_pred() case and added a fix for that. Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <1249746549.6453.29.camel@tropicana> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-08x86/irq: Fix move_irq_desc() for nodes without ramYinghai Lu
Don't move it if target node is -1. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4A785B5D.4070702@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-07Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Fix double list iteration in per task precise stats perf: Auto-detect libelf perf symbol: Fix symbol parsing in certain cases: use the build-id as a symlink perf_counter/powerpc: Check oprofile_cpu_type for NULL before using it ftrace: Fix perf-tracepoint OOPS perf report: Add missing command line options to man page perf: Auto-detect libbfd perf report: Make --sort comm,dso,symbol the default
2009-08-07execve: must clear current->clear_child_tidEric Dumazet
While looking at Jens Rosenboom bug report (http://lkml.org/lkml/2009/7/27/35) about strange sys_futex call done from a dying "ps" program, we found following problem. clone() syscall has special support for TID of created threads. This support includes two features. One (CLONE_CHILD_SETTID) is to set an integer into user memory with the TID value. One (CLONE_CHILD_CLEARTID) is to clear this same integer once the created thread dies. The integer location is a user provided pointer, provided at clone() time. kernel keeps this pointer value into current->clear_child_tid. At execve() time, we should make sure kernel doesnt keep this user provided pointer, as full user memory is replaced by a new one. As glibc fork() actually uses clone() syscall with CLONE_CHILD_SETTID and CLONE_CHILD_CLEARTID set, chances are high that we might corrupt user memory in forked processes. Following sequence could happen: 1) bash (or any program) starts a new process, by a fork() call that glibc maps to a clone( ... CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID ...) syscall 2) When new process starts, its current->clear_child_tid is set to a location that has a meaning only in bash (or initial program) context (&THREAD_SELF->tid) 3) This new process does the execve() syscall to start a new program. current->clear_child_tid is left unchanged (a non NULL value) 4) If this new program creates some threads, and initial thread exits, kernel will attempt to clear the integer pointed by current->clear_child_tid from mm_release() : if (tsk->clear_child_tid && !(tsk->flags & PF_SIGNALED) && atomic_read(&mm->mm_users) > 1) { u32 __user * tidptr = tsk->clear_child_tid; tsk->clear_child_tid = NULL; /* * We don't check the error code - if userspace has * not set up a proper pointer then tough luck. */ << here >> put_user(0, tidptr); sys_futex(tidptr, FUTEX_WAKE, 1, NULL, NULL, 0); } 5) OR : if new program is not multi-threaded, but spied by /proc/pid users (ps command for example), mm_users > 1, and the exiting program could corrupt 4 bytes in a persistent memory area (shm or memory mapped file) If current->clear_child_tid points to a writeable portion of memory of the new program, kernel happily and silently corrupts 4 bytes of memory, with unexpected effects. Fix is straightforward and should not break any sane program. Reported-by: Jens Rosenboom <jens@mcbone.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sonny Rao <sonnyrao@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07generic-ipi: fix hotplug_cfd()Xiao Guangrong
Use CONFIG_HOTPLUG_CPU, not CONFIG_CPU_HOTPLUG When hot-unpluging a cpu, it will leak memory allocated at cpu hotplug, but only if CPUMASK_OFFSTACK=y, which is default to n. The bug was introduced by 8969a5ede0f9e17da4b943712429aef2c9bcd82b ("generic-ipi: remove kmalloc()"). Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-07ring-buffer: Fix memleak in ring_buffer_free()Eric Dumazet
I noticed oprofile memleaked in linux-2.6 current tree, and tracked this ring-buffer leak. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> LKML-Reference: <4A7C06B9.2090302@gmail.com> Cc: stable@kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-08-07lockdep: Fix file mode of lock_statLi Zefan
/proc/lock_stat is writable. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4A7BE7B6.10904@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-06perf_counter: Fix double list iteration in per task precise statsPeter Zijlstra
Brice Goglin reported this crash with per task precise stats: > I finally managed to test the threaded perfcounter statistics (thanks a > lot for implementing it). I am running 2.6.31-rc5 (with the AMD > magny-cours patches but I don't think they matter here). I am trying to > measure local/remote memory accesses per thread during the well-known > stream benchmark. It's compiled with OpenMP using 16 threads on a > quad-socket quad-core barcelona machine. > > Command line is: > /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf record -f -s > -e r1000001e0 -e r1000002e0 -e r1000004e0 -e r1000008e0 ./stream > > It seems to work fine with a single -e <counter> on the command line > while it crashes when there are at least 2 of them. > It seems to work fine without -s as well. A silly copy-paste resulted in a messed up iteration which would cause the OOPS. Reported-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> LKML-Reference: <1249574786.32113.550.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-06ring-buffer: Fix advance of reader in rb_buffer_peek()Robert Richter
When calling rb_buffer_peek() from ring_buffer_consume() and a padding event is returned, the function rb_advance_reader() is called twice. This may lead to missing samples or under high workloads to the warning below. This patch fixes this. If a padding event is returned by rb_buffer_peek() it will be consumed by the calling function now. Also, I simplified some code in ring_buffer_consume(). ------------[ cut here ]------------ WARNING: at /dev/shm/.source/linux/kernel/trace/ring_buffer.c:2289 rb_advance_reader+0x2e/0xc5() Hardware name: Anaheim Modules linked in: Pid: 29, comm: events/2 Tainted: G W 2.6.31-rc3-oprofile-x86_64-standard-00059-g5050dc2 #1 Call Trace: [<ffffffff8106776f>] ? rb_advance_reader+0x2e/0xc5 [<ffffffff81039ffe>] warn_slowpath_common+0x77/0x8f [<ffffffff8103a025>] warn_slowpath_null+0xf/0x11 [<ffffffff8106776f>] rb_advance_reader+0x2e/0xc5 [<ffffffff81068bda>] ring_buffer_consume+0xa0/0xd2 [<ffffffff81326933>] op_cpu_buffer_read_entry+0x21/0x9e [<ffffffff810be3af>] ? __find_get_block+0x4b/0x165 [<ffffffff8132749b>] sync_buffer+0xa5/0x401 [<ffffffff810be3af>] ? __find_get_block+0x4b/0x165 [<ffffffff81326c1b>] ? wq_sync_buffer+0x0/0x78 [<ffffffff81326c76>] wq_sync_buffer+0x5b/0x78 [<ffffffff8104aa30>] worker_thread+0x113/0x1ac [<ffffffff8104dd95>] ? autoremove_wake_function+0x0/0x38 [<ffffffff8104a91d>] ? worker_thread+0x0/0x1ac [<ffffffff8104dc9a>] kthread+0x88/0x92 [<ffffffff8100bdba>] child_rip+0xa/0x20 [<ffffffff8104dc12>] ? kthread+0x0/0x92 [<ffffffff8100bdb0>] ? child_rip+0x0/0x20 ---[ end trace f561c0a58fcc89bd ]--- Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@kernel.org> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-06tracing/events: Only define remove_subsystem_dir() if CONFIG_MODULESFrederic Weisbecker
If we disable modules, we get the following warning in ftrace events file: kernel/trace/trace_events.c:912: attention : ‘remove_subsystem_dir’ defined but not used remove_subystem_dir() is useless if !CONFIG_MODULES, then move it to the appropriate #ifdef section of trace_events.c Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2009-08-06tracing/function-graph-tracer: Move graph event insertion helpers in the ↵Frederic Weisbecker
graph tracer file The function graph events helpers which insert the function entry and return events into the ring buffer currently reside in trace.c But this file is quite overloaded and the right place for these helpers is in the function graph tracer file. Then move them to trace_functions_graph.c Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2009-08-06tracing: Move sched event insertion helpers in the sched switch tracer fileFrederic Weisbecker
The sched events helpers which insert the sched switch and wakeup events into the ring buffer currently reside in trace.c But this file is quite overloaded and the right place for these helpers is in the sched switch tracer file. Then move them to trace_functions.c Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2009-08-06tracing/core: Make the stack entry helpers globalFrederic Weisbecker
Make the stacktrace event insertion helpers globals. This has two effects: - Prepare for moving the sched events insertion helpers to the sched switch tracer file. - Move some ifdef outside function definitions Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2009-08-06tracing/core: Turn ftrace_cpu_disabled into a global varFrederic Weisbecker
In order to prepare the moving of the function graph tracer insertion helpers from trace.c to trace_functions_graph.c, we need to export the ftrace_cpu_disabled variable. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2009-08-06tracing: Simplify print_graph_cpu()Lai Jiangshan
print_graph_cpu() is little over-designed. And "log10_all" may be wrong when there are holes in cpu_online_mask: the max online cpu id > cpumask_weight(cpu_online_mask) So change it by using a static column length for the cpu matching nr_cpu_ids number of decimal characters. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4A6EEE5E.2000001@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-06ftrace: Fix perf-tracepoint OOPSPeter Zijlstra
Not all tracepoints are created equal, in specific the ftrace tracepoints are created with TRACE_EVENT_FORMAT() which does not generate the needed bits to tie them into perf counters. For those events, don't create the 'id' file and fail ->profile_enable when their ID is specified through other means. Reported-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1249497664.5890.4.camel@laptop> [ v2: fix build error in the !CONFIG_EVENT_PROFILE case ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-06rtmutex: Avoid deadlock in rt_mutex_start_proxy_lock()Darren Hart
In the event of a lock steal or owner died, rt_mutex_start_proxy_lock() will give the rt_mutex to the waiting task, but it fails to release the wait_lock. This leads to subsequent deadlocks when other tasks try to acquire the rt_mutex. I also removed a few extra blank lines that really spaced this routine out. I must have been high on the \n when I wrote this originally... Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dinakar Guniguntala <dino@in.ibm.com> Cc: John Stultz <johnstul@linux.vnet.ibm.com> LKML-Reference: <4A79D7F1.4000405@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-05ring-buffer: do not disable ring buffer on oops_in_progressSteven Rostedt
The commit: commit e0fdace10e75dac67d906213b780ff1b1a4cc360 Author: David Miller <davem@davemloft.net> Date: Fri Aug 1 01:11:22 2008 -0700 debug_locks: set oops_in_progress if we will log messages. Otherwise lock debugging messages on runqueue locks can deadlock the system due to the wakeups performed by printk(). Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Will permanently set oops_in_progress on any lockdep failure. When this triggers it will cause any read from the ring buffer to permanently disable the ring buffer (not to mention no locking of printk). This patch removes the check. It keeps the print in NMI which makes sense. This is probably OK, since the ring buffer should not cause something to set oops_in_progress anyway. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-08-05ring-buffer: fix check of try_to_discard resultSteven Rostedt
The function ring_buffer_discard_commit inversed the code path of the result of try_to_discard. It should skip incrementing the entry counter if try_to_discard succeeded. But instead, it increments the entry conder if it succeeded to discard, and does not increment it if it fails. The result of this bug is that filtering will make the stat counters incorrect. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-08-06Security/SELinux: seperate lsm specific mmap_min_addrEric Paris
Currently SELinux enforcement of controls on the ability to map low memory is determined by the mmap_min_addr tunable. This patch causes SELinux to ignore the tunable and instead use a seperate Kconfig option specific to how much space the LSM should protect. The tunable will now only control the need for CAP_SYS_RAWIO and SELinux permissions will always protect the amount of low memory designated by CONFIG_LSM_MMAP_MIN_ADDR. This allows users who need to disable the mmap_min_addr controls (usual reason being they run WINE as a non-root user) to do so and still have SELinux controls preventing confined domains (like a web server) from being able to map some area of low memory. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2009-08-06Merge branch 'master' into nextJames Morris
2009-08-04Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Set the CONFIG_PERF_COUNTERS default to y if CONFIG_PROFILING=y perf: Fix read buffer overflow perf top: Add mwait_idle_with_hints to skip_symbols[] perf tools: Fix faulty check perf report: Update for the new FORK/EXIT events perf_counter: Full task tracing perf_counter: Collapse inherit on read() tracing, perf_counter: Add help text to CONFIG_EVENT_PROFILE perf_counter tools: Fix link errors with older toolchains
2009-08-04Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix race in cpupri introduced by cpumask_var changes sched: Fix latencytop and sleep profiling vs group scheduling
2009-08-04Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: posix-timers: Fix oops in clock_nanosleep() with CLOCK_MONOTONIC_RAW
2009-08-04Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix missing function_graph events when we splice_read from trace_pipe tracing: Fix invalid function_graph entry trace: stop tracer in oops_enter() ftrace: Only update $offset when we update $ref_func ftrace: Fix the conditional that updates $ref_func tracing: only truncate ftrace files when O_TRUNC is set tracing: show proper address for trace-printk format
2009-08-04timers: Cache __next_timer_interrupt resultMartin Schwidefsky
Each time a cpu goes to sleep on a NOHZ=y system the timer wheel is searched for the next timer interrupt. It can take quite a few cycles to find the next pending timer. This patch adds a field to tvec_base that caches the result of __next_timer_interrupt. The hit ratio is around 80% on my thinkpad under normal use, on a server I've seen hit ratios from 5% to 95% dependent on the workload. -v2: jiffies wrap fixes Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20090721202505.7d56a079@skybase> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04irq: Clean up by removing irqfixup MODULE_PARM_DESC()Jiri Slaby
It's wrong (the parm takes no arguments) and compile-time wiped away anyway (due to !MODULE). Signed-off-by: Jiri Slaby <jirislaby@gmail.com> LKML-Reference: <1248991326-26267-1-git-send-email-jirislaby@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04futex: Correct futex_wait_requeue_pi() commentaryDarren Hart
The state machine described in the comments wasn't updated with a follow-on fix. Address that and cleanup the corresponding commentary in the function. Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <4A737C2A.9090001@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04workqueues: Improve schedule_work() documentationBart Van Assche
Two important aspects of the schedule_work() function are not yet documented: - that it is allowed to pass a struct work_struct * to this function that is already on the kernel-global workqueue; - the meaning of its return value. The patch below documents both aspects. Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com> Cc: "Greg Kroah-Hartman" <gregkh@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <200907301900.54202.bart.vanassche@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04Merge branch 'tracing/fixes' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into tracing/urgent
2009-08-04posix-timers: Fix oops in clock_nanosleep() with CLOCK_MONOTONIC_RAWHiroshi Shimamoto
Prevent calling do_nanosleep() with clockid CLOCK_MONOTONIC_RAW, it may cause oops, such as NULL pointer dereference. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <johnstul@us.ibm.com> Cc: <stable@kernel.org> LKML-Reference: <4A764FF3.50607@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-03cputime: Optimize jiffies_to_cputime(1)Stanislaw Gruszka
For powerpc with CONFIG_VIRT_CPU_ACCOUNTING jiffies_to_cputime(1) is not compile time constant and run time calculations are quite expensive. To optimize we use precomputed value. For all other architectures is is preprocessor definition. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> LKML-Reference: <1248862529-6063-5-git-send-email-sgruszka@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-03itimers: Simplify arm_timer() code a bitStanislaw Gruszka
Don't update values in expiration cache when new ones are equal. Add expire_le() and expire_gt() helpers to simplify the code. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> LKML-Reference: <1248862529-6063-4-git-send-email-sgruszka@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-03itimers: Fix periodic tics precisionStanislaw Gruszka
Measure ITIMER_PROF and ITIMER_VIRT timers interval error between real ticks and requested by user. Take it into account when scheduling next tick. This patch introduce possibility where time between two consecutive tics is smaller then requested interval, it preserve however dependency that n tick is generated not earlier than n*interval time - counting from the beginning of periodic signal generation. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> LKML-Reference: <1248862529-6063-3-git-send-email-sgruszka@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-03itimers: Merge ITIMER_VIRT and ITIMER_PROFStanislaw Gruszka
Both cpu itimers have same data flow in the few places, this patch make unification of code related with VIRT and PROF itimers. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> LKML-Reference: <1248862529-6063-2-git-send-email-sgruszka@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>