summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2009-12-15perf_events: Fix perf_event_attr layoutPeter Zijlstra
The miss-alignment of bp_addr created a 32bit hole, causing different structure packings on 32 and 64 bit machines. Fix that by moving __reserve_2 into that hole. Further, remove the useless struct and redundant __bp_reserve muck. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1260902591.8023.781.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15Merge branch 'core-locking-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits) clockevents: Convert to raw_spinlock clockevents: Make tick_device_lock static debugobjects: Convert to raw_spinlocks perf_event: Convert to raw_spinlock hrtimers: Convert to raw_spinlocks genirq: Convert irq_desc.lock to raw_spinlock smp: Convert smplocks to raw_spinlocks rtmutes: Convert rtmutex.lock to raw_spinlock sched: Convert pi_lock to raw_spinlock sched: Convert cpupri lock to raw_spinlock sched: Convert rt_runtime_lock to raw_spinlock sched: Convert rq->lock to raw_spinlock plist: Make plist debugging raw_spinlock aware bkl: Fixup core_lock fallout locking: Cleanup the name space completely locking: Further name space cleanups alpha: Fix fallout from locking changes locking: Implement new raw_spinlock locking: Convert raw_rwlock functions to arch_rwlock locking: Convert raw_rwlock to arch_rwlock ...
2009-12-15tree-wide: convert open calls to remove spaces to skip_spaces() lib functionAndré Goddard Rosa
Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15vt: introduce and use vt_kmsg_redirect() functionBernhard Walle
The kernel offers with TIOCL_GETKMSGREDIRECT ioctl() the possibility to redirect the kernel messages to a specific console. However, since it's not possible to switch to the kernel message console after a panic(), it would be nice if the kernel would print the panic message on the current console. This patch series adds a new interface to access the global kmsg_redirect variable by a function to be able to use it in code where CONFIG_VT_CONSOLE is not set (kernel/panic.c). This patch: Instead of using and exporting a global value kmsg_redirect, introduce a function vt_kmsg_redirect() that both can set and return the console where messages are printed. Change all users of kmsg_redirect (the VT code itself and kernel/power.c) to the new interface. The main advantage is that vt_kmsg_redirect() can also be used when CONFIG_VT_CONSOLE is not set. Signed-off-by: Bernhard Walle <bernhard@bwalle.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15kernel/sys.c: fix "warning: do-while statement is not a compound statement" ↵H Hartley Sweeten
noise do_each_thread/while_each_thread wrap a block of code that is in this format: for (...) do ... while If curly braces do not surround the inner loop the following warning is generated by sparse: warning: do-while statement is not a compound statement Fix the warning by adding the braces. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15generic-ipi: cleanup for generic_smp_call_function_interrupt()Xiao Guangrong
Use smp_processor_id() instead of get_cpu() and put_cpu() in generic_smp_call_function_interrupt(), It's no need to disable preempt, because we must call generic_smp_call_function_interrupt() with interrupts disabled. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15'sysctl_max_map_count' should be non-negativeAmerigo Wang
Jan Engelhardt reported we have this problem: setting max_map_count to a value large enough results in programs dying at first try. This is on 2.6.31.6: 15:59 borg:/proc/sys/vm # echo $[1<<31-1] >max_map_count 15:59 borg:/proc/sys/vm # cat max_map_count 1073741824 15:59 borg:/proc/sys/vm # echo $[1<<31] >max_map_count 15:59 borg:/proc/sys/vm # cat max_map_count Killed This is because we have a chance to make 'max_map_count' negative. but it's meaningless. Make it only accept non-negative values. Reported-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: James Morris <jmorris@namei.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15hugetlb: derive huge pages nodes allowed from task mempolicyLee Schermerhorn
This patch derives a "nodes_allowed" node mask from the numa mempolicy of the task modifying the number of persistent huge pages to control the allocation, freeing and adjusting of surplus huge pages when the pool page count is modified via the new sysctl or sysfs attribute "nr_hugepages_mempolicy". The nodes_allowed mask is derived as follows: * For "default" [NULL] task mempolicy, a NULL nodemask_t pointer is produced. This will cause the hugetlb subsystem to use node_online_map as the "nodes_allowed". This preserves the behavior before this patch. * For "preferred" mempolicy, including explicit local allocation, a nodemask with the single preferred node will be produced. "local" policy will NOT track any internode migrations of the task adjusting nr_hugepages. * For "bind" and "interleave" policy, the mempolicy's nodemask will be used. * Other than to inform the construction of the nodes_allowed node mask, the actual mempolicy mode is ignored. That is, all modes behave like interleave over the resulting nodes_allowed mask with no "fallback". See the updated documentation [next patch] for more information about the implications of this patch. Examples: Starting with: Node 0 HugePages_Total: 0 Node 1 HugePages_Total: 0 Node 2 HugePages_Total: 0 Node 3 HugePages_Total: 0 Default behavior [with or without this patch] balances persistent hugepage allocation across nodes [with sufficient contiguous memory]: sysctl vm.nr_hugepages[_mempolicy]=32 yields: Node 0 HugePages_Total: 8 Node 1 HugePages_Total: 8 Node 2 HugePages_Total: 8 Node 3 HugePages_Total: 8 Of course, we only have nr_hugepages_mempolicy with the patch, but with default mempolicy, nr_hugepages_mempolicy behaves the same as nr_hugepages. Applying mempolicy--e.g., with numactl [using '-m' a.k.a. '--membind' because it allows multiple nodes to be specified and it's easy to type]--we can allocate huge pages on individual nodes or sets of nodes. So, starting from the condition above, with 8 huge pages per node, add 8 more to node 2 using: numactl -m 2 sysctl vm.nr_hugepages_mempolicy=40 This yields: Node 0 HugePages_Total: 8 Node 1 HugePages_Total: 8 Node 2 HugePages_Total: 16 Node 3 HugePages_Total: 8 The incremental 8 huge pages were restricted to node 2 by the specified mempolicy. Similarly, we can use mempolicy to free persistent huge pages from specified nodes: numactl -m 0,1 sysctl vm.nr_hugepages_mempolicy=32 yields: Node 0 HugePages_Total: 4 Node 1 HugePages_Total: 4 Node 2 HugePages_Total: 16 Node 3 HugePages_Total: 8 The 8 huge pages freed were balanced over nodes 0 and 1. [rientjes@google.com: accomodate reworked NODEMASK_ALLOC] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Andi Kleen <andi@firstfloor.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Eric Whitney <eric.whitney@hp.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15bsdacct: fix uid/gid misreportingAlexey Dobriyan
commit d8e180dcd5bbbab9cd3ff2e779efcf70692ef541 "bsdacct: switch credentials for writing to the accounting file" introduced credential switching during final acct data collecting. However, uid/gid pair continued to be collected from current which became credentials of who created acct file, not who exits. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14676 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: Juho K. Juopperi <jkj@kapsi.fi> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Reviewed-by: Michal Schmidt <mschmidt@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15perf_event: Fix incorrect range check on cpu numberPaul Mackerras
It is quite legitimate for CPUs to be numbered sparsely, meaning that it possible for an online CPU to have a number which is greater than the total count of possible CPUs. Currently find_get_context() has a sanity check on the cpu number where it checks it against num_possible_cpus(). This test can fail for a legitimate cpu number if the cpu_possible_mask is sparsely populated. This fixes the problem by checking the CPU number against nr_cpumask_bits instead, since that is the appropriate check to ensure that the cpu number is same to pass to cpu_isset() subsequently. Reported-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Tested-by: Michael Neuling <mikey@neuling.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: <stable@kernel.org> LKML-Reference: <20091215084032.GA18661@brick.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15sched: Fix cpu_clock() in NMIs, on !CONFIG_HAVE_UNSTABLE_SCHED_CLOCKDavid Miller
Relax stable-sched-clock architectures to not save/disable/restore hardirqs in cpu_clock(). The background is that I was trying to resolve a sparc64 perf issue when I discovered this problem. On sparc64 I implement pseudo NMIs by simply running the kernel at IRQ level 14 when local_irq_disable() is called, this allows performance counter events to still come in at IRQ level 15. This doesn't work if any code in an NMI handler does local_irq_save() or local_irq_disable() since the "disable" will kick us back to cpu IRQ level 14 thus letting NMIs back in and we recurse. The only path which that does that in the perf event IRQ handling path is the code supporting frequency based events. It uses cpu_clock(). cpu_clock() simply invokes sched_clock() with IRQs disabled. And that's a fundamental bug all on it's own, particularly for the HAVE_UNSTABLE_SCHED_CLOCK case. NMIs can thus get into the sched_clock() code interrupting the local IRQ disable code sections of it. Furthermore, for the not-HAVE_UNSTABLE_SCHED_CLOCK case, the IRQ disabling done by cpu_clock() is just pure overhead and completely unnecessary. So the core problem is that sched_clock() is not NMI safe, but we are invoking it from NMI contexts in the perf events code (via cpu_clock()). A less important issue is the overhead of IRQ disabling when it isn't necessary in cpu_clock(). CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures are not affected by this patch. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091213.182502.215092085.davem@davemloft.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15tracing: Fix return of trace_dump_stack()Steven Rostedt
The trace_dump_stack() returned a value for a void function. Also, added the missing stub for trace_dump_stack() when tracing is not configured. Reported-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20091214162713.GA31060@elte.hu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=yRusty Russell
powerpc applies relocations to the kcrctab. They're absolute symbols, but it's not completely unreasonable: other archs may too, but the relocation is often 0. http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-November/077972.html Inspired-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Paul Mackerras <paulus@samba.org>
2009-12-14clockevents: Convert to raw_spinlockThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14clockevents: Make tick_device_lock staticThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14perf_event: Convert to raw_spinlockThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14hrtimers: Convert to raw_spinlocksThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14genirq: Convert irq_desc.lock to raw_spinlockThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14smp: Convert smplocks to raw_spinlocksThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14rtmutes: Convert rtmutex.lock to raw_spinlockThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14sched: Convert pi_lock to raw_spinlockThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14sched: Convert cpupri lock to raw_spinlockThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14sched: Convert rt_runtime_lock to raw_spinlockThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14sched: Convert rq->lock to raw_spinlockThomas Gleixner
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14plist: Make plist debugging raw_spinlock awareThomas Gleixner
plists are used with spinlocks and raw_spinlocks. Change the plist debugging to handle both types. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14locking: Cleanup the name space completelyThomas Gleixner
Make the name space hierarchy of locking functions consistent: raw_spin* -> _raw_spin* -> __raw_spin* No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14locking: Further name space cleanupsThomas Gleixner
The name space hierarchy for the internal lock functions is now a bit backwards. raw_spin* functions map to _spin* which use __spin*, while we would like to have _raw_spin* and __raw_spin*. _raw_spin* is already used by lock debugging, so rename those funtions to do_raw_spin* to free up the _raw_spin* name space. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14locking: Implement new raw_spinlockThomas Gleixner
Now that the raw_spin name space is freed up, we can implement raw_spinlock and the related functions which are used to annotate the locks which are not converted to sleeping spinlocks in preempt-rt. A side effect is that only such locks can be used with the low level lock fsunctions which circumvent lockdep. For !rt spin_* functions are mapped to the raw_spin* implementations. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14locking: Convert __raw_spin* functions to arch_spin*Thomas Gleixner
Name space cleanup. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
2009-12-14locking: Rename __RAW_SPIN_LOCK_UNLOCKED to __ARCH_SPIN_LOCK_UNLOCKEDThomas Gleixner
Further name space cleanup. No functional change Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
2009-12-14locking: Convert raw_spinlock to arch_spinlockThomas Gleixner
The raw_spin* namespace was taken by lockdep for the architecture specific implementations. raw_spin_* would be the ideal name space for the spinlocks which are not converted to sleeping locks in preempt-rt. Linus suggested to convert the raw_ to arch_ locks and cleanup the name space instead of using an artifical name like core_spin, atomic_spin or whatever No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
2009-12-14locking: Reorder functions in spinlock.cThomas Gleixner
Separate spin_lock and rw_lock functions. Preempt-RT needs to exclude the rw_lock functions from being compiled. The reordering allows to do that with a single #ifdef. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c
2009-12-14Merge branch 'tip/tracing/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent
2009-12-14sched: Use rcu in sched_get_rr_param()Thomas Gleixner
read_lock(&tasklist_lock) does not protect sys_sched_get_rr_param() against a concurrent update of the policy or scheduler parameters as do_sched_scheduler() does not take the tasklist_lock. The access to task->sched_class->get_rr_interval is protected by task_rq_lock(task). Use rcu_read_lock() to protect find_task_by_vpid() and prevent the task struct from going away. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091209100706.862897167@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14sched: Use rcu in sched_get/set_affinity()Thomas Gleixner
tasklist_lock is held read locked to protect the find_task_by_vpid() call and to prevent the task going away. sched_setaffinity acquires a task struct ref and drops tasklist lock right away. The access to the cpus_allowed mask is protected by rq->lock. rcu_read_lock() provides the same protection here. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091209100706.789059966@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14sched: Use rcu in sys_sched_getscheduler/sys_sched_getparam()Thomas Gleixner
read_lock(&tasklist_lock) does not protect sys_sched_getscheduler and sys_sched_getparam() against a concurrent update of the policy or scheduler parameters as do_sched_setscheduler() does not take the tasklist_lock. The accessed integers can be retrieved w/o locking and are snapshots anyway. Using rcu_read_lock() to protect find_task_by_vpid() and prevent the task struct from going away is not changing the above situation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091209100706.753790977@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14Merge branch 'linus' into tracing/urgentIngo Molnar
Conflicts: kernel/trace/trace_kprobe.c Merge reason: resolve the conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-13ksym_tracer: Fix bad castLi Zefan
Fix this warning: kernel/trace/trace_ksym.c: In function 'ksym_trace_filter_read': kernel/trace/trace_ksym.c:239: warning: cast to pointer from integer of different size Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: "K.Prasad" <prasad@linux.vnet.ibm.com> LKML-Reference: <4B1DC578.9020909@cn.fujitsu.com> [remove the strstrip fix as tglx already fixed that] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing/power: Remove two exportsLi Zefan
trace_power_start and trace_power_end are used in arch/x86/kernel/power.c, and this file can't be compiled as a module, so these two tracepoints don't need to be exported. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC55F.7060305@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Change event->profile_count to be int typeLi Zefan
Like total_profile_count, struct ftrace_event_call::profile_count is protected by event_mutex, so it doesn't need to be atomic_t. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4B1DC549.5010705@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Simplify trace_option_write()Li Zefan
- remove duplicate code inside trace_options_write() - extract duplicate code in trace_options_write() and set_tracer_option() Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC532.9010802@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Remove useless trace optionLi Zefan
Since commit 4d9493c90f8e6e1b164aede3814010a290161abb ("ftrace: remove add-hoc code"), option "sched-tree" has become useless. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC50A.7040402@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Use seq file for trace_clockLi Zefan
The buffer for the output is as small as 64 bytes, so it'll overflow if we add more clock type. Use seq file instead. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC4FB.5030407@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Use seq file for trace_optionsLi Zefan
Code simplification for reading trace_options. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-reference: <4B1DC4EF.3090106@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13function-graph: Allow writing the same val to set_graph_functionLi Zefan
# echo 'do_open' > set_graph_function # echo 'do_open' >> set_graph_function bash: echo: write error: Invalid argument Make it valid to write the same value to set_graph_function, which is consistent with set_ftrace_filter interface. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-reference: <4B1DC4E1.1060303@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13ftrace: Call trace_parser_clear() properlyLi Zefan
I found a weird behavior: # echo 'fuse:*' > set_ftrace_filter bash: echo: write error: Invalid argument # cat set_ftrace_filter fuse_dev_fasync fuse_dev_poll fuse_copy_do We should call trace_parser_clear() no matter ftrace_process_regex() returns 0 or -errno, otherwise we will actually take the unaccepted records from ftrace_regex_release(). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC4D2.3000406@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13ftrace: Return EINVAL when writing invalid val to set_ftrace_filterLi Zefan
Currently it doesn't warn user on invald value: # echo nonexist_symbol > set_ftrace_filter or: # echo 'nonexist_symbol:mod:fuse' > set_ftrace_filter Better make it return failure. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC4BF.2070003@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Move a printk out of ftrace_raw_reg_event_foo()Li Zefan
Move the printk from each ftrace_raw_reg_event_foo() to its caller ftrace_event_enable_disable(). This avoids each regfunc trace event callbacks to handle a same error report that can be carried from the caller. See how much space this saves: text data bss dec hex filename 5345151 1961864 7103260 14410275 dbe223 vmlinux.o.old 5331487 1961864 7103260 14396611 dbacc3 vmlinux.o Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <4B1DC4AC.802@cn.fujitsu.com> [start cmdline record before calling regfunc to avoid lost window of pid to comm resolution] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Pull up calls to trace_define_common_fields()Li Zefan
Call trace_define_common_fields() in event_create_dir() only. This avoids trace events to handle it from their define_fields callbacks and shrinks the kernel code size: text data bss dec hex filename 5346802 1961864 7103260 14411926 dbe896 vmlinux.o.old 5345151 1961864 7103260 14410275 dbe223 vmlinux.o Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jason Baron <jbaron@redhat.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <4B1DC49C.8000107@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>