summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-03-11omap: Enable PM_RUNTIME in defconfigs to avoid USB compile errorsTony Lindgren
While waiting for the related USB patch, fix compile by enabling it in the defconfigs. As discussed at: http://thread.gmane.org/gmane.linux.usb.general/27432/focus=4460 Otherwise we'll get errors like: drivers/usb/core/hcd.c:1892: error: 'pm_wq' undeclared (first use in this function) drivers/usb/core/hcd.c:1892: error: (Each undeclared identifier is reported only once drivers/usb/core/hcd.c:1892: error: for each function it appears in.) Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-03-11perf record: Don't try to find buildids in a zero sized fileArnaldo Carvalho de Melo
Fixing this symptom: [acme@mica linux-2.6-tip]$ perf record -a -f Fatal: Permission error - are you root? Bus error [acme@mica linux-2.6-tip]$ I.e. if for some reason no data is collected, in this case a non root user trying to do systemwide profiling, no data will be collected, and then we end up trying to mmap a zero sized file and access the file header, b00m. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: <stable@kernel.org> LKML-Reference: <1268333592-30872-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11omap2: Update n8x0 defconfig to test multi-omap and DMA api changesTony Lindgren
Recent DMA API changes broke compile for tusb6010. While testing the fixes for tusb6010, I had to update the n8x0 defconfig quite a bit. Might as well merge it while at it to make it more usable as we're using this to test the multi-omap booting between V6 and V7 ARMs. Also, anybody using n8x0 with a current kernel will most likely want to mount root on the MMC instead of the onenand to keep the Maemo install intact. Enable I2C, REGULATOR, MMC, MFD, PM, and USB. Also change the root to /dev/mmcblk0p2 instead of the onenand. Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-03-11omap2: add USB initialization for tusb6010Francisco Alecrim
Based on Kalle's and Tony's patches. Some variables re-organized and unused code removed. Signed-off-by: Kalle Valo <kalle.valo@iki.fi> Signed-off-by: Francisco Alecrim <francisco.alecrim@openbossa.org> [tony@atomide.com: this is needed to fix the related tusb6010 DMA API changes] Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-03-11omap4: Fix build break by moving omap_smc1 into a separate .SSantosh Shilimkar
This patch moves omap_smc1 function to a seperate omap44xx-smc.S file and sets compile flags as -Wa,-march=armv7-a. This fix was suggested by Tony Lindgren <tony@atomide.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> [tony@atomide.com: otherwise multi-omap build with V6 and V7 breaks] Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-03-11omap2/3/4: ehci: avoid compiler error with touchbookFelipe Balbi
the early_param() call in board-omap3touchbook.c expands to: static const char __setup_str_early_touchbook_revision[] __section(.init.rodata) _aligned(1) = tbr; [...] and we have a non-const variable being added to the same section: static struct ehci_hcd_omap_platform_data ehci_pdata __section(.init.rodata); because of that, gcc generates a section type conflict which can (and actually should) be avoided by marking const every variable marked with __initconst. This patch fixes that for the ehci_hdc_omap_platform_data. Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-03-11GFS2: Skip check for mandatory locks when unlockingSachin Prabhu
gfs2_lock() will skip locks on file which have mode set to 02666. This is a problem in cases where the mode of the file is changed after a process has obtained a lock on the file. Such a lock will be skipped and will result in a BUG in locks_remove_flock(). gfs2_lock() should skip the check for mandatory locks when unlocking a file. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2010-03-11sched: Fix pick_next_highest_task_rt() for cgroupsPeter Zijlstra
Since pick_next_highest_task_rt() already iterates all the cgroups and is really only interested in tasks, skip over the !task entries. Reported-by: Dhaval Giani <dhaval.giani@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Dhaval Giani <dhaval.giani@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11perf: export perf_trace_regs and perf_arch_fetch_caller_regsXiao Guangrong
Export perf_trace_regs and perf_arch_fetch_caller_regs since module will use these. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> [ use EXPORT_PER_CPU_SYMBOL_GPL() ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4B989C1B.2090407@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11perf, x86: Fix hw_perf_enable() event assignmentPeter Zijlstra
What happens is that we schedule badly like: <...>-1987 [019] 280.252808: x86_pmu_start: event-46/1300c0: idx: 0 <...>-1987 [019] 280.252811: x86_pmu_start: event-47/1300c0: idx: 1 <...>-1987 [019] 280.252812: x86_pmu_start: event-48/1300c0: idx: 2 <...>-1987 [019] 280.252813: x86_pmu_start: event-49/1300c0: idx: 3 <...>-1987 [019] 280.252814: x86_pmu_start: event-50/1300c0: idx: 32 <...>-1987 [019] 280.252825: x86_pmu_stop: event-46/1300c0: idx: 0 <...>-1987 [019] 280.252826: x86_pmu_stop: event-47/1300c0: idx: 1 <...>-1987 [019] 280.252827: x86_pmu_stop: event-48/1300c0: idx: 2 <...>-1987 [019] 280.252828: x86_pmu_stop: event-49/1300c0: idx: 3 <...>-1987 [019] 280.252829: x86_pmu_stop: event-50/1300c0: idx: 32 <...>-1987 [019] 280.252834: x86_pmu_start: event-47/1300c0: idx: 1 <...>-1987 [019] 280.252834: x86_pmu_start: event-48/1300c0: idx: 2 <...>-1987 [019] 280.252835: x86_pmu_start: event-49/1300c0: idx: 3 <...>-1987 [019] 280.252836: x86_pmu_start: event-50/1300c0: idx: 32 <...>-1987 [019] 280.252837: x86_pmu_start: event-51/1300c0: idx: 32 *FAIL* This happens because we only iterate the n_running events in the first pass, and reset their index to -1 if they don't match to force a re-assignment. Now, in our RR example, n_running == 0 because we fully unscheduled, so event-50 will retain its idx==32, even though in scheduling it will have gotten idx=0, and we don't trigger the re-assign path. The easiest way to fix this is the below patch, which simply validates the full assignment in the second pass. Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1268311069.5037.31.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11perf, ppc: Fix compile error due to new cpu notifiersPeter Zijlstra
Fix: arch/powerpc/kernel/perf_event.c:1334: error: 'power_pmu_notifier' undeclared (first use in this function) arch/powerpc/kernel/perf_event.c:1334: error: (Each undeclared identifier is reported only once arch/powerpc/kernel/perf_event.c:1334: error: for each function it appears in.) arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'power_pmu_notifier' arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'register_cpu_notifier' Due to commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11NFS: Avoid a deadlock in nfs_release_pageTrond Myklebust
J.R. Okajima reports the following deadlock: INFO: task kswapd0:305 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kswapd0 D 0000000000000001 0 305 2 0x00000000 ffff88001f21d4f0 0000000000000046 ffff88001fdea680 ffff88001f21c000 ffff88001f21dfd8 ffff88001f21c000 ffff88001f21dfd8 ffff88001f21dfd8 ffff88001fdea040 0000000000014c00 0000000000000001 ffff88001fdea040 Call Trace: [<ffffffff8146155d>] io_schedule+0x4d/0x70 [<ffffffff810d2be5>] sync_page+0x65/0xa0 [<ffffffff81461b12>] __wait_on_bit_lock+0x52/0xb0 [<ffffffff810d2b80>] ? sync_page+0x0/0xa0 [<ffffffff810d2b64>] __lock_page+0x64/0x70 [<ffffffff81070ce0>] ? wake_bit_function+0x0/0x40 [<ffffffff810df1d4>] truncate_inode_pages_range+0x344/0x4a0 [<ffffffff810df340>] truncate_inode_pages+0x10/0x20 [<ffffffff8112cbfe>] generic_delete_inode+0x15e/0x190 [<ffffffff8112cc8d>] generic_drop_inode+0x5d/0x80 [<ffffffff8112bb88>] iput+0x78/0x80 [<ffffffff811bc908>] nfs_dentry_iput+0x38/0x50 [<ffffffff811285f4>] dentry_iput+0x84/0x110 [<ffffffff811286ae>] d_kill+0x2e/0x60 [<ffffffff8112912a>] dput+0x7a/0x170 [<ffffffff8111e925>] path_put+0x15/0x40 [<ffffffff811c3a44>] __put_nfs_open_context+0xa4/0xb0 [<ffffffff811cb5d0>] ? nfs_free_request+0x0/0x50 [<ffffffff811c3b0b>] put_nfs_open_context+0xb/0x10 [<ffffffff811cb5f9>] nfs_free_request+0x29/0x50 [<ffffffff81234b7e>] kref_put+0x8e/0xe0 [<ffffffff811cb594>] nfs_release_request+0x14/0x20 [<ffffffff811cf769>] nfs_find_and_lock_request+0x89/0xa0 [<ffffffff811d1180>] nfs_wb_page+0x80/0x110 [<ffffffff811c0770>] nfs_release_page+0x70/0x90 [<ffffffff810d18ee>] try_to_release_page+0x5e/0x80 [<ffffffff810e1178>] shrink_page_list+0x638/0x860 [<ffffffff810e19de>] shrink_zone+0x63e/0xc40 We can fix this by making the call to put_nfs_open_context() happen when we actually remove the write request from the inode (which is done by the nfsiod thread in this case). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-03-11x86: Reduce per cpu warning boot up messagesMike Travis
Reduce warning message output to one line only instead of per cpu. Signed-of-by: Mike Travis <travis@sgi.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: x86@kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11x86: Reduce per cpu MCA boot up messagesMike Travis
Don't write per cpu MCA boot up messages. Signed-of-by: Mike Travis <travis@sgi.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: x86@kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11microblaze: entry.S use delay slot for return handlersMichal Simek
Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Save current task directlyMichal Simek
Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Simplify entry.S - save/restore r3/r4 - ret_from_trapMichal Simek
There is possible to save r3/r4 at the beggining of user part before calling handlers and at the end restore it. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: PCI early support for noMMU systemMichal Simek
Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Fix dma alloc and free coherent dma functionsMichal Simek
We have to use consistent code to be able to do coherent dma function. In consistent code is used cache inhibit page mapping. Xilinx reported that there is bug in Microblaze for WB and d-cache_always use option. Microblaze 7.30.a should be first version where is this bug removed. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Add consistent codeMichal Simek
Remove ancient Kconfig option for consistent code. MMU uses cache inhibit pages. noMMU uses UNCACHE SHADOW feature where is used double ram size. For example: Physical ram is 256MB and cache are setup to cover the same size. But if you setup in HW that size is 512MB and cache covers 256MB than you can use adresses from 256-512MB without caches and correspond with 0-256MB with cache. That's why I am using dcache base/high addresses to find out uncache area. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: pgtable.h: move consistent functionsMichal Simek
Consistent functions will be used for noMMU and MMU kernels. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Remove ancient Kconfig option for consistent mappingMichal Simek
We don't use CONSISTENT option from Kconfig that's why I am removing them. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Remove VMALLOC_VMADDRMichal Simek
Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Add define for ASM_LOOPMichal Simek
It is default option but both options must be measured. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Preliminary support for dma driversMichal Simek
I found several problems for ll_temac driver and on system with WB. This early fix should fix it. I will clean this patch before I will add it to mainline Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: remove trailing space in messagesFrans Pop
Signed-off-by: Frans Pop <elendil@planet.nl> Cc: microblaze-uclinux@itee.uq.edu.au Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Use generic show_mem()Michal Simek
Remove arch-specific show_mem() in favor of the generic version. It is based on powerpc patch. bda2fa535564ace56a395d5b65c6dc81305401fa Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Change temp register for cmdlineMichal Simek
For copy was used r7 register when CONFIG_CMDLINE_BOOL option is enabled. But r7 stores pointer to fdt that's why machine_early_init not detect compiled-in DTB. I also moved kernel PID setup to have TLB init in one block Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Preliminary support for dma driversMichal Simek
I found several problems for ll_temac driver and on system with WB. This early fix should fix it. I will clean this patch before I will add it to mainline Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11perf: Make the install relative to DESTDIR if specifiedJohn Kacur
Without this change, the install path is relative to prefix/DESTDIR where prefix is automatically set to $HOME. This can produce unexpected results. For example: make -C tools/perf DESTDIR=/home/jkacur/tmp install-man creates the directory: /home/jkacur/home/jkacur/tmp/share/... instead of the expected: /home/jkacur/tmp/share/... Signed-off-by: John Kacur <jkacur@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Kyle McMartin <kyle@redhat.com> Cc: <stable@kernel.org> LKML-Reference: <1268312220-12880-1-git-send-email-jkacur@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11microblaze: Move cache function to cache.cMichal Simek
It is better to have init cache handling on one place. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Add support from PREEMPTMichal Simek
This patch add core PREEMPT support for Microblaze. I tried to trace it via tracers and I was able to see any output. I also added low level debug functions to see if that code is called. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11kprobes: Calculate the index correctly when freeing the out-of-line ↵Masami Hiramatsu
execution slot From : Ananth N Mavinakayanahalli <ananth@in.ibm.com> When freeing the instruction slot, the arithmetic to calculate the index of the slot in the page needs to account for the total size of the instruction on the various architectures. Calculate the index correctly when freeing the out-of-line execution slot. Reported-by: Sachin Sant <sachinp@in.ibm.com> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <4B9667AB.9050507@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11microblaze: Add support for Xilinx PCI host bridgeMichal Simek
This patch is based on powerpc patch 64f16502475ddf663169369fffff6da9b10ea9fb We did some cleanups and removed powerpc parts. There is one new debug early listing function too. Exclude function is only in Debug options. We tested in on custom board. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Enable PCI, missing filesMichal Simek
There are two parts of changes. The first is just enable PCI in Makefiles and in Kconfig. The second is the rest of missing files. I didn't want to add it with previous patch because that patch is too big. Current Microblaze toolchain has problem with weak symbols that's why is necessary to apply this changes to be possible to compile pci support. Xilinx knows about this problem. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Add core PCI filesMichal Simek
Add pci-common.h and pci32.c. Files are based on PPC version. There are removed ppc specific parts and the code was completely clean. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Add pci-bridge.hMichal Simek
Add pci-bridge.h for Microblaze. It is based on powerpc header file. My changes: I removed PPC_ prefix from constants Removed ppc64 specifis parts Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Add pci.hMichal Simek
Add pci.h for microblaze. It is based on powerpc pci.h Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: io.h include asm-generic/iomap.hMichal Simek
I need to use generic/iomap.h for PCI that's why is necessary to include it and fix ioport_{map,unmap} functions. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11sched: Cleanup: remove unused variable in try_to_wake_up()Dan Carpenter
We haven't used the "orig_rq" variable since 055a00865d "Fix/add missing update_rq_clock() calls" Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: efault@gmx.de LKML-Reference: <20100306111752.GL4958@bicker> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11microblaze: Add irq_create_{of_,}mapping functionsMichal Simek
Support function for PCI. We don't use any advance mapping mechanism that's why implementation is simple. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Add {z,}alloc_maybe_bootmem functionsMichal Simek
I will need {z,}alloc_maybe_bootmem functions for pci patches Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Implement __dma_sync_pageMichal Simek
There is necessary to do some cache handling for dma operations. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11microblaze: Support DMAMichal Simek
Add DMA support for Microblaze. There are some part of this new feature: 1. Basic DMA support 2. Enable DMA debug option 3. Setup notifier Ad 1. dma-mapping come from powerpc and x86 version and it is based on generic dma-mapping-common.h Ad 2. DMA support debug features which is used in generic file. For more information please look at Documentation/DMA-API.txt Ad 3. notifier is very important to setup dma_ops. Without this part for example ll_temac driver failed because there are no setup dma operations. Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-03-11Merge branch 'tip/tracing/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent
2010-03-11x86/mce: Fix RCU lockdep splatsPaul E. McKenney
Create an rcu_dereference_check_mce() that checks for RCU-sched read side and mce_read_mutex being held on update side. Replace uses of rcu_dereference() in arch/x86/kernel/cpu/mcheck/mce.c with this new macro. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1267830207-9474-3-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11rcu: Increase RCU CPU stall timeouts if PROVE_RCUPaul E. McKenney
CONFIG_PROVE_RCU imposes additional overhead on the kernel, so increase the RCU CPU stall timeouts in an attempt to allow for this effect. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1267830207-9474-2-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11ftrace: Replace read_barrier_depends() with rcu_dereference_raw()Paul E. McKenney
Replace the calls to read_barrier_depends() in ftrace_list_func() with rcu_dereference_raw() to improve readability. The reason that we use rcu_dereference_raw() here is that removed entries are never freed, instead they are simply leaked. This is one of a very few cases where use of rcu_dereference_raw() is the long-term right answer. And I don't yet know of any others. ;-) Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1267830207-9474-1-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11perf tools: Fix sparse CPU numbering related bugsPaul Mackerras
At present, the perf subcommands that do system-wide monitoring (perf stat, perf record and perf top) don't work properly unless the online cpus are numbered 0, 1, ..., N-1. These tools ask for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN) and then try to create events for cpus 0, 1, ..., N-1. This creates problems for systems where the online cpus are numbered sparsely. For example, a POWER6 system in single-threaded mode (i.e. only running 1 hardware thread per core) will have only even-numbered cpus online. This fixes the problem by reading the /sys/devices/system/cpu/online file to find out which cpus are online. The code that does that is in tools/perf/util/cpumap.[ch], and consists of a read_cpu_map() function that sets up a cpumap[] array and returns the number of online cpus. If /sys/devices/system/cpu/online can't be read or can't be parsed successfully, it falls back to using sysconf to ask how many cpus are online and sets up an identity map in cpumap[]. The perf record, perf stat and perf top code then calls read_cpu_map() in the system-wide monitoring case (instead of sysconf) and uses cpumap[] to get the cpu numbers to pass to perf_event_open. Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100310093609.GA3959@brick.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11perf_event: Fix oops triggered by cpu offline/onlinePaul Mackerras
Anton Blanchard found that he could reliably make the kernel hit a BUG_ON in the slab allocator by taking a cpu offline and then online while a system-wide perf record session was running. The reason is that when the cpu comes up, we completely reinitialize the ctx field of the struct perf_cpu_context for the cpu. If there is a system-wide perf record session running, then there will be a struct perf_event that has a reference to the context, so its refcount will be 2. (The perf_event has been removed from the context's group_entry and event_entry lists by perf_event_exit_cpu(), but that doesn't remove the perf_event's reference to the context and doesn't decrement the context's refcount.) When the cpu comes up, perf_event_init_cpu() gets called, and it calls __perf_event_init_context() on the cpu's context. That resets the refcount to 1. Then when the perf record session finishes and the perf_event is closed, the refcount gets decremented to 0 and the context gets kfreed after an RCU grace period. Since the context wasn't kmalloced -- it's part of a per-cpu variable -- bad things happen. In fact we don't need to completely reinitialize the context when the cpu comes up. It's sufficient to initialize the context once at boot, but we need to do it for all possible cpus. This moves the context initialization to happen at boot time. With this, we don't trash the refcount and the context never gets kfreed, and we don't hit the BUG_ON. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Tested-by: Anton Blanchard <anton@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>