linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2011-09-05	dmaengine/ste_dma40: fix memory leak due to prepared descriptors	Per Forlin
	Prepared descriptors that are not submitted will not be freed. Add prepared descriptor to a list to be able to release them upon dmaengine_terminate_all(). Signed-off-by: Per Forlin <per.forlin@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-09-05	dmaengine/ste_dma40: fix Oops due to double free of client descriptor	Per Forlin
	The client list may exist in two lists at the same time. This makes free fail since the same desc is freed multiple times. Remove desc from client list when adding it to the pending queue. Move free of client owned descriptors from free_dma() to terminate_all(). Unable to handle kernel paging request at virtual address 00100104 pgd = dea8c000 [00100104] pgd=1ea62831, pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#1] PREEMPT SMP Modules linked in: CPU: 0 Not tainted (3.1.0-rc3+ #58) PC is at d40_free_chan_resources+0x64/0x330 Signed-off-by: Per Forlin <per.forlin@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-09-05	dmaengine/ste_dma40: remove duplicate call to d40_pool_lli_free().	Per Forlin
	d40_desc_free() already calls d40_pool_lli_free(). Signed-off-by: Per Forlin <per.forlin@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-09-05	dmaengine/ste_dma40: add missing kernel doc for pending_queue	Per Forlin
	Signed-off-by: Per Forlin <per.forlin@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-09-05	NET: am79c961: fix race in link status code	Russell King
	The link status code operates from a timer, and writes the index register without first taking a lock. A well-placed interrupt between writing the index register and reading the data register could change the index register on us, which will return wrong data. Add the necessary lock. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-09-04	Merge branches 'non_hwmod_compliant_fix_3.1rc', 'omap3_clock_fixes_3.1rc', ↵	Paul Walmsley
	'omap4_clock_fixes_3.1rc', 'missing_2430_musb_adds_terminator_fix_3.1rc' and 'pwrdm_clkdm_fixes_3.1rc' into prcm-fixes-a-3.1rc
2011-09-04	OMAP2430: hwmod: musb: add missing terminator to omap2430_usbhsotg_addrs[]	Paul Walmsley
	Add a missing array terminator to omap2430_usbhsotg_addrs[]. Without this terminator, the omap_hwmod resource building code runs off the end of the array, resulting in at least this error -- if not worse behavior: [ 0.578002] musb-omap2430: failed to claim resource 4 [ 0.583465] omap_device: musb-omap2430: build failed (-16) [ 0.589294] Could not build omap_device for musb-omap2430 usb_otg_hs This should have been part of commit 78183f3fdf76f422431a81852468be01b36db325 ("omap_hwmod: use a null structure record to terminate omap_hwmod_addr_space arrays") but was evidently missed. Signed-off-by: Paul Walmsley <paul@pwsan.com>
2011-09-04	Linux 3.1-rc5v3.1-rc5	Linus Torvalds

2011-09-04	ARM: 7067/1: mm: keep significant bits in pfn_valid	Mark Rutland
	When ARCH_HAS_HOLES_MEMORYMODEL is selected, pfn_valid calls memblock_is_memory to test validity of a pfn: > memblock_is_memory(pfn << PAGE_SHIFT); On LPAE systems this cuts off the top bits, as the shift occurs before the value is promoted to a phys_addr_t. This patch replaces the shift with a call to __pfn_to_phys (which casts pfn to phys_addr_t before shifting), preventing the loss of significant bits. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-09-02	Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux	Linus Torvalds
	* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/radeon/kms: make sure pci max read request size is valid on evergreen+ (v2) drm/radeon/kms: set a default max_pixel_clock
2011-09-02	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs	Linus Torvalds
	* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: fix ->write_inode return values xfs: fix xfs_mark_inode_dirty during umount xfs: deprecate the nodelaylog mount option
2011-09-02	iommu/amd: Don't take domain->lock recursivly	Joerg Roedel
	The domain_flush_devices() function takes the domain->lock. But this function is only called from update_domain() which itself is already called unter the domain->lock. This causes a deadlock situation when the dma-address-space of a domain grows larger than 1GB. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-09-02	iommu/amd: Make sure iommu->need_sync contains correct value	Joerg Roedel
	The value is only set to true but never set back to false, which causes to many completion-wait commands to be sent to hardware. Fix it with this patch. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-09-02	Merge branch 'fortglx/3.1/tip/timers/rtc' of ↵	Thomas Gleixner
	git://git.linaro.org/people/jstultz/linux into timers/urgent
2011-09-02	drm/radeon/kms: make sure pci max read request size is valid on evergreen+ (v2)	Alex Deucher
	If the bios or OS sets the pci max read request size to 0 or an invalid value (6,7), it can result in a hang or slowdown. Check and set it to something sane if it's invalid. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=42162 v2: use pci reg defines from include/linux/pci_regs.h Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@kernel.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-09-01	xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead.	Konrad Rzeszutek Wilk
	We have hit a couple of customer bugs where they would like to use those parameters to run an UP kernel - but both of those options turn of important sources of interrupt information so we end up not being able to boot. The correct way is to pass in 'dom0_max_vcpus=1' on the Xen hypervisor line and the kernel will patch itself to be a UP kernel. Fixes bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=637308 CC: stable@kernel.org Acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-01	xen: x86_32: do not enable iterrupts when returning from exception in ↵	Igor Mammedov
	interrupt context If vmalloc page_fault happens inside of interrupt handler with interrupts disabled then on exit path from exception handler when there is no pending interrupts, the following code (arch/x86/xen/xen-asm_32.S:112): cmpw $0x0001, XEN_vcpu_info_pending(%eax) sete XEN_vcpu_info_mask(%eax) will enable interrupts even if they has been previously disabled according to eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99) testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp) setz XEN_vcpu_info_mask(%eax) Solution is in setting XEN_vcpu_info_mask only when it should be set according to cmpw $0x0001, XEN_vcpu_info_pending(%eax) but not clearing it if there isn't any pending events. Reproducer for bug is attached to RHBZ 707552 CC: stable@kernel.org Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-01	xfs: fix ->write_inode return values	Christoph Hellwig
	Currently we always redirty an inode that was attempted to be written out synchronously but has been cleaned by an AIL pushed internall, which is rather bogus. Fix that by doing the i_update_core check early on and return 0 for it. Also include async calls for it, as doing any work for those is just as pointless. While we're at it also fix the sign for the EIO return in case of a filesystem shutdown, and fix the completely non-sensical locking around xfs_log_inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com> (cherry picked from commit 297db93bb74cf687510313eb235a7aec14d67e97) Signed-off-by: Alex Elder <aelder@sgi.com>
2011-09-01	xen: use maximum reservation to limit amount of usable RAM	David Vrabel
	Use the domain's maximum reservation to limit the amount of extra RAM for the memory balloon. This reduces the size of the pages tables and the amount of reserved low memory (which defaults to about 1/32 of the total RAM). On a system with 8 GiB of RAM with the domain limited to 1 GiB the kernel reports: Before: Memory: 627792k/4472000k available After: Memory: 549740k/11132224k available A increase of about 76 MiB (~1.5% of the unused 7 GiB). The reserved low memory is also reduced from 253 MiB to 32 MiB. The total additional usable RAM is 329 MiB. For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit the number of pages for dom0') (c/s 23790) CC: stable@kernel.org Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-08-31	xfs: fix xfs_mark_inode_dirty during umount	Christoph Hellwig
	During umount we do not add a dirty inode to the lru and wait for it to become clean first, but force writeback of data and metadata with I_WILL_FREE set. Currently there is no way for XFS to detect that the inode has been redirtied for metadata operations, as we skip the mark_inode_dirty call during teardown. Fix this by setting i_update_core nanually in that case, so that the inode gets flushed during inode reclaim. Alternatively we could enable calling mark_inode_dirty for inodes in I_WILL_FREE state, and let the VFS dirty tracking handle this. I decided against this as we will get better I/O patterns from reclaim compared to the synchronous writeout in write_inode_now, and always marking the inode dirty in some way from xfs_mark_inode_dirty is a better safetly net in either case. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com> (cherry picked from commit da6742a5a4cc844a9982fdd936ddb537c0747856) Signed-off-by: Alex Elder <aelder@sgi.com>
2011-08-31	libceph: fix leak of osd structs during shutdown	Sage Weil
	We want to remove all OSDs, not just those on the idle LRU. Signed-off-by: Sage Weil <sage@newdream.net>
2011-08-31	Merge tag 'for_linus-20110831' of git://github.com/tytso/ext4	Linus Torvalds
	* tag 'for_linus-20110831' of git://github.com/tytso/ext4: ext4: remove i_mutex lock in ext4_evict_inode to fix lockdep complaining
2011-08-31	mmc: sdhci-s3c: Fix mmc card I/O problem	Girish K S
	This patch fixes the problem in sdhci-s3c host driver for Samsung Soc's. During the card identification stage the mmc core driver enumerates for the best bus width in combination with the highest available data rate. It starts enumerating from the highest bus width (8) to lowest width (1). In case of few MMC cards the 4-bit bus enumeration fails and tries the 1-bit bus enumeration. When switched to 1-bit bus mode the host driver has to clear the previous bus width setting and apply the new setting. The current patch will clear the previous bus mode and apply the new mode setting. Signed-off-by: Girish K S <girish.shivananjappa@linaro.org> Acked-by: Jaehoon Chung <jh80.chung@samsung.com> Cc: <stable@kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>
2011-08-31	mmc: sd: UHS-I bus speed should be set last in UHS initialization	Subhash Jadavani
	mmc_sd_init_uhs_card function sets the driver type, current limit and bus speed mode on card as well as on host controller side. Currently bus speed mode is set by sending CMD6 to card and immediately setting the timing mode in host controller. But then before initiating tuning sequence, it also tries to set current limit by sending CMD6 to card which results in data timeout errors in controller if bus speed mode is SDR50/SDR104 mode. So basically bus speed mode should be set only after current limit is set in the card and immediately after setting the bus speed mode, tuning sequence should be initiated. Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org> Reviewed-by: Arindam Nath <arindam.nath@amd.com> Signed-off-by: Chris Ball <cjb@laptop.org>
2011-08-31	mmc: sdhi: initialise mmc_data->flags before use	Simon Horman
	This corrects a logic error that I introduced in "mmc: sdhi: Add write16_hook" Reported-by: Magnus Damm <magnus.damm@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Chris Ball <cjb@laptop.org>
2011-08-31	mmc: core: use non-reentrant workqueue for clock gating	Mika Westerberg
	The default multithread workqueue can cause the same work to be executed concurrently on a different CPUs. This isn't really suitable for clock gating as it might already gated the clock and gating it twice results both host->clk_old and host->ios.clock to be set to 0. To prevent this from happening we use system_nrt_wq instead. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Chris Ball <cjb@laptop.org> Cc: <stable@kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>
2011-08-31	mmc: core: prevent aggressive clock gating racing with ios updates	Mika Westerberg
	We have seen at least two different races when clock gating kicks in in a middle of ios structure update. First one happens when ios->clock is changed outside of aggressive clock gating framework, for example via mmc_set_clock(). The race might happen when we run following code: mmc_set_ios(): ... if (ios->clock > 0) mmc_set_ungated(host); Now if gating kicks in right after the condition check we end up setting host->clk_gated to false even though we have just gated the clock. Next time a request is started we try to ungate and restore the clock in mmc_host_clk_hold(). However since we have host->clk_gated set to false the original clock is not restored. This eventually will cause the host controller to hang since its clock is disabled while we are trying to issue a request. For example on Intel Medfield platform we see: [ 13.818610] mmc2: Timeout waiting for hardware interrupt. [ 13.818698] sdhci: =========== REGISTER DUMP (mmc2)=========== [ 13.818753] sdhci: Sys addr: 0x00000000 \| Version: 0x00008901 [ 13.818804] sdhci: Blk size: 0x00000000 \| Blk cnt: 0x00000000 [ 13.818853] sdhci: Argument: 0x00000000 \| Trn mode: 0x00000000 [ 13.818903] sdhci: Present: 0x1fff0000 \| Host ctl: 0x00000001 [ 13.818951] sdhci: Power: 0x0000000d \| Blk gap: 0x00000000 [ 13.819000] sdhci: Wake-up: 0x00000000 \| Clock: 0x00000000 [ 13.819049] sdhci: Timeout: 0x00000000 \| Int stat: 0x00000000 [ 13.819098] sdhci: Int enab: 0x00ff00c3 \| Sig enab: 0x00ff00c3 [ 13.819147] sdhci: AC12 err: 0x00000000 \| Slot int: 0x00000000 [ 13.819196] sdhci: Caps: 0x6bee32b2 \| Caps_1: 0x00000000 [ 13.819245] sdhci: Cmd: 0x00000000 \| Max curr: 0x00000000 [ 13.819292] sdhci: Host ctl2: 0x00000000 [ 13.819331] sdhci: ADMA Err: 0x00000000 \| ADMA Ptr: 0x00000000 [ 13.819377] sdhci: =========================================== [ 13.919605] mmc2: Reset 0x2 never completed. and it never recovers. Second race might happen while running mmc_power_off(): static void mmc_power_off(struct mmc_host host) { host->ios.clock = 0; host->ios.vdd = 0; [ clock gating kicks in here ] / * Reset ocr mask to be the highest possible voltage supported for * this mmc host. This value will be used at next power up. */ host->ocr = 1 << (fls(host->ocr_avail) - 1); if (!mmc_host_is_spi(host)) { host->ios.bus_mode = MMC_BUSMODE_OPENDRAIN; host->ios.chip_select = MMC_CS_DONTCARE; } host->ios.power_mode = MMC_POWER_OFF; host->ios.bus_width = MMC_BUS_WIDTH_1; host->ios.timing = MMC_TIMING_LEGACY; mmc_set_ios(host); } If the clock gating worker kicks in while we are only partially updated the ios structure the host controller gets incomplete ios and might not work as supposed. Again on Intel Medfield platform we get: [ 4.185349] kernel BUG at drivers/mmc/host/sdhci.c:1155! [ 4.185422] invalid opcode: 0000 [#1] PREEMPT SMP [ 4.185509] Modules linked in: [ 4.185565] [ 4.185608] Pid: 4, comm: kworker/0:0 Not tainted 3.0.0+ #240 Intel Corporation Medfield/iCDKA [ 4.185742] EIP: 0060:[<c136364e>] EFLAGS: 00010083 CPU: 0 [ 4.185827] EIP is at sdhci_set_power+0x3e/0xd0 [ 4.185891] EAX: f5ff98e0 EBX: f5ff98e0 ECX: 00000000 EDX: 00000001 [ 4.185970] ESI: f5ff977c EDI: f5ff9904 EBP: f644fe98 ESP: f644fe94 [ 4.186049] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 4.186125] Process kworker/0:0 (pid: 4, ti=f644e000 task=f644c0e0 task.ti=f644e000) [ 4.186219] Stack: [ 4.186257] f5ff98e0 f644feb0 c1365173 00000282 f5ff9460 f5ff96e0 f5ff96e0 f644feec [ 4.186418] c1355bd8 f644c0e0 c1499c3d f5ff96e0 f644fed4 00000006 f5ff96e0 00000286 [ 4.186579] f644fedc c107922b f644feec 00000286 f5ff9460 f5ff9700 f644ff10 c135839e [ 4.186739] Call Trace: [ 4.186802] [<c1365173>] sdhci_set_ios+0x1c3/0x340 [ 4.186883] [<c1355bd8>] mmc_gate_clock+0x68/0x120 [ 4.186963] [<c1499c3d>] ? _raw_spin_unlock_irqrestore+0x4d/0x60 [ 4.187052] [<c107922b>] ? trace_hardirqs_on+0xb/0x10 [ 4.187134] [<c135839e>] mmc_host_clk_gate_delayed+0xbe/0x130 [ 4.187219] [<c105ec09>] ? process_one_work+0xf9/0x5b0 [ 4.187300] [<c135841d>] mmc_host_clk_gate_work+0xd/0x10 [ 4.187379] [<c105ec82>] process_one_work+0x172/0x5b0 [ 4.187457] [<c105ec09>] ? process_one_work+0xf9/0x5b0 [ 4.187538] [<c1358410>] ? mmc_host_clk_gate_delayed+0x130/0x130 [ 4.187625] [<c105f3c8>] worker_thread+0x118/0x330 [ 4.187700] [<c1496cee>] ? preempt_schedule+0x2e/0x50 [ 4.187779] [<c105f2b0>] ? rescuer_thread+0x1f0/0x1f0 [ 4.187857] [<c1062cf4>] kthread+0x74/0x80 [ 4.187931] [<c1062c80>] ? __init_kthread_worker+0x60/0x60 [ 4.188015] [<c149acfa>] kernel_thread_helper+0x6/0xd [ 4.188079] Code: 81 fa 00 00 04 00 0f 84 a7 00 00 00 7f 21 81 fa 80 00 00 00 0f 84 92 00 00 00 81 fa 00 00 0 [ 4.188780] EIP: [<c136364e>] sdhci_set_power+0x3e/0xd0 SS:ESP 0068:f644fe94 [ 4.188898] ---[ end trace a7b23eecc71777e4 ]--- This BUG() comes from the fact that ios.power_mode was still in previous value (MMC_POWER_ON) and ios.vdd was set to zero. We prevent these by inhibiting the clock gating while we update the ios structure. Both problems can be reproduced by simply running the device in a reboot loop. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Chris Ball <cjb@laptop.org> Cc: <stable@kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>
2011-08-31	mmc: rename mmc_host_clk_{ungate\|gate} to mmc_host_clk_{hold\|release}	Mika Westerberg
	As per suggestion by Linus Walleij: > If you think the names of the functions are confusing then > you may rename them, say like this: > > mmc_host_clk_ungate() -> mmc_host_clk_hold() > mmc_host_clk_gate() -> mmc_host_clk_release() > > Which would make the usecases more clear (This is CC'd to stable@ because the next two patches, which fix observable races, depend on it.) Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Cc: <stable@kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>
2011-08-31	Merge branch 'for-linus' of git://neil.brown.name/md	Linus Torvalds
	* 'for-linus' of git://neil.brown.name/md: md/raid5: fix a hang on device failure. md: fix clearing of 'blocked' flag in the presence of bad blocks. md/linear: avoid corrupting structure while waiting for rcu_free to complete. md: use REQ_NOIDLE flag in md_super_write() md: ensure changes to 'write-mostly' are reflected in metadata. md: report failure if a 'set faulty' request doesn't.
2011-08-31	Merge branch 'merge' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/p1023rds: Fix the error of bank-width of nor flash powerpc/85xx: enable caam crypto driver by default powerpc/85xx: enable the audio drivers in the defconfigs
2011-08-31	ext4: remove i_mutex lock in ext4_evict_inode to fix lockdep complaining	Jiaying Zhang
	The i_mutex lock and flush_completed_IO() added by commit 2581fdc810 in ext4_evict_inode() causes lockdep complaining about potential deadlock in several places. In most/all of these LOCKDEP complaints it looks like it's a false positive, since many of the potential circular locking cases can't take place by the time the ext4_evict_inode() is called; but since at the very least it may mask real problems, we need to address this. This change removes the flush_completed_IO() and i_mutex lock in ext4_evict_inode(). Instead, we take a different approach to resolve the software lockup that commit 2581fdc810 intends to fix. Rather than having ext4-dio-unwritten thread wait for grabing the i_mutex lock of an inode, we use mutex_trylock() instead, and simply requeue the work item if we fail to grab the inode's i_mutex lock. This should speed up work queue processing in general and also prevents the following deadlock scenario: During page fault, shrink_icache_memory is called that in turn evicts another inode B. Inode B has some pending io_end work so it calls ext4_ioend_wait() that waits for inode B's i_ioend_count to become zero. However, inode B's ioend work was queued behind some of inode A's ioend work on the same cpu's ext4-dio-unwritten workqueue. As the ext4-dio-unwritten thread on that cpu is processing inode A's ioend work, it tries to grab inode A's i_mutex lock. Since the i_mutex lock of inode A is still hold before the page fault happened, we enter a deadlock. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-08-31	x86, perf: Check that current->mm is alive before getting user callchain	Andrey Vagin
	An event may occur when an mm is already released. I added an event in dequeue_entity() and caught a panic with the following backtrace: [ 434.421110] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 [ 434.421258] IP: [<ffffffff810464ac>] __get_user_pages_fast+0x9c/0x120 ... [ 434.421258] Call Trace: [ 434.421258] [<ffffffff8101ae81>] copy_from_user_nmi+0x51/0xf0 [ 434.421258] [<ffffffff8109a0d5>] ? sched_clock_local+0x25/0x90 [ 434.421258] [<ffffffff8101b048>] perf_callchain_user+0x128/0x170 [ 434.421258] [<ffffffff811154cd>] ? __perf_event_header__init_id+0xed/0x100 [ 434.421258] [<ffffffff81116690>] perf_prepare_sample+0x200/0x280 [ 434.421258] [<ffffffff81118da8>] __perf_event_overflow+0x1b8/0x290 [ 434.421258] [<ffffffff81065240>] ? tg_shares_up+0x0/0x670 [ 434.421258] [<ffffffff8104fe1a>] ? walk_tg_tree+0x6a/0xb0 [ 434.421258] [<ffffffff81118f44>] perf_swevent_overflow+0xc4/0xf0 [ 434.421258] [<ffffffff81119150>] do_perf_sw_event+0x1e0/0x250 [ 434.421258] [<ffffffff81119204>] perf_tp_event+0x44/0x70 [ 434.421258] [<ffffffff8105701f>] ftrace_profile_sched_block+0xdf/0x110 [ 434.421258] [<ffffffff8106121d>] dequeue_entity+0x2ad/0x2d0 [ 434.421258] [<ffffffff810614ec>] dequeue_task_fair+0x1c/0x60 [ 434.421258] [<ffffffff8105818a>] dequeue_task+0x9a/0xb0 [ 434.421258] [<ffffffff810581e2>] deactivate_task+0x42/0xe0 [ 434.421258] [<ffffffff814bc019>] thread_return+0x191/0x808 [ 434.421258] [<ffffffff81098a44>] ? switch_task_namespaces+0x24/0x60 [ 434.421258] [<ffffffff8106f4c4>] do_exit+0x464/0x910 [ 434.421258] [<ffffffff8106f9c8>] do_group_exit+0x58/0xd0 [ 434.421258] [<ffffffff8106fa57>] sys_exit_group+0x17/0x20 [ 434.421258] [<ffffffff8100b202>] system_call_fastpath+0x16/0x1b Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1314693156-24131-1-git-send-email-avagin@openvz.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-08-31	perf_event: Fix broken calc_timer_values()	Eric B Munson
	We detected a serious issue with PERF_SAMPLE_READ and timing information when events were being multiplexing. Samples would have time_running > time_enabled. That was easy to reproduce with a libpfm4 example (ran 3 times to cause multiplexing on Core 2): $ syst_smpl -e uops_retired:freq=1 & $ syst_smpl -e uops_retired:freq=1 & $ syst_smpl -e uops_retired:freq=1 & IIP:0x0000000040062d ... PERIOD:2355332948 ENA=40144625315 RUN=60014875184 syst_smpl: WARNING: time_running > time_enabled 63277537998 uops_retired:freq=1 , scaled The bug was not present in kernel up to (and including) 3.0. It turns out the bug was introduced by the following commit: commit c4794295917ebeda8013b6cb9c8d71ab4f74a1fa events: Move lockless timer calculation into helper function The parameters of the function got reversed yet the call sites were not updated to reflect the change. That lead to time_running and time_enabled being swapped. That had no effect when there was no multiplexing because in that case time_running = time_enabled but it would show up in any other scenario. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110829124112.GA4828@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-08-31	drm/radeon/kms: set a default max_pixel_clock	Dave Airlie
	On some Power rv100 cards, we have no ATY OF table, but we have no combios table either, and hence we refuse all modes on VGA-0 since we end up with a 0 max pixel clock. Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: stable@kernel.org Reviewed-by: Alex Deucher <alexdeucher@gmail.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2011-08-31	Merge branches 'hwbreak', 'perf/updates' and 'perf/system-pmus' into for-rmk	Will Deacon

2011-08-31	ARM: perf: Remove unnecessary armpmu->enable()s	Mark Rutland
	Currently, armpmu_enable iterates through the events for a given counter set, calling armpmu->enable on each before calling armpmu->start to start the PMU's counters. As armpmu->enable is called when each event is added, each event is already configured in hardware. Due to this, calling armpmu->enable in armpmu_enable is unnecessary and confusing. This patch removes the unnecessary calls to armpmu->enable. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: move arm_pmu into <asm/pmu.h>	Mark Rutland
	Currently, struct arm_pmu and related functions are only visible to {,arch/arm/}/kernel/perf_event.c. This prevents new drivers from using the framework. This patch moves declarations to asm/pmu.h, allowing new PMU drivers to use the framework. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: remove cpu-related misnomers	Mark Rutland
	Currently struct cpu_hw_events stores data on events running on a PMU associated with a CPU. As this data is general enough to be used for system PMUs, this name is a misnomer, and may cause confusion when it is used for system PMUs. Additionally, 'armpmu' is commonly used as a parameter name for an instance of struct arm_pmu. The name is also used for a global instance which represents the CPU's PMU. As cpu_hw_events is now not tied to CPU PMUs, it is renamed to pmu_hw_events, with instances of it renamed similarly. As the global 'armpmu' is CPU-specfic, it is renamed to cpu_pmu. This should make it clearer which code is generic, and which is coupled with the CPU. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: remove event limit from pmu_hw_events	Mark Rutland
	Currently the event accounting data in pmu_hw_events is stored in fixed-sized arrays within the structure. This patch refactors the accounting data to allow any number of events to be managed. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: add support for multiple PMUs	Mark Rutland
	Currently, a single static instance of struct pmu is used when registering an ARM PMU with the main perf subsystem. This limits the ARM perf code to supporting a single PMU. This patch replaces the static struct pmu instance with a member variable on struct arm_pmu. This provides bidirectional mapping between the two structs, and therefore allows for support of multiple PMUs. The function 'to_arm_pmu' is provided for convenience. PMU-generic functions are also updated to use the new mapping, and PMU-generic initialisation of the member variables is moved into a new function: armpmu_init. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: refactor event mapping	Mark Rutland
	Currently mapping an event type to a hardware configuration value depends on the data being pointed to from struct arm_pmu. These fields (cache_map, event_map, raw_event_mask) are currently specific to CPU PMUs, and do not serve the general case well. This patch replaces the event map pointers on struct arm_pmu with a new 'map_event' function pointer. Small shim functions are used to reuse the existing common code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: add type field to struct arm_pmu	Mark Rutland
	Currently, the ARM perf code assumes all PMUs it will handle are CPU PMUs, having ARM_PMU_DEVICE_CPU hardcoded when reserving or releasing hardware. This means that currently, the ARM perf code can't support system PMUs. This patch adds a 'type' field to struct arm_pmu, which allows the code to reserve & release the hardware regardless of the PMU type. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: lock PMU registers per-CPU	Mark Rutland
	Currently, a single lock serialises access to CPU PMU registers. This global locking is unnecessary as PMU registers are local to the CPU they monitor. This patch replaces the global lock with a per-CPU lock. As the lock is in struct cpu_hw_events, PMUs providing a single cpu_hw_events instance can be locked globally. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: remove unnecessary armpmu->stop	Mark Rutland
	As armpmu_disable will call armpmu->stop when the last event has been removed, this is pointless and simply adds to the noise when debugging. Additionally, due to this call occurring in a preemptible context, this is problematic for per-cpu locking of PMU registers (where we will attempt to access per-cpu spinlock for use with raw_spin_lock_irqsave). This patch removes the call to armpmu->stop. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: indirect access to cpu_hw_events	Mark Rutland
	Currently, cpu_hw_events is a global per-CPU variable. To enable support for multiple PMUs, there needs to be a mapping from an instance of arm_pmu to its cpu_hw_events. Additionally, as system PMUs are not CPU-affine, they should not have this stored per-CPU. This patch moves access to the hardware events data behind an accessor function (arm_pmu::get_hw_events). This allows each instance to have its own hardware event data, which can be stored per-CPU or globally as required. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: move platform device to struct arm_pmu	Mark Rutland
	Currently the ARM perf code supports having a single struct platform_device to supply IRQ numbers, limiting it to supporting a single PMU. This patch makes a platform_device instance variable on struct arm_pmu. This should allow for multiple PMUs to be supported in future. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: move active_events into struct arm_pmu	Mark Rutland
	This patch moves the active_events counter into struct arm_pmu, in preparation for supporting multiple PMUs. This also moves pmu_reserve_mutex, as it is used to guard accesses to active_events. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: remove active_mask	Mark Rutland
	Currently, pmu_hw_events::active_mask is used to keep track of which events are active in hardware. As we can stop counters and their interrupts, this is unnecessary. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: clean up event group validation	Mark Rutland
	Currently, event group validation compares each event's 'pmu' pointer against the static 'pmu' pointer. This limits the code to supporting only 1 PMU. This patch changes the behaviour to consider an event's group leader's 'pmu' pointer as canonical for validation. This should ease later generalisation of the code to support multiple PMUs at once. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-08-31	ARM: perf: only register a CPU PMU when present	Mark Rutland
	Currently, an "empty" struct pmu is registered as the CPU PMU, regardless of whether there is a physical PMU. This burdens the accessor functions with checks to see whether a PMU is actually present. This patch changes initialisation to register a PMU only if there is a supported PMU present, and removes the checks that this change makes redundant. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>