summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-07-03powerpc: Fix assmption of end_of_DRAM() returns end addressBharat Bhushan
memblock_end_of_DRAM() returns end_address + 1, not end address. While some code assumes that it returns end address. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Optimise the 64bit optimised __clear_userAnton Blanchard
I blame Mikey for this. He elevated my slightly dubious testcase: to benchmark status. And naturally we need to be number 1 at creating zeros. So lets improve __clear_user some more. As Paul suggests we can use dcbz for large lengths. This patch gets the destination cacheline aligned then uses dcbz on whole cachelines. Before: 10485760000 bytes (10 GB) copied, 0.414744 s, 25.3 GB/s After: 10485760000 bytes (10 GB) copied, 0.268597 s, 39.0 GB/s 39 GB/s, a new record. Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Olof Johansson <olof@lixom.net> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/iommu: Implement IOMMU pools to improve multiqueue adapter performanceAnton Blanchard
At the moment all queues in a multiqueue adapter will serialise against the IOMMU table lock. This is proving to be a big issue, especially with 10Gbit ethernet. This patch creates 4 pools and tries to spread the load across them. If the table is under 1GB in size we revert back to the original behaviour of 1 pool and 1 largealloc pool. We create a hash to map CPUs to pools. Since we prefer interrupts to be affinitised to primary CPUs, without some form of hashing we are very likely to end up using the same pool. As an example, POWER7 has 4 way SMT and with 4 pools all primary threads will map to the same pool. The largealloc pool is reduced from 1/2 to 1/4 of the space to partially offset the overhead of breaking the table up into pools. Some performance numbers were obtained with a Chelsio T3 adapter on two POWER7 boxes, running a 100 session TCP round robin test. Performance improved 69% with this patch applied. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/iommu: Push spinlock into iommu_range_alloc and __iommu_freeAnton Blanchard
In preparation for IOMMU pools, push the spinlock into iommu_range_alloc and __iommu_free. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/iommu: Reduce spinlock coverage in iommu_freeAnton Blanchard
This patch moves tce_free outside of the lock in iommu_free. Some performance numbers were obtained with a Chelsio T3 adapter on two POWER7 boxes, running a 100 session TCP round robin test. Performance improved 25% with this patch applied. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/iommu: Reduce spinlock coverage in iommu_alloc and iommu_freeAnton Blanchard
We currently hold the IOMMU spinlock around tce_build and tce_flush. This causes our spinlock hold times to be much higher than required and can impact multiqueue adapters. This patch moves tce_build and tce_flush outside of the lock in iommu_alloc, and tce_flush outside of the lock in iommu_free. Some performance numbers were obtained with a Chelsio T3 adapter on two POWER7 boxes, running a 100 session TCP round robin test. Performance improved 32% with this patch applied. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/pseries: Disable interrupts around IOMMU percpu data accessesAnton Blanchard
tce_buildmulti_pSeriesLP uses a per cpu page to communicate with the hypervisor. We currently rely on the IOMMU table spinlock but subsequent patches will be removing that so disable interrupts around all accesses of tce_page. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: POWER7 optimised memcpy using VMX and enhanced prefetchAnton Blanchard
Implement a POWER7 optimised memcpy using VMX and enhanced prefetch instructions. This is a copy of the POWER7 optimised copy_to_user/copy_from_user loop. Detailed implementation and performance details can be found in commit a66086b8197d (powerpc: POWER7 optimised copy_to_user/copy_from_user using VMX). I noticed memcpy issues when profiling a RAID6 workload: .memcpy .async_memcpy .async_copy_data .__raid_run_ops .handle_stripe .raid5d .md_thread I created a simplified testcase by building a RAID6 array with 4 1GB ramdisks (booting with brd.rd_size=1048576): # mdadm -CR -e 1.2 /dev/md0 --level=6 -n4 /dev/ram[0-3] I then timed how long it took to write to the entire array: # dd if=/dev/zero of=/dev/md0 bs=1M Before: 892 MB/s After: 999 MB/s A 12% improvement. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Use enhanced touch instructions in POWER7 copy_to_user/copy_from_userAnton Blanchard
Version 2.06 of the POWER ISA introduced enhanced touch instructions, allowing us to specify a number of attributes including the length of a stream. This patch adds a software stream for both loads and stores in the POWER7 copy_tofrom_user loop. Since the setup is quite complicated and we have to use an eieio to ensure correct ordering of the "GO" command we only do this for copies above 4kB. To quantify any performance improvements we need a working set bigger than the caches so we operate on a 1GB file: # dd if=/dev/zero of=/tmp/foo bs=1M count=1024 And we compare how fast we can read the file: # dd if=/tmp/foo of=/dev/null bs=1M before: 7.7 GB/s after: 9.6 GB/s A 25% improvement. The worst case for this patch will be a completely L1 cache contained copy of just over 4kB. We can test this with the copy_to_user testcase we used to tune copy_tofrom_user originally: http://ozlabs.org/~anton/junkcode/copy_to_user.c # time ./copy_to_user2 -l 4224 -i 10000000 before: 6.807 s after: 6.946 s A 2% slowdown, which seems reasonable considering our data is unlikely to be completely L1 contained. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/pci: cleanup on duplicate assignmentGavin Shan
While creating the PCI root bus through function pci_create_root_bus() of PCI core, it should have assigned the secondary bus number for the newly created PCI root bus. Thus we needn't do the explicit assignment for the secondary bus number again in pcibios_scan_phb(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/numa: Fix OF node refcounting bugGavin Shan
The form affinity for NUMA is set to 1 if the firmware supports OPAL. Otherwise, we have to retrieve that from OF node "/chosen". For the latter case, OF node "/chosen" reference count was never decreased. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: POWER7 optimised copy_page using VMX and enhanced prefetchAnton Blanchard
Implement a POWER7 optimised copy_page using VMX and enhanced prefetch instructions. We use enhanced prefetch hints to prefetch both the load and store side. We copy a cacheline at a time and fall back to regular loads and stores if we are unable to use VMX (eg we are in an interrupt). The following microbenchmark was used to assess the impact of the patch: http://ozlabs.org/~anton/junkcode/page_fault_file.c We test MAP_PRIVATE page faults across a 1GB file, 100 times: # time ./page_fault_file -p -l 1G -i 100 Before: 22.25s After: 18.89s 17% faster Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Rename copyuser_power7_vmx.c to vmx-helper.cAnton Blanchard
Subsequent patches will add more VMX library functions and it makes sense to keep all the c-code helper functions in the one file. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Clear RI and EE at the same time in system call exitAnton Blanchard
mtmsrd is an expensive instruction, we save a few cycles by doing it once instead of twice. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Use enhanced touch instructions in POWER7 copy_to_user/copy_from_userAnton Blanchard
Version 2.06 of the POWER ISA introduced enhanced touch instructions, allowing us to specify a number of attributes including the length of a stream. This patch adds a software stream for both loads and stores in the POWER7 copy_tofrom_user loop. Since the setup is quite complicated and we have to use an eieio to ensure correct ordering of the "GO" command we only do this for copies above 4kB. To quantify any performance improvements we need a working set bigger than the caches so we operate on a 1GB file: # dd if=/dev/zero of=/tmp/foo bs=1M count=1024 And we compare how fast we can read the file: # dd if=/tmp/foo of=/dev/null bs=1M before: 7.7 GB/s after: 9.6 GB/s A 25% improvement. The worst case for this patch will be a completely L1 cache contained copy of just over 4kB. We can test this with the copy_to_user testcase we used to tune copy_tofrom_user originally: http://ozlabs.org/~anton/junkcode/copy_to_user.c # time ./copy_to_user2 -l 4224 -i 10000000 before: 6.807 s after: 6.946 s A 2% slowdown, which seems reasonable considering our data is unlikely to be completely L1 contained. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/smp: remove call to ipi_call_lock()/ipi_call_unlock()Yong Zhang
1) call_function.lock used in smp_call_function_many() is just to protect call_function.queue and &data->refs, cpu_online_mask is outside of the lock. And it's not necessary to protect cpu_online_mask, because data->cpumask is pre-calculate and even if a cpu is brougt up when calling arch_send_call_function_ipi_mask(), it's harmless because validation test in generic_smp_call_function_interrupt() will take care of it. 2) For cpu down issue, stop_machine() will guarantee that no concurrent smp_call_fuction() is processing. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: 64bit optimised __clear_userAnton Blanchard
I noticed __clear_user high up in a profile of one of my RAID stress tests. The testcase was doing a dd from /dev/zero which ends up calling __clear_user. __clear_user is basically a loop with a single 4 byte store which is horribly slow. We can do much better by aligning the desination and doing 32 bytes of 8 byte stores in a loop. The following testcase was used to verify the patch: http://ozlabs.org/~anton/junkcode/stress_clear_user.c To show the improvement in performance I ran a dd from /dev/zero to /dev/null on a POWER7 box: Before: # dd if=/dev/zero of=/dev/null bs=1M count=10000 10485760000 bytes (10 GB) copied, 3.72379 s, 2.8 GB/s After: # time dd if=/dev/zero of=/dev/null bs=1M count=10000 10485760000 bytes (10 GB) copied, 0.728318 s, 14.4 GB/s Over 5x faster. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: tracing: Avoid tracepoint duplication with DECLARE_EVENT_CLASSAnton Blanchard
irq_entry, irq_exit, timer_interrupt_entry and timer_interrupt_exit all do the same thing so use DECLARE_EVENT_CLASS to avoid duplicating everything 4 times. This saves quite a lot of space in both instruction text and data: text data bss dec hex filename 9265 19622 16 28903 70e7 arch/powerpc/kernel/irq.o 6817 19019 16 25852 64fc arch/powerpc/kernel/irq.o Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Enable jump label supportAnton Blanchard
When looking through some instruction traces I noticed our tracepoint checks were inline. It turns out we don't have CONFIG_JUMP_LABEL enabled. By enabling CONFIG_JUMP_LABEL we replace a load/compare/branch with a nop at every tracepoint call. For example in do_IRQ: CONFIG_JUMP_LABEL disabled: stdx 3,11,9 lwz 0,8(29) cmpwi 7,0,0 bne- 7,.L124 bl .irq_enter CONFIG_JUMP_LABEL enabled: stdx 3,11,9 nop bl .irq_enter Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/pseries/cpuidle: Replace pseries_notify_cpuidle_add call with notifierDeepthi Dharwar
The following patch is to remove the pseries_notify_add_cpu() call and replace it by a hot plug notifier. This would prevent cpuidle resources being released and allocated each time cpu comes online on pseries. The earlier design was causing a lockdep problem in start_secondary as reported on this thread -https://lkml.org/lkml/2012/5/17/2 This applies on 3.4-rc7 Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/pseries/iommu: remove default window before attempting DDW manipulationNishanth Aravamudan
An upcoming release of firmware will add DDW extensions, in particular an API to "reset" the DMA window to the original configuration (32-bit, 2GB in size). With that API available, we can safely remove the default window, increasing the resources available to firmware for creation of larger windows for the slot in question -- if we encounter an error, we can use the new API to reset the state of the slot. Further, this same release of firmware will make it a hard requirement for OSes to release the existing window before any other windows will be shown as available, to avoid conflicts in addressing between the two windows. In anticipation of these changes, always remove the default window before we do any DDW manipulations. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/ftrace: Use patch_instruction instead of probe_kernel_write()Steven Rostedt
The patch_instruction() interface is made to modify kernel text. It is safer to use that then the probe_kernel_write() when modifying kernel code. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc: Have patch_instruction detect faultsSteven Rostedt
For ftrace to use the patch_instruction code, it needs to check for faults on write. Ftrace updates code all over the kernel, and we need to know if code is updated or not due to protections that are placed on some portions of the kernel. If ftrace does not detect a fault, it will error later on, and it will be much more difficult to find the problem. By changing patch_instruction() to detect faults, then ftrace will be able to make use of it too. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/ftrace: Have PPC skip updating with stop_machine()Steven Rostedt
PowerPC does not have the synchronization issues that x86 has with modifying code on one CPU while another CPU is executing it. The other CPU will either see the old or new code without any issues, unlike x86 which may issue a GPF. Instead of calling the heavy stop_machine, just update the code. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03powerpc/boot: Only build board support files when required.Tony Breeds
Currently we build all board files regardless of the final zImage target. This is sub-optimal (in terms on compilation) and leads to problems in one platform needlessly causing failures for other platforms. Use the Kconfig variables to selectively construct this board files to build. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-01Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull two ARM fixes from Russell King: "It's been fairly quiet with the fixes. Just two this time. One fixes a long standing problem with KALLSYMS needing an additional pass, and the other sorts a problem with the vmalloc space interacting with static IO mappings." * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7438/1: fill possible PMD empty section gaps ARM: 7428/1: Prevent KALLSYM size mismatch on ARM.
2012-07-01ARM: 7438/1: fill possible PMD empty section gapsNicolas Pitre
On ARM with the 2-level page table format, a PMD entry is represented by two consecutive section entries covering 2MB of virtual space. However, static mappings always were allowed to use separate 1MB section entries. This means in practice that a static mapping may create half populated PMDs via create_mapping(). Since commit 0536bdf33f (ARM: move iotable mappings within the vmalloc region) those static mappings are located in the vmalloc area. We must ensure no such half populated PMDs are accessible once vmalloc() or ioremap() start looking at the vmalloc area for nearby free virtual address ranges, or various things leading to a kernel crash will happen. Signed-off-by: Nicolas Pitre <nico@linaro.org> Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: "R, Sricharan" <r.sricharan@ti.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-06-30Linux 3.5-rc5Linus Torvalds
2012-06-30Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Another week, another batch of fixes. All are small, contained, targeted fixes for explicit problems -- mostly build and boot failures across i.MX, OMAP, Renesas/Shmobile and Samsung." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: imx6q: fix suspend regression caused by common clk migration ARM: OMAP4470: Fix OMAP4470 boot failure ARM: EXYNOS: Fix EXYNOS_DEV_DMA Kconfig entry ARM: OMAP2+: nand: fix build error when CONFIG_MTD_ONENAND_OMAP2=n ARM: shmobile: r8a7779: Route all interrupts to ARM ARM: shmobile: kzm9d: use late init machine hook ARM: shmobile: kzm9g: use late init machine hook ARM: mach-shmobile: armadillo800eva: Use late init machine hook ARM: SAMSUNG: Fix for S3C2412 EBI memory mapping ARM: mach-shmobile: add missing GPIO IRQ configuration on mackerel ARM: mach-shmobile: Fix build when SMP is enabled and EMEV2 is not enabled ARM: shmobile: sh7372: bugfix: chclr_offset base ARM: shmobile: sh73a0: bugfix: SY-DMAC number ARM: SAMSUNG: Should check for IS_ERR(clk) instead of NULL
2012-06-30printk.c: fix kernel-doc warningsRandy Dunlap
Fix kernel-doc warnings in printk.c: use correct parameter name. Warning(kernel/printk.c:2429): No description found for parameter 'buf' Warning(kernel/printk.c:2429): Excess function parameter 'line' description in 'kmsg_dump_get_buffer' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-30linux/irq.h: fix kernel-doc warningRandy Dunlap
Fix kernel-doc warning. This struct member was removed in commit 875682648b89 ("irq: Remove irq_chip->release()") so remove its associated kernel-doc entry also. Warning(include/linux/irq.h:338): Excess struct/union/enum/typedef member 'release' description in 'irq_chip' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-30Merge branch 'v3.5-samsung-fixes-1' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes * 'v3.5-samsung-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung: ARM: EXYNOS: Fix EXYNOS_DEV_DMA Kconfig entry ARM: SAMSUNG: Fix for S3C2412 EBI memory mapping ARM: SAMSUNG: Should check for IS_ERR(clk) instead of NULL
2012-06-30ARM: imx6q: fix suspend regression caused by common clk migrationShawn Guo
When moving to common clk framework, the imx6q clks rom and mmdc_ch1_axi get different on/off states than old clk driver, which breaks suspend function. There might be a better way to manage these clocks, but let's takes the old clk driver approach to fix the regression first. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2012-06-30Merge tag 'omap-fixes-for-v3.5-rc4' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes From Tony Lindgren: "Here's one more regression fix that I missed earlier, and a trivial fix to get omap4470 booting." * tag 'omap-fixes-for-v3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP4470: Fix OMAP4470 boot failure ARM: OMAP2+: nand: fix build error when CONFIG_MTD_ONENAND_OMAP2=n
2012-06-30Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull ACPI & Power Management patches from Len Brown. * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: acpi_pad: fix power_saving thread deadlock ACPI video: Still use ACPI backlight control if _DOS doesn't exist ACPI, APEI, Avoid too much error reporting in runtime ACPI: Add a quirk for "AMILO PRO V2030" to ignore the timer overriding ACPI: Remove one board specific WARN when ignoring timer overriding ACPI: Make acpi_skip_timer_override cover all source_irq==0 cases ACPI, x86: fix Dell M6600 ACPI reboot regression via DMI ACPI sysfs.c strlen fix
2012-06-30Merge tag 'driver-core-3.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver Core fixes from Greg Kroah-Hartman: "Here is a number of printk() fixes, specifically a few reported by the crazy blog program that ships in SUSE releases (that's "boot log" and not "web log", it predates the general "blog" terminology by many years), and the restoration of the continuation line functionality reported by Stephen and others. Yes, the changes seem a bit big this late in the cycle, but I've been beating on them for a while now, and Stephen has even optimized it a bit, so all looks good to me. The other change in here is a Documentation update for the stable kernel rules describing how some distro patches should be backported, to hopefully drive a bit more response from the distros to the stable kernel releases. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: printk: Optimize if statement logic where newline exists printk: flush continuation lines immediately to console syslog: fill buffer with more than a single message for SYSLOG_ACTION_READ Revert "printk: return -EINVAL if the message len is bigger than the buf size" printk: fix regression in SYSLOG_ACTION_CLEAR stable: Allow merging of backports for serious user-visible performance issues
2012-06-30Merge branches 'acpi_pad-bugzilla-42981', 'apei-bugzilla-43282', ↵Len Brown
'video-bugzilla-43168', 'bugzilla-40002' and 'bugfix-misc' into release bug fixes
2012-06-30acpi_pad: fix power_saving thread deadlockStuart Hayes
The acpi_pad driver can get stuck in destroy_power_saving_task() waiting for kthread_stop() to stop a power_saving thread. The problem is that the isolated_cpus_lock mutex is owned when destroy_power_saving_task() calls kthread_stop(), which waits for a power_saving thread to end, and the power_saving thread tries to acquire the isolated_cpus_lock when it calls round_robin_cpu(). This patch fixes the issue by making round_robin_cpu() use its own mutex. https://bugzilla.kernel.org/show_bug.cgi?id=42981 Cc: stable@vger.kernel.org Signed-off-by: Stuart Hayes <Stuart_Hayes@Dell.com> Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-30ACPI video: Still use ACPI backlight control if _DOS doesn't existZhang Rui
This fixes a regression in 3.4-rc1 caused by commit ea9f8856bd6d4ed45885b06a338f7362cd6c60e5 (ACPI video: Harden video bus adding.) Some platforms don't have _DOS control method, but the ACPI backlight still works. We should not invoke _DOS for these platforms. https://bugzilla.kernel.org/show_bug.cgi?id=43168 Cc: Igor Murzov <intergalactic.anonymous@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-29Merge tag 'pm-for-3.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael J. Wysocki: * Fix for a bug in async suspend error code path causing parents to wait forever for their children in case of a suspend error from Mandeep Singh Baines (-stable metarial). * Fix for a suspend regression related to earlier changes in the ACPI cpuidle driver from Deepthi Dharwar. * tag 'pm-for-3.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / ACPI: Fix suspend/resume regression caused by cpuidle cleanup. PM / Sleep: Prevent waiting forever on asynchronous suspend after abort
2012-06-29printk: Optimize if statement logic where newline existsSteven Rostedt
In reviewing Kay's fix up patch: "printk: Have printk() never buffer its data", I found two if statements that could be combined and optimized. Put together the two 'cont.len && cont.owner == current' if statements into a single one, and check if we need to call cont_add(). This also removes the unneeded double cont_flush() calls. Link: http://lkml.kernel.org/r/1340869133.876.10.camel@mop Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-29Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Benjamin Herrenschmidt: "Here are a few powerpc fixes. Arguably some of this should have come to you earlier but I'm only just catching up after my medical leave. Mostly these fixes regressions, a couple are long standing bugs." * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/pseries: Fix software invalidate TCE powerpc: check_and_cede_processor() never cedes powerpc/ftrace: Do not trace restore_interrupts() powerpc: Fix Section mismatch warnings in prom_init.c ppc64: fix missing to check all bits of _TIF_USER_WORK_MASK in preempt powerpc: Fix uninitialised error in numa.c powerpc: Fix BPF_JIT code to link with multiple TOCs
2012-06-29Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, cpufeature: Remove stray %s, add -w to mkcapflags.pl x86, cpufeature: Catch duplicate CPU feature strings x86, cpufeature: Rename X86_FEATURE_DTS to X86_FEATURE_DTHERM x86: Fix kernel-doc warnings x86, compat: Use test_thread_flag(TIF_IA32) in compat signal delivery
2012-06-29Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull oprofile fixlet from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: oprofile: perf: use NR_CPUS instead or nr_cpumask_bits for static array
2012-06-29Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fix from Ingo Molnar. Fixes a bug introduced in this merge window by commit b1420f1c ("Make rcu_barrier() less disruptive") * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Stop rcu_do_batch() from multiplexing the "count" variable
2012-06-29printk: flush continuation lines immediately to consoleKay Sievers
Continuation lines are buffered internally, intended to merge the chunked printk()s into a single record, and to isolate potentially racy continuation users from usual terminated line users. This though, has the effect that partial lines are not printed to the console in the moment they are emitted. In case the kernel crashes in the meantime, the potentially interesting printed information would never reach the consoles. Here we share the continuation buffer with the console copy logic, and partial lines are always immediately flushed to the available consoles. They are still buffered internally to improve the readability and integrity of the messages and minimize the amount of needed record headers to store. Signed-off-by: Kay Sievers <kay@vrfy.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-29powerpc/pseries: Fix software invalidate TCEMichael Neuling
The following added support for powernv but broke pseries/BML: 1f1616e powerpc/powernv: Add TCE SW invalidation support TCE_PCI_SW_INVAL was split into FREE and CREATE flags but the tests in the pseries code were not updated to reflect this. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@kernel.org [v3.3+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29powerpc: check_and_cede_processor() never cedesAnton Blanchard
Commit f948501b36c6 ("Make hard_irq_disable() actually hard-disable interrupts") caused check_and_cede_processor to stop working. ->irq_happened will never be zero right after a hard_irq_disable so the compiler removes the call to cede_processor completely. The bug was introduced back in the lazy interrupt handling rework of 3.4 but was hidden until recently because hard_irq_disable did nothing. This issue will eventually appear in 3.4 stable since the hard_irq_disable fix is marked stable, so mark this one for stable too. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29powerpc/ftrace: Do not trace restore_interrupts()Steven Rostedt
As I was adding code that affects all archs, I started testing function tracer against PPC64 and found that it currently locks up with 3.4 kernel. I figured it was due to tracing a function that shouldn't be, so I went through the following process to bisect to find the culprit: cat /debug/tracing/available_filter_functions > t num=`wc -l t` sed -ne "1,${num}p" t > t1 let num=num+1 sed -ne "${num},$p" t > t2 cat t1 > /debug/tracing/set_ftrace_filter echo function /debug/tracing/current_tracer <failed? bisect t1, if not bisect t2> It finally came down to this function: restore_interrupts() I'm not sure why this locks up the system. It just seems to prevent scheduling from occurring. Interrupts seem to still work, as I can ping the box. But all user processes freeze. When restore_interrupts() is not traced, function tracing works fine. Cc: stable@kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29powerpc: Fix Section mismatch warnings in prom_init.cLi Zhong
This patches tries to fix a couple of Section mismatch warnings like following one: WARNING: arch/powerpc/kernel/built-in.o(.text+0x2923c): Section mismatch in reference from the function .prom_query_opal() to the function .init.text:.call_prom() The function .prom_query_opal() references the function __init .call_prom(). This is often because .prom_query_opal lacks a __init annotation or the annotation of .call_prom is wrong. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>