linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2010-12-20	drm/radeon: use aperture size not vram size for overlap tests	Dave Airlie
	This fixes a problem where the wrong card conflicts with vesafb in my x2 system. Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-12-19	sched: Remove debugging check	Ingo Molnar
	Linus reported that the new warning introduced by commit f26f9aff6aaf "Sched: fix skip_clock_update optimization" triggers. The need_resched flag can be set by other CPUs asynchronously so this debug check is bogus - remove it. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <AANLkTinJ8hAG1TpyC+CSYPR47p48+1=E7fiC45hMXT_1@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-19	ux500: platsmp: Fix section mismatch	Jonas Aaberg
	Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	mach-ux500: add STMPE1601 platform data	Sundar Iyer
	Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> [Minor fixups to GPIO enumerators] Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	mach-ux500: move keymaps to new file	Sundar Iyer
	Move keylayouts to a dedicated file and plug these keylayouts for input platform data. This will make addition of new and custom keylayouts localized. Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	Merge branches 'x86-fixes-for-linus' and 'perf-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32: Make sure we can map all of lowmem if we need to x86, vt-d: Handle previous faults after enabling fault handling x86: Enable the intr-remap fault handling after local APIC setup x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode x86, vt-d: Quirk for masking vtd spec errors to platform error handling logic x86, xsave: Use alloc_bootmem_align() instead of alloc_bootmem() bootmem: Add alloc_bootmem_align() x86, gcc-4.6: Use gcc -m options when building vdso x86: HPET: Chose a paranoid safe value for the ETIME check x86: io_apic: Avoid unused variable warning when CONFIG_GENERIC_PENDING_IRQ=n * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Fix off by one in perf_swevent_init() perf: Fix duplicate events with multiple-pmu vs software events ftrace: Have recordmcount honor endianness in fn_ELF_R_INFO scripts/tags.sh: Add magic for trace-events tracing: Fix panic when lseek() called on "trace" opened for writing
2010-12-19	Merge branch 'sched-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix the irqtime code for 32bit sched: Fix the irqtime code to deal with u64 wraps nohz: Fix get_next_timer_interrupt() vs cpu hotplug Sched: fix skip_clock_update optimization sched: Cure more NO_HZ load average woes
2010-12-19	mfd/tc3589x: add suspend/resume support	Sundar Iyer
	Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	mfd/tc3589x: undo gpio module reset during chip init	Sundar Iyer
	Skip putting the GPIO module into a reset during the chip init. This makes sure to preserve any existing GPIO configurations done by pre-kernel boot code. Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	mfd/tc3589x: fix random interrupt misses	Sundar Iyer
	On the TC35892, a random delayed interrupt clear (GPIO IC) write locks up the child interrupts. In such a case, the original interrupt is active and not yet acknowledged. Re-check the IRQST bit for any pending interrupts and handle those. Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	mfd/tc3589x: add block identifier for multiple child devices	Sundar Iyer
	Add block identifier to be able to add multiple mfd clients to the mfd core Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	mfd/tc3589x: rename tc35892 structs/registers to tc359x	Sundar Iyer
	Most of the register layout, client IRQ numbers on the TC35892 is shared also by other variants. Make this generic as tc3589x Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	mfd/tc35892: rename tc35892 core driver to tc3589x	Sundar Iyer
	Rename the tc35892 core/gpio drivers to tc3589x to include new variants in the same mfd core Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	mfd/tc35892: rename tc35892 header to tc3589x	Sundar Iyer
	Rename the header file to include further variants within the same mfd core driver Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	mach-ux500: deprecate spi support for ab8500	Sundar Iyer
	Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Sundar Iyer <sundar.iyer@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
2010-12-19	ARM: fix cache-feroceon-l2 after stack based kmap_atomic()	Nicolas Pitre
	Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is actively wrong to rely on fixed kmap type indices (namely KM_L2_CACHE) as kmap_atomic() totally ignores them and a concurrent instance of it may happily reuse any slot for any purpose. Because kmap_atomic() is now able to deal with reentrancy, we can get rid of the ad hoc mapping here. While the code is made much simpler, there is a needless cache flush introduced by the usage of __kunmap_atomic(). It is not clear if the performance difference to remove that is worth the cost in code maintenance (I don't think there are that many highmem users on that platform anyway) but that should be reconsidered when/if someone cares enough to do some measurements. Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
2010-12-19	ARM: fix cache-xsc3l2 after stack based kmap_atomic()	Nicolas Pitre
	Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is actively wrong to rely on fixed kmap type indices (namely KM_L2_CACHE) as kmap_atomic() totally ignores them and a concurrent instance of it may happily reuse any slot for any purpose. Because kmap_atomic() is now able to deal with reentrancy, we can get rid of the ad hoc mapping here, and we even don't have to disable IRQs anymore (highmem case). While the code is made much simpler, there is a needless cache flush introduced by the usage of __kunmap_atomic(). It is not clear if the performance difference to remove that is worth the cost in code maintenance (I don't think there are that many highmem users on that platform if at all anyway). Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
2010-12-19	ARM: get rid of kmap_high_l1_vipt()	Nicolas Pitre
	Since commit 3e4d3af501 "mm: stack based kmap_atomic()", it is no longer necessary to carry an ad hoc version of kmap_atomic() added in commit 7e5a69e83b "ARM: 6007/1: fix highmem with VIPT cache and DMA" to cope with reentrancy. In fact, it is now actively wrong to rely on fixed kmap type indices (namely KM_L1_CACHE) as kmap_atomic() totally ignores them now and a concurrent instance of it may reuse any slot for any purpose. Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
2010-12-19	ARM: 6530/1: mmci: partially revert clock divisor code	Linus Walleij
	I misread the datasheet as if bypass mode was not available at all on the ux500's, I was wrong. It is there, the datasheet just states that you should not have to use it. Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-19	ARM: 6526/1: mmci: corrected calculation of clock div for ux500	Linus Walleij
	The Ux500 variant of this block has a different divider. The value used right now is too big and which means a loss in performance. This fix corrects it. Also expand the math comments a bit so it's clear what's happening. Further the Ux500 variant does not like if we use the BYPASS bit, instead we are supposed to set the clock divider to zero. Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-19	ARM: AT91: update clock source registration	Russell King
	In d7e81c2 (clocksource: Add clocksource_register_hz/khz interface) new interfaces were added which simplify (and optimize) the selection of the divisor shift/mult constants. Switch over to using this new interface. Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-19	ARM: clockevents: fix IOP clock events initialization	Russell King
	Ensure that no interrupt is pending before registering the clock event device, and properly initialize the periodic tick in the ->set_mode callback. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-19	sched: Fix interactivity bug by charging unaccounted run-time on entity ↵	Paul Turner
	re-weight Mike Galbraith reported poor interactivity[] when the new shares distribution code was combined with autogroups. The root cause turns out to be a mis-ordering of accounting accrued execution time and shares updates. Since update_curr() is issued hierarchically, updating the parent entity weights to reflect child enqueue/dequeue results in the parent's unaccounted execution time then being accrued (vs vruntime) at the new weight as opposed to the weight present at accumulation. While this doesn't have much effect on processes with timeslices that cross a tick, it is particularly problematic for an interactive process (e.g. Xorg) which incurs many (tiny) timeslices. In this scenario almost all updates are at dequeue which can result in significant fairness perturbation (especially if it is the only thread, resulting in potential {tg->shares, MIN_SHARES} transitions). Correct this by ensuring unaccounted time is accumulated prior to manipulating an entity's weight. [] http://xkcd.com/619/ is perversely Nostradamian here. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20101216031038.159704378@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-19	sched: Move periodic share updates to entity_tick()	Paul Turner
	Long running entities that do not block (dequeue) require periodic updates to maintain accurate share values. (Note: group entities with several threads are quite likely to be non-blocking in many circumstances). By virtue of being long-running however, we will see entity ticks (otherwise the required update occurs in dequeue/put and we are done). Thus we can move the detection (and associated work) for these updates into the periodic path. This restores the 'atomicity' of update_curr() with respect to accounting. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20101216031038.067028969@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-19	Merge commit 'v2.6.37-rc6' into sched/core	Ingo Molnar
	Merge reason: Update to the latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-19	firewire: net: set carrier state at ifup	Stefan Richter
	At ifup, carrier status would be shown on even if it actually was off. Also add an include for ethtool_ops rather than to rely on the one from netdevice.h. Note, we can alas not use fwnet_device_mutex to serialize access to dev->peer_count (as I originally wanted). This would cause a lock inversion: - fwnet_probe \| takes fwnet_device_mutex + register_netdev \| takes rtnl_mutex - devinet_ioctl \| takes rtnl_mutex + fwnet_open \| ...must not take fwnet_device_mutex Hence use the dev->lock spinlock for serialization. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-12-19	firewire: net: add carrier detection	Maxim Levitsky
	To make userland, e.g. NetworkManager work with firewire, we need to detect whether cable is plugged or not. Simple and correct way of doing that is just counting number of peers. No peers - no link and vice versa. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2010-12-19	oprofile, x86: Add support for 6 counters (AMD family 15h)	Robert Richter
	This patch adds support for up to 6 hardware counters for AMD family 15h cpus. There is a new MSR range for hardware counters beginning at MSRC001_0200 Performance Event Select (PERF_CTL0). Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-12-19	oprofile, x86: Add support for AMD family 15h	Robert Richter
	This patch adds support for AMD family 15h (Interlagos/Valencia/ Zambezi) cpus. Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-12-19	MAINTAINERS: Add tomoyo-dev-en ML.	Tetsuo Handa
	MAINTAINERS: Add tomoyo-dev-en ML. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: James Morris <jmorris@namei.org>
2010-12-18	ipv6: remove duplicate neigh_ifdown	stephen hemminger
	When device is being set to down, neigh_ifdown was being called twice. Once from addrconf notifier and once from ndisc notifier. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-18	ipv6: fib6_ifdown cleanup	stephen hemminger
	Remove (unnecessary) casts to make code cleaner. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-18	jbd2: simplify return path of journal_init_common	Theodore Ts'o
	Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-18	jbd2: move debug message into debug #ifdef	Theodore Ts'o
	This is a port to jbd2 of a patch which Namhyung Kim <namhyung@gmail.com> originally made to fs/jbd. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-18	jbd2: remove unnecessary goto statement	Theodore Ts'o
	This is a port to jbd2 of a patch which Namhyung Kim <namhyung@gmail.com> originally made to fs/jbd. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-18	Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: arch/tile: handle rt_sigreturn() more cleanly arch/tile: handle CLONE_SETTLS in copy_thread(), not user space
2010-12-18	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds
	* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: MIPS: Fix build errors in sc-mips.c
2010-12-18	jbd2: use offset_in_page() instead of manual calculation	Theodore Ts'o
	This is a port to jbd2 of a patch which Namhyung Kim <namhyung@gmail.com> originally made to fs/jbd. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-18	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: x86: avoid high BIOS area when allocating address space x86: avoid E820 regions when allocating address space x86: avoid low BIOS area when allocating address space resources: add arch hook for preventing allocation in reserved areas Revert "resources: support allocating space within a region from the top down" Revert "PCI: allocate bus resources from the top down" Revert "x86/PCI: allocate space from the end of a region, not the beginning" Revert "x86: allocate space within a region top-down" Revert "PCI: fix pci_bus_alloc_resource() hang, prefer positive decode" PCI: Update MCP55 quirk to not affect non HyperTransport variants
2010-12-18	jbd2: Fix a debug message in do_get_write_access()	Theodore Ts'o
	'buffer_head' should be 'journal_head' This is a port of a patch which Namhyung Kim <namhyung@gmail.com> made to fs/jbd to jbd2. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-18	omap4: l2x0: Enable early BRESP bit	Santosh Shilimkar
	The AXI protocol specifies that the write response can only be sent back to an AXI master when the last write data has been accepted. This optimization enables the PL310 to send the write response of certain write transactions as soon as the store buffer accepts the write address. This behavior is not compatible with the AXI protocol and is disabled by default. You enable this optimization by setting the Early BRESP Enable bit in the Auxiliary Control Register (bit [30]). Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Mans Rullgard <mans@mansr.com> Tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18	omap4: l2x0: Set share override bit	Santosh Shilimkar
	Clearing bit 22 in the PL310 Auxiliary Control register (shared attribute override enable) has the side effect of transforming Normal Shared Non-cacheable reads into Cacheable no-allocate reads. Coherent DMA buffers in Linux always have a Cacheable alias via the kernel linear mapping and the processor can speculatively load cache lines into the PL310 controller. With bit 22 cleared, Non-cacheable reads would unexpectedly hit such cache lines leading to buffer corruption Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18	omap4: l2x0: enable instruction and data prefetching	Mans Rullgard
	Enabling L2 prefetching improves performance as shown on Panda ES2.1 board with mem test, and it has measurable impact on performances. I think we should consider it, even though it damages "writes" a bit. (rebased to k.org) Usually the prefetch is used at both levels together L1 + L2, however, to enable the CP15 prefetch engines, these are under security, and on GP devices, we cannot enable it(e.g. on PandaBoard). However, just enabling PL310 prefetch seems to provide performance improvement, as shown in the data below (from Ubuntu) and would be a great thing to pull in. What prefetch does is enable automatic next line prefetching. With this enabled, whenever the PL310 receives a cachable read request, it automatically prefetches the following cache line as well. Measurement Data: == STOCK 10.10 WITHOUT PATCH ======================== ~# ./memspeed size 8388608 8192k 8M offset 8388608, 0 buffers 0x2aaad000 0x2b2ad000 copy libc 133 MB/s copy Android v5 273 MB/s copy Android NEON 235 MB/s copy INT32 116 MB/s copy ASM ARM 187 MB/s copy ASM VLDM 64 204 MB/s copy ASM VLDM 128 173 MB/s copy ASM VLD1 216 MB/s read ASM ARM 286 MB/s read ASM VLDM 242 MB/s read ASM VLD1 286 MB/s write libc 1947 MB/s write ASM ARM 1943 MB/s write ASM VSTM 1942 MB/s write ASM VST1 1935 MB/s 10.10 + PATCH ============= ~# ./memspeed size 8388608 8192k 8M offset 8388608, 0 buffers 0x2ab17000 0x2b317000 copy libc 129 MB/s copy Android v5 256 MB/s copy Android NEON 356 MB/s copy INT32 127 MB/s copy ASM ARM 321 MB/s copy ASM VLDM 64 337 MB/s copy ASM VLDM 128 321 MB/s copy ASM VLD1 350 MB/s read ASM ARM 496 MB/s read ASM VLDM 470 MB/s read ASM VLD1 488 MB/s write libc 1701 MB/s write ASM ARM 1682 MB/s write ASM VSTM 1693 MB/s write ASM VST1 1681 MB/s Signed-off-by: Mans Rullgard <mans@mansr.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18	omap4: l2x0: Construct the AUXCTRL value using defines	Santosh Shilimkar
	This patch removes the hardcoded value of auxctrl value and construct it using bitfields Bit 25 is reserved and is always set to 1. Same value of this bit is retained in this patch Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18	ARM: l2x0: Add aux control register bitfields	Santosh Shilimkar
	This patch adds the PL310 Auxiliary Control Register bitfields so that SOC's can use these bit fields to construct the AUXCTRL value to be passed/programmed instead of hardcoding it. Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-12-18	vmstat: User per cpu atomics to avoid interrupt disable / enable	Christoph Lameter
	Currently the operations to increment vm counters must disable interrupts in order to not mess up their housekeeping of counters. So use this_cpu_cmpxchg() to avoid the overhead. Since we can no longer count on preremption being disabled we still have some minor issues. The fetching of the counter thresholds is racy. A threshold from another cpu may be applied if we happen to be rescheduled on another cpu. However, the following vmstat operation will then bring the counter again under the threshold limit. The operations for __xxx_zone_state are not changed since the caller has taken care of the synchronization needs (and therefore the cycle count is even less than the optimized version for the irq disable case provided here). The optimization using this_cpu_cmpxchg will only be used if the arch supports efficient this_cpu_ops (must have CONFIG_CMPXCHG_LOCAL set!) The use of this_cpu_cmpxchg reduces the cycle count for the counter operations by %80 (inc_zone_page_state goes from 170 cycles to 32). Signed-off-by: Christoph Lameter <cl@linux.com>
2010-12-18	irq_work: Use per cpu atomics instead of regular atomics	Christoph Lameter
	The irq work queue is a per cpu object and it is sufficient for synchronization if per cpu atomics are used. Doing so simplifies the code and reduces the overhead of the code. Before: christoph@linux-2.6$ size kernel/irq_work.o text data bss dec hex filename 451 8 1 460 1cc kernel/irq_work.o After: christoph@linux-2.6$ size kernel/irq_work.o text data bss dec hex filename 438 8 1 447 1bf kernel/irq_work.o Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Christoph Lameter <cl@linux.com>
2010-12-18	Merge branch 'this_cpu_ops' into for-2.6.38	Tejun Heo

2010-12-18	cpuops: Use cmpxchg for xchg to avoid lock semantics	Christoph Lameter
	Use cmpxchg instead of xchg to realize this_cpu_xchg. xchg will cause LOCK overhead since LOCK is always implied but cmpxchg will not. Baselines: xchg() = 18 cycles (no segment prefix, LOCK semantics) __this_cpu_xchg = 1 cycle (simulated using this_cpu_read/write, two prefixes. Looks like the cpu can use loop optimization to get rid of most of the overhead) Cycles before: this_cpu_xchg = 37 cycles (segment prefix and LOCK (implied by xchg)) After: this_cpu_xchg = 11 cycle (using cmpxchg without lock semantics) Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-18	x86: this_cpu_cmpxchg and this_cpu_xchg operations	Christoph Lameter
	Provide support as far as the hardware capabilities of the x86 cpus allow. Define CONFIG_CMPXCHG_LOCAL in Kconfig.cpu to allow core code to test for fast cpuops implementations. V1->V2: - Take out the definition for this_cpu_cmpxchg_8 and move it into a separate patch. tj: - Reordered ops to better follow this_cpu_* organization. - Renamed macro temp variables similar to their existing neighbours. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>