linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2009-01-21	x86: merge mmu_context.h	Brian Gerst
	Impact: cleanup tj: * changed cpu to unsigned as was done on mmu_context_64.h as cpu id is officially unsigned int * added missing ';' to 32bit version of deactivate_mm() Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21	x86: set %fs to __KERNEL_PERCPU unconditionally for x86_32	Brian Gerst
	Impact: cleanup %fs is currently set to __KERNEL_DS at boot, and conditionally switched to __KERNEL_PERCPU for secondary cpus. Instead, initialize GDT_ENTRY_PERCPU to the same attributes as GDT_ENTRY_KERNEL_DS and set %fs to __KERNEL_PERCPU unconditionally. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21	x86: fix percpu_write with 64-bit constants	Brian Gerst
	Impact: slightly better code generation for percpu_to_op() The processor will sign-extend 32-bit immediate values in 64-bit operations. Use the 'e' constraint ("32-bit signed integer constant, or a symbolic reference known to fit that range") for 64-bit constants. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21	x86: clean up gdt_page definition	Brian Gerst
	Impact: cleanup && more compact percpu area layout with future changes Move 64-bit GDT to page-aligned section and clean up comment formatting. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21	x86: update canary handling during switch	Tejun Heo
	Impact: cleanup In switch_to(), instead of taking offset to irq_stack_union.stack, make it a proper percpu access using __percpu_arg() and per_cpu_var(). Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20	x86: remove byte locks	Jiri Kosina
	Impact: cleanup Remove byte locks implementation, which was introduced by Jeremy in 8efcbab6 ("paravirt: introduce a "lock-byte" spinlock implementation"), but turned out to be dead code that is not used by any in-kernel virtualization guest (Xen uses its own variant of spinlocks implementation and KVM is not planning to move to byte locks). Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20	x86: optimise x86's do_page_fault (C entry point for the page fault path)	Nick Piggin
	Impact: cleanup, restructure code to improve assembly gcc isn't _all_ that smart about spilling registers to stack or reusing stack slots, even with branch annotations. do_page_fault contained a lot of functionality, so split unlikely paths into their own functions, and mark them as noinline just to be sure. I consider this actually to be somewhat of a cleanup too: the main function now contains about half the number of lines so the normal path is easier to read, while the error cases are also nicely split away. Also, ensure the order of arguments to functions is always the same: regs, addr, error_code. This can reduce code size a tiny bit, and just looks neater too. And add a couple of branch annotations. Before: do_page_fault: subq $360, %rsp #, After: do_page_fault: subq $56, %rsp #, bloat-o-meter: add/remove: 8/0 grow/shrink: 0/1 up/down: 2222/-1680 (542) function old new delta __bad_area_nosemaphore - 506 +506 no_context - 474 +474 vmalloc_fault - 424 +424 spurious_fault - 358 +358 mm_fault_error - 272 +272 bad_area_access_error - 89 +89 bad_area - 89 +89 bad_area_nosemaphore - 10 +10 do_page_fault 2464 784 -1680 Yes, the total size increases by 542 bytes, due to the extra function calls. But these will very rarely be called (except for vmalloc_fault) in a normal workload. Importantly, do_page_fault is less than 1/3rd it's original size, and touches far less stack. Existing gotos and branch hints did move a lot of the infrequently used text out of the fastpath, but that's even further improved after this patch. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20	Merge commit 'v2.6.29-rc2' into x86/mm	Ingo Molnar

2009-01-20	x86, cpumask: fix tlb flush race	Ingo Molnar
	Impact: fix bootup crash The cpumask is now passed in as a reference to mm->cpu_vm_mask, not on the stack - hence it is not constant anymore during the TLB flush. That way it could race and some static sanity checks would trigger: [ 238.154287] ------------[ cut here ]------------ [ 238.156039] kernel BUG at arch/x86/kernel/tlb_32.c:130! [ 238.156039] invalid opcode: 0000 [#1] SMP [ 238.156039] last sysfs file: /sys/class/net/eth2/address [ 238.156039] Modules linked in: [ 238.156039] [ 238.156039] Pid: 6493, comm: ifup-eth Not tainted (2.6.29-rc2-tip #1) P4DC6 [ 238.156039] EIP: 0060:[<c0118f87>] EFLAGS: 00010202 CPU: 2 [ 238.156039] EIP is at native_flush_tlb_others+0x35/0x158 [ 238.156039] EAX: c0ef972c EBX: f6143301 ECX: 00000000 EDX: 00000000 [ 238.156039] ESI: f61433a8 EDI: f6143200 EBP: f34f3e00 ESP: f34f3df0 [ 238.156039] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 238.156039] Process ifup-eth (pid: 6493, ti=f34f2000 task=f399ab00 task.ti=f34f2000) [ 238.156039] Stack: [ 238.156039] ffffffff f61433a8 ffffffff f6143200 f34f3e18 c0118e9c 00000000 f6143200 [ 238.156039] f61433a8 f5bec738 f34f3e28 c0119435 c2b5b830 f6143200 f34f3e34 c01c2dc3 [ 238.156039] bffd9000 f34f3e60 c01c3051 00000000 ffffffff f34f3e4c 00000000 00000071 [ 238.156039] Call Trace: [ 238.156039] [<c0118e9c>] ? flush_tlb_others+0x52/0x5b [ 238.156039] [<c0119435>] ? flush_tlb_mm+0x7f/0x8b [ 238.156039] [<c01c2dc3>] ? tlb_finish_mmu+0x2d/0x55 [ 238.156039] [<c01c3051>] ? exit_mmap+0x124/0x170 [ 238.156039] [<c013e965>] ? mmput+0x40/0xf5 [ 238.156039] [<c01e4788>] ? flush_old_exec+0x640/0x94b [ 238.156039] [<c01ddb4e>] ? fsnotify_access+0x37/0x39 [ 238.156039] [<c01e3435>] ? kernel_read+0x39/0x4b [ 238.156039] [<c021bc8a>] ? load_elf_binary+0x4a1/0x11bb [ 238.156039] [<c01c0af9>] ? might_fault+0x51/0x9c [ 238.156039] [<c010a2cc>] ? paravirt_read_tsc+0x20/0x4f [ 238.156039] [<c010a406>] ? native_sched_clock+0x5d/0x60 [ 238.156039] [<c01e2fda>] ? search_binary_handler+0xab/0x2c4 [ 238.156039] [<c021b7e9>] ? load_elf_binary+0x0/0x11bb [ 238.156039] [<c04ae9a5>] ? _raw_read_unlock+0x21/0x46 [ 238.156039] [<c021b7e9>] ? load_elf_binary+0x0/0x11bb [ 238.156039] [<c01e2fe1>] ? search_binary_handler+0xb2/0x2c4 [ 238.156039] [<c01e4076>] ? do_execve+0x21c/0x2ee [ 238.156039] [<c01029b7>] ? sys_execve+0x51/0x8c [ 238.156039] [<c0103eaf>] ? sysenter_do_call+0x12/0x43 Fix it by not assuming that the cpumask is constant. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20	Merge branch 'tj-percpu' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu
2009-01-20	x86: remove pda.h	Brian Gerst
	Impact: cleanup Signed-off-by: Brian Gerst <brgerst@gmail.com>
2009-01-20	x86: move stack_canary into irq_stack	Brian Gerst
	Impact: x86_64 percpu area layout change, irq_stack now at the beginning Now that the PDA is empty except for the stack canary, it can be removed. The irqstack is moved to the start of the per-cpu section. If the stack protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack. tj: * updated subject * dropped asm relocation of irq_stack_ptr * updated comments a bit * rebased on top of stack canary changes Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20	x86: rework __per_cpu_load adjustments	Brian Gerst
	Impact: cleanup Use cpu_number to determine if the adjustment is necessary. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20	x86: remove pda_init()	Brian Gerst
	Impact: cleanup Copy the code to cpu_init() to satisfy the requirement that the cpu be reinitialized. Remove all other calls, since the segments are already initialized in head_64.S. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20	x86: conditionalize stack canary handling in hot path	Tejun Heo
	Impact: no unnecessary stack canary swapping during context switch There's no point in moving stack_canary around during context switch if it's not enabled. Conditionalize it. Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20	x86: cleanup stack protector	Tejun Heo
	Impact: cleanup Make the following cleanups. * remove duplicate comment from boot_init_stack_canary() which fits better in the other place - cpu_idle(). * move stack_canary offset check from __switch_to() to boot_init_stack_canary(). Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20	x86: remove kernel_physical_mapping_init() from init section	Gary Hade
	Impact: fix crash with memory hotplug enabled kernel_physical_mapping_init() is called during memory hotplug so it does not belong in the init section. If the kernel is built with CONFIG_DEBUG_SECTION_MISMATCH=y on the make command line, arch/x86/mm/init_64.c is compiled with the -fno-inline-functions-called-once gcc option defeating inlining of kernel_physical_mapping_init() within init_memory_mapping(). When kernel_physical_mapping_init() is not inlined it is placed in the .init.text section according to the __init in it's current declaration. A later call to kernel_physical_mapping_init() during a memory hotplug operation encounters an int3 trap because the .init.text section memory has been freed. This patch eliminates the crash caused by the int3 trap by moving the non-inlined kernel_physical_mapping_init() from .init.text to .meminit.text. Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20	fix: crash: IP: __bitmap_intersects+0x48/0x73	Ingo Molnar
	-tip testing found this crash: > [ 35.258515] calling acpi_cpufreq_init+0x0/0x127 @ 1 > [ 35.264127] BUG: unable to handle kernel NULL pointer dereference at (null) > [ 35.267554] IP: [<ffffffff80478092>] __bitmap_intersects+0x48/0x73 > [ 35.267554] PGD 0 > [ 35.267554] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c is still broken: there's no allocation of the variable mask, so we pass in an uninitialized cmd.mask field to drv_read(), which then passes it to the scheduler which then crashes ... Switch it over to the much simpler constant-cpumask-pointers approach. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-19	cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write	Mike Travis
	Impact: use new work_on_cpu function to reduce stack usage Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with a work_on_cpu function for drv_read() and drv_write(). Basically converts do_drv_{read,write} into "work_on_cpu" functions that are now called by drv_read and drv_write. Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now that the work_on_cpu() function is more stable. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Dieter Ries <clip2@gmx.de> Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: <cpufreq@vger.kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-19	x86: fully honor "nolapic", fix	Ingo Molnar
	Impact: build fix Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18	x86: Remove never-called arch_setup_msi_irq()	Michael Ellerman
	Since commit 75c46fa, "x64, x2apic/intr-remap: MSI and MSI-X support for interrupt remapping infrastructure", x86 has had an implementation of arch_setup_msi_irqs(). That implementation does not call arch_setup_msi_irq(), instead it calls setup_irq(). No other x86 code calls arch_setup_msi_irq(). That leaves only arch_setup_msi_irqs() in drivers/pci/msi.c, but that routine is overridden by the x86 version of arch_setup_msi_irqs(). So arch_setup_msi_irq() is dead code, remove it. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-18	x86: fix section mismatch warnings in kernel/setup_percpu.c	Leonardo Potenza
	The function setup_cpu_local_masks() has been marked __init, in order to remove the following section mismatch messages: WARNING: vmlinux.o(.text+0x3c2c7): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var() The function setup_cpu_local_masks() references the function __init alloc_bootmem_cpumask_var(). This is often because setup_cpu_local_masks lacks a __init annotation or the annotation of alloc_bootmem_cpumask_var is wrong. WARNING: vmlinux.o(.text+0x3c2d3): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var() The function setup_cpu_local_masks() references the function __init alloc_bootmem_cpumask_var(). This is often because setup_cpu_local_masks lacks a __init annotation or the annotation of alloc_bootmem_cpumask_var is wrong. WARNING: vmlinux.o(.text+0x3c2df): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var() The function setup_cpu_local_masks() references the function __init alloc_bootmem_cpumask_var(). This is often because setup_cpu_local_masks lacks a __init annotation or the annotation of alloc_bootmem_cpumask_var is wrong. WARNING: vmlinux.o(.text+0x3c2eb): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var() The function setup_cpu_local_masks() references the function __init alloc_bootmem_cpumask_var(). This is often because setup_cpu_local_masks lacks a __init annotation or the annotation of alloc_bootmem_cpumask_var is wrong. Signed-off-by: Leonardo Potenza <lpotenza@inwind.it> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18	x86: put trigger in to detect mismatched apic versions	Mike Travis
	Impact: add debug warning Fire off one message if two apic's discovered with different apic versions. (this code is only called during CPU init) The goal of this is to pave the way of the removal of the apic_version[] array. We dont expect any apic version incompatibilities in the x86 landscape of systems [if so we dont handle them very well and probably never will handle deep apic version assymetries well], but it's prudent to have a debug check for one kernel cycle nevertheless. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18	x86, rdc321x: remove/move leftover files	Ingo Molnar
	Impact: cleanup Move/remove leftover RDC321 files. Now that it's not a subarch anymore, arch/x86/mach-rdc321x and arch/x86/include/asm/mach-rdc321x/ are not needed. One include file was still in use: rdc321x_defs.h, move that to the generic x86 asm header directory. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18	Merge branch 'core/percpu' into stackprotector	Ingo Molnar
	Conflicts: arch/x86/include/asm/pda.h arch/x86/include/asm/system.h Also, moved include/asm-x86/stackprotector.h to arch/x86/include/asm. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-19	x86-64: Use absolute displacements for per-cpu accesses.	Brian Gerst
	Accessing memory through %gs should not use rip-relative addressing. Adding a P prefix for the argument tells gcc to not add (%rip) to the memory references. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Move isidle from PDA to per-cpu.	Brian Gerst
	tj: s/isidle/is_idle/ Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Move nodenumber from PDA to per-cpu.	Brian Gerst
	tj: * s/nodenumber/node_number/ * removed now unused pda variable from pda_init() Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Move irqcount from PDA to per-cpu.	Brian Gerst
	tj: s/irqcount/irq_count/ Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Move oldrsp from PDA to per-cpu.	Brian Gerst
	tj: * in asm-offsets_64.c, pda.h inclusion shouldn't be removed as pda is still referenced in the file * s/oldrsp/old_rsp/ Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Move kernelstack from PDA to per-cpu.	Brian Gerst
	Also clean up PER_CPU_VAR usage in xen-asm_64.S tj: * remove now unused stack_thread_info() * s/kernelstack/kernel_stack/ * added FIXME comment in xen-asm_64.S Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit.	Brian Gerst
	Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit.	Brian Gerst
	tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA for voyager. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Convert exception stacks to per-cpu	Brian Gerst
	Move the exception stacks to per-cpu, removing specific allocation code. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Convert irqstacks to per-cpu	Brian Gerst
	Move the irqstackptr variable from the PDA to per-cpu. Make the stacks themselves per-cpu, removing some specific allocation code. Add a seperate flag (is_boot_cpu) to simplify the per-cpu boot adjustments. tj: * sprinkle some underbars around. * irq_stack_ptr is not used till traps_init(), no reason to initialize it early. On SMP, just leaving it NULL till proper initialization in setup_per_cpu_areas() works. Dropped is_boot_cpu and early irq_stack_ptr initialization. * do DECLARE/DEFINE_PER_CPU(char[IRQ_STACK_SIZE], irq_stack) instead of (char, irq_stack[IRQ_STACK_SIZE]). Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit.	Brian Gerst
	Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19	x86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit.	Brian Gerst
	Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16	x86: put trigger in to detect mismatched apic versions.	Mike Travis
	Fire off one message if two apic's discovered with different apic versions. Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-16	cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write	Mike Travis
	Impact: use new work_on_cpu function to reduce stack usage Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with a work_on_cpu function for drv_read() and drv_write(). Basically converts do_drv_{read,write} into "work_on_cpu" functions that are now called by drv_read and drv_write. Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now that the work_on_cpu() function is more stable. Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Dieter Ries <clip2@gmx.de> Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: <cpufreq@vger.kernel.org>
2009-01-16	ACPI suspend: Fix compilation warnings in drivers/acpi/sleep.c	Rafael J. Wysocki
	Fix two compilation warnings in drivers/acpi/sleep.c, one triggered by unsetting CONFIG_SUSPEND and the other triggered by unsetting CONFIG_HIBERNATION, by moving some code under the appropriate #ifdefs . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-16	Merge branch 'misc' into release	Len Brown

2009-01-16	kprobes: check CONFIG_FREEZER instead of CONFIG_PM	Masami Hiramatsu
	Check CONFIG_FREEZER instead of CONFIG_PM because kprobe booster depends on freeze_processes() and thaw_processes() when CONFIG_PREEMPT=y. This fixes a linkage error which occurs when CONFIG_PREEMPT=y, CONFIG_PM=y and CONFIG_FREEZER=n. Reported-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-16	x86_64: initialize this_cpu_off to __per_cpu_load	Tejun Heo
	On x86_64, if get_per_cpu_var() is used before per cpu area is setup (if lockdep is turned on, it happens), it needs this_cpu_off to point to __per_cpu_load. Initialize accordingly. Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16	x86: fix build bug introduced during merge	Tejun Heo
	EXPORT_PER_CPU_SYMBOL() got misplaced during merge leading to build failure. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16	percpu: add optimized generic percpu accessors	Ingo Molnar
	It is an optimization and a cleanup, and adds the following new generic percpu methods: percpu_read() percpu_write() percpu_add() percpu_sub() percpu_and() percpu_or() percpu_xor() and implements support for them on x86. (other architectures will fall back to a default implementation) The advantage is that for example to read a local percpu variable, instead of this sequence: return __get_cpu_var(var); ffffffff8102ca2b: 48 8b 14 fd 80 09 74 mov -0x7e8bf680(,%rdi,8),%rdx ffffffff8102ca32: 81 ffffffff8102ca33: 48 c7 c0 d8 59 00 00 mov $0x59d8,%rax ffffffff8102ca3a: 48 8b 04 10 mov (%rax,%rdx,1),%rax We can get a single instruction by using the optimized variants: return percpu_read(var); ffffffff8102ca3f: 65 48 8b 05 91 8f fd mov %gs:0x7efd8f91(%rip),%rax I also cleaned up the x86-specific APIs and made the x86 code use these new generic percpu primitives. tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out * added percpu_and() for completeness's sake * made generic percpu ops atomic against preemption Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16	x86: misc clean up after the percpu update	Tejun Heo
	Do the following cleanups: * kill x86_64_init_pda() which now is equivalent to pda_init() * use per_cpu_offset() instead of cpu_pda() when initializing initial_gs Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16	x86: convert pda ops to wrappers around x86 percpu accessors	Tejun Heo
	pda is now a percpu variable and there's no reason it can't use plain x86 percpu accessors. Add x86_test_and_clear_bit_percpu() and replace pda op implementations with wrappers around x86 percpu accessors. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16	x86: make pda a percpu variable	Tejun Heo
	[ Based on original patch from Christoph Lameter and Mike Travis. ] As pda is now allocated in percpu area, it can easily be made a proper percpu variable. Make it so by defining per cpu symbol from linker script and declaring it in C code for SMP and simply defining it for UP. This change cleans up code and brings SMP and UP closer a bit. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16	x86: merge 64 and 32 SMP percpu handling	Tejun Heo
	Now that pda is allocated as part of percpu, percpu doesn't need to be accessed through pda. Unify x86_64 SMP percpu access with x86_32 SMP one. Other than the segment register, operand size and the base of percpu symbols, they behave identical now. This patch replaces now unnecessary pda->data_offset with a dummy field which is necessary to keep stack_canary at its place. This patch also moves per_cpu_offset initialization out of init_gdt() into setup_per_cpu_areas(). Note that this change also necessitates explicit per_cpu_offset initializations in voyager_smp.c. With this change, x86_OP_percpu()'s are as efficient on x86_64 as on x86_32 and also x86_64 can use assembly PER_CPU macros. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16	x86: fold pda into percpu area on SMP	Tejun Heo
	[ Based on original patch from Christoph Lameter and Mike Travis. ] Currently pdas and percpu areas are allocated separately. %gs points to local pda and percpu area can be reached using pda->data_offset. This patch folds pda into percpu area. Due to strange gcc requirement, pda needs to be at the beginning of the percpu area so that pda->stack_canary is at %gs:40. To achieve this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is added and used to reserve pda sized chunk at the start of the percpu area. After this change, for boot cpu, %gs first points to pda in the data.init area and later during setup_per_cpu_areas() gets updated to point to the actual pda. This means that setup_per_cpu_areas() need to reload %gs for CPU0 while clearing pda area for other cpus as cpu0 already has modified it when control reaches setup_per_cpu_areas(). This patch also removes now unnecessary get_local_pda() and its call sites. A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into per cpu area" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>