linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2009-10-16	x86: Don't print number of MCE banks for every CPU	Roland Dreier
	The MCE initialization code explicitly says it doesn't handle asymmetric configurations where different CPUs support different numbers of MCE banks, and it prints a big warning in that case. Therefore, printing the "mce: CPU supports <x> MCE banks" message into the kernel log for every CPU is pure redundancy that clutters the log significantly for systems with lots of CPUs. Signed-off-by: Roland Dreier <rolandd@cisco.com> LKML-Reference: <adaeip473qt.fsf@cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16	x86, UV: Fix information in __uv_hub_info structure	Robin Holt
	A few parts of the uv_hub_info structure are initialized incorrectly. - n_val is being loaded with m_val. - gpa_mask is initialized with a bytes instead of an unsigned long. - Handle the case where none of the alias registers are used. Lastly I converted the bau over to using the uv_hub_info->m_val which is the correct value. Without this patch, booting a large configuration hits a problem where the upper bits of the gnode affect the pnode and the bau will not operate. Signed-off-by: Robin Holt <holt@sgi.com> Acked-by: Jack Steiner <steiner@sgi.com> Cc: Cliff Whickman <cpw@sgi.com> Cc: stable@kernel.org LKML-Reference: <20091015224946.396355000@alcatraz.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16	x86: Document linker script ASSERT() quirk	Ingo Molnar
	Older binutils breaks if ASSERT() is used without a sink for the output. For example 2.14.90.0.6 is known to be broken, the link fails with: LD .tmp_vmlinux1 ld:arch/x86/kernel/vmlinux.lds:678: parse error Document this quirk in all three files that use it. See: http://marc.info/?l=linux-kbuild&m=124930110427870&w=2 See[2]: d2ba8b2 ("x86: Fix assert syntax in vmlinux.lds.S") Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Roland McGrath <roland@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <4AD6523D.5030909@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15	x86: apic: Allow noop operations to be called almost at any time	Cyrill Gorcunov
	As only apic noop is used we allow to use almost any operation caller wants (and which of them noop driver supports of course). Initially it was reported by Ingo Molnar that apic noop issue a warning for pkg id (which is actually false positive and should be eliminated). So we save checking (and warning issue) for read/write operations while allow any other ops to be freely used. Also: - fix noop_cpu_to_logical_apicid, it should be 0. - rename noop_default_phys_pkg_id to noop_phys_pkg_id (we use default_ prefix for more general routines in apic subsystem). Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Maciej W. Rozycki <macro@linux-mips.org> LKML-Reference: <20091015150416.GC5331@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15	Merge branch 'tracing/core' into perf/core	Ingo Molnar
	Merge reason: to add event filter support we need the following commits from the tracing tree: 3f6fe06: tracing/filters: Unify the regex parsing helpers 1889d20: tracing/filters: Provide basic regex support 737f453: tracing/filters: Cleanup useless headers Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15	Merge branch 'linus' into perf/core	Ingo Molnar
	Merge reason: pick up tools/perf/ changes from upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15	Revert "x86: linker script syntax nits"	Ingo Molnar
	This reverts commit e9a63a4e559fbdc522072281d05e6b13c1022f4b. This breaks older binutils, where sink-less asserts are broken. See this commit for further details: d2ba8b2: x86: Fix assert syntax in vmlinux.lds.S Acked-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4AD6523D.5030909@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15	Merge branch 'linus' into x86/urgent	Ingo Molnar
	Merge reason: pull in latest, to be able to revert a patch there. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	Merge branch 'topic/x86-lds-nits' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland * 'topic/x86-lds-nits' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland: x86: linker script syntax nits
2009-10-14	Merge branch 'sched-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix missing kernel-doc notation Revert "x86, timers: Check for pending timers after (device) interrupts" sched: Update the clock of runqueue select_task_rq() selected
2009-10-14	Merge branch 'x86-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/paravirt: Use normal calling sequences for irq enable/disable x86: fix kernel panic on 32 bits when profiling x86: Fix Suspend to RAM freeze on Acer Aspire 1511Lmi laptop x86, vmi: Mark VMI deprecated and schedule it for removal
2009-10-14	x86: linker script syntax nits	Roland McGrath
	The linker scripts grew some use of weirdly wrong linker script syntax. It happens to work, but it's not what the syntax is documented to be. Clean it up to use the official syntax. Signed-off-by: Roland McGrath <roland@redhat.com> CC: Ian Lance Taylor <iant@google.com>
2009-10-14	x86: UV RTC: Rename generic_interrupt to x86_platform_ipi	Dimitri Sivanich
	Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091014142257.GE11048@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	x86: UV RTC: Clean up error handling	Dimitri Sivanich
	Cleanup error handling in uv_rtc_setup_clock. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091014142103.GD11048@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	x86: UV RTC: Add clocksource only boot option	Dimitri Sivanich
	Add clocksource only boot option for UV RTC. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091014141848.GC11048@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	x86: UV RTC: Fix early expiry handling	Dimitri Sivanich
	Tune/fix early timer expiry handling and return correct early timeout value for set_next_event. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091014141630.GB11048@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	x86: Remove BKL from apm_32	Thomas Gleixner
	The lock/unlock kernel pair in do_open() got there with the BKL push down and protects nothing. Remove it. Replace the lock/unlock kernel in the ioctl code with a mutex to protect standbys_pending and suspends_pending. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091010153349.365236337@linutronix.de>
2009-10-14	x86: Remove BKL from microcode	Thomas Gleixner
	cycle_lock_kernel() in microcode_open() is a worthless exercise as there is nothing to wait for. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091010153349.196074920@linutronix.de>
2009-10-14	x86, perf_event: Rename 'performance counter interrupt'	Li Hong
	In 'cdd6c482c9ff9c55475ee7392ec8f672eddb7be6', we renamed Performance Counters -> Performance Events. The name showed up in /proc/interrupts also needs a change. I use PMI (Performance monitoring interrupt) here, since it is the official name used in Intel's documents. Signed-off-by: Li Hong <lihong.hi@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091014105039.GA22670@uhli> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	tracing: Move syscalls metadata handling from arch to core	Frederic Weisbecker
	Most of the syscalls metadata processing is done from arch. But these operations are mostly generic accross archs. Especially now that we have a common variable name that expresses the number of syscalls supported by an arch: NR_syscalls, the only remaining bits that need to reside in arch is the syscall nr to addr translation. v2: Compare syscalls symbols only after the "sys" prefix so that we avoid spurious mismatches with archs that have syscalls wrappers, in which case syscalls symbols have "SyS" prefixed aliases. (Reported by: Heiko Carstens) Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org>
2009-10-14	x86, apic: Move SGI UV functionality out of generic IO-APIC code	Dimitri Sivanich
	Move UV specific functionality out of the generic IO-APIC code. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091013203236.GD20543@sgi.com> [ Cleaned up the code some more in their new places. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	x86: SGI UV: Fix irq affinity for hub based interrupts	Dimitri Sivanich
	This patch fixes handling of uv hub irq affinity. IRQs with ALL or NODE affinity can be routed to cpus other than their originally assigned cpu. Those with CPU affinity cannot be rerouted. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20090930160259.GA7822@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	x86, apic: Limit apic dumping, introduce new show_lapic= setup option	Cyrill Gorcunov
	In case if a system has a large number of cpus printing apics contents may consume a long time period. We limit such an output by 1 apic by default. But to have an ability to see all apics or some part of them we introduce "show_lapic" setup option which allow us to limit/unlimit the number of APICs being dumped. Example: apic=debug show_lapic=5, or apic=debug show_lapic=all Also move apic_verbosity checking upper that way so helper routines do not need to inspect it at all. Suggested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.926793122@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	x86, apic: Use apic noop driver	Cyrill Gorcunov
	In case if apic were disabled we may use the whole apic NOOP driver instead of sparse poking the some functions in apic driver. Also NOOP would catch any inappropriate apic operation calls (not just read/write). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.747817361@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	x86, apic: Introduce the NOOP apic driver	Cyrill Gorcunov
	Introduce NOOP APIC driver. We should use it in case if apic was disabled due to hardware of software/firmware problems (including user requested to disable it case). The driver is attempting to catch any inappropriate apic operation call with warning issue. Also it is possible to use some apic operation like IPI calls, read/write without checking for apic presence which should make callers code easier. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.534682104@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14	function-graph/x86: Replace unbalanced ret with jmp	Steven Rostedt
	The function graph tracer replaces the return address with a hook to trace the exit of the function call. This hook will finish by returning to the real location the function should return to. But the current implementation uses a ret to jump to the real return location. This causes a imbalance between calls and ret. That is the original function does a call, the ret goes to the handler and then the handler does a ret without a matching call. Although the function graph tracer itself still breaks the branch predictor by replacing the original ret, by using a second ret and causing an imbalance, it breaks the predictor even more. This patch replaces the ret with a jmp to keep the calls and ret balanced. I tested this on one box and it showed a 1.7% increase in performance. Another box only showed a small 0.3% increase. But no box that I tested this on showed a decrease in performance by making this change. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091013203425.042034383@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13	Merge git://git.infradead.org/~dwmw2/iommu-2.6.32	Linus Torvalds
	* git://git.infradead.org/~dwmw2/iommu-2.6.32: x86: Move pci_iommu_init to rootfs_initcall() Run pci_apply_final_quirks() sooner. Mark pci_apply_final_quirks() __init rather than __devinit Rename pci_init() to pci_apply_final_quirks(), move it to quirks.c intel-iommu: Yet another BIOS workaround: Isoch DMAR unit with no TLB space intel-iommu: Decode (and ignore) RHSA entries intel-iommu: Make "Unknown DMAR structure" message more informative
2009-10-13	perf_event, x86, mce: Use TRACE_EVENT() for MCE logging	Hidetoshi Seto
	This approach is the first baby step towards solving many of the structural problems the x86 MCE logging code is having today: - It has a private ring-buffer implementation that has a number of limitations and has been historically fragile and buggy. - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE specific. /dev/mcelog is not part of any larger logging framework and hence has remained on the fringes for many years. - The MCE logging code is still very unclean partly due to its ABI limitations. Fields are being reused for multiple purposes, and the whole message structure is limited and x86 specific to begin with. All in one, the x86 tree would like to move away from this private implementation of an event logging facility to a broader framework. By using perf events we gain the following advantages: - Multiple user-space agents can access MCE events. We can have an mcelog daemon running but also a system-wide tracer capturing important events in flight-recorder mode. - Sampling support: the kernel and the user-space call-chain of MCE events can be stored and analyzed as well. This way actual patterns of bad behavior can be matched to precisely what kind of activity happened in the kernel (and/or in the app) around that moment in time. - Coupling with other hardware and software events: the PMU can track a number of other anomalies - monitoring software might chose to monitor those plus the MCE events as well - in one coherent stream of events. - Discovery of MCE sources - tracepoints are enumerated and tools can act upon the existence (or non-existence) of various channels of MCE information. - Filtering support: we just subscribe to and act upon the events we are interested in. Then even on a per event source basis there's in-kernel filter expressions available that can restrict the amount of data that hits the event channel. - Arbitrary deep per cpu buffering of events - we can buffer 32 entries or we can buffer as much as we want, as long as we have the RAM. - An NMI-safe ring-buffer implementation - mappable to user-space. - Built-in support for timestamping of events, PID markers, CPU markers, etc. - A rich ABI accessible over system call interface. Per cpu, per task and per workload monitoring of MCE events can be done this way. The ABI itself has a nice, meaningful structure. - Extensible ABI: new fields can be added without breaking tooling. New tracepoints can be added as the hardware side evolves. There's various parsers that can be used. - Lots of scheduling/buffering/batching modes of operandi for MCE events. poll() support. mmap() support. read() support. You name it. - Rich tooling support: even without any MCE specific extensions added the 'perf' tool today offers various views of MCE data: perf report, perf stat, perf trace can all be used to view logged MCE events and perhaps correlate them to certain user-space usage patterns. But it can be used directly as well, for user-space agents and policy action in mcelog, etc. With this we hope to achieve significant code cleanup and feature improvements in the MCE code, and we hope to be able to drop the /dev/mcelog facility in the end. This patch is just a plain dumb dump of mce_log() records to the tracepoints / perf events framework - a first proof of concept step. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13	Merge commit 'v2.6.32-rc4' into perf/core	Ingo Molnar
	Merge reason: we were on an -rc1 base, merge up to -rc4. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12	net: Introduce recvmmsg socket syscall	Arnaldo Carvalho de Melo
	Meaning receive multiple messages, reducing the number of syscalls and net stack entry/exit operations. Next patches will introduce mechanisms where protocols that want to optimize this operation will provide an unlocked_recvmsg operation. This takes into account comments made by: . Paul Moore: sock_recvmsg is called only for the first datagram, sock_recvmsg_nosec is used for the rest. . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that works in the same fashion as the ppoll one. If the underlying protocol returns a datagram with MSG_OOB set, this will make recvmmsg return right away with as many datagrams (+ the OOB one) it has received so far. . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen datagrams and then recvmsg returns an error, recvmmsg will return the successfully received datagrams, store the error and return it in the next call. This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg, where we will be able to acquire the lock only at batch start and end, not at every underlying recvmsg call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-13	perf_events, x86: Fix event constraints code	Ingo Molnar
	There was namespace overlap due to a rename i did - this caused the following build warning, reported by Stephen Rothwell against linux-next x86_64 allmodconfig: arch/x86/kernel/cpu/perf_event.c: In function 'intel_get_event_idx': arch/x86/kernel/cpu/perf_event.c:1445: warning: 'event_constraint' is used uninitialized in this function This is a real bug not just a warning: fix it by renaming the global event-constraints table pointer to 'event_constraints'. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Stephane Eranian <eranian@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091013144223.369d616d.sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12	x86: use kernel_stack_pointer() in kprobes.c	H. Peter Anvin
	The way to obtain a kernel-mode stack pointer from a struct pt_regs in 32-bit mode is "subtle": the stack doesn't actually contain the stack pointer, but rather the location where it would have been marks the actual previous stack frame. For clarity, use kernel_stack_pointer() instead of coding this weirdness explicitly. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@redhat.com>
2009-10-12	x86: use kernel_stack_pointer() in kgdb.c	H. Peter Anvin
	The way to obtain a kernel-mode stack pointer from a struct pt_regs in 32-bit mode is "subtle": the stack doesn't actually contain the stack pointer, but rather the location where it would have been marks the actual previous stack frame. For clarity, use kernel_stack_pointer() instead of coding this weirdness explicitly. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Jason Wessel <jason.wessel@windriver.com>
2009-10-12	x86: use kernel_stack_pointer() in dumpstack.c	H. Peter Anvin
	The way to obtain a kernel-mode stack pointer from a struct pt_regs in 32-bit mode is "subtle": the stack doesn't actually contain the stack pointer, but rather the location where it would have been marks the actual previous stack frame. For clarity, use kernel_stack_pointer() instead of coding this weirdness explicitly. Furthermore, user_mode() is only valid when the process is known to not run in V86 mode. Use the safer user_mode_vm() instead. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12	x86: use kernel_stack_pointer() in process_32.c	H. Peter Anvin
	The way to obtain a kernel-mode stack pointer from a struct pt_regs in 32-bit mode is "subtle": the stack doesn't actually contain the stack pointer, but rather the location where it would have been marks the actual previous stack frame. For clarity, use kernel_stack_pointer() instead of coding this weirdness explicitly. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12	x86: Export srat physical topology	David Rientjes
	This is the counterpart to "x86: export k8 physical topology" for SRAT. It is not as invasive because the acpi code already seperates node setup into detection and registration steps, with the exception of registering e820 active regions in acpi_numa_memory_affinity_init(). This is now moved to acpi_scan_nodes() if NUMA emulation is disabled or deferred. acpi_numa_init() now returns a value which specifies whether an underlying SRAT was located. If so, that topology can be used by the emulation code to interleave emulated nodes over physical nodes or to register the nodes for ACPI. acpi_get_nodes() may now be used to export the srat physical topology of the machine for NUMA emulation. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ankita Garg <ankita@in.ibm.com> Cc: Len Brown <len.brown@intel.com> LKML-Reference: <alpine.DEB.1.00.0909251518580.14754@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12	x86: Export k8 physical topology	David Rientjes
	To eventually interleave emulated nodes over physical nodes, we need to know the physical topology of the machine without actually registering it. This does the k8 node setup in two parts: detection and registration. NUMA emulation can then used the physical topology detected to setup the address ranges of emulated nodes accordingly. If emulation isn't used, the k8 nodes are registered as normal. Two formals are added to the x86 NUMA setup functions: `acpi' and `k8'. These represent whether ACPI or K8 NUMA has been detected; both cannot be true at the same time. This specifies to the NUMA emulation code whether an underlying physical NUMA topology exists and which interface to use. This patch deals solely with separating the k8 setup path into Northbridge detection and registration steps and leaves the ACPI changes for a subsequent patch. The `acpi' formal is added here, however, to avoid touching all the header files again in the next patch. This approach also ensures emulated nodes will not span physical nodes so the true memory latency is not misrepresented. k8_get_nodes() may now be used to export the k8 physical topology of the machine for NUMA emulation. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Ankita Garg <ankita@in.ibm.com> Cc: Len Brown <len.brown@intel.com> LKML-Reference: <alpine.DEB.1.00.0909251518400.14754@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12	x86: fix kernel panic on 32 bits when profiling	H. Peter Anvin
	Latest kernel has a kernel panic in booting on i386 machine when profile=2 setting in cmdline. It is due to 'sp' being incorrect in profile_pc(). BUG: unable to handle kernel NULL pointer dereference at 00000246 IP: [<c01288b6>] profile_pc+0x2a/0x48 *pde = 00000000 Oops: 0000 [#1] SMP This differs from the original version by Alex Shi in that we use the kernel_stack_pointer() inline already defined in <asm/ptrace.h> for this purpose, instead of #ifdef. Originally-by: Alex Shi <alex.shi@intel.com> Cc: "Chen, Tim C" <tim.c.chen@intel.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12	x86, 64-bit: Move K8 B step iret fixup to fault entry asm	Brian Gerst
	Move the handling of truncated %rip from an iret fault to the fault entry path. This allows x86-64 to use the standard search_extable() function. Signed-off-by: Brian Gerst <brgerst@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Beulich <jbeulich@novell.com> LKML-Reference: <1255357103-5418-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12	x86: Fix Suspend to RAM freeze on Acer Aspire 1511Lmi laptop	Jan Beulich
	Move the trampoline and accessors back out of .cpuinit.* for the case of 64-bits+ACPI_SLEEP. This solves s2ram hangs reported in: http://bugzilla.kernel.org/show_bug.cgi?id=14279 Reported-and-bisected-by: Christian Casteyde <casteyde.christian@free.fr> Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: <bugzilla-daemon@bugzilla.kernel.org> Cc: "Andrew Morton" <akpm@linux-foundation.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12	x86: Move pci_iommu_init to rootfs_initcall()	David Woodhouse
	We want this to happen after the PCI quirks, which are now running at the very end of the fs_initcalls. This works around the BIOS problems which were originally addressed by commit db8be50c4307dac2b37305fc59c8dc0f978d09ea ('USB: Work around BIOS bugs by quiescing USB controllers earlier'), which was reverted in commit d93a8f829fe1d2f3002f2c6ddb553d12db420412. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-10-12	mce, edac: Use an atomic notifier for MCEs decoding	Borislav Petkov
	Add an atomic notifier which ensures proper locking when conveying MCE info to EDAC for decoding. The actual notifier call overrides a default, negative priority notifier. Note: make sure we register the default decoder only once since mcheck_init() runs on each CPU. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091003065752.GA8935@liondog.tnic> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12	ftrace.c: Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt	Joe Perches
	- Remove prefixes from pr_<level>, use pr_fmt(fmt). No change in output. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <9b377eefae9e28c599dd4a17bdc81172965e9931.1254701151.git.joe@perches.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-11	pci: increase alignment to make more space for hidden code	Yinghai Lu
	As reported in http://bugzilla.kernel.org/show_bug.cgi?id=13940 on some system when acpi are enabled, acpi clears some BAR for some devices without reason, and kernel will need to allocate devices for them. It then apparently hits some undocumented resource conflict, resulting in non-working devices. Try to increase alignment to get more safe range for unassigned devices. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-11	headers: remove sched.h from interrupt.h	Alexey Dobriyan
	After m68k's task_thread_info() doesn't refer to current, it's possible to remove sched.h from interrupt.h and not break m68k! Many thanks to Heiko Carstens for allowing this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-10-09	x86-64: merge the standard and compat start_thread() functions	H. Peter Anvin
	The only thing left that differs between the standard and compat start_thread functions is the actual segment numbers and the prototype, so have a single common function which contains the guts and two very small wrappers. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
2009-10-09	x86-64: make compat_start_thread() match start_thread()	H. Peter Anvin
	For no real good reason, compat_start_thread() was embedded inline in <asm/elf.h> whereas the native start_thread() lives in process_.c. Move compat_start_thread() to process_64.c, remove gratuitious differences, and fix a few items which mostly look like bit rot. In particular, compat_start_thread() didn't do free_thread_xstate(), which means it was hanging on to the xstate store area even when it was not needed. It was also not setting old_rsp, but it looks like that generally shouldn't matter for a 32-bit process. Note: compat_start_thread has* to be a macro, since it is tested with start_thread_ia32() as the out of line function name. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
2009-10-09	x86/amd-iommu: Workaround for erratum 63	Joerg Roedel
	There is an erratum for IOMMU hardware which documents undefined behavior when forwarding SMI requests from peripherals and the DTE of that peripheral has a sysmgt value of 01b. This problem caused weird IO_PAGE_FAULTS in my case. This patch implements the suggested workaround for that erratum into the AMD IOMMU driver. The erratum is documented with number 63. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-10-09	Revert "x86, timers: Check for pending timers after (device) interrupts"	Ingo Molnar
	This reverts commit 9bcbdd9c58617f1301dd4f17c738bb9bc73aca70. The real bug producing LatencyTop latencies has been fixed in: f5dc375: sched: Update the clock of runqueue select_task_rq() selected And the commit being reverted here triggers local timer processing from every device IRQ. If device IRQs come in at a high frequency, this could cause a performance regression. The commit being reverted here purely 'fixed' the reported latency as a side effect, because CPUs were being moved out of idle more often. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Frans Pop <elendil@planet.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091008064041.67219b13@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09	perf, x86: Add simple group validation	Peter Zijlstra
	Refuse to add events when the group wouldn't fit onto the PMU anymore. Naive implementation. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@gmail.com> LKML-Reference: <1254911461.26976.239.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>