summaryrefslogtreecommitdiff
path: root/arch/powerpc/include
AgeCommit message (Collapse)Author
2010-04-25KVM: PPC: Add helpers to modify ppc fieldsAlexander Graf
The PowerPC specification always lists bits from MSB to LSB. That is really confusing when you're trying to write C code, because it fits in pretty badly with the normal (1 << xx) schemes. So I came up with some nice wrappers that allow to get and set fields in a u64 with bit numbers exactly as given in the spec. That makes the code in KVM and the spec easier comparable. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Add helpers to call FPU instructionsAlexander Graf
To emulate paired single instructions, we need to be able to call FPU operations from within the kernel. Since we don't want gcc to spill arbitrary FPU code everywhere, we tell it to use a soft fpu. Since we know we can really call the FPU in safe areas, let's also add some calls that we can later use to actually execute real world FPU operations on the host's FPU. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Make ext giveup non-staticAlexander Graf
We need to call the ext giveup handlers from code outside of book3s.c. So let's make it non-static. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Make software load/store return eaddrAlexander Graf
The Book3S KVM implementation contains some helper functions to load and store data from and to virtual addresses. Unfortunately, this helper used to keep the physical address it so nicely found out for us to itself. So let's change that and make it return the physical address it resolved. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Add Gekko SPRsAlexander Graf
The Gekko has some SPR values that differ from other PPC core values and also some additional ones. Let's add support for them in our mfspr/mtspr emulator. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Add hidden flag for paired singlesAlexander Graf
The Gekko implements an extension called paired singles. When the guest wants to use that extension, we need to make sure we're not running the host FPU, because all FPU instructions need to get emulated to accomodate for additional operations that occur. This patch adds an hflag to track if we're in paired single mode or not. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Add AGAIN type for emulation returnAlexander Graf
Emulation of an instruction can have different outcomes. It can succeed, fail, require MMIO, do funky BookE stuff - or it can just realize something's odd and will be fixed the next time around. Exactly that is what EMULATE_AGAIN means. Using that flag we can now tell the caller that nothing happened, but we still want to go back to the guest and see what happens next time we come around. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Teach MMIO SignednessAlexander Graf
The guest I was trying to get to run uses the LHA and LHAU instructions. Those instructions basically do a load, but also sign extend the result. Since we need to fill our registers by hand when doing MMIO, we also need to sign extend manually. This patch implements sign extended MMIO and the LHA(U) instructions. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Enable MMIO to do 64 bits, fprs and qprsAlexander Graf
Right now MMIO access can only happen for GPRs and is at most 32 bit wide. That's actually enough for almost all types of hardware out there. Unfortunately, the guest I was using used FPU writes to MMIO regions, so it ended up writing 64 bit MMIOs using FPRs and QPRs. So let's add code to handle those odd cases too. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Make fpscr 64-bitAlexander Graf
Modern PowerPCs have a 64 bit wide FPSCR register. Let's accomodate for that and make it 64 bits in our vcpu struct too. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25KVM: PPC: Add QPR registersAlexander Graf
The Gekko has GPRs, SPRs and FPRs like normal PowerPC codes, but it also has QPRs which are basically single precision only FPU registers that get used when in paired single mode. The following patches depend on them being around, so let's add the definitions early. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-13Revert "powerpc/mm: Bump SECTION_SIZE_BITS from 16MB to 256MB"Benjamin Herrenschmidt
This reverts commit 7545ba6f82924d4523f8f8a2baf2e517a750265d. It breaks eHEA among other issues
2010-04-07powerpc: Add kprobe-based event tracerMahesh Salgaonkar
This patch ports the kprobe-based event tracer to powerpc. This patch is based on x86 port. This brings powerpc on par with x86. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07powerpc/mm: Bump SECTION_SIZE_BITS from 16MB to 256MBAnton Blanchard
The current setting for SECTION_SIZE_BITS is quite small compared to everyone else: arch/powerpc/include/asm/sparsemem.h:#define SECTION_SIZE_BITS 24 arch/sparc/include/asm/sparsemem.h:#define SECTION_SIZE_BITS 30 arch/ia64/include/asm/sparsemem.h:#define SECTION_SIZE_BITS (30) arch/s390/include/asm/sparsemem.h:#define SECTION_SIZE_BITS 28 arch/x86/include/asm/sparsemem.h:# define SECTION_SIZE_BITS 27 And it has proven to be an issue during boot on very large machines. If hotplug memory is enabled, drivers/base/memory.c does this: for (i = 0; i < NR_MEM_SECTIONS; i++) { if (!present_section_nr(i)) continue; err = add_memory_block(0, __nr_to_section(i), MEM_ONLINE, 0, BOOT); if (!ret) ret = err; } Which creates a sysfs directory for every 16MB of memory. As a result I'm seeing up to 30 minutes spent here during boot: c000000000248ee0 .__sysfs_add_one+0x28/0x128 c0000000002492a8 .sysfs_add_one+0x38/0x188 c000000000249c88 .create_dir+0x70/0x138 c000000000249d98 .sysfs_create_dir+0x48/0x78 c00000000032bad8 .kobject_add_internal+0x140/0x308 c00000000032beb4 .kobject_init_and_add+0x4c/0x68 c00000000046c2c0 .sysdev_register+0xa0/0x220 c00000000047b1dc .add_memory_block+0x124/0x1e8 c0000000008d1f28 .memory_dev_init+0xf4/0x168 c0000000008d1b64 .driver_init+0x50/0x64 c000000000890378 .do_basic_setup+0x40/0xd4 I assume there are some O(n^2) issues in sysfs as we add all the memory nodes. Bumping SECTION_SIZE_BITS to 256 MB drops the time to about 10 seconds and results in a much smaller /sys. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07powerpc/numa: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaimAnton Blanchard
I noticed /proc/sys/vm/zone_reclaim_mode was 0 on a ppc64 NUMA box. It gets enabled via this: /* * If another node is sufficiently far away then it is better * to reclaim pages in a zone before going off node. */ if (distance > RECLAIM_DISTANCE) zone_reclaim_mode = 1; Since we use the default value of 20 for REMOTE_DISTANCE and 20 for RECLAIM_DISTANCE it never kicks in. The local to remote bandwidth ratios can be quite large on System p machines so it makes sense for us to reclaim clean pagecache locally before going off node. The patch below sets a smaller value for RECLAIM_DISTANCE and thus enables zone reclaim. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07powerpc/ppc32: Fixup pmd_page to work when ARCH_PFN_OFFSET is non-zeroJason Gunthorpe
Instead of referencing mem_map directly, use pfn_to_page. Otherwise the kernel crashes when trying to start userspace if ARCH_PFN_OFFSET is non-zero and CONFIG_BOOKE is not defined Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07powerpc/pseries: Export data from new hcall H_EM_GET_PARMSVaidyanathan Srinivasan
Add support for H_EM_GET_PARMS hcall that will return data related to power modes from the platform. Export the data directly to user space for administrative tools to interpret and use. cat /proc/powerpc/lparcfg will export power mode data Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-26Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: powerpc/perf_events: Fix call-graph recording, add perf_arch_fetch_caller_regs perf top: Add missing initialization to zero perf probe: Use original address instead of CU-based address perf probe: Fix offset to allow signed value perf top: Improve the autosizing of column lenghts perf probe: Fix need_dwarf flag if lazy matching is used perf probe: Fix probe_point buffer overrun
2010-03-19powerpc: Use correct ccr bit for syscall error statusNathan Lynch
The powerpc implementations of syscall_get_error and syscall_set_return_value should use CCR0:S0 (0x10000000) for testing and setting syscall error status. Fortunately these APIs don't seem to be used at the moment. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-19Merge commit 'kumar/merge' into mergeBenjamin Herrenschmidt
2010-03-18powerpc/perf_events: Fix call-graph recording, add perf_arch_fetch_caller_regsPaul Mackerras
This implements a powerpc version of perf_arch_fetch_caller_regs to get correct call-graphs. It's implemented in assembly because that way we can be sure there isn't a stack frame for perf_arch_fetch_caller_regs. If it was in C, gcc might or might not create a stack frame for it, which would affect the number of levels we have to skip. With this, we see results from perf record -e lock:lock_acquire like this: # Samples: 24878 # # Overhead Command Shared Object Symbol # ........ .............. ................. ...... # 14.99% perf [kernel.kallsyms] [k] ._raw_spin_lock | --- ._raw_spin_lock | |--25.00%-- .alloc_fd | (nil) | | | |--50.00%-- .anon_inode_getfd | | .sys_perf_event_open | | syscall_exit | | syscall | | create_counter | | __cmd_record | | run_builtin | | main | | 0xfd2e704 | | 0xfd2e8c0 | | (nil) ... etc. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: anton@samba.org Cc: linuxppc-dev@ozlabs.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100318050513.GA6575@drongo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-16powerpc/85xx: Make sure lwarx hint isn't set on ppc32Kumar Gala
e500v1/v2 based chips will treat any reserved field being set in an opcode as illegal. Thus always setting the hint in the opcode is a bad idea. Anton should be kept away from the powerpc opcode map. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-03-12Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/booke: Fix breakpoint/watchpoint one-shot behavior powerpc: Reduce printk from pseries_mach_cpu_die() powerpc: Move checks in pseries_mach_cpu_die() powerpc: Reset kernel stack on cpu online from cede state powerpc: Fix G5 thermal shutdown powerpc/pseries: Pass CPPR value to H_XIRR hcall powerpc/booke: Fix a couple typos in the advanced ptrace code powerpc: Fix SMP build with disabled CPU hotplugging. powerpc: Dynamically allocate pacas powerpc/perf: e500 support powerpc/perf: Build callchain code regardless of hardware event support. powerpc/cpm2: Checkpatch cleanup powerpc/86xx: Renaming following split of GE Fanuc joint venture powerpc/86xx: Convert gef_pic_lock to raw_spinlock powerpc/qe: Convert qe_ic_lock to raw_spinlock powerpc/82xx: Convert pci_pic_lock to raw_spinlock powerpc/85xx: Convert socrates_fpga_pic_lock to raw_spinlock
2010-03-12dma-mapping: powerpc: use generic pci_set_dma_mask and ↵FUJITA Tomonori
pci_set_consistent_dma_mask This converts powerpc to use the generic pci_set_dma_mask and pci_set_consistent_dma_mask (drivers/pci/pci.c). The generic pci_set_dma_mask does what powerpc's pci_set_dma_mask does. Unlike powerpc's pci_set_consistent_dma_mask, the gneric pci_set_consistent_dma_mask sets only coherent_dma_mask. It doesn't work for powerpc? pci_set_consistent_dma_mask API should set only coherent_dma_mask? Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Greg KH <greg@kroah.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12pci-dma: add linux/pci-dma.h to linux/pci.hFUJITA Tomonori
All the architectures properly set NEED_DMA_MAP_STATE now so we can safely add linux/pci-dma.h to linux/pci.h and remove the linux/pci-dma.h inclusion in arch's asm/pci.h Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12pci-dma: powerpc: use include/linux/pci-dma.hFUJITA Tomonori
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12ptrace: move user_enable_single_step & co prototypes to linux/ptrace.hChristoph Hellwig
While in theory user_enable_single_step/user_disable_single_step/ user_enable_blockstep could also be provided as an inline or macro there's no good reason to do so, and having the prototype in one places keeps code size and confusion down. Roland said: The original thought there was that user_enable_single_step() et al might well be only an instruction or three on a sane machine (as if we have any of those!), and since there is only one call site inlining would be beneficial. But I agree that there is no strong reason to care about inlining it. As to the arch changes, there is only one thought I'd add to the record. It was always my thinking that for an arch where PTRACE_SINGLESTEP does text-modifying breakpoint insertion, user_enable_single_step() should not be provided. That is, arch_has_single_step()=>true means that there is an arch facility with "pure" semantics that does not have any unexpected side effects. Inserting a breakpoint might do very unexpected strange things in multi-threaded situations. Aside from that, it is a peculiar side effect that user_{enable,disable}_single_step() should cause COW de-sharing of text pages and so forth. For PTRACE_SINGLESTEP, all these peculiarities are the status quo ante for that arch, so having arch_ptrace() itself do those is one thing. But for building other things in the future, it is nicer to have a uniform "pure" semantics that arch-independent code can expect. OTOH, all such arch issues are really up to the arch maintainer. As of today, there is nothing but ptrace using user_enable_single_step() et al so it's a distinction without a practical difference. If/when there are other facilities that use user_enable_single_step() and might care, the affected arch's can revisit the question when someone cares about the quality of the arch support for said new facility. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12Add generic sys_olduname()Christoph Hellwig
Add generic implementations of the old and really old uname system calls. Note that sh only implements sys_olduname but not sys_oldolduname, but I'm not going to bother with another ifdef for that special case. m32r implemented an old uname but never wired it up, so kill it, too. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12improve sys_newuname() for compat architecturesChristoph Hellwig
On an architecture that supports 32-bit compat we need to override the reported machine in uname with the 32-bit value. Instead of doing this separately in every architecture introduce a COMPAT_UTS_MACHINE define in <asm/compat.h> and apply it directly in sys_newuname(). Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12Add generic sys_ipc wrapperChristoph Hellwig
Add a generic implementation of the ipc demultiplexer syscall. Except for s390 and sparc64 all implementations of the sys_ipc are nearly identical. There are slight differences in the types of the parameters, where mips and powerpc as the only 64-bit architectures with sys_ipc use unsigned long for the "third" argument as it gets casted to a pointer later, while it traditionally is an "int" like most other paramters. frv goes even further and uses unsigned long for all parameters execept for "ptr" which is a pointer type everywhere. The change from int to unsigned long for "third" and back to "int" for the others on frv should be fine due to the in-register calling conventions for syscalls (we already had a similar issue with the generic sys_ptrace), but I'd prefer to have the arch maintainers looks over this in details. Except for that h8300, m68k and m68knommu lack an impplementation of the semtimedop sub call which this patch adds, and various architectures have gets used - at least on i386 it seems superflous as the compat code on x86-64 and ia64 doesn't even bother to implement it. [akpm@linux-foundation.org: add sys_ipc to sys_ni.c] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: James Morris <jmorris@namei.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-09powerpc/booke: Fix a couple typos in the advanced ptrace codeDave Kleikamp
powerpc/booke: Fix a couple typos in the advanced ptrace code Found and fixed a couple typos in the advanced ptrace patches. (These patches are currently in benh's next tree.) Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-09powerpc: Dynamically allocate pacasMichael Ellerman
On 64-bit kernels we currently have a 512 byte struct paca_struct for each cpu (usually just called "the paca"). Currently they are statically allocated, which means a kernel built for a large number of cpus will waste a lot of space if it's booted on a machine with few cpus. We can avoid that by only allocating the number of pacas we need at boot. However this is complicated by the fact that we need to access the paca before we know how many cpus there are in the system. The solution is to dynamically allocate enough space for NR_CPUS pacas, but then later in boot when we know how many cpus we have, we free any unused pacas. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-09Merge commit 'kumar/next' into mergeBenjamin Herrenschmidt
2010-03-05Merge branch 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (145 commits) KVM: x86: Add KVM_CAP_X86_ROBUST_SINGLESTEP KVM: VMX: Update instruction length on intercepted BP KVM: Fix emulate_sys[call, enter, exit]()'s fault handling KVM: Fix segment descriptor loading KVM: Fix load_guest_segment_descriptor() to inject page fault KVM: x86 emulator: Forbid modifying CS segment register by mov instruction KVM: Convert kvm->requests_lock to raw_spinlock_t KVM: Convert i8254/i8259 locks to raw_spinlocks KVM: x86 emulator: disallow opcode 82 in 64-bit mode KVM: x86 emulator: code style cleanup KVM: Plan obsolescence of kernel allocated slots, paravirt mmu KVM: x86 emulator: Add LOCK prefix validity checking KVM: x86 emulator: Check CPL level during privilege instruction emulation KVM: x86 emulator: Fix popf emulation KVM: x86 emulator: Check IOPL level during io instruction emulation KVM: x86 emulator: fix memory access during x86 emulation KVM: x86 emulator: Add Virtual-8086 mode of emulation KVM: x86 emulator: Add group9 instruction decoding KVM: x86 emulator: Add group8 instruction decoding KVM: do not store wqh in irqfd ... Trivial conflicts in Documentation/feature-removal-schedule.txt
2010-03-05powerpc/perf: e500 supportScott Wood
This implements perf_event support for the Freescale embedded performance monitor, based on the existing perf_event.c that supports server/classic chips. Some limitations: - Performance monitor interrupts are regular EE interrupts, and thus you can't profile places with interrupts disabled. We may want to implement soft IRQ-disabling, with perfmon interrupts exempted and treated as NMIs. - When trying to schedule multiple event groups at once, and using restricted events, situations could arise where scheduling fails even though it would be possible. Consider three groups, each with two events. One group has restricted events, the others don't. The two non-restricted groups are scheduled, then one is removed, which happens to occupy the two counters that can't do restricted events. The remaining non-restricted group will not be moved to the non-restricted-capable counters to make room if the restricted group tries to be scheduled. Signed-off-by: Scott Wood <scottwood@freescale.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-03-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.
2010-03-01Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (100 commits) ARM: Eliminate decompressor -Dstatic= PIC hack ARM: 5958/1: ARM: U300: fix inverted clk round rate ARM: 5956/1: misplaced parentheses ARM: 5955/1: ep93xx: move timer defines into core.c and document ARM: 5954/1: ep93xx: move gpio interrupt support to gpio.c ARM: 5953/1: ep93xx: fix broken build of clock.c ARM: 5952/1: ARM: MM: Add ARM_L1_CACHE_SHIFT_6 for handle inside each ARCH Kconfig ARM: 5949/1: NUC900 add gpio virtual memory map ARM: 5948/1: Enable timer0 to time4 clock support for nuc910 ARM: 5940/2: ARM: MMCI: remove custom DBG macro and printk ARM: make_coherent(): fix problems with highpte, part 2 MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself ARM: 5945/1: ep93xx: include correct irq.h in core.c ARM: 5933/1: amba-pl011: support hardware flow control ARM: 5930/1: Add PKMAP area description to memory.txt. ARM: 5929/1: Add checks to detect overlap of memory regions. ARM: 5928/1: Change type of VMALLOC_END to unsigned long. ARM: 5927/1: Make delimiters of DMA area globally visibly. ARM: 5926/1: Add "Virtual kernel memory..." printout. ARM: 5920/1: OMAP4: Enable L2 Cache ... Fix up trivial conflict in arch/arm/mach-mx25/clock.c
2010-03-01KVM: ppc/booke: Set ESR and DEAR when inject interrupt to guestLiu Yu
Old method prematurely sets ESR and DEAR. Move this part after we decide to inject interrupt, which is more like hardware behave. Signed-off-by: Liu Yu <yu.liu@freescale.com> Acked-by: Hollis Blanchard <hollis@penguinppc.org> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC E500: fix tlbcfg emulationLiu Yu
commit 55fb1027c1cf9797dbdeab48180da530e81b1c39 doesn't update tlbcfg correctly. Fix it. And since guest OS likes 'fixed' hardware, initialize tlbcfg everytime when guest access is useless. So move this part to init code. Signed-off-by: Liu Yu <yu.liu@freescale.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: PPC E500: Add register l1csr0 emulationLiu Yu
Latest kernel start to access l1csr0 to contron L1. We just tell guest no operation is on going. Signed-off-by: Liu Yu <yu.liu@freescale.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: PPC: Keep SRR1 flags around in shadow_msrAlexander Graf
SRR1 stores more information that just the MSR value. It also stores valuable information about the type of interrupt we received, for example whether the storage interrupt we just got was because of a missing htab entry or not. We use that information to speed up the exit path. Now if we get preempted before we can interpret the shadow_msr values, we get into vcpu_put which then calls the MSR handler, which then sets all the SRR1 information bits in shadow_msr to 0. Great. So let's preserve the SRR1 specific bits in shadow_msr whenever we set the MSR. They don't hurt. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Fix initial GPR settingsAlexander Graf
Commit 7d01b4c3ed2bb33ceaf2d270cb4831a67a76b51b introduced PACA backed vcpu values. With this patch, when a userspace app was setting GPRs before it was actually first loaded, the set values get discarded. This is because vcpu_load loads them from the vcpu backing store that we use whenever we're not owning the PACA. That behavior is not really a major problem, because we don't need it for qemu. Other users (like kvmctl) do have problems with it though, so let's better do it right. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Add support for FPU/Altivec/VSXAlexander Graf
When our guest starts using either the FPU, Altivec or VSX we need to make sure Linux knows about it and sneak into its process switching code accordingly. This patch makes accesses to the above parts of the system work inside the VM. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Add helper functions to call real mode loadersAlexander Graf
Linux contains quite some bits of code to load FPU, Altivec and VSX lazily for a task. It calls those bits in real mode, coming from an interrupt handler. For KVM we better reuse those, so let's wrap a bit of trampoline magic around them and then we can call them from normal module code. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Make large pages workAlexander Graf
An SLB entry contains two pieces of information related to size: 1) PTE size 2) SLB size The L bit defines the PTE be "large" (usually means 16MB), SLB_VSID_B_1T defines that the SLB should span 1 GB instead of the default 256MB. Apparently I messed things up and just put those two in one box, shaked it heavily and came up with the current code which handles large pages incorrectly, because it also treats large page SLB entries as "1TB" segment entries. This patch splits those two features apart, making Linux guests boot even when they have > 256MB. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Emulate trap SRR1 flags properlyAlexander Graf
Book3S needs some flags in SRR1 to get to know details about an interrupt. One such example is the trap instruction. It tells the guest kernel that a program interrupt is due to a trap using a bit in SRR1. This patch implements above behavior, making WARN_ON behave like WARN_ON. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Call SLB patching code in interrupt safe mannerAlexander Graf
Currently we're racy when doing the transition from IR=1 to IR=0, from the module memory entry code to the real mode SLB switching code. To work around that I took a look at the RTAS entry code which is faced with a similar problem and did the same thing: A small helper in linear mapped memory that does mtmsr with IR=0 and then RFIs info the actual handler. Thanks to that trick we can safely take page faults in the entry code and only need to be really wary of what to do as of the SLB switching part. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Implement 'skip instruction' modeAlexander Graf
To fetch the last instruction we were interrupted on, we enable DR in early exit code, where we are still in a very transitional phase between guest and host state. Most of the time this seemed to work, but another CPU can easily flush our TLB and HTAB which makes us go in the Linux page fault handler which totally breaks because we still use the guest's SLB entries. To work around that, let's introduce a second KVM guest mode that defines that whenever we get a trap, we don't call the Linux handler or go into the KVM exit code, but just jump over the faulting instruction. That way a potentially bad lwz doesn't trigger any faults and we can later on interpret the invalid instruction we fetched as "fetch didn't work". Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Use PACA backed shadow vcpuAlexander Graf
We're being horribly racy right now. All the entry and exit code hijacks random fields from the PACA that could easily be used by different code in case we get interrupted, for example by a #MC or even page fault. After discussing this with Ben, we figured it's best to reserve some more space in the PACA and just shove off some vcpu state to there. That way we can drastically improve the readability of the code, make it less racy and less complex. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Add helpers for CR, XERAlexander Graf
We now have helpers for the GPRs, so let's also add some for CR and XER. Having them in the PACA simplifies code a lot, as we don't need to care about where to store CC or not to overflow any integers. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>