Age | Commit message (Collapse) | Author |
|
This ensures that the translations for unmapped IO mappings or
unmapped memory are properly removed from the MMU hash table
before such an unplug. Without this, the hypervisor refuses the
unplug operations due to those resources still being mapped by
the partition.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Firmware changed the way it represents memory and cpu affinity on POWER7.
Unfortunately the old method now caps the topology to work around issues
with legacy operating systems. For Linux to get the correct topology we
need to use the new form 1 affinity information.
We set the form 1 field in the client architecture, and if we see "1" in the
ibm,associativity-form property firmware supports form 1 affinity and
we should look at the first field in the ibm,associativity-reference-points
array. If not we use the second field as we always have.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The following commit broke CONFIG_RELOCATABLE support on FSL Book-E
parts:
commit 549e8152de8039506f69c677a4546e5427aa6ae7
Author: Paul Mackerras <paulus@samba.org>
Date: Sat Aug 30 11:43:47 2008 +1000
powerpc: Make the 64-bit kernel as a position-independent executable
The change to __va and __pa to use PAGE_OFFSET & MEMORY_START causes
problems on the Book-E parts because we don't know MEMORY_START until
after we parse the device tree. We need __va to work properly to even
parse the device tree so we have a chicken an egg. So go back to using
he other definition of __va/__pa on CONFIG_BOOKE and use the
PAGE_OFFSET/MEMORY_START version on "Classic" PPC64.
Also updated casts to handle phys_addr_t being a different size from
unsigned long (ie 36-bit physical on PPC32).
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
|
|
discontiguous => discontinuous
coealesce => coalesce
Signed-off-by: Thomas Weber <weber@corscience.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
When we destory a vcpu, we should also make sure to kill all pending
timers that could still be up. When not doing this, hrtimers might
dereference null pointers trying to call our code.
This patch fixes spontanious kernel panics seen after closing VMs.
Signed-off-by: Alexander Graf <alex@csgraf.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
While converting the kzalloc we used to allocate our vcpu struct to
vmalloc, I forgot to memset the contents to zeros. That broke quite
a lot.
This patch memsets it to zero again.
Signed-off-by: Alexander Graf <alex@csgraf.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
We used to use get_free_pages to allocate our vcpu struct. Unfortunately
that call failed on me several times after my machine had a big enough
uptime, as memory became too fragmented by then.
Fortunately, we don't need it to be page aligned any more! We can just
vmalloc it and everything's great.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
We don't need as complex code. I had some thinkos while writing it, figuring
I needed to support PPC32 paths on PPC64 which would have required DR=0, but
everything just runs fine with DR=1.
So let's make the functions simple C call wrappers that reserve some space on
the stack for the respective functions to clobber.
Fixes out-of-RMA-access (and thus guest FPU loading) on the PS3.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
We had code to make use of the secondary htab buckets, but kept that
disabled because it was unstable when I put it in.
I checked again if that's still the case and apparently it was only
exposing some instability that was there anyways before. I haven't
seen any badness related to usage of secondary htab entries so far.
This should speed up guest memory allocations by quite a bit, because
we now have more space to put PTEs in.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
We need to tell userspace that we can emulate paired single instructions.
So let's add a capability export.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The one big thing about the Gekko is paired singles.
Paired singles are an extension to the instruction set, that adds 32 single
precision floating point registers (qprs), some SPRs to modify the behavior
of paired singled operations and instructions to deal with qprs to the
instruction set.
Unfortunately, it also changes semantics of existing operations that affect
single values in FPRs. In most cases they get mirrored to the coresponding
QPR.
Thanks to that we need to emulate all FPU operations and all the new paired
single operations too.
In order to achieve that, we use the just introduced FPU call helpers to
call the real FPU whenever the guest wants to modify an FPR. Additionally
we also fix up the QPR values along the way.
That way we can execute paired single FPU operations without implementing a
soft fpu.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
When we get a program interrupt we usually don't expect it to perform an
MMIO operation. But why not? When we emulate paired singles, we can end
up loading or storing to an MMIO address - and the handling of those
happens in the program interrupt handler.
So let's teach the program interrupt handler how to deal with EMULATE_MMIO.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The PowerPC specification always lists bits from MSB to LSB. That is
really confusing when you're trying to write C code, because it fits
in pretty badly with the normal (1 << xx) schemes.
So I came up with some nice wrappers that allow to get and set fields
in a u64 with bit numbers exactly as given in the spec. That makes the
code in KVM and the spec easier comparable.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
BATs didn't work. Well, they did, but only up to BAT3. As soon as we
came to BAT4 the offset calculation was screwed up and we ended up
overwriting BAT0-3.
Fortunately, Linux hasn't been using BAT4+. It's still a good
idea to write correct code though.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
To emulate paired single instructions, we need to be able to call FPU
operations from within the kernel. Since we don't want gcc to spill
arbitrary FPU code everywhere, we tell it to use a soft fpu.
Since we know we can really call the FPU in safe areas, let's also add
some calls that we can later use to actually execute real world FPU
operations on the host's FPU.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
We need to call the ext giveup handlers from code outside of book3s.c.
So let's make it non-static.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The Book3S KVM implementation contains some helper functions to load and store
data from and to virtual addresses.
Unfortunately, this helper used to keep the physical address it so nicely
found out for us to itself. So let's change that and make it return the
physical address it resolved.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The Book3S_32 specifications allows for two instructions to modify segment
registers: mtsrin and mtsr.
Most normal operating systems use mtsrin, because it allows to define which
segment it wants to change using a register. But since I was trying to run
an embedded guest, it turned out to be using mtsr with hardcoded values.
So let's also emulate mtsr. It's a valid instruction after all.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
There's a typo in the debug ifdef of the book3s_32 mmu emulation. While trying
to debug something I stumbled across that and wanted to save anyone after me
(or myself later) from having to debug that again.
So let's fix the ifdef.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
There are some situations when we're pretty sure the guest will use the
FPU soon. So we can save the churn of going into the guest, finding out
it does want to use the FPU and going out again.
This patch adds preloading of the FPU when it's reasonable.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
When we for example get an Altivec interrupt, but our guest doesn't support
altivec, we need to inject a program interrupt, not an altivec interrupt.
The same goes for paired singles. When an altivec interrupt arrives, we're
pretty sure we need to emulate the instruction because it's a paired single
operation.
So let's make all the ext handlers aware that they need to jump to the
program interrupt handler when an extension interrupt arrives that
was not supposed to arrive for the guest CPU.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The Gekko has some SPR values that differ from other PPC core values and
also some additional ones.
Let's add support for them in our mfspr/mtspr emulator.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The Gekko implements an extension called paired singles. When the guest wants
to use that extension, we need to make sure we're not running the host FPU,
because all FPU instructions need to get emulated to accomodate for additional
operations that occur.
This patch adds an hflag to track if we're in paired single mode or not.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Emulation of an instruction can have different outcomes. It can succeed,
fail, require MMIO, do funky BookE stuff - or it can just realize something's
odd and will be fixed the next time around.
Exactly that is what EMULATE_AGAIN means. Using that flag we can now tell
the caller that nothing happened, but we still want to go back to the
guest and see what happens next time we come around.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The guest I was trying to get to run uses the LHA and LHAU instructions.
Those instructions basically do a load, but also sign extend the result.
Since we need to fill our registers by hand when doing MMIO, we also need
to sign extend manually.
This patch implements sign extended MMIO and the LHA(U) instructions.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Right now MMIO access can only happen for GPRs and is at most 32 bit wide.
That's actually enough for almost all types of hardware out there.
Unfortunately, the guest I was using used FPU writes to MMIO regions, so
it ended up writing 64 bit MMIOs using FPRs and QPRs.
So let's add code to handle those odd cases too.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Modern PowerPCs have a 64 bit wide FPSCR register. Let's accomodate for that
and make it 64 bits in our vcpu struct too.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
The Gekko has GPRs, SPRs and FPRs like normal PowerPC codes, but
it also has QPRs which are basically single precision only FPU registers
that get used when in paired single mode.
The following patches depend on them being around, so let's add the
definitions early.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Int is not long enough to store the size of a dirty bitmap.
This patch fixes this problem with the introduction of a wrapper
function to calculate the sizes of dirty bitmaps.
Note: in mark_page_dirty(), we have to consider the fact that
__set_bit() takes the offset as int, not long.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
|
|
a recent fc11 udev update on an 83xx board made root console login
disappear:
Updating : udev-141-8.fc11.ppc 32/83
udev: starting version 141
udev: deprecated sysfs layout; update the kernel or disable CONFIG_SYSFS_DEPRECATED;
some udev features will not work correctly
and sure enough, turning off SYSFS_DEPRECATED brings the login prompt back.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
|
|
to enable the storage controller on earlier rev. mpc834x itx boards.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
|
|
83xx users looking to run apache will experience this error:
/var/log/apache2/error.log:
[emerg] (38)Function not implemented: Couldn't create pollset in child; check system or user limits
enabling CONFIG_EPOLL in kernel config fixes this so apache can run.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
|
|
83xx boards typically have the DS1307 or DS1374:
git grep RTC_DRV arch/powerpc/configs/*83* | grep "=y"
arch/powerpc/configs/83xx/asp8347_defconfig:CONFIG_RTC_DRV_DS1374=y
arch/powerpc/configs/83xx/mpc8313_rdb_defconfig:CONFIG_RTC_DRV_DS1307=y
arch/powerpc/configs/83xx/mpc8315_rdb_defconfig:CONFIG_RTC_DRV_DS1307=y
arch/powerpc/configs/83xx/mpc832x_mds_defconfig:CONFIG_RTC_DRV_DS1374=y
arch/powerpc/configs/83xx/mpc834x_itx_defconfig:CONFIG_RTC_DRV_DS1307=y
arch/powerpc/configs/83xx/mpc834x_itxgp_defconfig:CONFIG_RTC_DRV_DS1307=y
arch/powerpc/configs/83xx/mpc834x_mds_defconfig:CONFIG_RTC_DRV_DS1374=y
arch/powerpc/configs/83xx/mpc836x_mds_defconfig:CONFIG_RTC_DRV_DS1374=y
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
|
|
Some board setup functions call cpm1_clk_setup() or cmp2_clk_setup()
to configure the clock source.
If CPM_CLK_RTX has been used for the parameter mode,
the clock has been configured only for TX but not for RX.
With this patch CPM_CLK_RTX configures the clock for both directions
correctly.
Signed-off-by: Wolfgang Ocker <weo@reccoware.de>
Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
|
|
The code was looking for this in cpu_features, not mmu_features. Fix this.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
|
|
Currently some MPC85xx and MPC86xx boards fail to build without
CONFIG_PCI:
arch/powerpc/platforms/fsl_uli1575.c: In function 'quirk_final_uli5249':
arch/powerpc/platforms/fsl_uli1575.c:234: error: implicit declaration of function 'pci_bus_for_each_resource'
arch/powerpc/platforms/fsl_uli1575.c:234: error: expected ';' before '{' token
cc1: warnings being treated as errors
arch/powerpc/platforms/fsl_uli1575.c:223: warning: unused variable 'dummy'
make[1]: *** [arch/powerpc/platforms/fsl_uli1575.o] Error 1
This patch fixes the issue by appending 'if PCI' condition when
selecting FSL_ULI1575 Kconfig symbol.
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
|
|
This reverts commit 7545ba6f82924d4523f8f8a2baf2e517a750265d.
It breaks eHEA among other issues
|
|
This patch ports the kprobe-based event tracer to powerpc. This patch
is based on x86 port. This brings powerpc on par with x86.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Adds support for suspend/resume for VIO devices. This is needed for
support for HMC initiated hibernation.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Replace open-coded rate limiting logic with __ratelimit().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The current setting for SECTION_SIZE_BITS is quite small compared to
everyone else:
arch/powerpc/include/asm/sparsemem.h:#define SECTION_SIZE_BITS 24
arch/sparc/include/asm/sparsemem.h:#define SECTION_SIZE_BITS 30
arch/ia64/include/asm/sparsemem.h:#define SECTION_SIZE_BITS (30)
arch/s390/include/asm/sparsemem.h:#define SECTION_SIZE_BITS 28
arch/x86/include/asm/sparsemem.h:# define SECTION_SIZE_BITS 27
And it has proven to be an issue during boot on very large machines.
If hotplug memory is enabled, drivers/base/memory.c does this:
for (i = 0; i < NR_MEM_SECTIONS; i++) {
if (!present_section_nr(i))
continue;
err = add_memory_block(0, __nr_to_section(i), MEM_ONLINE,
0, BOOT);
if (!ret)
ret = err;
}
Which creates a sysfs directory for every 16MB of memory. As a result
I'm seeing up to 30 minutes spent here during boot:
c000000000248ee0 .__sysfs_add_one+0x28/0x128
c0000000002492a8 .sysfs_add_one+0x38/0x188
c000000000249c88 .create_dir+0x70/0x138
c000000000249d98 .sysfs_create_dir+0x48/0x78
c00000000032bad8 .kobject_add_internal+0x140/0x308
c00000000032beb4 .kobject_init_and_add+0x4c/0x68
c00000000046c2c0 .sysdev_register+0xa0/0x220
c00000000047b1dc .add_memory_block+0x124/0x1e8
c0000000008d1f28 .memory_dev_init+0xf4/0x168
c0000000008d1b64 .driver_init+0x50/0x64
c000000000890378 .do_basic_setup+0x40/0xd4
I assume there are some O(n^2) issues in sysfs as we add all the memory
nodes. Bumping SECTION_SIZE_BITS to 256 MB drops the time to about 10
seconds and results in a much smaller /sys.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We have had issues in the past with ibm,os-term initiating shutdown of a
partition. This is confusing to the user, especially if panic_timeout is
non zero.
The temporary fix was to avoid calling ibm,os-term if a panic_timeout was set
and since we set it on every boot we basically never call ibm,os-term.
An extended version of ibm,os-term has since been implemented which gives us
the behaviour we want:
"When the platform supports extended ibm,os-term behavior, the return to the
RTAS will always occur unless there is a kernel assisted dump active as
initiated by an ibm,configure-kernel-dump call."
This patch checks for the ibm,extended-os-term property and calls ibm,os-term
if it exists.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
I noticed /proc/sys/vm/zone_reclaim_mode was 0 on a ppc64 NUMA box. It gets
enabled via this:
/*
* If another node is sufficiently far away then it is better
* to reclaim pages in a zone before going off node.
*/
if (distance > RECLAIM_DISTANCE)
zone_reclaim_mode = 1;
Since we use the default value of 20 for REMOTE_DISTANCE and 20 for
RECLAIM_DISTANCE it never kicks in.
The local to remote bandwidth ratios can be quite large on System p
machines so it makes sense for us to reclaim clean pagecache locally before
going off node.
The patch below sets a smaller value for RECLAIM_DISTANCE and thus enables
zone reclaim.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Use set_cpus_allowed_ptr rather than set_cpus_allowed.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E1,E2;
@@
- set_cpus_allowed(E1, cpumask_of_cpu(E2))
+ set_cpus_allowed_ptr(E1, cpumask_of(E2))
@@
expression E;
identifier I;
@@
- set_cpus_allowed(E, I)
+ set_cpus_allowed_ptr(E, &I)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Add an unlock before exiting the function.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r exists@
expression E1;
identifier f;
@@
f (...) { <+...
* spin_lock_irq (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
kasprintf combines kmalloc and sprintf, and takes care of the size
calculation itself.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression a,flag;
expression list args;
statement S;
@@
a =
- \(kmalloc\|kzalloc\)(...,flag)
+ kasprintf(flag,args)
<... when != a
if (a == NULL || ...) S
...>
- sprintf(a,args);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
dlpar_free_cc_nodes frees its argument, so dlpar_online_cpu should not be
called on the same value. Skip over the call to dlpar_online_cpu by
jumping directly to out.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E,E2;
@@
dlpar_free_cc_nodes(E)
...
(
E = E2
|
* E
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Commit 0119536c, which added the assembly version of strncmp to
powerpc, mentions that it adds two instructions to the version from
boot/string.S to allow it to handle len=0. Unfortunately, it doesn't
always return 0 when that is the case. The length is passed in r5, but
the return value is passed back in r3. In certain cases, this will
happen to work. Otherwise it will pass back the address of the first
string as the return value.
This patch lifts the len <= 0 handling code from memcpy to handle that
case.
Reported by: Christian_Sellars@symantec.com
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
CC: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Fix minor nits found by cppcheck
[./arch/powerpc/platforms/powermac/low_i2c.c:594]: (style) The scope of the variable chans can be reduced
[./arch/powerpc/platforms/powermac/low_i2c.c:594]: (style) The scope of the variable i can be reduced
[./arch/powerpc/platforms/powermac/low_i2c.c:1260]: (style) Redundant condition. It is safe to deallocate a NULL pointer
Signed-off-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|