Age | Commit message (Collapse) | Author |
|
When the guest can use VMX instructions (when the "nested" module option is
on), it should also be able to read and write VMX MSRs, e.g., to query about
VMX capabilities. This patch adds this support.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
An implementation of VMX needs to define a VMCS structure. This structure
is kept in guest memory, but is opaque to the guest (who can only read or
write it with VMX instructions).
This patch starts to define the VMCS structure which our nested VMX
implementation will present to L1. We call it "vmcs12", as it is the VMCS
that L1 keeps for its L2 guest. We will add more content to this structure
in later patches.
This patch also adds the notion (as required by the VMX spec) of L1's "current
VMCS", and finally includes utility functions for mapping the guest-allocated
VMCSs in host memory.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
This patch allows the guest to enable the VMXE bit in CR4, which is a
prerequisite to running VMXON.
Whether to allow setting the VMXE bit now depends on the architecture (svm
or vmx), so its checking has moved to kvm_x86_ops->set_cr4(). This function
now returns an int: If kvm_x86_ops->set_cr4() returns 1, __kvm_set_cr4()
will also return 1, and this will cause kvm_set_cr4() will throw a #GP.
Turning on the VMXE bit is allowed only when the nested VMX feature is
enabled, and turning it off is forbidden after a vmxon.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
This patch allows a guest to use the VMXON and VMXOFF instructions, and
emulates them accordingly. Basically this amounts to checking some
prerequisites, and then remembering whether the guest has enabled or disabled
VMX operation.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
This patch adds to kvm_intel a module option "nested". This option controls
whether the guest can use VMX instructions, i.e., whether we allow nested
virtualization. A similar, but separate, option already exists for the
SVM module.
This option currently defaults to 0, meaning that nested VMX must be
explicitly enabled by giving nested=1. When nested VMX matures, the default
should probably be changed to enable nested VMX by default - just like
nested SVM is currently enabled by default.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
During tracing the emulator, we noticed that init_emulate_ctxt()
sometimes took a bit longer time than we expected.
This patch is for mitigating the problem by some degree.
By looking into the function, we soon notice that it clears the whole
decode_cache whose size is about 2.5K bytes now. Furthermore, most of
the bytes are taken for the two read_cache arrays, which are used only
by a few instructions.
Considering the fact that we are not assuming the cache arrays have
been cleared when we store actual data, we do not need to clear the
arrays: 2K bytes elimination. In addition, we can avoid clearing the
fetch_cache and regs arrays.
This patch changes the initialization not to clear the arrays.
On our 64-bit host, init_emulate_ctxt() becomes 0.3 to 0.5us faster with
this patch applied.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Use a local pointer to the emulate_ctxt for simplicity. Then, arrange
the hard-to-read mode selection lines neatly.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
So far kvm_arch_vcpu_setup is responsible for freeing the vcpu struct if
it fails. Move this confusing resonsibility back into the hands of
kvm_vm_ioctl_create_vcpu. Only kvm_arch_vcpu_setup of x86 is affected,
all other archs cannot fail.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
In VMX, before we bring down a CPU we must VMCLEAR all VMCSs loaded on it
because (at least in theory) the processor might not have written all of its
content back to memory. Since a patch from June 26, 2008, this is done using
a per-cpu "vcpus_on_cpu" linked list of vcpus loaded on each CPU.
The problem is that with nested VMX, we no longer have the concept of a
vcpu being loaded on a cpu: A vcpu has multiple VMCSs (one for L1, a pool for
L2s), and each of those may be have been last loaded on a different cpu.
So instead of linking the vcpus, we link the VMCSs, using a new structure
loaded_vmcs. This structure contains the VMCS, and the information pertaining
to its loading on a specific cpu (namely, the cpu number, and whether it
was already launched on this cpu once). In nested we will also use the same
structure to hold L2 VMCSs, and vmx->loaded_vmcs is a pointer to the
currently active VMCS.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Acked-by: Acked-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
Instead of blacklisting known-unsupported cpuid leaves, whitelist known-
supported leaves. This is more conservative and prevents us from reporting
features we don't support. Also whitelist a few more leaves while at it.
Signed-off-by: Avi Kivity <avi@redhat.com>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Introduce drop_parent_pte to remove the rmap of parent pte and
clear parent pte
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Cleanup the same operation between kvm_mmu_page_unlink_children and
mmu_pte_write_zap_pte
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Parent pte rmap and page rmap are very similar, so use the same arithmetic
for them
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Abstract the operation of rmap to spte_list, then we can use it for the
reverse mapping of parent pte in the later patch
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Fix:
warning: ‘cs_sel’ may be used uninitialized in this function
warning: ‘ss_sel’ may be used uninitialized in this function
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Simply use __copy_to_user/__clear_user to write guest page since we have
already verified the user address when the memslot is set
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Simply return from kvm_mmu_pte_write path if no shadow page is
write-protected, then we can avoid to walk all shadow pages and hold
mmu-lock
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
vmcs_readl() and friends are really short, but gcc thinks they are long because of
the out-of-line exception handlers. Mark them always_inline to clear the
misunderstanding.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
We clean up a failed VMREAD by clearing the output register. Do
it in the exception handler instead of unconditionally. This is
worthwhile since there are more than a hundred call sites.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Dereference it in the actual users.
This not only cleans up the emulator but also makes it easy to convert
the old emulation functions to the new em_xxx() form later.
Note: Remove some inline keywords to let the compiler decide inlining.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Dereference it in the actual users: only do_insn_fetch_byte().
This is consistent with the way __linearize() dereferences it.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
The two macros need special care to use:
Assume rc, ctxt, ops and done exist outside of them.
Can goto outside.
Considering the fact that these are used only in decode functions,
moving these right after do_insn_fetch() seems to be a right thing
to improve the readability.
We also rename do_fetch_insn_byte() to do_insn_fetch_byte() to be
consistent.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
Just compiling pseries in the kernel causes it to override
memory_block_size_bytes() regardless of what is the runtime
platform.
This cleans up the implementation of that function, fixing
a bug or two while at it, so that it's harmless (and potentially
useful) for other platforms. Without this, bugs in that code
would trigger a WARN_ON() in drivers/base/memory.c when
booting some different platforms.
If/when we have another platform supporting memory hotplug we
might want to either move that out to a generic place or
make it a ppc_md. callback.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The macro MIN_MEMORY_BLOCK_SIZE is currently defined twice in two .c
files, and I need it in a third one to fix a powerpc bug, so let's
first move it into a header
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6 into omap/fixes
|
|
Not only does this file duplicate <linux/bitops.h> and implement a
well-known antipattern, it is also not used by anything.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ycmiao/pxa-linux-2.6 into fixes
|
|
Removing __init from check_platform_magic since it is called by
xen_unplug_emulated_devices in non-init contexts (It probably gets inlined
because of -finline-functions-called-once, removing __init is more to avoid
mismatch being reported).
Signed-off-by: Raghavendra D Prabhu <rprabhu@wnohang.net>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
In the past we would only use the function's value if the
returned value was not equal to 'acpi_sci_override_gsi'. Meaning
that the INT_SRV_OVR values for global and source irq were different.
But it is OK to use the function's value even when the global
and source irq are the same.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
In the past (2.6.38) the 'xen_allocate_pirq_gsi' would allocate
an entry in a Linux IRQ -> {XEN_IRQ, type, event, ..} array. All
of that has been removed in 2.6.39 and the Xen IRQ subsystem uses
an linked list that is populated when the call to
'xen_allocate_irq_gsi' (universally done from any of the xen_bind_*
calls) is done. The 'xen_allocate_pirq_gsi' is a NOP and there is
no need for it anymore so lets remove it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
As the code paths that are guarded by CONFIG_XEN_DOM0 already depend
on CONFIG_ACPI so the extra #ifdef is not required. The earlier
patch that added them in had done its job.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
.. which means we can preset of NR_IRQS_LEGACY interrupts using
the 'acpi_get_override_irq' API before this loop.
This means that we can get the IRQ's polarity (and trigger) from either
the ACPI (or MP); or use the default values. This fixes a bug if we did
not have an IOAPIC we would not been able to preset the IRQ's polarity
if the MP table existed.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
Since they are only called once and the rest of the pci_xen_*
functions follow the same pattern of setup.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
.. to cut down on the code duplicity.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
Not before .. also that code segment starts looking like the HVM one.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
In the past we would guard those code segments to be dependent
on CONFIG_XEN_DOM0 (which depends on CONFIG_ACPI) so this patch is
not stricly necessary. But the next patch will merge common
HVM and initial domain code and we want to make sure the CONFIG_ACPI
dependency is preserved - as HVM code does not depend on CONFIG_XEN_DOM0.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
Update the out-dated comment at the beginning of the file.
Also provide the copyrights of folks who have been contributing
to this code lately.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
The file is hard to read. Move the code around so that
the contents of it follows a uniform format:
- setup GSIs - PV, HVM, and initial domain case
- then MSI/MSI-x setup - PV, HVM and then initial domain case.
- then MSI/MSI-x teardown - same order.
- lastly, the __init functions in PV, HVM, and initial domain order.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
The code in setup_ioapic_irq() determines the Destination Field,
so why not also include it in the debug printk output that gets
displayed when the boot parameter "apic=debug" is used.
Before the change, "dmesg" will show:
IOAPIC[0]: Set routing entry (8-1 -> 0x31 -> IRQ 1 Mode:0 Active:0)
IOAPIC[0]: Set routing entry (8-2 -> 0x30 -> IRQ 0 Mode:0 Active:0)
IOAPIC[0]: Set routing entry (8-3 -> 0x33 -> IRQ 3 Mode:0 Active:0) ...
After the change, you will see:
IOAPIC[0]: Set routing entry (8-1 -> 0x31 -> IRQ 1 Mode:0 Active:0 Dest:0)
IOAPIC[0]: Set routing entry (8-2 -> 0x30 -> IRQ 0 Mode:0 Active:0 Dest:0)
IOAPIC[0]: Set routing entry (8-3 -> 0x33 -> IRQ 3 Mode:0 Active:0 Dest:0) ...
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Link: http://lkml.kernel.org/r/20110708184603.2734.91071.sendpatchset@nchumbalkar.americas.cpqcorp.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
When IOAPIC data is displayed in "dmesg" with the help of the
boot parameter "apic=debug" certain values are not formatted
correctly wrt their size.
In the "dmesg" snippet below, note that the output for "max
redirection entries", and "IO APIC version" which are each
defined to be just 8-bits long are displayed as 2 bytes in
length. Similarly, "Dst" under the "IRQ redirection table"
should only be 8-bits long.
IO APIC #0......
...
...
.... register #01: 00170020
....... : max redirection entries: 0017
....... : PRQ implemented: 0
....... : IO APIC version: 0020
...
...
.... IRQ redirection table:
NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
00 000 1 0 0 0 0 0 0 00
01 000 0 0 0 0 0 0 0 31
02 000 0 0 0 0 0 0 0 30
03 000 1 0 0 0 0 0 0 33
...
...
Do some formatting clean up, so you will see output like below:
IO APIC #0......
...
...
.... register #01: 00170020
....... : max redirection entries: 17
....... : PRQ implemented: 0
....... : IO APIC version: 20
...
...
.... IRQ redirection table:
NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
00 00 1 0 0 0 0 0 0 00
01 00 0 0 0 0 0 0 0 31
02 00 0 0 0 0 0 0 0 30
03 00 1 0 0 0 0 0 0 33
...
...
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Link: http://lkml.kernel.org/r/20110708184557.2734.61830.sendpatchset@nchumbalkar.americas.cpqcorp.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Commit 2706a0bf7b ("x86, NUMA: Enable CONFIG_AMD_NUMA on 32bit
too") enabled AMD NUMA for 32bit too. Unfortunately, SPARSEMEM
on 32bit had rather coarse (512MiB) addr->node mapping
granularity due to lack of space in page->flags. This led to
boot failure on certain AMD NUMA machines which had 128MiB
alignment on nodes.
Patches to properly detect this condition and reject NUMA
configuration are posted[1] but deemed too pervasive for merge
at this point (-rc6). Disable AMD NUMA for 32bit for now and
re-enable once the detection logic is merged.
[1] http://thread.gmane.org/gmane.linux.kernel/1161279/focus=1162583
Reported-by: Hans Rosenfeld <hans.rosenfeld@amd.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Conny Seidel <conny.seidel@amd.com>
Link: http://lkml.kernel.org/r/20110711083432.GC943@htj.dyndns.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Sync with Linus' tree to be able to apply pending patches that
are based on newer code already present upstream.
|
|
Make sure that the 'static' keywork is at the beginning of declaration
for arch/xtensa/variants/s6000/include/variant/dmac.h
This gets rid of warnings like
warning: ‘static’ is not at beginning of declaration
when building with -Wold-style-declaration (and/or -Wextra which also
enables it).
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Make sure that the 'static' keywork is at the beginning of declaration
for arch/sh/*
This gets rid of warnings like
warning: ‘static’ is not at beginning of declaration
when building with -Wold-style-declaration (and/or -Wextra which also
enables it).
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Make sure that the 'static' keywork is at the beginning of declaration
for arch/mips/include/asm/mach-jz4740/gpio.h
This gets rid of warnings like
warning: ‘static’ is not at beginning of declaration
when building with -Wold-style-declaration (and/or -Wextra which also
enables it).
Also a few tiny whitespace changes.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Make sure that the 'static' keywork is at the beginning of declaration
for arch/arm/mach-shmobile/board-ap4evb.c
This gets rid of warnings like
warning: ‘static’ is not at beginning of declaration
when building with -Wold-style-declaration (and/or -Wextra which also
enables it).
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
|
|
After commits d13586574d373ef40acd4725c9a269daa355e412 ("OMAP: McBSP:
implement functional clock switching via clock framework") and
cf4c87abe238ec17cd0255b4e21abd949d7f811e ("OMAP: McBSP: implement
McBSP CLKR and FSR signal muxing via mach-omap2/mcbsp.c"), any OMAP1
board (such as the AMS Delta) that uses the ASoC McBSP driver will no
longer build:
sound/built-in.o: In function `omap_mcbsp_dai_set_dai_sysclk':
last.c:(.text+0x24ff8): undefined reference to `omap2_mcbsp1_mux_clkr_src'
last.c:(.text+0x2500c): undefined reference to `omap2_mcbsp1_mux_fsr_src'
make: *** [vmlinux] Error 1
Fix by defining three OMAP1-only dummy functions for
omap2_mcbsp1_mux_clkr_src(), omap2_mcbsp1_mux_fsr_src(), and
omap2_mcbsp_set_clks_src().
Normally, code that is OMAP SoC-revision-specific like this should go
under the arch/arm/*omap* directories, and get abstracted away from
drivers via struct platform_data function pointers. This doesn't work
in this case since there doesn't appear to be any convenient way to access
struct platform_data (or something like it) in the current design of
the sound/soc/omap/omap-mcbsp.c driver.
Reported by Janusz Krzysztofik <jkrzyszt@tis.icnet.pl> and Tony Lindgren
<tony@atomide.com>. Janusz also posted a patch to fix this at:
http://www.spinics.net/lists/linux-omap/msg39560.html
(among other places), but the following approach seems less dependent
on compiler behavior.
This patch passes build tests for ams_delta_defconfig and omap2plus_defconfig,
but since I don't have an AMS Delta here, I can't boot test it on that
platform.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Janusz Krzysztofik <jkrzyszt@tis.icnet.pl>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Jarkko Nikula <jhnikula@gmail.com>
Cc: Peter Ujfalusi <peter.ujfalusi@nokia.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
A pcmcia_init callback isn't used on any of the platforms. Drop it.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
|