Age | Commit message (Collapse) | Author |
|
Intel SDM 27.5.2 Loading Host Segment and Descriptor-Table Registers:
"The GDTR and IDTR limits are each set to FFFFH."
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Similar to NMI, there may be ISA specific reasons why an SMI cannot be
injected into the guest. This commit adds a new smi_allowed callback to
be implemented in following commits.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Entering and exiting SMM may require ISA specific handling under certain
circumstances. This commit adds two new callbacks with empty implementations.
Actual functionality will be added in following commits.
* pre_enter_smm() is to be called when injecting an SMM, before any
SMM related vcpu state has been changed
* pre_leave_smm() is to be called when emulating the RSM instruction,
when the vcpu is in real mode and before any SMM related vcpu state
has been restored
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
It has always annoyed me a bit how SVM_EXIT_NPF is handled by
pf_interception. This is also the only reason behind the
under-documented need_unprotect argument to kvm_handle_page_fault.
Let NPF go straight to kvm_mmu_page_fault, just like VMX
does in handle_ept_violation and handle_ept_misconfig.
Reviewed-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Checking the mode is unnecessary, and is done without a memory barrier
separating the LAPIC write from the vcpu->mode read; in addition,
kvm_vcpu_wake_up is already doing a check for waiters on the wait queue
that has the same effect.
In practice it's safe because spin_lock has full-barrier semantics on x86,
but don't be too clever.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Remove redundant null checks before calling kmem_cache_destroy.
Found with make coccicheck M=arch/x86/kvm on linux-next tag
next-20170929.
Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
SDM mentioned:
"If either the “unrestricted guest†VM-execution control or the “mode-based
execute control for EPT†VM- execution control is 1, the “enable EPTâ€
VM-execution control must also be 1."
However, we can still observe unrestricted_guest is Y after inserting the kvm-intel.ko
w/ ept=N. It depends on later starts a guest in order that the function
vmx_compute_secondary_exec_control() can be executed, then both the module parameter
and exec control fields will be amended.
This patch fixes it by amending module parameter immediately during vmcs data setup.
Reviewed-by: Jim Mattson <jmattson@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
- XCR0 is reset to 1 by RESET but not INIT
- XSS is zeroed by both RESET and INIT
- BNDCFGU, BND0-BND3, BNDCFGS, BNDSTATUS are zeroed by both RESET and INIT
This patch does this according to SDM.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Our routines look at tscdeadline and period when deciding state of a
timer. The timer is disarmed when switching between TSC deadline and
other modes, so we should set everything to disarmed state.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
preemption timer only looks at tscdeadline and could inject already
disarmed timer.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
0 should disable the timer, but start_hv_timer will recognize it as an
expired timer instead.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The kvm slabs can consume a significant amount of system memory
and indeed in our production environment we have observed that
a lot of machines are spending significant amount of memory that
can not be left as system memory overhead. Also the allocations
from these slabs can be triggered directly by user space applications
which has access to kvm and thus a buggy application can leak
such memory. So, these caches should be accounted to kmemcg.
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Let's just name these according to the SDM. This should make it clearer
that the are used to enable exiting and not the feature itself.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Changing it afterwards doesn't make too much sense and will only result
in inconsistencies.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
No need for another enable_ept check. kvm->arch.ept_identity_map_addr
only has to be inititalized once. Having alloc_identity_pagetable() is
overkill and dropping BUG_ONs is always nice.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
They are inititally 0, so no need to reset them to 0.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
vcpu->cpu is not cleared when doing a vmx_vcpu_put/load, so this can be
dropped.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Without this, we won't be able to do any flushes, so let's just require
it. Should be absent in very strange configurations.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
ept_* function should only be called with enable_ept being set.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
This function is only called with enable_ept.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
vmx and svm use zalloc, so this is not necessary.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Make it a void and drop error handling code.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
And also get rid of that superfluous local variable "kvm".
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Let's just drop the return.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
The description in the Intel SDM of how the divide configuration
register is used: "The APIC timer frequency will be the processor's bus
clock or core crystal clock frequency divided by the value specified in
the divide configuration register."
Observation of baremetal shown that when the TDCR is change, the TMCCT
does not change or make a big jump in value, but the rate at which it
count down change.
The patch update the emulation to APIC timer to so that a change to the
divide configuration would be reflected in the value of the counter and
when the next interrupt is triggered.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Fixed some whitespace and added a check for negative delta and running
timer. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
If we take TSC-deadline mode timer out of the picture, the Intel SDM
does not say that the timer is disable when the timer mode is change,
either from one-shot to periodic or vice versa.
After this patch, the timer is no longer disarmed on change of mode, so
the counter (TMCCT) keeps counting down.
So what does a write to LVTT changes ? On baremetal, the change of mode
is probably taken into account only when the counter reach 0. When this
happen, LVTT is use to figure out if the counter should restard counting
down from TMICT (so periodic mode) or stop counting (if one-shot mode).
This patch is based on observation of the behavior of the APIC timer on
baremetal as well as check that they does not go against the description
written in the Intel SDM.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Fixed rate limiting of periodic timer.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Extract the logic of limit lapic periodic timer frequency to a new function,
this function will be used by later patches.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
SDM 10.5.4.1 TSC-Deadline Mode mentioned that "Transitioning between TSC-Deadline
mode and other timer modes also disarms the timer". So the APIC Timer Initial Count
Register for one-shot/periodic mode should be reset. This patch do it.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Removed unnecessary definition of APIC_LVT_TIMER_MASK.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
KVM doesn't expose the PLE capability to the L1 hypervisor, however,
ple_window still shows the default value on L1 hypervisor. This patch
fixes it by clearing all the PLE related module parameter if there is
no PLE capability.
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
When KVM emulates an exit from L2 to L1, it loads L1 CR4 into the
guest CR4. Before this CR4 loading, the guest CR4 refers to L2
CR4. Because these two CR4's are in different levels of guest, we
should vmx_set_cr4() rather than kvm_set_cr4() here. The latter, which
is used to handle guest writes to its CR4, checks the guest change to
CR4 and may fail if the change is invalid.
The failure may cause trouble. Consider we start
a L1 guest with non-zero L1 PCID in use,
(i.e. L1 CR4.PCIDE == 1 && L1 CR3.PCID != 0)
and
a L2 guest with L2 PCID disabled,
(i.e. L2 CR4.PCIDE == 0)
and following events may happen:
1. If kvm_set_cr4() is used in load_vmcs12_host_state() to load L1 CR4
into guest CR4 (in VMCS01) for L2 to L1 exit, it will fail because
of PCID check. As a result, the guest CR4 recorded in L0 KVM (i.e.
vcpu->arch.cr4) is left to the value of L2 CR4.
2. Later, if L1 attempts to change its CR4, e.g., clearing VMXE bit,
kvm_set_cr4() in L0 KVM will think L1 also wants to enable PCID,
because the wrong L2 CR4 is used by L0 KVM as L1 CR4. As L1
CR3.PCID != 0, L0 KVM will inject GP to L1 guest.
Fixes: 4704d0befb072 ("KVM: nVMX: Exiting from L2 to L1")
Cc: qemu-stable@nongnu.org
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Enable ostm0 and ostm1 timers to be used as clock source and clockevent
source. The timers provides greater accuracy than the already enabled
mtu2 one.
With these enabled:
clocksource: ostm: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 57352151442 ns
sched_clock: 32 bits at 33MHz, resolution 30ns, wraps every 64440619504ns
ostm: used for clocksource
ostm: used for clock events
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Suggested-by: Chris Brandt <chris.brandt@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
Add pin configuration subnode for ETHER pin group.
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
This patch adds DMA properties to the HSUSB node.
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
Enable HS-USB device for the iWave G20D-Q7 carrier board based on
RZ/G1M.
Also disable the host mode support on usb otg port by default to avoid
pin conflicts.
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
Define the R8A7743 generic part of the HS-USB device node. It is up to the
board file to enable the device.
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
The core interrupt code can call the affinity setter for inactive
interrupts under certain circumstances.
For inactive intererupts which use managed or reservation mode this is a
pointless exercise as the activation will assign a vector which fits the
destination mask.
Check for this and return w/o going through the vector assignment.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
With printch() the console messages are sent out one character at a time
which is agonizingly slow especially with semihosting as the whole trap
intercept, remote byte access, and system resume danse is performed for
every single character across a relatively slow remote debug connection.
Let's use printascii() to send a whole string at once. This is also going
to be more efficient, albeit to a quite lesser extent, with serial ports
as well.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
The svc instruction doesn't exist on v7m processors. Semihosting ops are
invoked with the bkpt instruction instead.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
This was located in .text which is meant to be read-only. And in the XIP
case this shortcut simply doesn't work and may trigger a Flash controller
mode switch and crash the kernel. Move it to the .bss area.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
By default sparse uses the characteristics of the build
machine to infer things like the wordsize.
This is fine when doing native builds but for ARM it's,
I suspect, very rarely the case and if the build are done
on a 64bit machine we get a bunch of warnings like:
'cast truncates bits from constant value (... becomes ...)'
Fix this by adding the -m32 flags for sparse.
Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Some nommu systems have RAM at address 0. When vectors are not located
there, the very beginning of memory remains available for dynamic
allocations. The memblock allocator explicitly skips the first page
but the standard page allocator does not, and while it correctly returns
a non-null struct page pointer for that page, page_address() gives 0
which gets confused with NULL (out of memory) by callers despite having
plenty of free memory left.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
Enable internal AHB-PCI bridges for the USB EHCI/OHCI controllers
attached to them.
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
Describe the PCI USB devices that are behind the PCI bridges, adding
necessary links to the USB PHY device.
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
Define the r8a7745 generic part of the USB PHY device node.
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
Add device nodes for the r8a7745 internal PCI bridge devices.
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
The following 'capacity-dmips-mhz' dt property values are used:
Cortex-A15: 1024, Cortex-A7: 539
They have been derived form the cpu_efficiency values:
Cortex-A15: 3891, Cortex-A7: 2048
by scaling them so that the Cortex-A15s (big cores) use 1024.
The cpu_efficiency values were originally derived from the "Big.LITTLE
Processing with ARM Cortex™-A15 & Cortex-A7" white paper
(http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
(3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
Dhrystone benchmark.
The following platform is affected once cpu-invariant accounting
support is re-connected to the task scheduler:
r8a7790-lager
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|