Age | Commit message (Collapse) | Author |
|
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
Update the binding docs for Renesas R-Car M3-N. No driver changes are
needed.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The LPE audio is a child device of i915, it is powered up and down
alongside the igfx and presents no independent runtime interface. This
aptly fulfils the description of a "No-Callback" Device, so mark it
thus.
Fixes: 183c00350ccd ("drm/i915: Fix runtime PM for LPE audio")
Testcase: igt/pm_rpm/basic-pci-d3-state
Testcase: igt/pm_rpm/basic-rte
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180802140416.6062-1-chris@chris-wilson.co.uk
(cherry picked from commit 46e831abe864a6b59fa3de253a681c0f2ee1bf2f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The register for 0xe420 is unable to hold any value, including
this bit. The documentation is also mixed between having a
register bit for toggle and having a state command setup
for it. Apparently the register toggle is deprecated.
Remove the register toggle as evidence shows it's futile.
The thing remaining is an apology and humble request for
Mesa folks to resurrect their state setup for this as they
were on right track from start.
This reverts commit 0bf059f3532bb39c52d917142206a8554fc2f1c5.
Fixes: 0bf059f3532b ("drm/i915/icl: WaEnableFloatBlendOptimization")
References: HSDES#1406393558
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180730120636.26958-1-mika.kuoppala@linux.intel.com
(cherry picked from commit c358514ba8da9e235876db1628cedd19a35803c6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
RPS provides a feedback loop where we use the load during the previous
evaluation interval to decide whether to up or down clock the GPU
frequency. Our responsiveness is split into 3 regimes, a high and low
plateau with the intent to keep the gpu clocked high to cover occasional
stalls under high load, and low despite occasional glitches under steady
low load, and inbetween. However, we run into situations like kodi where
we want to stay at low power (video decoding is done efficiently
inside the fixed function HW and doesn't need high clocks even for high
bitrate streams), but just occasionally the pipeline is more complex
than a video decode and we need a smidgen of extra GPU power to present
on time. In the high power regime, we sample at sub frame intervals with
a bias to upclocking, and conversely at low power we sample over a few
frames worth to provide what we consider to be the right levels of
responsiveness respectively. At low power, we more or less expect to be
kicked out to high power at the start of a busy sequence by waitboosting.
Prior to commit e9af4ea2b9e7 ("drm/i915: Avoid waitboosting on the active
request") whenever we missed the frame or stalled, we would immediate go
full throttle and upclock the GPU to max. But in commit e9af4ea2b9e7, we
relaxed the waitboosting to only apply if the pipeline was deep to avoid
over-committing resources for a near miss. Sadly though, a near miss is
still a miss, and perceptible as jitter in the frame delivery.
To try and prevent the near miss before having to resort to boosting
after the fact, we use the pageflip queue as an indication that we are
in an "interactive" regime and so should sample the load more frequently
to provide power before the frame misses it vblank. This will make us
more favorable to providing a small power increase (one or two bins) as
required rather than going all the way to maximum and then having to
work back down again. (We still keep the waitboosting mechanism around
just in case a dramatic change in system load requires urgent uplocking,
faster than we can provide in a few evaluation intervals.)
v2: Reduce rps_set_interactive to a boolean parameter to avoid the
confusion of what if they wanted a new power mode after pinning to a
different mode (which to choose?)
v3: Only reprogram RPS while the GT is awake, it will be set when we
wake the GT, and while off warns about being used outside of rpm.
v4: Fix deferred application of interactive mode
v5: s/state/interactive/
v6: Group the mutex with its principle in a substruct
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107111
Fixes: e9af4ea2b9e7 ("drm/i915: Avoid waitboosting on the active request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180731132629.3381-1-chris@chris-wilson.co.uk
(cherry picked from commit 60548c554be2830d29d2533dad0ac8133347ee51)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
First of all don't try to read dpcd if PSR is not even supported.
But also, if read failed return -EIO instead of reporting via a
backchannel.
v2: fix dev_priv: At this level m->private is the connector. (CI/DK)
don't convert dpcd read errors to EIO. (DK)
Fixes: 5b7b30864d1d ("drm/i915/psr: Split sink status into a separate debugfs node")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180720003155.16290-1-rodrigo.vivi@intel.com
(cherry picked from commit 7a72c78bdd0a1ea1d879610542679cc680398220)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
This is a fix-up patch for below build issue with multi_v7_defconfig.
drivers/ptp/ptp_qoriq.o: In function `qoriq_ptp_probe':
ptp_qoriq.c:(.text+0xd0c): undefined reference to `__aeabi_uldivmod'
Fixes: 91305f281262 ("ptp_qoriq: support automatic configuration for ptp timer")
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The sock_flag() check is alreay inside sock_enable_timestamp(), so it is
unnecessary checking it in the caller.
void sock_enable_timestamp(struct sock *sk, int flag)
{
if (!sock_flag(sk, flag)) {
...
}
}
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We use to have message like:
struct vhost_msg {
int type;
union {
struct vhost_iotlb_msg iotlb;
__u8 padding[64];
};
};
Unfortunately, there will be a hole of 32bit in 64bit machine because
of the alignment. This leads a different formats between 32bit API and
64bit API. What's more it will break 32bit program running on 64bit
machine.
So fixing this by introducing a new message type with an explicit
32bit reserved field after type like:
struct vhost_msg_v2 {
__u32 type;
__u32 reserved;
union {
struct vhost_iotlb_msg iotlb;
__u8 padding[64];
};
};
We will have a consistent ABI after switching to use this. To enable
this capability, introduce a new ioctl (VHOST_SET_BAKCEND_FEATURE) for
userspace to enable this feature (VHOST_BACKEND_F_IOTLB_V2).
Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The err is not modified after initalization, So remove it and make
it to be void function.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__inet6_lookup_established() expect th->dport passed in host-endian,
not net-endian. The reason is microoptimization in __inet6_lookup(),
but if you use the lower-level helpers, you have to play by their
rules...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to documentation, setting of PHY registers is unnecessary with
R-Car Gen3. The registers are not even described. So, don't initialize
them.
Signed-off-by: Masaharu Hayakawa <masaharu.hayakawa.ry@renesas.com>
[wsa: updated commit message]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Since R-Car Gen2, a new bit has been introduced to the interrupt mask
register. Update the code to handle it properly as well.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
This reverts commit aece27a2f01be4bb7683790f69cd1bed3a0929a2.
Causes boot failure on some devices.
http://lore.kernel.org/r/CA+G9fYuKW_jCFZPqG4tz=QY9ROfHO38KiCp9XTA+KaDOFVtcqQ@mail.gmail.com
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Signed-off-by: Jeff Layton <jlayton@kernel.org>
|
|
Signed-off-by: Jeff Layton <jlayton@kernel.org>
|
|
As there is precious little left in any DTS files referring to the
node "/chosen@0" as opposed to "/chosen", remove the two checks for
the former node name.
[paul.burton@mips.com:
The modified yamon-dt code only operates on
arch/mips/boot/dts/mti/sead3.dts right now, and that uses chosen
rather than chosen@0 anyway, so this should have no behavioural
effect.]
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20131/
Cc: linux-mips@linux-mips.org
|
|
Remove open-coded uses of set instructions to use CC_SET()/CC_OUT() in
arch/x86/kvm/vmx.c.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
[Mark error paths as unlikely while touching this. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
ACP->SYSMEM DMA happens at every I2S->SYSMEM period
completion. Thus, there is delay of x frames till
I2S->SYSMEM reaches a period length. This delay is
communicated to user space.
Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Give position on ACP->SYSMEM DMA channel for
the number of bytes that have been transferred on
the basis of current descriptor under service.
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
In capture case we don't want ACP to SYSMEM dma
to be circular. This is because if an in place DSP
filter is applied to captured output then circular DMA
can overwrite the filter value with stale data.
Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Implement paravirtual apic hooks to enable PV IPIs for KVM if the "send IPI"
hypercall is available. The hypercall lets a guest send IPIs, with
at most 128 destinations per hypercall in 64-bit mode and 64 vCPUs per
hypercall in 32-bit mode.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add kvm hypervisor init time platform setup callback which
will be used to replace native apic hooks by pararvirtual
hooks.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Using hypercall to send IPIs by one vmexit instead of one by one for
xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster
mode. Intel guest can enter x2apic cluster mode when interrupt remmaping
is enabled in qemu, however, latest AMD EPYC still just supports xapic
mode which can get great improvement by Exit-less IPIs. This patchset
lets a guest send multicast IPIs, with at most 128 destinations per
hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode.
Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM
is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141):
x2apic cluster mode, vanilla
Dry-run: 0, 2392199 ns
Self-IPI: 6907514, 15027589 ns
Normal IPI: 223910476, 251301666 ns
Broadcast IPI: 0, 9282161150 ns
Broadcast lock: 0, 8812934104 ns
x2apic cluster mode, pv-ipi
Dry-run: 0, 2449341 ns
Self-IPI: 6720360, 15028732 ns
Normal IPI: 228643307, 255708477 ns
Broadcast IPI: 0, 7572293590 ns => 22% performance boost
Broadcast lock: 0, 8316124651 ns
x2apic physical mode, vanilla
Dry-run: 0, 3135933 ns
Self-IPI: 8572670, 17901757 ns
Normal IPI: 226444334, 255421709 ns
Broadcast IPI: 0, 19845070887 ns
Broadcast lock: 0, 19827383656 ns
x2apic physical mode, pv-ipi
Dry-run: 0, 2446381 ns
Self-IPI: 6788217, 15021056 ns
Normal IPI: 219454441, 249583458 ns
Broadcast IPI: 0, 7806540019 ns => 154% performance boost
Broadcast lock: 0, 9143618799 ns
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
X86_CR4_OSXSAVE check belongs to sregs check and so move into
kvm_valid_sregs().
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Considering the fact that the pae_root shadow is not needed when
tdp is in use, skip the pae_root shadow page allocation to allow
mmu creation even not being able to obtain memory from DMA32
zone when particular cgroup cpuset.mems or mempolicy control is
applied.
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
mmu_set_spte() flushes remote tlbs for drop_parent_pte/drop_spte()
and set_spte() separately. This may introduce redundant flush. This
patch is to combine these flushes and check flush request after
calling set_spte().
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Reviewed-by: Junaid Shahid <junaids@google.com>
Reviewed-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The host's FS.base and GS.base rarely change, e.g. ~0.1% of host/guest
swaps on my system. Cache the last value written to the VMCS and skip
the VMWRITE to the associated VMCS fields when loading host state if
the value hasn't changed since the last VMWRITE.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
On a 64-bit host, FS.sel and GS.sel are all but guaranteed to be 0,
which in turn means they'll rarely change. Skip the VMWRITE for the
associated VMCS fields when loading host state if the selector hasn't
changed since the last VMWRITE.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The HOST_{FS,GS}_BASE fields are guaranteed to be written prior to
VMENTER, by way of vmx_prepare_switch_to_guest(). Initialize the
fields to zero for 64-bit kernels instead of pulling the base values
from their respective MSRs. In addition to eliminating two RDMSRs,
vmx_prepare_switch_to_guest() can safely assume the initial value of
the fields is zero in all cases.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Make host_state a property of a loaded_vmcs so that it can be
used as a cache of the VMCS fields, e.g. to lazily VMWRITE the
corresponding VMCS field. Treating host_state as a cache does
not work if it's not VMCS specific as the cache would become
incoherent when switching between vmcs01 and vmcs02.
Move vmcs_host_cr3 and vmcs_host_cr4 into host_state.
Explicitly zero out host_state when allocating a new VMCS for a
loaded_vmcs. Unlike the pre-existing vmcs_host_cr{3,4} usage,
the segment information is not guaranteed to be (re)initialized
when running a new nested VMCS, e.g. HOST_FS_BASE is not written
in vmx_set_constant_host_state().
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Remove fs_reload_needed and gs_ldt_reload_needed from host_state
and instead compute whether we need to reload various state at
the time we actually do the reload. The state that is tracked
by the *_reload_needed variables is not any more volatile than
the trackers themselves.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
prepare_vmcs02() has an odd comment that says certain fields are
"not in vmcs02". AFAICT the intent of the comment is to document
that various VMCS fields are not handled by prepare_vmcs02(),
e.g. HOST_{FS,GS}_{BASE,SELECTOR}. While technically true, the
comment is misleading, e.g. it can lead the reader to think that
KVM never writes those fields to vmcs02.
Remove the comment altogether as the handling of FS and GS is
not specific to nested VMX, and GUEST_PML_INDEX has been written
by prepare_vmcs02() since commit "4e59516a12a6 (kvm: vmx: ensure
VMCS is current while enabling PML)"
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Now that the vmx_load_host_state() wrapper is gone, i.e. the only
time we call the core functions is when we're actually about to
switch between guest/host, rename the functions that handle lazy
state switching to vmx_prepare_switch_to_{guest,host}_state() to
better document the full extent of their functionality.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When lazy save/restore of MSR_KERNEL_GS_BASE was introduced[1], the
MSR was intercepted in all modes and was only restored for the host
when the guest is in 64-bit mode. So at the time, going through the
full host restore prior to accessing MSR_KERNEL_GS_BASE was necessary
to load host state and was not a significant waste of cycles.
Later, MSR_KERNEL_GS_BASE interception was disabled for a 64-bit
guest[2], and then unconditionally saved/restored for the host[3].
As a result, loading full host state is overkill for accesses to
MSR_KERNEL_GS_BASE, and completely unnecessary when the guest is
not in 64-bit mode.
Add a dedicated utility to read/write the guest's MSR_KERNEL_GS_BASE
(outside of the save/restore flow) to minimize the overhead incurred
when accessing the MSR. When setting EFER, only decache the MSR if
the new EFER will disable long mode.
Removing out-of-band usage of vmx_load_host_state() also eliminates,
or at least reduces, potential corner cases in its usage, which in
turn will (hopefuly) make it easier to reason about future changes
to the save/restore flow, e.g. optimization of saving host state.
[1] commit 44ea2b1758d8 ("KVM: VMX: Move MSR_KERNEL_GS_BASE out of the vmx
autoload msr area")
[2] commit 5897297bc228 ("KVM: VMX: Don't intercept MSR_KERNEL_GS_BASE")
[3] commit c8770e7ba63b ("KVM: VMX: Fix host userspace gsbase corruption")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Using 'struct loaded_vmcs*' to track whether the CPU registers
contain host or guest state kills two birds with one stone.
1. The (effective) boolean host_state.loaded is poorly named.
It does not track whether or not host state is loaded into
the CPU registers (which most readers would expect), but
rather tracks if host state has been saved AND guest state
is loaded.
2. Using a loaded_vmcs pointer provides a more robust framework
for the optimized guest/host state switching, especially when
consideration per-VMCS enhancements. To that end, WARN_ONCE
if we try to switch to host state with a different VMCS than
was last used to save host state.
Resolve an occurrence of the new WARN by setting loaded_vmcs after
the call to vmx_vcpu_put() in vmx_switch_vmcs().
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Use local variables in vmx_save_host_state() to temporarily track
the selector and base values for FS and GS, and reorganize the
code so that the 64-bit vs 32-bit portions are contained within
a single #ifdef. This refactoring paves the way for future patches
to modify the updating of VMCS state with minimal changes to the
code, and (hopefully) simplifies resolving a likely conflict with
another in-flight patch[1] by being the whipping boy for future
patches.
[1] https://www.spinics.net/lists/kvm/msg171647.html
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When checking emulated VMX instructions for faults, the #UD for "IF
(not in VMX operation)" should take precedence over the #GP for "ELSIF
CPL > 0."
Suggested-by: Eric Northup <digitaleric@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The fault that should be raised for a privilege level violation is #GP
rather than #UD.
Fixes: 727ba748e110b4 ("kvm: nVMX: Enforce cpl=0 for VMX instructions")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
We are currently cutting hva_to_pfn_fast short if we do not want an
immediate exit, which is represented by !async && !atomic. However,
this is unnecessary, and __get_user_pages_fast is *much* faster
because the regular get_user_pages takes pmd_lock/pte_lock.
In fact, when many CPUs take a nested vmexit at the same time
the contention on those locks is visible, and this patch removes
about 25% (compared to 4.18) from vmexit.flat on a 16 vCPU
nested guest.
Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Register tlb_remote_flush callback for vmx when hyperv capability of
nested guest mapping flush is detected. The interface can help to
reduce overhead when flush ept table among vcpus for nested VM. The
tradition way is to send IPIs to all affected vcpus and executes
INVEPT on each vcpus. It will trigger several vmexits for IPI
and INVEPT emulation. Hyper-V provides such hypercall to do
flush for all vcpus and call the hypercall when all ept table
pointers of single VM are same.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This patch is to provide a way for platforms to register hv tlb remote
flush callback and this helps to optimize operation of tlb flush
among vcpus for nested virtualization case.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This patch is to add hyperv_nested_flush_guest_mapping support to trace
hvFlushGuestPhysicalAddressSpace hypercall.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Hyper-V supports a pv hypercall HvFlushGuestPhysicalAddressSpace to
flush nested VM address space mapping in l1 hypervisor and it's to
reduce overhead of flushing ept tlb among vcpus. This patch is to
implement it.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
On a VM with only 1 vCPU, the locking fast path will always be
successful. In this case, there is no need to use the the PV qspinlock
code which has higher overhead on the unlock side than the native
qspinlock code.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Merge check of "sp->role.cr4_pae != !!is_pae(vcpu))" and "vcpu->
arch.mmu.sync_page(vcpu, sp) == 0". kvm_mmu_prepare_zap_page()
is called under both these conditions.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
It is a duplicate of X86_CR3_PCID_NOFLUSH. So just use that instead.
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Adds support for storing multiple previous CR3/root_hpa pairs maintained
as an LRU cache, so that the lockless CR3 switch path can be used when
switching back to any of them.
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This needs a minor bug fix. The updated patch is as follows.
Thanks,
Junaid
------------------------------------------------------------------------------
kvm_mmu_invlpg() and kvm_mmu_invpcid_gva() only need to flush the TLB
entries for the specific guest virtual address, instead of flushing all
TLB entries associated with the VM.
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When the guest indicates that the TLB doesn't need to be flushed in a
CR3 switch, we can also skip resyncing the shadow page tables since an
out-of-sync shadow page table is equivalent to an out-of-sync TLB.
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|