Age | Commit message (Collapse) | Author |
|
Pull block fixes from Jens Axboe:
"A few fixes that should make it into this release. This contains:
- io_uring:
- The timeout command assumes sequence == 0 means that we want
one completion, but this kind of overloading is unfortunate as
it prevents users from doing a pure time based wait. Since
this operation was introduced in this cycle, let's correct it
now, while we can. (me)
- One-liner to fix an issue with dependent links and fixed
buffer reads. The actual IO completed fine, but the link got
severed since we stored the wrong expected value. (me)
- Add TIMEOUT to list of opcodes that don't need a file. (Pavel)
- rsxx missing workqueue destry calls. Old bug. (Chuhong)
- Fix blk-iocost active list check (Jiufei)
- Fix impossible-to-hit overflow merge condition, that still hit some
folks very rarely (Junichi)
- Fix bfq hang issue from 5.3. This didn't get marked for stable, but
will go into stable post this merge (Paolo)"
* tag 'for-linus-20191115' of git://git.kernel.dk/linux-block:
rsxx: add missed destroy_workqueue calls in remove
iocost: check active_list of all the ancestors in iocg_activate()
block, bfq: deschedule empty bfq_queues not referred by any process
io_uring: ensure registered buffer import returns the IO length
io_uring: Fix getting file for timeout
block: check bi_size overflow before merge
io_uring: make timeout sequence == 0 mean no sequence
|
|
We can't use "adap->dev" after it has been freed.
Fixes: 5bf4fa7daea6 ("i2c: break out OF support into separate file")
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Many cheap devices use Silead touchscreen controllers. Testing has shown
repeatedly that these touchscreen controllers work fine at 400KHz, but for
unknown reasons do not work properly at 100KHz. This has been seen on
both ARM and x86 devices using totally different i2c controllers.
On some devices the ACPI tables list another device at the same I2C-bus
as only being capable of 100KHz, testing has shown that these other
devices work fine at 400KHz (as can be expected of any recent I2C hw).
This commit makes i2c_acpi_find_bus_speed() always return 400KHz if a
Silead touchscreen controller is present, fixing the touchscreen not
working on devices which ACPI tables' wrongly list another device on the
same bus as only being capable of 100KHz.
Specifically this fixes the touchscreen on the Jumper EZpad 6 m4 not
working.
Reported-by: youling 257 <youling257@gmail.com>
Tested-by: youling 257 <youling257@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[wsa: rewording warning a little]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
|
|
Richard Cochran says:
====================
ptp: Validate the ancillary ioctl flags more carefully.
The flags passed to the ioctls for periodic output signals and
time stamping of external signals were never checked, and thus formed
a useless ABI inadvertently. More recently, a version 2 of the ioctls
was introduced in order make the flags meaningful. This series
tightens up the checks on the new ioctl flags.
- Patch 1 ensures at least one edge flag is set for the new ioctl.
- Patches 2-7 are Jacob's recent checks, picking up the tags.
- Patch 8 introduces a "strict" flag for passing to the drivers when the
new ioctl is used.
- Patches 9-12 implement the "strict" checking in the drivers.
- Patch 13 extends the test program to exercise combinations of flags.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Because each driver and hardware has different capabilities, the test
cannot provide a simple pass/fail result, but it can at least show what
combinations of flags are supported.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This driver enables rising edge or falling edge, but not both, and so
this patch validates that the request contains only one of the two
edges.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This hardware always time stamps rising and falling edges, and so this
patch validates that the request does contains both edges.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This driver enables rising edge or falling edge, but not both, and so
this patch validates that the request contains only one of the two
edges.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This driver enables rising edge or falling edge, but not both, and so
this patch validates that the request contains only one of the two
edges.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
User space may request time stamps on rising edges, falling edges, or
both. However, the particular mode may or may not be supported in the
hardware or in the driver. This patch adds a "strict" flag that tells
drivers to ensure that the requested mode will be honored.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the renesas PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.
In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the mlx5 core PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.
In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.
[ RC: I'm not 100% sure what this driver does, but if I'm not wrong it
follows the dp83640:
flags Meaning
---------------------------------------------------- --------------------------
PTP_ENABLE_FEATURE Time stamp rising edge
PTP_ENABLE_FEATURE|PTP_RISING_EDGE Time stamp rising edge
PTP_ENABLE_FEATURE|PTP_FALLING_EDGE Time stamp falling edge
PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE Time stamp falling edge
]
Cc: Feras Daoud <ferasda@mellanox.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the igb PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.
In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.
This HW always time stamps both edges:
flags Meaning
---------------------------------------------------- --------------------------
PTP_ENABLE_FEATURE Time stamp both edges
PTP_ENABLE_FEATURE|PTP_RISING_EDGE Time stamp both edges
PTP_ENABLE_FEATURE|PTP_FALLING_EDGE Time stamp both edges
PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE Time stamp both edges
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the dp83640 PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.
In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.
For the record, the semantics of this driver are:
flags Meaning
---------------------------------------------------- --------------------------
PTP_ENABLE_FEATURE Time stamp rising edge
PTP_ENABLE_FEATURE|PTP_RISING_EDGE Time stamp rising edge
PTP_ENABLE_FEATURE|PTP_FALLING_EDGE Time stamp falling edge
PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE Time stamp falling edge
Cc: Stefan Sørensen <stefan.sorensen@spectralink.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the mv88e6xxx PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.
In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.
For the record, the semantics of this driver are:
flags Meaning
---------------------------------------------------- --------------------------
PTP_ENABLE_FEATURE Time stamp falling edge
PTP_ENABLE_FEATURE|PTP_RISING_EDGE Time stamp rising edge
PTP_ENABLE_FEATURE|PTP_FALLING_EDGE Time stamp falling edge
PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE Time stamp rising edge
Cc: Brandon Streiff <brandon.streiff@ni.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 823eb2a3c4c7 ("PTP: add support for one-shot output") introduced
a new flag for the PTP periodic output request ioctl. This flag is not
currently supported by any driver.
Fix all drivers which implement the periodic output request ioctl to
explicitly reject any request with flags they do not understand. This
ensures that the driver does not accidentally misinterpret the
PTP_PEROUT_ONE_SHOT flag, or any new flag introduced in the future.
This is important for forward compatibility: if a new flag is
introduced, the driver should reject requests to enable the flag until
the driver has actually been modified to support the flag in question.
Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Christopher Hall <christopher.s.hall@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 415606588c61 ("PTP: introduce new versions of IOCTLs")
introduced a new external time stamp ioctl that validates the flags.
This patch extends the validation to ensure that at least one rising
or falling edge flag is set when enabling external time stamps.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver calls release_resource in remove to match request_mem_region
in probe, which is incorrect.
Fix it by using the right one, release_mem_region.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a malicious device gives a short MAC it can elicit up to
5 bytes of leaked memory out of the driver. We need to check for
ETH_ALEN instead.
Reported-by: syzbot+a8d4acdad35e6bbca308@syzkaller.appspotmail.com
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
mlxsw does not support VXLAN devices with a physical device attached and
vetoes such configurations upon enslavement to an offloaded bridge.
Commit 0ce1822c2a08 ("vxlan: add adjacent link to limit depth level")
changed the VXLAN device to be an upper of the physical device which
causes mlxsw to veto the creation of the VXLAN device with "Unknown
upper device type".
This is OK as this configuration is not supported, but it prevents us
from testing bad flows involving the enslavement of VXLAN devices with a
physical device to a bridge, regardless if the physical device is an
mlxsw netdev or not.
Adjust the test to use a dummy device as a physical device instead of a
mlxsw netdev.
Fixes: 0ce1822c2a08 ("vxlan: add adjacent link to limit depth level")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Metadata runs are supposed to be aligned on 4k boundary (so that they work
efficiently with disks with 4k sectors). However, there was a programming
bug that makes them aligned on 128k boundary instead. The unused space is
wasted.
Fix this bug by providing a proper 4k alignment. In order to keep
existing volumes working, we introduce a new flag SB_FLAG_FIXED_PADDING
- when the flag is clear, we calculate the padding the old way. In order
to make sure that the old version cannot mount the volume created by the
new version, we increase superblock version to 4.
Also in order to not break with old integritysetup, we fix alignment
only if the parameter "fix_padding" is present when formatting the
device.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
The driver forgets to destroy workqueue in remove() similarly to what is
done when probe() fails. Add a call to destroy_workqueue() to fix it.
Since unregistration will wait for the work to finish, we do not need to
cancel/flush the work instance in remove().
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191114023405.31477-1-hslester96@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
No timer must be left running when the device goes away.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reported-and-tested-by: syzbot+b6c55daa701fc389e286@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1573726121.17351.3.camel@suse.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Pull ceph fixes from Ilya Dryomov:
"Two fixes for the buffered reads and O_DIRECT writes serialization
patch that went into -rc1 and a fixup for a bogus warning on older gcc
versions"
* tag 'ceph-for-5.4-rc8' of git://github.com/ceph/ceph-client:
rbd: silence bogus uninitialized warning in rbd_object_map_update_finish()
ceph: increment/decrement dio counter on async requests
ceph: take the inode lock before acquiring cap refs
|
|
When a lookup is done, the afs filesystem will perform a bulk status-fetch
operation on the requested vnode (file) plus the next 49 other vnodes from
the directory list (in AFS, directory contents are downloaded as blobs and
parsed locally). When the results are received, it will speculatively
populate the inode cache from the extra data.
However, if the lookup races with another lookup on the same directory, but
for a different file - one that's in the 49 extra fetches, then if the bulk
status-fetch operation finishes first, it will try and update the inode
from the other lookup.
If this other inode is still in the throes of being created, however, this
will cause an assertion failure in afs_apply_status():
BUG_ON(test_bit(AFS_VNODE_UNSET, &vnode->flags));
on or about fs/afs/inode.c:175 because it expects data to be there already
that it can compare to.
Fix this by skipping the update if the inode is being created as the
creator will presumably set up the inode with the same information.
Fixes: 39db9815da48 ("afs: Fix application of the results of a inline bulk status fetch")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"One trivial fix for -rc8/final that ensures that the script used to
detect RELR relocation support in the toolchain works correctly when
$CC contains quotes. Although it fails safely (by failing to detect
the support when it exists), it would be nice to have this fixed in
5.4 given that it was only introduced in the last merge window.
Summary:
- Handle CC variables containing quotes in tools-support-relr.sh
script"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
scripts/tools-support-relr.sh: un-quote variables
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
"A fix and simplification for SGI IP27 exception handlers, and a small
MAINTAINERS update for Broadcom MIPS systems"
* tag 'mips_fixes_5.4_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MAINTAINERS: Remove Kevin as maintainer of BMIPS generic platforms
MIPS: SGI-IP27: fix exception handler replication
|
|
Pull more KVM fixes from Paolo Bonzini:
- fixes for CONFIG_KVM_COMPAT=n
- two updates to the IFU erratum
- selftests build fix
- brown paper bag fix
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Add a comment describing the /dev/kvm no_compat handling
KVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast()
KVM: Forbid /dev/kvm being opened by a compat task when CONFIG_KVM_COMPAT=n
KVM: X86: Reset the three MSR list number variables to 0 in kvm_init_msr_list()
selftests: kvm: fix build with glibc >= 2.30
kvm: x86: disable shattered huge page recovery for PREEMPT_RT.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fix from Ulf Hansson:
"Don't overwrite quirk flags in sdhci-of-at91 host driver"
* tag 'mmc-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-of-at91: fix quirk2 overwrite
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A few small last-minute fixes for USB-audio and HD-audio as well as
for PCM core:
- A race fix for PCM core between stopping and closing a stream
- USB-audio regressions in the recent descriptor validation code and
relevant changes
- A read of uninitialized value in USB-audio spotted by fuzzer
- A fix for USB-audio race at stopping a stream
- Intel HD-audio platform fixes"
* tag 'sound-5.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Fix incorrect size check for processing/extension units
ALSA: usb-audio: Fix incorrect NULL check in create_yamaha_midi_quirk()
ALSA: pcm: Fix stream lock usage in snd_pcm_period_elapsed()
ALSA: usb-audio: not submit urb for stopped endpoint
ALSA: hda: hdmi - fix pin setup on Tigerlake
ALSA: hda: Add Cometlake-S PCI ID
ALSA: usb-audio: Fix missing error check at mixer resolution test
|
|
Pull drm fixes from Dave Airlie:
"Here is this weeks non-intel hw vuln fixes pull. Three drivers, all
small fixes.
i915:
- MOCS table fixes for EHL and TGL
- Update Display's rawclock on resume
- GVT's dmabuf reference drop fix
amdgpu:
- Fix a potential crash in firmware parsing
sun4i:
- One fix to the dotclock dividers range for sun4i"
* tag 'drm-fixes-2019-11-15' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: fix null pointer deref in firmware header printing
drm/i915/tgl: MOCS table update
Revert "drm/i915/ehl: Update MOCS table for EHL"
drm/sun4i: tcon: Set min division of TCON0_DCLK to 1.
drm/i915: update rawclk also on resume
drm/i915/gvt: fix dropping obj reference twice
|
|
Pull misc vfs fixes from Al Viro:
"Assorted fixes all over the place; some of that is -stable fodder,
some regressions from the last window"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either
ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable
ecryptfs: fix unlink and rmdir in face of underlying fs modifications
audit_get_nd(): don't unlock parent too early
exportfs_decode_fh(): negative pinned may become positive without the parent locked
cgroup: don't put ERR_PTR() into fc->root
autofs: fix a leak in autofs_expire_indirect()
aio: Fix io_pgetevents() struct __compat_aio_sigset layout
fs/namespace.c: fix use-after-free of mount in mnt_warn_timestamp_expiry()
|
|
In IOAPIC fixed delivery mode instead of flushing the scan
requests to all vCPUs, we should only send the requests to
vCPUs specified within the destination field.
This patch introduces kvm_get_dest_vcpus_mask() API which
retrieves an array of target vCPUs by using
kvm_apic_map_get_dest_lapic() and then based on the
vcpus_idx, it sets the bit in a bitmap. However, if the above
fails kvm_get_dest_vcpus_mask() finds the target vCPUs by
traversing all available vCPUs. Followed by setting the
bits in the bitmap.
If we had different vCPUs in the previous request for the
same redirection table entry then bits corresponding to
these vCPUs are also set. This to done to keep
ioapic_handled_vectors synchronized.
This bitmap is then eventually passed on to
kvm_make_vcpus_request_mask() to generate a masked request
only for the target vCPUs.
This would enable us to reduce the latency overhead on isolated
vCPUs caused by the IPI to process due to KVM_REQ_IOAPIC_SCAN.
Suggested-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fetching an index for any vcpu in kvm->vcpus array by traversing
the entire array everytime is costly.
This patch remembers the position of each vcpu in kvm->vcpus array
by storing it in vcpus_idx under kvm_vcpu structure.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The L1 hypervisor may include the IA32_TIME_STAMP_COUNTER MSR in the
vmcs12 MSR VM-exit MSR-store area as a way of determining the highest
TSC value that might have been observed by L2 prior to VM-exit. The
current implementation does not capture a very tight bound on this
value. To tighten the bound, add the IA32_TIME_STAMP_COUNTER MSR to the
vmcs02 VM-exit MSR-store area whenever it appears in the vmcs12 VM-exit
MSR-store area. When L0 processes the vmcs12 VM-exit MSR-store area
during the emulation of an L2->L1 VM-exit, special-case the
IA32_TIME_STAMP_COUNTER MSR, using the value stored in the vmcs02
VM-exit MSR-store area to derive the value to be stored in the vmcs12
VM-exit MSR-store area.
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Rename function find_msr() to vmx_find_msr_index() in preparation for an
upcoming patch where we export it and use it in nested.c.
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Rename NR_AUTOLOAD_MSRS to NR_LOADSTORE_MSRS. This needs to be done
due to the addition of the MSR-autostore area that will be added in a
future patch. After that the name AUTOLOAD will no longer make sense.
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add the function read_and_check_msr_entry() which just pulls some code
out of nested_vmx_store_msr(). This will be useful as reusable code in
upcoming patches.
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Correct a small inaccuracy in the shattering of vmx.c, which becomes
visible now that pmu_intel.c includes nested.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The "load IA32_PERF_GLOBAL_CTRL" bit for VM-entry and VM-exit should
only be exposed to the guest if IA32_PERF_GLOBAL_CTRL is a valid MSR.
Create a new helper to allow pmu_refresh() to update the VM-Entry and
VM-Exit controls to ensure PMU values are initialized when performing
the is_valid_msr() check.
Suggested-by: Jim Mattson <jmattson@google.com>
Co-developed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add condition to prepare_vmcs02 which loads IA32_PERF_GLOBAL_CTRL on
VM-entry if the "load IA32_PERF_GLOBAL_CTRL" bit on the VM-entry control
is set. Use SET_MSR_OR_WARN() rather than directly writing to the field
to avoid overwrite by atomic_switch_perf_msrs().
Suggested-by: Jim Mattson <jmattson@google.com>
Co-developed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The existing implementation for loading the IA32_PERF_GLOBAL_CTRL MSR
on VM-exit was incorrect, as the next call to atomic_switch_perf_msrs()
could cause this value to be overwritten. Instead, call kvm_set_msr()
which will allow atomic_switch_perf_msrs() to correctly set the values.
Define a macro, SET_MSR_OR_WARN(), to set the MSR with kvm_set_msr()
and WARN on failure.
Suggested-by: Jim Mattson <jmattson@google.com>
Co-developed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a consistency check on nested vm-entry for host's
IA32_PERF_GLOBAL_CTRL from vmcs12. Per Intel's SDM Vol 3 26.2.2:
If the "load IA32_PERF_GLOBAL_CTRL"
VM-exit control is 1, bits reserved in the IA32_PERF_GLOBAL_CTRL
MSR must be 0 in the field for that register"
Suggested-by: Jim Mattson <jmattson@google.com>
Co-developed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add condition to nested_vmx_check_guest_state() to check the validity of
GUEST_IA32_PERF_GLOBAL_CTRL. Per Intel's SDM Vol 3 26.3.1.1:
If the "load IA32_PERF_GLOBAL_CTRL" VM-entry control is 1, bits
reserved in the IA32_PERF_GLOBAL_CTRL MSR must be 0 in the field for that
register.
Suggested-by: Jim Mattson <jmattson@google.com>
Co-developed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Create a helper function to check the validity of a proposed value for
IA32_PERF_GLOBAL_CTRL from the existing check in intel_pmu_set_msr().
Per Intel's SDM, the reserved bits in IA32_PERF_GLOBAL_CTRL must be
cleared for the corresponding host/guest state fields.
Suggested-by: Jim Mattson <jmattson@google.com>
Co-developed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
On kvm_create_max_vcpus test remove unneeded local
variable in the loop that add vcpus to the VM.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
it directly
When KVM emulates a nested VMEntry (L1->L2 VMEntry), it switches mmu root
page. If nEPT is used, this will happen from
kvm_init_shadow_ept_mmu()->__kvm_mmu_new_cr3() and otherwise it will
happpen from nested_vmx_load_cr3()->kvm_mmu_new_cr3(). Either case,
__kvm_mmu_new_cr3() will use fast_cr3_switch() in attempt to switch to a
previously cached root page.
In case fast_cr3_switch() finds a matching cached root page, it will
set it in mmu->root_hpa and request KVM_REQ_LOAD_CR3 such that on
next entry to guest, KVM will set root HPA in appropriate hardware
fields (e.g. vmcs->eptp). In addition, fast_cr3_switch() calls
kvm_x86_ops->tlb_flush() in order to flush TLB as MMU root page
was replaced.
This works as mmu->root_hpa, which vmx_flush_tlb() use, was
already replaced in cached_root_available(). However, this may
result in unnecessary INVEPT execution because a KVM_REQ_TLB_FLUSH
may have already been requested. For example, by prepare_vmcs02()
in case L1 don't use VPID.
Therefore, change fast_cr3_switch() to just request TLB flush on
next entry to guest.
Reviewed-by: Bhavesh Davda <bhavesh.davda@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Currently, a host perf_event is created for a vPMC functionality emulation.
It’s unpredictable to determine if a disabled perf_event will be reused.
If they are disabled and are not reused for a considerable period of time,
those obsolete perf_events would increase host context switch overhead that
could have been avoided.
If the guest doesn't WRMSR any of the vPMC's MSRs during an entire vcpu
sched time slice, and its independent enable bit of the vPMC isn't set,
we can predict that the guest has finished the use of this vPMC, and then
do request KVM_REQ_PMU in kvm_arch_sched_in and release those perf_events
in the first call of kvm_pmu_handle_event() after the vcpu is scheduled in.
This lazy mechanism delays the event release time to the beginning of the
next scheduled time slice if vPMC's MSRs aren't changed during this time
slice. If guest comes back to use this vPMC in next time slice, a new perf
event would be re-created via perf_event_create_kernel_counter() as usual.
Suggested-by: Wei Wang <wei.w.wang@intel.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The perf_event_create_kernel_counter() in the pmc_reprogram_counter() is
a heavyweight and high-frequency operation, especially when host disables
the watchdog (maximum 21000000 ns) which leads to an unacceptable latency
of the guest NMI handler. It limits the use of vPMUs in the guest.
When a vPMC is fully enabled, the legacy reprogram_*_counter() would stop
and release its existing perf_event (if any) every time EVEN in most cases
almost the same requested perf_event will be created and configured again.
For each vPMC, if the reuqested config ('u64 eventsel' for gp and 'u8 ctrl'
for fixed) is the same as its current config AND a new sample period based
on pmc->counter is accepted by host perf interface, the current event could
be reused safely as a new created one does. Otherwise, do release the
undesirable perf_event and reprogram a new one as usual.
It's light-weight to call pmc_pause_counter (disable, read and reset event)
and pmc_resume_counter (recalibrate period and re-enable event) as guest
expects instead of release-and-create again on any condition. Compared to
use the filterable event->attr or hw.config, a new 'u64 current_config'
field is added to save the last original programed config for each vPMC.
Based on this implementation, the number of calls to pmc_reprogram_counter
is reduced by ~82.5% for a gp sampling event and ~99.9% for a fixed event.
In the usage of multiplexing perf sampling mode, the average latency of the
guest NMI handler is reduced from 104923 ns to 48393 ns (~2.16x speed up).
If host disables watchdog, the minimum latecy of guest NMI handler could be
speed up at ~3413x (from 20407603 to 5979 ns) and at ~786x in the average.
Suggested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Introduce a new callback msr_idx_to_pmc that returns a struct kvm_pmc*,
and change kvm_pmu_is_valid_msr to return ".msr_idx_to_pmc(vcpu, msr) ||
.is_valid_msr(vcpu, msr)" and AMD just returns false from .is_valid_msr.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|