Age | Commit message (Collapse) | Author |
|
Remove the additional space.
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add TEST opcode to Group3-2 reg=001b as same as Group3-1 does.
Commit
12a78d43de76 ("x86/decoder: Add new TEST instruction pattern")
added a TEST opcode assignment to f6 XX/001/XXX (Group 3-1), but did
not add f7 XX/001/XXX (Group 3-2).
Actually, this TEST opcode variant (ModRM.reg /1) is not described in
the Intel SDM Vol2 but in AMD64 Architecture Programmer's Manual Vol.3,
Appendix A.2 Table A-6. ModRM.reg Extensions for the Primary Opcode Map.
Without this fix, Randy found a warning by insn_decoder_test related
to this issue as below.
HOSTCC arch/x86/tools/insn_decoder_test
HOSTCC arch/x86/tools/insn_sanity
TEST posttest
arch/x86/tools/insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
arch/x86/tools/insn_decoder_test: warning: ffffffff81000bf1: f7 0b 00 01 08 00 testl $0x80100,(%rbx)
arch/x86/tools/insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 2
arch/x86/tools/insn_decoder_test: warning: Decoded and checked 11913894 instructions with 1 failures
TEST posttest
arch/x86/tools/insn_sanity: Success: decoded and checked 1000000 random instructions with 0 errors (seed:0x871ce29c)
To fix this error, add the TEST opcode according to AMD64 APM Vol.3.
[ bp: Massage commit message. ]
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lkml.kernel.org/r/157966631413.9580.10311036595431878351.stgit@devnote2
|
|
When seeding KALSR on a system where we have architecture level random
number generation make use of that entropy, mixing it in with the seed
passed by the bootloader. Since this is run very early in init before
feature detection is complete we open code rather than use archrandom.h.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Expose the ID_AA64ISAR0.RNDR field to userspace, as the RNG system
registers are always available at EL0.
Implement arch_get_random_seed_long using RNDR. Given that the
TRNG is likely to be a shared resource between cores, and VMs,
do not explicitly force re-seeding with RNDRRS. In order to avoid
code complexity and potential issues with hetrogenous systems only
provide values after cpufeature has finalized the system capabilities.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[Modified to only function after cpufeature has finalized the system
capabilities and move all the code into the header -- broonie]
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
[will: Advertise HWCAP via /proc/cpuinfo]
Signed-off-by: Will Deacon <will@kernel.org>
|
|
A load on an ESB page returning all 1's means that the underlying
device has invalidated the access to the PQ state of the interrupt
through mmio. It may happen, for example when querying a PHB interrupt
while the PHB is in an error state.
In that case, we should consider the interrupt to be invalid when
checking its state in the irq_get_irqchip_state() handler.
Fixes: da15c03b047d ("powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
[clg: wrote a commit log, introduced XIVE_ESB_INVALID ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200113130118.27969-1-clg@kaod.org
|
|
Let PPC_UV depend only on DEVICE_PRIVATE which in turn
will satisfy all the other required dependencies
Fixes: 013a53f2d25a ("powerpc: Ultravisor: Add PPC_UV config option")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200109092047.24043-1-bharata@linux.ibm.com
|
|
When the ARM accelerated ChaCha driver is built as part of a configuration
that has kernel mode NEON disabled, we expect the compiler to propagate
the build time constant expression IS_ENABLED(CONFIG_KERNEL_MODE_NEON) in
a way that eliminates all the cross-object references to the actual NEON
routines, which allows the chacha-neon-core.o object to be omitted from
the build entirely.
Unfortunately, this fails to work as expected in some cases, and we may
end up with a build error such as
chacha-glue.c:(.text+0xc0): undefined reference to `chacha_4block_xor_neon'
caused by the fact that chacha_doneon() has not been eliminated from the
object code, even though it will never be called in practice.
Let's fix this by adding some IS_ENABLED(CONFIG_KERNEL_MODE_NEON) tests
that are not strictly needed from a logical point of view, but should
help the compiler infer that the NEON code paths are unreachable in
those cases.
Fixes: b36d8c09e710c71f ("crypto: arm/chacha - remove dependency on generic ...")
Reported-by: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The emit code does optional base conversion itself in assembly, so we
don't need to do that here. Also, neither one of these functions uses
simd instructions, so checking for that doesn't make sense either.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Admist the kbuild robot induced changes, the .gitignore file for the
generated file wasn't updated with the non-clashing filename. This
commit adjusts that.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
In order to avoid CFI function prototype mismatches, this removes the
casts on assembly implementations of sha1/256/512 accelerators. The
safety checks from BUILD_BUG_ON() remain.
Additionally, this renames various arguments for clarity, as suggested
by Eric Biggers.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Before:
1f299fad1e31: ("efi/x86: Limit EFI old memory map to SGI UV machines")
enabling the old EFI memory map on mixed mode systems
disabled EFI runtime services altogether.
Given that efi=old_map is a debug feature designed to work around
firmware problems related to EFI runtime services, and disabling
them can be achieved more straightforwardly using 'noefi' or
'efi=noruntime', it makes more sense to ignore efi=old_map on
mixed mode systems.
Currently, we do neither, and try to use the old memory map in
combination with mixed mode routines, which results in crashes,
so let's fix this by making efi=old_map functional on native
systems only.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
arm/dt
ARM: dts: zynq: DT changes for v5.6 v2
- Enable coresight topology for Zynq
* tag 'zynq-dt-for-v5.6-v2' of https://github.com/Xilinx/linux-xlnx:
ARM: dts: zynq: enablement of coresight topology
Link: https://lore.kernel.org/r/5db334df-89b5-1d07-3884-93f77b0f4e60@monstr.eu
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
arm/soc
ARM: Xilinx Zynq SoC patches for v5.6
- Fix cpuid handling logic in platform SMP startup code
* tag 'zynq-soc-for-v5.6' of https://github.com/Xilinx/linux-xlnx:
ARM: zynq: use physical cpuid in zynq_slcr_cpu_stop/start
Link: https://lore.kernel.org/r/50dec3cf-5f80-69be-c3d1-cc14b9bce5ff@monstr.eu
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
arm/dt
arm64: dts: zynqmp: DT changes for v5.6
- Switch from fixed to firmware based clock driver
- Wire power domain driver
- Wire all ina226 chips through IIO and IIO hwmon drivers
- Add missing dr_mode property to usb nodes
- Use gpio-line-names property instead of comments
- Use clock-output-names for si570 differentiation
- Minor DT fixes
* tag 'zynqmp-dt-for-v5.6' of https://github.com/Xilinx/linux-xlnx: (21 commits)
arm64: zynqmp: Add label property to all ina226 on zcu106
arm64: zynqmp: Enable iio-hwmon for ina226 on zcu106
arm64: zynqmp: Add label property to all ina226 on zcu102
arm64: zynqmp: Enable iio-hwmon for ina226 on zcu102
arm64: zynqmp: Add label property to all ina226 on zcu111
arm64: zynqmp: Enable iio-hwmon for ina226 on zcu111
arm64: zynqmp: Enable iio-hwmon for ina226 on zcu100
arm64: zynqmp: Setup default number of chipselects for zcu100
arm64: zynqmp: Remove broken-cd from zcu100-revC
arm64: zynqmp: Fix the si570 clock frequency on zcu111
arm64: zynqmp: Setup clock-output-names for si570 chips
arm64: zynqmp: Turn comment to gpio-line-names
arm64: zynqmp: Fix address for tca6416_u97 chip on zcu104
arm64: zynqmp: Remove addition number in node name
arm64: zynqmp: Use ethernet-phy as node name for ethernet phys
arm64: dts: xilinx: Add the power nodes for zynqmp
arm64: dts: xilinx: Remove dtsi for fixed clock
arm64: dts: xilinx: Add the clock nodes for zynqmp
arm64: zynqmp: Add dr_mode property to usb node
arm64: dts: zynqmp: Use decimal values for drm-clock properties
...
Link: https://lore.kernel.org/r/c70d2efa-9ee2-a764-5248-0e5bfbf29f8a@monstr.eu
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Support for Samsung S3C64XX systems depends on ARCH_MULTI_V6, and thus
on ARCH_MULTIPLATFORM.
As the latter selects TIMER_OF, there is no need for MACH_S3C64XX_DT to
select TIMER_OF.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
|
|
Support for Samsung Exynos SoCs depends on ARCH_MULTI_V7, which selects
ARCH_MULTI_V6_V7.
As the latter selects MIGHT_HAVE_CACHE_L2X0, there is no need for
ARCH_EXYNOS4 to select MIGHT_HAVE_CACHE_L2X0.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
|
|
Stop using the legacy PWM API which only still exists because there are
some users left.
Note this change make use of the fact that the value of struct
pwm_state::duty_cycle doesn't matter for a disabled PWM and so its value
can stay constant simplifying the code a bit.
A side effect of the conversion is that the pwm isn't stopped in
rx1950_backlight_init() by the call to pwm_apply_args() just before
reenabling it when rx1950_lcd_power(1) is called.
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
|
|
We need the USB fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Since v4.3-rc1 commit 0723c05fb75e44 ("arm64: enable more compressed
Image formats"), it is possible to build Image.{bz2,lz4,lzma,lzo}
AArch64 images. However, the commit missed adding support for removing
those images on 'make ARCH=arm64 (dist)clean'.
Fix this by adding them to the target list.
Make sure to match the order of the recipes in the makefile.
Cc: stable@vger.kernel.org # v4.3+
Fixes: 0723c05fb75e44 ("arm64: enable more compressed Image formats")
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Remove the bogus 64-bit only condition from the check that disables MMIO
spte optimization when the system supports the max PA, i.e. doesn't have
any reserved PA bits. 32-bit KVM always uses PAE paging for the shadow
MMU, and per Intel's SDM:
PAE paging translates 32-bit linear addresses to 52-bit physical
addresses.
The kernel's restrictions on max physical addresses are limits on how
much memory the kernel can reasonably use, not what physical addresses
are supported by hardware.
Fixes: ce88decffd17 ("KVM: MMU: mmio page fault support")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In case writing to vmread destination operand result in a #PF, vmread
should not call nested_vmx_succeed() to set rflags to specify success.
Similar to as done in VMPTRST (See handle_vmptrst()).
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Rework the handling of nEPT's bad memtype/XWR checks to micro-optimize
the checks as much as possible. Move the check to a separate helper,
__is_bad_mt_xwr(), which allows the guest_rsvd_check usage in
paging_tmpl.h to omit the check entirely for paging32/64 (bad_mt_xwr is
always zero for non-nEPT) while retaining the bitwise-OR of the current
code for the shadow_zero_check in walk_shadow_page_get_mmio_spte().
Add a comment for the bitwise-OR usage in the mmio spte walk to avoid
future attempts to "fix" the code, which is what prompted this
optimization in the first place[*].
Opportunistically remove the superfluous '!= 0' and parantheses, and
use BIT_ULL() instead of open coding its equivalent.
The net effect is that code generation is largely unchanged for
walk_shadow_page_get_mmio_spte(), marginally better for
ept_prefetch_invalid_gpte(), and significantly improved for
paging32/64_prefetch_invalid_gpte().
Note, walk_shadow_page_get_mmio_spte() can't use a templated version of
the memtype/XRW as it works on the host's shadow PTEs, e.g. checks that
KVM hasn't borked its EPT tables. Even if it could be templated, the
benefits of having a single implementation far outweight the few uops
that would be saved for NPT or non-TDP paging, e.g. most compilers
inline it all the way to up kvm_mmu_page_fault().
[*] https://lkml.kernel.org/r/20200108001859.25254-1-sean.j.christopherson@intel.com
Cc: Jim Mattson <jmattson@google.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Move the !PRESENT and !ACCESSED checks in FNAME(prefetch_invalid_gpte)
above the call to is_rsvd_bits_set(). For a well behaved guest, the
!PRESENT and !ACCESSED are far more likely to evaluate true than the
reserved bit checks, and they do not require additional memory accesses.
Before:
Dump of assembler code for function paging32_prefetch_invalid_gpte:
0x0000000000044240 <+0>: callq 0x44245 <paging32_prefetch_invalid_gpte+5>
0x0000000000044245 <+5>: mov %rcx,%rax
0x0000000000044248 <+8>: shr $0x7,%rax
0x000000000004424c <+12>: and $0x1,%eax
0x000000000004424f <+15>: lea 0x0(,%rax,4),%r8
0x0000000000044257 <+23>: add %r8,%rax
0x000000000004425a <+26>: mov %rcx,%r8
0x000000000004425d <+29>: and 0x120(%rsi,%rax,8),%r8
0x0000000000044265 <+37>: mov 0x170(%rsi),%rax
0x000000000004426c <+44>: shr %cl,%rax
0x000000000004426f <+47>: and $0x1,%eax
0x0000000000044272 <+50>: or %rax,%r8
0x0000000000044275 <+53>: jne 0x4427c <paging32_prefetch_invalid_gpte+60>
0x0000000000044277 <+55>: test $0x1,%cl
0x000000000004427a <+58>: jne 0x4428a <paging32_prefetch_invalid_gpte+74>
0x000000000004427c <+60>: mov %rdx,%rsi
0x000000000004427f <+63>: callq 0x44080 <drop_spte>
0x0000000000044284 <+68>: mov $0x1,%eax
0x0000000000044289 <+73>: retq
0x000000000004428a <+74>: xor %eax,%eax
0x000000000004428c <+76>: and $0x20,%ecx
0x000000000004428f <+79>: jne 0x44289 <paging32_prefetch_invalid_gpte+73>
0x0000000000044291 <+81>: mov %rdx,%rsi
0x0000000000044294 <+84>: callq 0x44080 <drop_spte>
0x0000000000044299 <+89>: mov $0x1,%eax
0x000000000004429e <+94>: jmp 0x44289 <paging32_prefetch_invalid_gpte+73>
End of assembler dump.
After:
Dump of assembler code for function paging32_prefetch_invalid_gpte:
0x0000000000044240 <+0>: callq 0x44245 <paging32_prefetch_invalid_gpte+5>
0x0000000000044245 <+5>: test $0x1,%cl
0x0000000000044248 <+8>: je 0x4424f <paging32_prefetch_invalid_gpte+15>
0x000000000004424a <+10>: test $0x20,%cl
0x000000000004424d <+13>: jne 0x4425d <paging32_prefetch_invalid_gpte+29>
0x000000000004424f <+15>: mov %rdx,%rsi
0x0000000000044252 <+18>: callq 0x44080 <drop_spte>
0x0000000000044257 <+23>: mov $0x1,%eax
0x000000000004425c <+28>: retq
0x000000000004425d <+29>: mov %rcx,%rax
0x0000000000044260 <+32>: mov (%rsi),%rsi
0x0000000000044263 <+35>: shr $0x7,%rax
0x0000000000044267 <+39>: and $0x1,%eax
0x000000000004426a <+42>: lea 0x0(,%rax,4),%r8
0x0000000000044272 <+50>: add %r8,%rax
0x0000000000044275 <+53>: mov %rcx,%r8
0x0000000000044278 <+56>: and 0x120(%rsi,%rax,8),%r8
0x0000000000044280 <+64>: mov 0x170(%rsi),%rax
0x0000000000044287 <+71>: shr %cl,%rax
0x000000000004428a <+74>: and $0x1,%eax
0x000000000004428d <+77>: mov %rax,%rcx
0x0000000000044290 <+80>: xor %eax,%eax
0x0000000000044292 <+82>: or %rcx,%r8
0x0000000000044295 <+85>: je 0x4425c <paging32_prefetch_invalid_gpte+28>
0x0000000000044297 <+87>: mov %rdx,%rsi
0x000000000004429a <+90>: callq 0x44080 <drop_spte>
0x000000000004429f <+95>: mov $0x1,%eax
0x00000000000442a4 <+100>: jmp 0x4425c <paging32_prefetch_invalid_gpte+28>
End of assembler dump.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The KVM MMIO support uses bit 51 as the reserved bit to cause nested page
faults when a guest performs MMIO. The AMD memory encryption support uses
a CPUID function to define the encryption bit position. Given this, it is
possible that these bits can conflict.
Use svm_hardware_setup() to override the MMIO mask if memory encryption
support is enabled. Various checks are performed to ensure that the mask
is properly defined and rsvd_bits() is used to generate the new mask (as
was done prior to the change that necessitated this patch).
Fixes: 28a1f3ac1d0c ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs")
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The function nested_vmx_prepare_msr_bitmap() declaration is below its
implementation. So this is meaningless and should be removed.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Rename bit() to __feature_bit() to give it a more descriptive name, and
add a macro, feature_bit(), to stuff the X68_FEATURE_ prefix to keep
line lengths manageable for code that hardcodes the bit to be retrieved.
No functional change intended.
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add build-time checks to ensure KVM isn't trying to do a reverse CPUID
lookup on Linux-defined feature bits, along with comments to explain
the gory details of X86_FEATUREs and bit().
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add an entry for CPUID_7_1_EAX in the reserve_cpuid array in preparation
for incorporating the array in bit() build-time assertions, specifically
to avoid an assertion on F(AVX512_BF16) in do_cpuid_7_mask().
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Move bit() to cpuid.h in preparation for incorporating the reverse_cpuid
array in bit() build-time assertions. Opportunistically use the BIT()
macro instead of open-coding the shift.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add feature-specific helpers for querying guest CPUID support from the
emulator instead of having the emulator do a full CPUID and perform its
own bit tests. The primary motivation is to eliminate the emulator's
usage of bit() so that future patches can add more extensive build-time
assertions on the usage of bit() without having to expose yet more code
to the emulator.
Note, providing a generic guest_cpuid_has() to the emulator doesn't work
due to the existing built-time assertions in guest_cpuid_has(), which
require the feature being checked to be a compile-time constant.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a helper macro to generate the set of reserved cr4 bits for both
host and guest to ensure that adding a check on guest capabilities is
also added for host capabilities, and vice versa.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Now that KVM prevents setting host-reserved CR4 bits, drop the dedicated
XSAVE check in guest_cpuid_has() in favor of open coding similar checks
in the SVM/VMX XSAVES enabling flows.
Note, checking boot_cpu_has(X86_FEATURE_XSAVE) in the XSAVES flows is
technically redundant with respect to the CR4 reserved bit checks, e.g.
XSAVES #UDs if CR4.OSXSAVE=0 and arch.xsaves_enabled is consumed if and
only if CR4.OXSAVE=1 in guest. Keep (add?) the explicit boot_cpu_has()
checks to help document KVM's usage of arch.xsaves_enabled.
No functional change intended.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Check the current CPU's reserved cr4 bits against the mask calculated
for the boot CPU to ensure consistent behavior across all CPUs.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Calculate the host-reserved cr4 bits at runtime based on the system's
capabilities (using logic similar to __do_cpuid_func()), and use the
dynamically generated mask for the reserved bit check in kvm_set_cr4()
instead using of the static CR4_RESERVED_BITS define. This prevents
userspace from "enabling" features in cr4 that are not supported by the
system, e.g. by ignoring KVM_GET_SUPPORTED_CPUID and specifying a bogus
CPUID for the vCPU.
Allowing userspace to set unsupported bits in cr4 can lead to a variety
of undesirable behavior, e.g. failed VM-Enter, and in general increases
KVM's attack surface. A crafty userspace can even abuse CR4.LA57 to
induce an unchecked #GP on a WRMSR.
On a platform without LA57 support:
KVM_SET_CPUID2 // CPUID_7_0_ECX.LA57 = 1
KVM_SET_SREGS // CR4.LA57 = 1
KVM_SET_MSRS // KERNEL_GS_BASE = 0x0004000000000000
KVM_RUN
leads to a #GP when writing KERNEL_GS_BASE into hardware:
unchecked MSR access error: WRMSR to 0xc0000102 (tried to write 0x0004000000000000)
at rIP: 0xffffffffa00f239a (vmx_prepare_switch_to_guest+0x10a/0x1d0 [kvm_intel])
Call Trace:
kvm_arch_vcpu_ioctl_run+0x671/0x1c70 [kvm]
kvm_vcpu_ioctl+0x36b/0x5d0 [kvm]
do_vfs_ioctl+0xa1/0x620
ksys_ioctl+0x66/0x70
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x4c/0x170
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fc08133bf47
Note, the above sequence fails VM-Enter due to invalid guest state.
Userspace can allow VM-Enter to succeed (after the WRMSR #GP) by adding
a KVM_SET_SREGS w/ CR4.LA57=0 after KVM_SET_MSRS, in which case KVM will
technically leak the host's KERNEL_GS_BASE into the guest. But, as
KERNEL_GS_BASE is a userspace-defined value/address, the leak is largely
benign as a malicious userspace would simply be exposing its own data to
the guest, and attacking a benevolent userspace would require multiple
bugs in the userspace VMM.
Cc: stable@vger.kernel.org
Cc: Jun Nakajima <jun.nakajima@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a helper to consolidate the common checks for writing PT MSRs,
and opportunistically clean up the formatting of the affected code.
No functional change intended.
Cc: Chao Peng <chao.p.peng@linux.intel.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Reject writes to RTIT address MSRs if the data being written is a
non-canonical address as the MSRs are subject to canonical checks, e.g.
KVM will trigger an unchecked #GP when loading the values to hardware
during pt_guest_enter().
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fix some writing mistakes in the comments.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fix some typos in vcpu unimpl info. It should be unhandled rather than
uhandled.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fix some grammar mistakes in the comments.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fix some typos and add missing parentheses in the comments.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Since commit b1346ab2afbe ("KVM: nVMX: Rename prepare_vmcs02_*_full to
prepare_vmcs02_*_rare"), prepare_vmcs02_full has been renamed to
prepare_vmcs02_rare.
nested_vmx_merge_msr_bitmap is renamed to nested_vmx_prepare_msr_bitmap
since commit c992384bde84 ("KVM: vmx: speed up MSR bitmap merge").
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fix some wrong function names in comment. mmu_check_roots is a typo for
mmu_check_root, vmcs_read_any should be vmcs12_read_any and so on.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
check kvm_pit outside kvm_vm_ioctl_reinject() to keep codestyle consistent
with other kvm_pit func and prepare for futher cleanups.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This patch optimizes redundancy logic before fixed mode ipi is delivered
in the fast path, broadcast handling needs to go slow path, so the delivery
mode repair can be delayed to before slow path.
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
ICR and TSCDEADLINE MSRs write cause the main MSRs write vmexits in our
product observation, multicast IPIs are not as common as unicast IPI like
RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc.
This patch introduce a mechanism to handle certain performance-critical
WRMSRs in a very early stage of KVM VMExit handler.
This mechanism is specifically used for accelerating writes to x2APIC ICR
that attempt to send a virtual IPI with physical destination-mode, fixed
delivery-mode and single target. Which was found as one of the main causes
of VMExits for Linux workloads.
The reason this mechanism significantly reduce the latency of such virtual
IPIs is by sending the physical IPI to the target vCPU in a very early stage
of KVM VMExit handler, before host interrupts are enabled and before expensive
operations such as reacquiring KVM’s SRCU lock.
Latency is reduced even more when KVM is able to use APICv posted-interrupt
mechanism (which allows to deliver the virtual IPI directly to target vCPU
without the need to kick it to host).
Testing on Xeon Skylake server:
The virtual IPI latency from sender send to receiver receive reduces
more than 200+ cpu cycles.
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
As discussed in the strace issue tracker, it appears that the sparc32
sysvipc support has been broken for the past 11 years. It was however
working in compat mode, which is how it must have escaped most of the
regular testing.
The problem is that a cleanup patch inadvertently changed the uid/gid
fields in struct ipc64_perm from 32-bit types to 16-bit types in uapi
headers.
Both glibc and uclibc-ng still use the original types, so they should
work fine with compat mode, but not natively. Change the definitions
to use __kernel_uid32_t and __kernel_gid32_t again.
Fixes: 83c86984bff2 ("sparc: unify ipcbuf.h")
Link: https://github.com/strace/strace/issues/116
Cc: <stable@vger.kernel.org> # v2.6.29
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: "Dmitry V . Levin" <ldv@altlinux.org>
Cc: Rich Felker <dalias@libc.org>
Cc: libc-alpha@sourceware.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These extra fields before the @ are not handled in of_node_name_eq,
making commit b3e46d1a0590500335f0b95e669ad6d84b12b03a break node name
comparisons for ambapp path components, thereby making LEON systems
unable to boot.
As there is no need for the tacked on vendor and device ID fields in the
path component, resolve this situation by removing them.
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
kernel_ventry will create alternative entries to potentially replace
0 instructions with 0 instructions for EL1 vectors. While this does not
cause an issue, it pointlessly takes up some bytes in the alternatives
section.
Do not generate such entries.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
arm64 provides always working implementation of futex_atomic_cmpxchg_inatomic(),
so there is no need to check it runtime.
Reported-by: Piyush swami <Piyush.swami@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
It also lacks a whitespace (copy'n'paste error?) and also messes up the
output:
SYSHDR arch/mips/include/generated/uapi/asm/unistd_n32.h
SYSHDR arch/mips/include/generated/uapi/asm/unistd_n64.h
SYSHDR arch/mips/include/generated/uapi/asm/unistd_o32.h
SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n32.h
SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n64.h
SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_o32.h
WRAP arch/mips/include/generated/uapi/asm/bpf_perf_event.h
WRAP arch/mips/include/generated/uapi/asm/ipcbuf.h
After:
SYSHDR arch/mips/include/generated/uapi/asm/unistd_n32.h
SYSHDR arch/mips/include/generated/uapi/asm/unistd_n64.h
SYSHDR arch/mips/include/generated/uapi/asm/unistd_o32.h
SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n32.h
SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n64.h
SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_o32.h
WRAP arch/mips/include/generated/uapi/asm/bpf_perf_event.h
WRAP arch/mips/include/generated/uapi/asm/ipcbuf.h
Present since day 0 of syscall table generation introduction for MIPS.
Fixes: 9bcbf97c6293 ("mips: add system call table generation support")
Cc: <stable@vger.kernel.org> # v5.0+
Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Rob Herring <robh@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
|