summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2020-01-22arm64: kconfig: Fix alignment of E0PD help textWill Deacon
Remove the additional space. Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22x86/decoder: Add TEST opcode to Group3-2Masami Hiramatsu
Add TEST opcode to Group3-2 reg=001b as same as Group3-1 does. Commit 12a78d43de76 ("x86/decoder: Add new TEST instruction pattern") added a TEST opcode assignment to f6 XX/001/XXX (Group 3-1), but did not add f7 XX/001/XXX (Group 3-2). Actually, this TEST opcode variant (ModRM.reg /1) is not described in the Intel SDM Vol2 but in AMD64 Architecture Programmer's Manual Vol.3, Appendix A.2 Table A-6. ModRM.reg Extensions for the Primary Opcode Map. Without this fix, Randy found a warning by insn_decoder_test related to this issue as below. HOSTCC arch/x86/tools/insn_decoder_test HOSTCC arch/x86/tools/insn_sanity TEST posttest arch/x86/tools/insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this. arch/x86/tools/insn_decoder_test: warning: ffffffff81000bf1: f7 0b 00 01 08 00 testl $0x80100,(%rbx) arch/x86/tools/insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 2 arch/x86/tools/insn_decoder_test: warning: Decoded and checked 11913894 instructions with 1 failures TEST posttest arch/x86/tools/insn_sanity: Success: decoded and checked 1000000 random instructions with 0 errors (seed:0x871ce29c) To fix this error, add the TEST opcode according to AMD64 APM Vol.3. [ bp: Massage commit message. ] Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lkml.kernel.org/r/157966631413.9580.10311036595431878351.stgit@devnote2
2020-01-22arm64: Use v8.5-RNG entropy for KASLR seedMark Brown
When seeding KALSR on a system where we have architecture level random number generation make use of that entropy, mixing it in with the seed passed by the bootloader. Since this is run very early in init before feature detection is complete we open code rather than use archrandom.h. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22arm64: Implement archrandom.h for ARMv8.5-RNGRichard Henderson
Expose the ID_AA64ISAR0.RNDR field to userspace, as the RNG system registers are always available at EL0. Implement arch_get_random_seed_long using RNDR. Given that the TRNG is likely to be a shared resource between cores, and VMs, do not explicitly force re-seeding with RNDRRS. In order to avoid code complexity and potential issues with hetrogenous systems only provide values after cpufeature has finalized the system capabilities. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> [Modified to only function after cpufeature has finalized the system capabilities and move all the code into the header -- broonie] Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> [will: Advertise HWCAP via /proc/cpuinfo] Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22powerpc/xive: Discard ESB load value when interrupt is invalidFrederic Barrat
A load on an ESB page returning all 1's means that the underlying device has invalidated the access to the PQ state of the interrupt through mmio. It may happen, for example when querying a PHB interrupt while the PHB is in an error state. In that case, we should consider the interrupt to be invalid when checking its state in the irq_get_irqchip_state() handler. Fixes: da15c03b047d ("powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> [clg: wrote a commit log, introduced XIVE_ESB_INVALID ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200113130118.27969-1-clg@kaod.org
2020-01-22powerpc: Ultravisor: Fix the dependencies for CONFIG_PPC_UVBharata B Rao
Let PPC_UV depend only on DEVICE_PRIVATE which in turn will satisfy all the other required dependencies Fixes: 013a53f2d25a ("powerpc: Ultravisor: Add PPC_UV config option") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200109092047.24043-1-bharata@linux.ibm.com
2020-01-22crypto: arm/chacha - fix build failured when kernel mode NEON is disabledArd Biesheuvel
When the ARM accelerated ChaCha driver is built as part of a configuration that has kernel mode NEON disabled, we expect the compiler to propagate the build time constant expression IS_ENABLED(CONFIG_KERNEL_MODE_NEON) in a way that eliminates all the cross-object references to the actual NEON routines, which allows the chacha-neon-core.o object to be omitted from the build entirely. Unfortunately, this fails to work as expected in some cases, and we may end up with a build error such as chacha-glue.c:(.text+0xc0): undefined reference to `chacha_4block_xor_neon' caused by the fact that chacha_doneon() has not been eliminated from the object code, even though it will never be called in practice. Let's fix this by adding some IS_ENABLED(CONFIG_KERNEL_MODE_NEON) tests that are not strictly needed from a logical point of view, but should help the compiler infer that the NEON code paths are unreachable in those cases. Fixes: b36d8c09e710c71f ("crypto: arm/chacha - remove dependency on generic ...") Reported-by: Russell King <linux@armlinux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22crypto: x86/poly1305 - emit does base conversion itselfJason A. Donenfeld
The emit code does optional base conversion itself in assembly, so we don't need to do that here. Also, neither one of these functions uses simd instructions, so checking for that doesn't make sense either. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22crypto: x86/poly1305 - fix .gitignore typoJason A. Donenfeld
Admist the kbuild robot induced changes, the .gitignore file for the generated file wasn't updated with the non-clashing filename. This commit adjusts that. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22crypto: x86/sha - Eliminate casts on asm implementationsKees Cook
In order to avoid CFI function prototype mismatches, this removes the casts on assembly implementations of sha1/256/512 accelerators. The safety checks from BUILD_BUG_ON() remain. Additionally, this renames various arguments for clarity, as suggested by Eric Biggers. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22efi/x86: Disallow efi=old_map in mixed modeArd Biesheuvel
Before: 1f299fad1e31: ("efi/x86: Limit EFI old memory map to SGI UV machines") enabling the old EFI memory map on mixed mode systems disabled EFI runtime services altogether. Given that efi=old_map is a debug feature designed to work around firmware problems related to EFI runtime services, and disabling them can be achieved more straightforwardly using 'noefi' or 'efi=noruntime', it makes more sense to ignore efi=old_map on mixed mode systems. Currently, we do neither, and try to use the old memory map in combination with mixed mode routines, which results in crashes, so let's fix this by making efi=old_map functional on native systems only. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-21Merge tag 'zynq-dt-for-v5.6-v2' of https://github.com/Xilinx/linux-xlnx into ↵Olof Johansson
arm/dt ARM: dts: zynq: DT changes for v5.6 v2 - Enable coresight topology for Zynq * tag 'zynq-dt-for-v5.6-v2' of https://github.com/Xilinx/linux-xlnx: ARM: dts: zynq: enablement of coresight topology Link: https://lore.kernel.org/r/5db334df-89b5-1d07-3884-93f77b0f4e60@monstr.eu Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-21Merge tag 'zynq-soc-for-v5.6' of https://github.com/Xilinx/linux-xlnx into ↵Olof Johansson
arm/soc ARM: Xilinx Zynq SoC patches for v5.6 - Fix cpuid handling logic in platform SMP startup code * tag 'zynq-soc-for-v5.6' of https://github.com/Xilinx/linux-xlnx: ARM: zynq: use physical cpuid in zynq_slcr_cpu_stop/start Link: https://lore.kernel.org/r/50dec3cf-5f80-69be-c3d1-cc14b9bce5ff@monstr.eu Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-21Merge tag 'zynqmp-dt-for-v5.6' of https://github.com/Xilinx/linux-xlnx into ↵Olof Johansson
arm/dt arm64: dts: zynqmp: DT changes for v5.6 - Switch from fixed to firmware based clock driver - Wire power domain driver - Wire all ina226 chips through IIO and IIO hwmon drivers - Add missing dr_mode property to usb nodes - Use gpio-line-names property instead of comments - Use clock-output-names for si570 differentiation - Minor DT fixes * tag 'zynqmp-dt-for-v5.6' of https://github.com/Xilinx/linux-xlnx: (21 commits) arm64: zynqmp: Add label property to all ina226 on zcu106 arm64: zynqmp: Enable iio-hwmon for ina226 on zcu106 arm64: zynqmp: Add label property to all ina226 on zcu102 arm64: zynqmp: Enable iio-hwmon for ina226 on zcu102 arm64: zynqmp: Add label property to all ina226 on zcu111 arm64: zynqmp: Enable iio-hwmon for ina226 on zcu111 arm64: zynqmp: Enable iio-hwmon for ina226 on zcu100 arm64: zynqmp: Setup default number of chipselects for zcu100 arm64: zynqmp: Remove broken-cd from zcu100-revC arm64: zynqmp: Fix the si570 clock frequency on zcu111 arm64: zynqmp: Setup clock-output-names for si570 chips arm64: zynqmp: Turn comment to gpio-line-names arm64: zynqmp: Fix address for tca6416_u97 chip on zcu104 arm64: zynqmp: Remove addition number in node name arm64: zynqmp: Use ethernet-phy as node name for ethernet phys arm64: dts: xilinx: Add the power nodes for zynqmp arm64: dts: xilinx: Remove dtsi for fixed clock arm64: dts: xilinx: Add the clock nodes for zynqmp arm64: zynqmp: Add dr_mode property to usb node arm64: dts: zynqmp: Use decimal values for drm-clock properties ... Link: https://lore.kernel.org/r/c70d2efa-9ee2-a764-5248-0e5bfbf29f8a@monstr.eu Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-21ARM: s3c64xx: Drop unneeded select of TIMER_OFGeert Uytterhoeven
Support for Samsung S3C64XX systems depends on ARCH_MULTI_V6, and thus on ARCH_MULTIPLATFORM. As the latter selects TIMER_OF, there is no need for MACH_S3C64XX_DT to select TIMER_OF. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
2020-01-21ARM: exynos: Drop unneeded select of MIGHT_HAVE_CACHE_L2X0Geert Uytterhoeven
Support for Samsung Exynos SoCs depends on ARCH_MULTI_V7, which selects ARCH_MULTI_V6_V7. As the latter selects MIGHT_HAVE_CACHE_L2X0, there is no need for ARCH_EXYNOS4 to select MIGHT_HAVE_CACHE_L2X0. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
2020-01-21ARM: s3c24xx: Switch to atomic pwm API in rx1950Uwe Kleine-König
Stop using the legacy PWM API which only still exists because there are some users left. Note this change make use of the fact that the value of struct pwm_state::duty_cycle doesn't matter for a disabled PWM and so its value can stay constant simplifying the code a bit. A side effect of the conversion is that the pwm isn't stopped in rx1950_backlight_init() by the call to pwm_apply_args() just before reenabling it when rx1950_lcd_power(1) is called. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
2020-01-21Merge 5.5-rc7 into usb-nextGreg Kroah-Hartman
We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-21arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean'Dirk Behme
Since v4.3-rc1 commit 0723c05fb75e44 ("arm64: enable more compressed Image formats"), it is possible to build Image.{bz2,lz4,lzma,lzo} AArch64 images. However, the commit missed adding support for removing those images on 'make ARCH=arm64 (dist)clean'. Fix this by adding them to the target list. Make sure to match the order of the recipes in the makefile. Cc: stable@vger.kernel.org # v4.3+ Fixes: 0723c05fb75e44 ("arm64: enable more compressed Image formats") Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com> Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-21KVM: x86/mmu: Apply max PA check for MMIO sptes to 32-bit KVMSean Christopherson
Remove the bogus 64-bit only condition from the check that disables MMIO spte optimization when the system supports the max PA, i.e. doesn't have any reserved PA bits. 32-bit KVM always uses PAE paging for the shadow MMU, and per Intel's SDM: PAE paging translates 32-bit linear addresses to 52-bit physical addresses. The kernel's restrictions on max physical addresses are limits on how much memory the kernel can reasonably use, not what physical addresses are supported by hardware. Fixes: ce88decffd17 ("KVM: MMU: mmio page fault support") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: nVMX: vmread should not set rflags to specify success in case of #PFMiaohe Lin
In case writing to vmread destination operand result in a #PF, vmread should not call nested_vmx_succeed() to set rflags to specify success. Similar to as done in VMPTRST (See handle_vmptrst()). Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checksSean Christopherson
Rework the handling of nEPT's bad memtype/XWR checks to micro-optimize the checks as much as possible. Move the check to a separate helper, __is_bad_mt_xwr(), which allows the guest_rsvd_check usage in paging_tmpl.h to omit the check entirely for paging32/64 (bad_mt_xwr is always zero for non-nEPT) while retaining the bitwise-OR of the current code for the shadow_zero_check in walk_shadow_page_get_mmio_spte(). Add a comment for the bitwise-OR usage in the mmio spte walk to avoid future attempts to "fix" the code, which is what prompted this optimization in the first place[*]. Opportunistically remove the superfluous '!= 0' and parantheses, and use BIT_ULL() instead of open coding its equivalent. The net effect is that code generation is largely unchanged for walk_shadow_page_get_mmio_spte(), marginally better for ept_prefetch_invalid_gpte(), and significantly improved for paging32/64_prefetch_invalid_gpte(). Note, walk_shadow_page_get_mmio_spte() can't use a templated version of the memtype/XRW as it works on the host's shadow PTEs, e.g. checks that KVM hasn't borked its EPT tables. Even if it could be templated, the benefits of having a single implementation far outweight the few uops that would be saved for NPT or non-TDP paging, e.g. most compilers inline it all the way to up kvm_mmu_page_fault(). [*] https://lkml.kernel.org/r/20200108001859.25254-1-sean.j.christopherson@intel.com Cc: Jim Mattson <jmattson@google.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86/mmu: Reorder the reserved bit check in prefetch_invalid_gpte()Sean Christopherson
Move the !PRESENT and !ACCESSED checks in FNAME(prefetch_invalid_gpte) above the call to is_rsvd_bits_set(). For a well behaved guest, the !PRESENT and !ACCESSED are far more likely to evaluate true than the reserved bit checks, and they do not require additional memory accesses. Before: Dump of assembler code for function paging32_prefetch_invalid_gpte: 0x0000000000044240 <+0>: callq 0x44245 <paging32_prefetch_invalid_gpte+5> 0x0000000000044245 <+5>: mov %rcx,%rax 0x0000000000044248 <+8>: shr $0x7,%rax 0x000000000004424c <+12>: and $0x1,%eax 0x000000000004424f <+15>: lea 0x0(,%rax,4),%r8 0x0000000000044257 <+23>: add %r8,%rax 0x000000000004425a <+26>: mov %rcx,%r8 0x000000000004425d <+29>: and 0x120(%rsi,%rax,8),%r8 0x0000000000044265 <+37>: mov 0x170(%rsi),%rax 0x000000000004426c <+44>: shr %cl,%rax 0x000000000004426f <+47>: and $0x1,%eax 0x0000000000044272 <+50>: or %rax,%r8 0x0000000000044275 <+53>: jne 0x4427c <paging32_prefetch_invalid_gpte+60> 0x0000000000044277 <+55>: test $0x1,%cl 0x000000000004427a <+58>: jne 0x4428a <paging32_prefetch_invalid_gpte+74> 0x000000000004427c <+60>: mov %rdx,%rsi 0x000000000004427f <+63>: callq 0x44080 <drop_spte> 0x0000000000044284 <+68>: mov $0x1,%eax 0x0000000000044289 <+73>: retq 0x000000000004428a <+74>: xor %eax,%eax 0x000000000004428c <+76>: and $0x20,%ecx 0x000000000004428f <+79>: jne 0x44289 <paging32_prefetch_invalid_gpte+73> 0x0000000000044291 <+81>: mov %rdx,%rsi 0x0000000000044294 <+84>: callq 0x44080 <drop_spte> 0x0000000000044299 <+89>: mov $0x1,%eax 0x000000000004429e <+94>: jmp 0x44289 <paging32_prefetch_invalid_gpte+73> End of assembler dump. After: Dump of assembler code for function paging32_prefetch_invalid_gpte: 0x0000000000044240 <+0>: callq 0x44245 <paging32_prefetch_invalid_gpte+5> 0x0000000000044245 <+5>: test $0x1,%cl 0x0000000000044248 <+8>: je 0x4424f <paging32_prefetch_invalid_gpte+15> 0x000000000004424a <+10>: test $0x20,%cl 0x000000000004424d <+13>: jne 0x4425d <paging32_prefetch_invalid_gpte+29> 0x000000000004424f <+15>: mov %rdx,%rsi 0x0000000000044252 <+18>: callq 0x44080 <drop_spte> 0x0000000000044257 <+23>: mov $0x1,%eax 0x000000000004425c <+28>: retq 0x000000000004425d <+29>: mov %rcx,%rax 0x0000000000044260 <+32>: mov (%rsi),%rsi 0x0000000000044263 <+35>: shr $0x7,%rax 0x0000000000044267 <+39>: and $0x1,%eax 0x000000000004426a <+42>: lea 0x0(,%rax,4),%r8 0x0000000000044272 <+50>: add %r8,%rax 0x0000000000044275 <+53>: mov %rcx,%r8 0x0000000000044278 <+56>: and 0x120(%rsi,%rax,8),%r8 0x0000000000044280 <+64>: mov 0x170(%rsi),%rax 0x0000000000044287 <+71>: shr %cl,%rax 0x000000000004428a <+74>: and $0x1,%eax 0x000000000004428d <+77>: mov %rax,%rcx 0x0000000000044290 <+80>: xor %eax,%eax 0x0000000000044292 <+82>: or %rcx,%r8 0x0000000000044295 <+85>: je 0x4425c <paging32_prefetch_invalid_gpte+28> 0x0000000000044297 <+87>: mov %rdx,%rsi 0x000000000004429a <+90>: callq 0x44080 <drop_spte> 0x000000000004429f <+95>: mov $0x1,%eax 0x00000000000442a4 <+100>: jmp 0x4425c <paging32_prefetch_invalid_gpte+28> End of assembler dump. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: SVM: Override default MMIO mask if memory encryption is enabledTom Lendacky
The KVM MMIO support uses bit 51 as the reserved bit to cause nested page faults when a guest performs MMIO. The AMD memory encryption support uses a CPUID function to define the encryption bit position. Given this, it is possible that these bits can conflict. Use svm_hardware_setup() to override the MMIO mask if memory encryption support is enabled. Various checks are performed to ensure that the mask is properly defined and rsvd_bits() is used to generate the new mask (as was done prior to the change that necessitated this patch). Fixes: 28a1f3ac1d0c ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs") Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: vmx: delete meaningless nested_vmx_prepare_msr_bitmap() declarationMiaohe Lin
The function nested_vmx_prepare_msr_bitmap() declaration is below its implementation. So this is meaningless and should be removed. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Refactor and rename bit() to feature_bit() macroSean Christopherson
Rename bit() to __feature_bit() to give it a more descriptive name, and add a macro, feature_bit(), to stuff the X68_FEATURE_ prefix to keep line lengths manageable for code that hardcodes the bit to be retrieved. No functional change intended. Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Expand build-time assertion on reverse CPUID usageSean Christopherson
Add build-time checks to ensure KVM isn't trying to do a reverse CPUID lookup on Linux-defined feature bits, along with comments to explain the gory details of X86_FEATUREs and bit(). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Add CPUID_7_1_EAX to the reverse CPUID tableSean Christopherson
Add an entry for CPUID_7_1_EAX in the reserve_cpuid array in preparation for incorporating the array in bit() build-time assertions, specifically to avoid an assertion on F(AVX512_BF16) in do_cpuid_7_mask(). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Move bit() helper to cpuid.hSean Christopherson
Move bit() to cpuid.h in preparation for incorporating the reverse_cpuid array in bit() build-time assertions. Opportunistically use the BIT() macro instead of open-coding the shift. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Add dedicated emulator helpers for querying CPUID featuresSean Christopherson
Add feature-specific helpers for querying guest CPUID support from the emulator instead of having the emulator do a full CPUID and perform its own bit tests. The primary motivation is to eliminate the emulator's usage of bit() so that future patches can add more extensive build-time assertions on the usage of bit() without having to expose yet more code to the emulator. Note, providing a generic guest_cpuid_has() to the emulator doesn't work due to the existing built-time assertions in guest_cpuid_has(), which require the feature being checked to be a compile-time constant. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Add macro to ensure reserved cr4 bits checks stay in syncSean Christopherson
Add a helper macro to generate the set of reserved cr4 bits for both host and guest to ensure that adding a check on guest capabilities is also added for host capabilities, and vice versa. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Drop special XSAVE handling from guest_cpuid_has()Sean Christopherson
Now that KVM prevents setting host-reserved CR4 bits, drop the dedicated XSAVE check in guest_cpuid_has() in favor of open coding similar checks in the SVM/VMX XSAVES enabling flows. Note, checking boot_cpu_has(X86_FEATURE_XSAVE) in the XSAVES flows is technically redundant with respect to the CR4 reserved bit checks, e.g. XSAVES #UDs if CR4.OSXSAVE=0 and arch.xsaves_enabled is consumed if and only if CR4.OXSAVE=1 in guest. Keep (add?) the explicit boot_cpu_has() checks to help document KVM's usage of arch.xsaves_enabled. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Ensure all logical CPUs have consistent reserved cr4 bitsSean Christopherson
Check the current CPU's reserved cr4 bits against the mask calculated for the boot CPU to ensure consistent behavior across all CPUs. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: Don't let userspace set host-reserved cr4 bitsSean Christopherson
Calculate the host-reserved cr4 bits at runtime based on the system's capabilities (using logic similar to __do_cpuid_func()), and use the dynamically generated mask for the reserved bit check in kvm_set_cr4() instead using of the static CR4_RESERVED_BITS define. This prevents userspace from "enabling" features in cr4 that are not supported by the system, e.g. by ignoring KVM_GET_SUPPORTED_CPUID and specifying a bogus CPUID for the vCPU. Allowing userspace to set unsupported bits in cr4 can lead to a variety of undesirable behavior, e.g. failed VM-Enter, and in general increases KVM's attack surface. A crafty userspace can even abuse CR4.LA57 to induce an unchecked #GP on a WRMSR. On a platform without LA57 support: KVM_SET_CPUID2 // CPUID_7_0_ECX.LA57 = 1 KVM_SET_SREGS // CR4.LA57 = 1 KVM_SET_MSRS // KERNEL_GS_BASE = 0x0004000000000000 KVM_RUN leads to a #GP when writing KERNEL_GS_BASE into hardware: unchecked MSR access error: WRMSR to 0xc0000102 (tried to write 0x0004000000000000) at rIP: 0xffffffffa00f239a (vmx_prepare_switch_to_guest+0x10a/0x1d0 [kvm_intel]) Call Trace: kvm_arch_vcpu_ioctl_run+0x671/0x1c70 [kvm] kvm_vcpu_ioctl+0x36b/0x5d0 [kvm] do_vfs_ioctl+0xa1/0x620 ksys_ioctl+0x66/0x70 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4c/0x170 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fc08133bf47 Note, the above sequence fails VM-Enter due to invalid guest state. Userspace can allow VM-Enter to succeed (after the WRMSR #GP) by adding a KVM_SET_SREGS w/ CR4.LA57=0 after KVM_SET_MSRS, in which case KVM will technically leak the host's KERNEL_GS_BASE into the guest. But, as KERNEL_GS_BASE is a userspace-defined value/address, the leak is largely benign as a malicious userspace would simply be exposing its own data to the guest, and attacking a benevolent userspace would require multiple bugs in the userspace VMM. Cc: stable@vger.kernel.org Cc: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: VMX: Add helper to consolidate up PT/RTIT WRMSR fault logicSean Christopherson
Add a helper to consolidate the common checks for writing PT MSRs, and opportunistically clean up the formatting of the affected code. No functional change intended. Cc: Chao Peng <chao.p.peng@linux.intel.com> Cc: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: VMX: Add non-canonical check on writes to RTIT address MSRsSean Christopherson
Reject writes to RTIT address MSRs if the data being written is a non-canonical address as the MSRs are subject to canonical checks, e.g. KVM will trigger an unchecked #GP when loading the values to hardware during pt_guest_enter(). Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some writing mistakesMiaohe Lin
Fix some writing mistakes in the comments. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: hyperv: Fix some typos in vcpu unimpl infoMiaohe Lin
Fix some typos in vcpu unimpl info. It should be unhandled rather than uhandled. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some grammar mistakesMiaohe Lin
Fix some grammar mistakes in the comments. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some comment typos and missing parenthesesMiaohe Lin
Fix some typos and add missing parentheses in the comments. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some out-dated function names in commentMiaohe Lin
Since commit b1346ab2afbe ("KVM: nVMX: Rename prepare_vmcs02_*_full to prepare_vmcs02_*_rare"), prepare_vmcs02_full has been renamed to prepare_vmcs02_rare. nested_vmx_merge_msr_bitmap is renamed to nested_vmx_prepare_msr_bitmap since commit c992384bde84 ("KVM: vmx: speed up MSR bitmap merge"). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: Fix some wrong function names in commentMiaohe Lin
Fix some wrong function names in comment. mmu_check_roots is a typo for mmu_check_root, vmcs_read_any should be vmcs12_read_any and so on. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: x86: check kvm_pit outside kvm_vm_ioctl_reinject()Miaohe Lin
check kvm_pit outside kvm_vm_ioctl_reinject() to keep codestyle consistent with other kvm_pit func and prepare for futher cleanups. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: LAPIC: micro-optimize fixed mode ipi deliveryWanpeng Li
This patch optimizes redundancy logic before fixed mode ipi is delivered in the fast path, broadcast handling needs to go slow path, so the delivery mode repair can be delayed to before slow path. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21KVM: VMX: FIXED+PHYSICAL mode single target IPI fastpathWanpeng Li
ICR and TSCDEADLINE MSRs write cause the main MSRs write vmexits in our product observation, multicast IPIs are not as common as unicast IPI like RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc. This patch introduce a mechanism to handle certain performance-critical WRMSRs in a very early stage of KVM VMExit handler. This mechanism is specifically used for accelerating writes to x2APIC ICR that attempt to send a virtual IPI with physical destination-mode, fixed delivery-mode and single target. Which was found as one of the main causes of VMExits for Linux workloads. The reason this mechanism significantly reduce the latency of such virtual IPIs is by sending the physical IPI to the target vCPU in a very early stage of KVM VMExit handler, before host interrupts are enabled and before expensive operations such as reacquiring KVM’s SRCU lock. Latency is reduced even more when KVM is able to use APICv posted-interrupt mechanism (which allows to deliver the virtual IPI directly to target vCPU without the need to kick it to host). Testing on Xeon Skylake server: The virtual IPI latency from sender send to receiver receive reduces more than 200+ cpu cycles. Reviewed-by: Liran Alon <liran.alon@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21sparc32: fix struct ipc64_perm type definitionArnd Bergmann
As discussed in the strace issue tracker, it appears that the sparc32 sysvipc support has been broken for the past 11 years. It was however working in compat mode, which is how it must have escaped most of the regular testing. The problem is that a cleanup patch inadvertently changed the uid/gid fields in struct ipc64_perm from 32-bit types to 16-bit types in uapi headers. Both glibc and uclibc-ng still use the original types, so they should work fine with compat mode, but not natively. Change the definitions to use __kernel_uid32_t and __kernel_gid32_t again. Fixes: 83c86984bff2 ("sparc: unify ipcbuf.h") Link: https://github.com/strace/strace/issues/116 Cc: <stable@vger.kernel.org> # v2.6.29 Cc: Sam Ravnborg <sam@ravnborg.org> Cc: "Dmitry V . Levin" <ldv@altlinux.org> Cc: Rich Felker <dalias@libc.org> Cc: libc-alpha@sourceware.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21sparc32, leon: Stop adding vendor and device id to prom ambapp path componentsAndreas Larsson
These extra fields before the @ are not handled in of_node_name_eq, making commit b3e46d1a0590500335f0b95e669ad6d84b12b03a break node name comparisons for ambapp path components, thereby making LEON systems unable to boot. As there is no need for the tacked on vendor and device ID fields in the path component, resolve this situation by removing them. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21arm64: entry: Avoid empty alternatives entriesJulien Thierry
kernel_ventry will create alternative entries to potentially replace 0 instructions with 0 instructions for EL1 vectors. While this does not cause an issue, it pointlessly takes up some bytes in the alternatives section. Do not generate such entries. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Julien Thierry <jthierry@redhat.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-21arm64: Kconfig: select HAVE_FUTEX_CMPXCHGVladimir Murzin
arm64 provides always working implementation of futex_atomic_cmpxchg_inatomic(), so there is no need to check it runtime. Reported-by: Piyush swami <Piyush.swami@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-20MIPS: syscalls: fix indentation of the 'SYSNR' messageAlexander Lobakin
It also lacks a whitespace (copy'n'paste error?) and also messes up the output: SYSHDR arch/mips/include/generated/uapi/asm/unistd_n32.h SYSHDR arch/mips/include/generated/uapi/asm/unistd_n64.h SYSHDR arch/mips/include/generated/uapi/asm/unistd_o32.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n32.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n64.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_o32.h WRAP arch/mips/include/generated/uapi/asm/bpf_perf_event.h WRAP arch/mips/include/generated/uapi/asm/ipcbuf.h After: SYSHDR arch/mips/include/generated/uapi/asm/unistd_n32.h SYSHDR arch/mips/include/generated/uapi/asm/unistd_n64.h SYSHDR arch/mips/include/generated/uapi/asm/unistd_o32.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n32.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n64.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_o32.h WRAP arch/mips/include/generated/uapi/asm/bpf_perf_event.h WRAP arch/mips/include/generated/uapi/asm/ipcbuf.h Present since day 0 of syscall table generation introduction for MIPS. Fixes: 9bcbf97c6293 ("mips: add system call table generation support") Cc: <stable@vger.kernel.org> # v5.0+ Signed-off-by: Alexander Lobakin <alobakin@dlink.ru> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Rob Herring <robh@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org