summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2018-12-26Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loading updates from Borislav Petkov: "This update contains work started by Maciej to make the microcode container verification more robust against all kinds of corruption and also unify verification paths between early and late loading. The result is a set of verification routines which validate the microcode blobs before loading it on the CPU. In addition, the code is a lot more streamlined and unified. In the process, some of the aspects of patch handling and loading were simplified. All provided by Maciej S. Szmigiero and Borislav Petkov" * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Update copyright x86/microcode/AMD: Check the equivalence table size when scanning it x86/microcode/AMD: Convert CPU equivalence table variable into a struct x86/microcode/AMD: Check microcode container data in the late loader x86/microcode/AMD: Fix container size's type x86/microcode/AMD: Convert early parser to the new verification routines x86/microcode/AMD: Change verify_patch()'s return value x86/microcode/AMD: Move chipset-specific check into verify_patch() x86/microcode/AMD: Move patch family check to verify_patch() x86/microcode/AMD: Simplify patch family detection x86/microcode/AMD: Concentrate patch verification x86/microcode/AMD: Cleanup verify_patch_size() more x86/microcode/AMD: Clean up per-family patch size checks x86/microcode/AMD: Move verify_patch_size() up in the file x86/microcode/AMD: Add microcode container verification x86/microcode/AMD: Subtract SECTION_HDR_SIZE from file leftover length
2018-12-26Merge branch 'x86-cache-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache control updates from Borislav Petkov: - The generalization of the RDT code to accommodate the addition of AMD's very similar implementation of the cache monitoring feature. This entails a subsystem move into a separate and generic arch/x86/kernel/cpu/resctrl/ directory along with adding vendor-specific initialization and feature detection helpers. Ontop of that is the unification of user-visible strings, both in the resctrl filesystem error handling and Kconfig. Provided by Babu Moger and Sherry Hurwitz. - Code simplifications and error handling improvements by Reinette Chatre. * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Fix rdt_find_domain() return value and checks x86/resctrl: Remove unnecessary check for cbm_validate() x86/resctrl: Use rdt_last_cmd_puts() where possible MAINTAINERS: Update resctrl filename patterns Documentation: Rename and update intel_rdt_ui.txt to resctrl_ui.txt x86/resctrl: Introduce AMD QOS feature x86/resctrl: Fixup the user-visible strings x86/resctrl: Add AMD's X86_FEATURE_MBA to the scattered CPUID features x86/resctrl: Rename the config option INTEL_RDT to RESCTRL x86/resctrl: Add vendor check for the MBA software controller x86/resctrl: Bring cbm_validate() into the resource structure x86/resctrl: Initialize the vendor-specific resource functions x86/resctrl: Move all the macros to resctrl/internal.h x86/resctrl: Re-arrange the RDT init code x86/resctrl: Rename the RDT functions and definitions x86/resctrl: Rename and move rdt files to a separate directory
2018-12-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - selftests improvements - large PUD support for HugeTLB - single-stepping fixes - improved tracing - various timer and vGIC fixes x86: - Processor Tracing virtualization - STIBP support - some correctness fixes - refactorings and splitting of vmx.c - use the Hyper-V range TLB flush hypercall - reduce order of vcpu struct - WBNOINVD support - do not use -ftrace for __noclone functions - nested guest support for PAUSE filtering on AMD - more Hyper-V enlightenments (direct mode for synthetic timers) PPC: - nested VFIO s390: - bugfixes only this time" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: x86: Add CPUID support for new instruction WBNOINVD kvm: selftests: ucall: fix exit mmio address guessing Revert "compiler-gcc: disable -ftracer for __noclone functions" KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup MAINTAINERS: Add arch/x86/kvm sub-directories to existing KVM/x86 entry KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streams KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range() KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp() KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte() KVM: Make kvm_set_spte_hva() return int KVM: Replace old tlb flush function with new one to flush a specified range. KVM/MMU: Add tlb flush with range helper function KVM/VMX: Add hv tlb range flush support x86/hyper-v: Add HvFlushGuestAddressList hypercall support KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops KVM: x86: Disable Intel PT when VMXON in L1 guest KVM: x86: Set intercept for Intel PT MSRs read/write KVM: x86: Implement Intel PT MSRs read/write emulation ...
2018-12-26Merge tag 'for-linus-4.21-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "Xen features and fixes: - a series to enable KVM guests to be booted by qemu via the Xen PVH boot entry for speeding up KVM guest tests - a series for a common driver to be used by Xen PV frontends (right now drm and sound) - two other fixes in Xen related code" * tag 'for-linus-4.21-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: ALSA: xen-front: Use Xen common shared buffer implementation drm/xen-front: Use Xen common shared buffer implementation xen: Introduce shared buffer helpers for page directory... xen/pciback: Check dev_data before using it kprobes/x86/xen: blacklist non-attachable xen interrupt functions KVM: x86: Allow Qemu/KVM to use PVH entry point xen/pvh: Add memory map pointer to hvm_start_info struct xen/pvh: Move Xen code for getting mem map via hcall out of common file xen/pvh: Move Xen specific PVH VM initialization out of common file xen/pvh: Create a new file for Xen specific PVH code xen/pvh: Move PVH entry code out of Xen specific tree xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH
2018-12-25Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 festive updates from Will Deacon: "In the end, we ended up with quite a lot more than I expected: - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and kernel-side support to come later) - Support for per-thread stack canaries, pending an update to GCC that is currently undergoing review - Support for kexec_file_load(), which permits secure boot of a kexec payload but also happens to improve the performance of kexec dramatically because we can avoid the sucky purgatory code from userspace. Kdump will come later (requires updates to libfdt). - Optimisation of our dynamic CPU feature framework, so that all detected features are enabled via a single stop_machine() invocation - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that they can benefit from global TLB entries when KASLR is not in use - 52-bit virtual addressing for userspace (kernel remains 48-bit) - Patch in LSE atomics for per-cpu atomic operations - Custom preempt.h implementation to avoid unconditional calls to preempt_schedule() from preempt_enable() - Support for the new 'SB' Speculation Barrier instruction - Vectorised implementation of XOR checksumming and CRC32 optimisations - Workaround for Cortex-A76 erratum #1165522 - Improved compatibility with Clang/LLD - Support for TX2 system PMUS for profiling the L3 cache and DMC - Reflect read-only permissions in the linear map by default - Ensure MMIO reads are ordered with subsequent calls to Xdelay() - Initial support for memory hotplug - Tweak the threshold when we invalidate the TLB by-ASID, so that mremap() performance is improved for ranges spanning multiple PMDs. - Minor refactoring and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (125 commits) arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset() arm64: sysreg: Use _BITUL() when defining register bits arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4 arm64: docs: document pointer authentication arm64: ptr auth: Move per-thread keys from thread_info to thread_struct arm64: enable pointer authentication arm64: add prctl control for resetting ptrauth keys arm64: perf: strip PAC when unwinding userspace arm64: expose user PAC bit positions via ptrace arm64: add basic pointer authentication support arm64/cpufeature: detect pointer authentication arm64: Don't trap host pointer auth use to EL2 arm64/kvm: hide ptrauth from guests arm64/kvm: consistently handle host HCR_EL2 flags arm64: add pointer authentication register bits arm64: add comments about EC exception levels arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field arm64: enable per-task stack canaries ...
2018-12-25Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti updates from Thomas Gleixner: "No point in speculating what's in this parcel: - Drop the swap storage limit when L1TF is disabled so the full space is available - Add support for the new AMD STIBP always on mitigation mode - Fix a bunch of STIPB typos" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Add support for STIBP always-on preferred mode x86/speculation/l1tf: Drop the swap storage limit restriction when l1tf=off x86/speculation: Change misspelled STIPB to STIBP
2018-12-25Merge tag 'acpi-4.21-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to the 20181213 upstream revision, make it possible to build the ACPI subsystem without PCI support, and a new OEM _OSI string, add a new device support to the ACPI driver for AMD SoCs and fix PM handling in the ACPI driver for Intel SoCs, fix the SPCR table handling and do some assorted fixes and cleanups. Specifics: - Update the ACPICA code in the kernel to the 20181213 upstream revision including: * New Windows _OSI strings (Bob Moore, Jung-uk Kim). * Buffers-to-string conversions update (Bob Moore). * Removal of support for expressions in package elements (Bob Moore). * New option to display method/object evaluation in debug output (Bob Moore). * Compiler improvements (Bob Moore, Erik Schmauss). * Minor debugger fix (Erik Schmauss). * Disassembler improvement (Erik Schmauss). * Assorted cleanups (Bob Moore, Colin Ian King, Erik Schmauss). - Add support for a new OEM _OSI string to indicate special handling of secondary graphics adapters on some systems (Alex Hung). - Make it possible to build the ACPI subystem without PCI support (Sinan Kaya). - Make the SPCR table handling regard baud rate 0 in accordance with the specification of it and make the DSDT override code support DSDT code names generated by recent ACPICA (Andy Shevchenko, Wang Dongsheng, Nathan Chancellor). - Add clock frequency for Hisilicon Hip08 SPI controller to the ACPI driver for AMD SoCs (APD) (Jay Fang). - Fix the PM handling during device init in the ACPI driver for Intel SoCs (LPSS) (Hans de Goede). - Avoid double panic()s by clearing the APEI GHES block_status before panic() (Lenny Szubowicz). - Clean up a function invocation in the ACPI core and get rid of some code duplication by using the DEFINE_SHOW_ATTRIBUTE macro in the APEI support code (Alexey Dobriyan, Yangtao Li)" * tag 'acpi-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (31 commits) ACPI / tables: Add an ifdef around amlcode and dsdt_amlcode ACPI/APEI: Clear GHES block_status before panic() ACPI: Make PCI slot detection driver depend on PCI ACPI/IORT: Stub out ACS functions when CONFIG_PCI is not set arm64: select ACPI PCI code only when both features are enabled PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set ACPICA: Remove PCI bits from ACPICA when CONFIG_PCI is unset ACPI: Allow CONFIG_PCI to be unset for reboot ACPI: Move PCI reset to a separate function ACPI / OSI: Add OEM _OSI string to enable dGPU direct output ACPI / tables: add DSDT AmlCode new declaration name support ACPICA: Update version to 20181213 ACPICA: change coding style to match ACPICA, no functional change ACPICA: Debug output: Add option to display method/object evaluation ACPICA: disassembler: disassemble OEMx tables as AML ACPICA: Add "Windows 2018.2" string in the _OSI support ACPICA: Expressions in package elements are not supported ACPICA: Update buffer-to-string conversions ACPICA: add comments, no functional change ACPICA: Remove defines that use deprecated flag ...
2018-12-23crypto: x86/chacha - avoid sleeping under kernel_fpu_begin()Eric Biggers
Passing atomic=true to skcipher_walk_virt() only makes the later skcipher_walk_done() calls use atomic memory allocations, not skcipher_walk_virt() itself. Thus, we have to move it outside of the preemption-disabled region (kernel_fpu_begin()/kernel_fpu_end()). (skcipher_walk_virt() only allocates memory for certain layouts of the input scatterlist, hence why I didn't notice this earlier...) Reported-by: syzbot+9bf843c33f782d73ae7d@syzkaller.appspotmail.com Fixes: 4af78261870a ("crypto: x86/chacha20 - add XChaCha20 support") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Add scatter/gather avx stubs, and use them in CDave Watson
Add the appropriate scatter/gather stubs to the avx asm. In the C code, we can now always use crypt_by_sg, since both sse and asm code now support scatter/gather. Introduce a new struct, aesni_gcm_tfm, that is initialized on startup to point to either the SSE, AVX, or AVX2 versions of the four necessary encryption/decryption routines. GENX_OPTSIZE is still checked at the start of crypt_by_sg. The total size of the data is checked, since the additional overhead is in the init function, calculating additional HashKeys. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Introduce partial block macroDave Watson
Before this diff, multiple calls to GCM_ENC_DEC will succeed, but only if all calls are a multiple of 16 bytes. Handle partial blocks at the start of GCM_ENC_DEC, and update aadhash as appropriate. The data offset %r11 is also updated after the partial block. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Introduce READ_PARTIAL_BLOCK macroDave Watson
Introduce READ_PARTIAL_BLOCK macro, and use it in the two existing partial block cases: AAD and the end of ENC_DEC. In particular, the ENC_DEC case should be faster, since we read by 8/4 bytes if possible. This macro will also be used to read partial blocks between enc_update and dec_update calls. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Move ghash_mul to GCM_COMPLETEDave Watson
Prepare to handle partial blocks between scatter/gather calls. For the last partial block, we only want to calculate the aadhash in GCM_COMPLETE, and a new partial block macro will handle both aadhash update and encrypting partial blocks between calls. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Fill in new context data structuresDave Watson
Fill in aadhash, aadlen, pblocklen, curcount with appropriate values. pblocklen, aadhash, and pblockenckey are also updated at the end of each scatter/gather operation, to be carried over to the next operation. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Merge avx precompute functionsDave Watson
The precompute functions differ only by the sub-macros they call, merge them to a single macro. Later diffs add more code to fill in the gcm_context_data structure, this allows changes in a single place. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Split AAD hash calculation to separate macroDave Watson
AAD hash only needs to be calculated once for each scatter/gather operation. Move it to its own macro, and call it from GCM_INIT instead of INITIAL_BLOCKS. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Add GCM_COMPLETE macroDave Watson
Merge encode and decode tag calculations in GCM_COMPLETE macro. Scatter/gather routines will call this once at the end of encryption or decryption. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - support 256 byte keys in avx asmDave Watson
Add support for 192/256-bit keys using the avx gcm/aes routines. The sse routines were previously updated in e31ac32d3b (Add support for 192 & 256 bit keys to AESNI RFC4106). Instead of adding an additional loop in the hotpath as in e31ac32d3b, this diff instead generates separate versions of the code using macros, and the entry routines choose which version once. This results in a 5% performance improvement vs. adding a loop to the hot path. This is the same strategy chosen by the intel isa-l_crypto library. The key size checks are removed from the c code where appropriate. Note that this diff depends on using gcm_context_data - 256 bit keys require 16 HashKeys + 15 expanded keys, which is larger than struct crypto_aes_ctx, so they are stored in struct gcm_context_data. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Macro-ify func save/restoreDave Watson
Macro-ify function save and restore. These will be used in new functions added for scatter/gather update operations. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Introduce gcm_context_dataDave Watson
Add the gcm_context_data structure to the avx asm routines. This will be necessary to support both 256 bit keys and scatter/gather. The pre-computed HashKeys are now stored in the gcm_context_data struct, which is expanded to hold the greater number of hashkeys necessary for avx. Loads and stores to the new struct are always done unlaligned to avoid compiler issues, see e5b954e8 "Use unaligned loads from gcm_context_data" Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23crypto: aesni - Merge GCM_ENC_DECDave Watson
The GCM_ENC_DEC routines for AVX and AVX2 are identical, except they call separate sub-macros. Pass the macros as arguments, and merge them. This facilitates additional refactoring, by requiring changes in only one place. The GCM_ENC_DEC macro was moved above the CONFIG_AS_AVX* ifdefs, since it will be used by both AVX and AVX2. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-23treewide: add intermediate .s files to targetsMasahiro Yamada
Avoid unneeded recreation of these in the incremental build. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-12-22x86/efi: Don't unmap EFI boot services code/data regions for EFI_OLD_MEMMAP ↵Sai Praneeth Prakhya
and EFI_MIXED_MODE The following commit: d5052a7130a6 ("x86/efi: Unmap EFI boot services code/data regions from efi_pgd") forgets to take two EFI modes into consideration, namely EFI_OLD_MEMMAP and EFI_MIXED_MODE: - EFI_OLD_MEMMAP is a legacy way of mapping EFI regions into swapper_pg_dir using ioremap() and init_memory_mapping(). This feature can be enabled by passing "efi=old_map" as kernel command line argument. But, efi_unmap_pages() unmaps EFI boot services code/data regions *only* from efi_pgd and hence cannot be used for unmapping EFI boot services code/data regions from swapper_pg_dir. Introduce a temporary fix to not unmap EFI boot services code/data regions when EFI_OLD_MEMMAP is enabled while working on a real fix. - EFI_MIXED_MODE is another feature where a 64-bit kernel runs on a 64-bit platform crippled by a 32-bit firmware. To support EFI_MIXED_MODE, all RAM (i.e. namely EFI regions like EFI_CONVENTIONAL_MEMORY, EFI_LOADER_<CODE/DATA>, EFI_BOOT_SERVICES_<CODE/DATA> and EFI_RUNTIME_CODE/DATA regions) is mapped into efi_pgd all the time to facilitate EFI runtime calls access it's arguments in 1:1 mode. Hence, don't unmap EFI boot services code/data regions when booted in mixed mode. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20181222022234.7573-1-sai.praneeth.prakhya@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2018-12-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fix from Paolo Bonzini: "A simple patch for a pretty bad bug: Unbreak AMD nested virtualization." * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: nSVM: fix switch to guest mmu
2018-12-21Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "The biggest part is a series of reverts for the macro based GCC inlining workarounds. It caused regressions in distro build and other kernel tooling environments, and the GCC project was very receptive to fixing the underlying inliner weaknesses - so as time ran out we decided to do a reasonably straightforward revert of the patches. The plan is to rely on the 'asm inline' GCC 9 feature, which might be backported to GCC 8 and could thus become reasonably widely available on modern distros. Other than those reverts, there's misc fixes from all around the place. I wish our final x86 pull request for v4.20 was smaller..." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs" Revert "x86/objtool: Use asm macros to work around GCC inlining bugs" Revert "x86/refcount: Work around GCC inlining bug" Revert "x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs" Revert "x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs" Revert "x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops" Revert "x86/extable: Macrofy inline assembly code to work around GCC inlining bugs" Revert "x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs" Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs" x86/mtrr: Don't copy uninitialized gentry fields back to userspace x86/fsgsbase/64: Fix the base write helper functions x86/mm/cpa: Fix cpa_flush_array() TLB invalidation x86/vdso: Pass --eh-frame-hdr to the linker x86/mm: Fix decoy address handling vs 32-bit builds x86/intel_rdt: Ensure a CPU remains online for the region's pseudo-locking sequence x86/dump_pagetables: Fix LDT remap address marker x86/mm: Fix guard hole handling
2018-12-22treewide: surround Kconfig file paths with double quotesMasahiro Yamada
The Kconfig lexer supports special characters such as '.' and '/' in the parameter context. In my understanding, the reason is just to support bare file paths in the source statement. I do not see a good reason to complicate Kconfig for the room of ambiguity. The majority of code already surrounds file paths with double quotes, and it makes sense since file paths are constant string literals. Make it treewide consistent now. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Wolfram Sang <wsa@the-dreams.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Ingo Molnar <mingo@kernel.org>
2018-12-21KVM: x86: Add CPUID support for new instruction WBNOINVDRobert Hoo
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routinesSean Christopherson
Transitioning to/from a VMX guest requires KVM to manually save/load the bulk of CPU state that the guest is allowed to direclty access, e.g. XSAVE state, CR2, GPRs, etc... For obvious reasons, loading the guest's GPR snapshot prior to VM-Enter and saving the snapshot after VM-Exit is done via handcoded assembly. The assembly blob is written as inline asm so that it can easily access KVM-defined structs that are used to hold guest state, e.g. moving the blob to a standalone assembly file would require generating defines for struct offsets. The other relevant aspect of VMX transitions in KVM is the handling of VM-Exits. KVM doesn't employ a separate VM-Exit handler per se, but rather treats the VMX transition as a mega instruction (with many side effects), i.e. sets the VMCS.HOST_RIP to a label immediately following VMLAUNCH/VMRESUME. The label is then exposed to C code via a global variable definition in the inline assembly. Because of the global variable, KVM takes steps to (attempt to) ensure only a single instance of the owning C function, e.g. vmx_vcpu_run, is generated by the compiler. The earliest approach placed the inline assembly in a separate noinline function[1]. Later, the assembly was folded back into vmx_vcpu_run() and tagged with __noclone[2][3], which is still used today. After moving to __noclone, an edge case was encountered where GCC's -ftracer optimization resulted in the inline assembly blob being duplicated. This was "fixed" by explicitly disabling -ftracer in the __noclone definition[4]. Recently, it was found that disabling -ftracer causes build warnings for unsuspecting users of __noclone[5], and more importantly for KVM, prevents the compiler for properly optimizing vmx_vcpu_run()[6]. And perhaps most importantly of all, it was pointed out that there is no way to prevent duplication of a function with 100% reliability[7], i.e. more edge cases may be encountered in the future. So to summarize, the only way to prevent the compiler from duplicating the global variable definition is to move the variable out of inline assembly, which has been suggested several times over[1][7][8]. Resolve the aforementioned issues by moving the VMLAUNCH+VRESUME and VM-Exit "handler" to standalone assembly sub-routines. Moving only the core VMX transition codes allows the struct indexing to remain as inline assembly and also allows the sub-routines to be used by nested_vmx_check_vmentry_hw(). Reusing the sub-routines has a happy side-effect of eliminating two VMWRITEs in the nested_early_check path as there is no longer a need to dynamically change VMCS.HOST_RIP. Note that callers to vmx_vmenter() must account for the CALL modifying RSP, e.g. must subtract op-size from RSP when synchronizing RSP with VMCS.HOST_RSP and "restore" RSP prior to the CALL. There are no great alternatives to fudging RSP. Saving RSP in vmx_enter() is difficult because doing so requires a second register (VMWRITE does not provide an immediate encoding for the VMCS field and KVM supports Hyper-V's memory-based eVMCS ABI). The other more drastic alternative would be to use eschew VMCS.HOST_RSP and manually save/load RSP using a per-cpu variable (which can be encoded as e.g. gs:[imm]). But because a valid stack is needed at the time of VM-Exit (NMIs aren't blocked and a user could theoretically insert INT3/INT1ICEBRK at the VM-Exit handler), a dedicated per-cpu VM-Exit stack would be required. A dedicated stack isn't difficult to implement, but it would require at least one page per CPU and knowledge of the stack in the dumpstack routines. And in most cases there is essentially zero overhead in dynamically updating VMCS.HOST_RSP, e.g. the VMWRITE can be avoided for all but the first VMLAUNCH unless nested_early_check=1, which is not a fast path. In other words, avoiding the VMCS.HOST_RSP by using a dedicated stack would only make the code marginally less ugly while requiring at least one page per CPU and forcing the kernel to be aware (and approve) of the VM-Exit stack shenanigans. [1] cea15c24ca39 ("KVM: Move KVM context switch into own function") [2] a3b5ba49a8c5 ("KVM: VMX: add the __noclone attribute to vmx_vcpu_run") [3] 104f226bfd0a ("KVM: VMX: Fold __vmx_vcpu_run() into vmx_vcpu_run()") [4] 95272c29378e ("compiler-gcc: disable -ftracer for __noclone functions") [5] https://lkml.kernel.org/r/20181218140105.ajuiglkpvstt3qxs@treble [6] https://patchwork.kernel.org/patch/8707981/#21817015 [7] https://lkml.kernel.org/r/ri6y38lo23g.fsf@suse.cz [8] https://lkml.kernel.org/r/20181218212042.GE25620@tassilo.jf.intel.com Suggested-by: Andi Kleen <ak@linux.intel.com> Suggested-by: Martin Jambor <mjambor@suse.cz> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Nadav Amit <namit@vmware.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Martin Jambor <mjambor@suse.cz> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobsSean Christopherson
Use '%% " _ASM_CX"' instead of '%0' to dereference RCX, i.e. the 'struct vcpu_vmx' pointer, in the VM-Enter asm blobs of vmx_vcpu_run() and nested_vmx_check_vmentry_hw(). Using the symbolic name means that adding/removing an output parameter(s) requires "rewriting" almost all of the asm blob, which makes it nearly impossible to understand what's being changed in even the most minor patches. Opportunistically improve the code comments. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixupSean Christopherson
____kvm_handle_fault_on_reboot() provides a generic exception fixup handler that is used to cleanly handle faults on VMX/SVM instructions during reboot (or at least try to). If there isn't a reboot in progress, ____kvm_handle_fault_on_reboot() treats any exception as fatal to KVM and invokes kvm_spurious_fault(), which in turn generates a BUG() to get a stack trace and die. When it was originally added by commit 4ecac3fd6dc2 ("KVM: Handle virtualization instruction #UD faults during reboot"), the "call" to kvm_spurious_fault() was handcoded as PUSH+JMP, where the PUSH'd value is the RIP of the faulting instructing. The PUSH+JMP trickery is necessary because the exception fixup handler code lies outside of its associated function, e.g. right after the function. An actual CALL from the .fixup code would show a slightly bogus stack trace, e.g. an extra "random" function would be inserted into the trace, as the return RIP on the stack would point to no known function (and the unwinder will likely try to guess who owns the RIP). Unfortunately, the JMP was replaced with a CALL when the macro was reworked to not spin indefinitely during reboot (commit b7c4145ba2eb "KVM: Don't spin on virt instruction faults during reboot"). This causes the aforementioned behavior where a bogus function is inserted into the stack trace, e.g. my builds like to blame free_kvm_area(). Revert the CALL back to a JMP. The changelog for commit b7c4145ba2eb ("KVM: Don't spin on virt instruction faults during reboot") contains nothing that indicates the switch to CALL was deliberate. This is backed up by the fact that the PUSH <insn RIP> was left intact. Note that an alternative to the PUSH+JMP magic would be to JMP back to the "real" code and CALL from there, but that would require adding a JMP in the non-faulting path to avoid calling kvm_spurious_fault() and would add no value, i.e. the stack trace would be the same. Using CALL: ------------[ cut here ]------------ kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356! invalid opcode: 0000 [#1] SMP CPU: 4 PID: 1057 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm] Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 RSP: 0018:ffffc900004bbcc8 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff888273fd8000 R08: 00000000000003e8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000371fb0 R13: 0000000000000000 R14: 000000026d763cf4 R15: ffff888273fd8000 FS: 00007f3d69691700(0000) GS:ffff888277800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f89bc56fe0 CR3: 0000000271a5a001 CR4: 0000000000362ee0 Call Trace: free_kvm_area+0x1044/0x43ea [kvm_intel] ? vmx_vcpu_run+0x156/0x630 [kvm_intel] ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? __set_task_blocked+0x38/0x90 ? __set_current_blocked+0x50/0x60 ? __fpu__restore_sig+0x97/0x490 ? do_vfs_ioctl+0xa1/0x620 ? __x64_sys_futex+0x89/0x180 ? ksys_ioctl+0x66/0x70 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x4f/0x100 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc ---[ end trace 9775b14b123b1713 ]--- Using JMP: ------------[ cut here ]------------ kernel BUG at /home/sean/go/src/kernel.org/linux/arch/x86/kvm/x86.c:356! invalid opcode: 0000 [#1] SMP CPU: 6 PID: 1067 Comm: qemu-system-x86 Not tainted 4.20.0-rc6+ #75 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_spurious_fault+0x5/0x10 [kvm] Code: <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 RSP: 0018:ffffc90000497cd0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffff RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88827058bd40 R08: 00000000000003e8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000784 R12: ffffc90000369fb0 R13: 0000000000000000 R14: 00000003c8fc6642 R15: ffff88827058bd40 FS: 00007f3d7219e700(0000) GS:ffff888277900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3d64001000 CR3: 0000000271c6b004 CR4: 0000000000362ee0 Call Trace: vmx_vcpu_run+0x156/0x630 [kvm_intel] ? kvm_arch_vcpu_ioctl_run+0x447/0x1a40 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? kvm_vcpu_ioctl+0x368/0x5c0 [kvm] ? __set_task_blocked+0x38/0x90 ? __set_current_blocked+0x50/0x60 ? __fpu__restore_sig+0x97/0x490 ? do_vfs_ioctl+0xa1/0x620 ? __x64_sys_futex+0x89/0x180 ? ksys_ioctl+0x66/0x70 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x4f/0x100 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Modules linked in: vhost_net vhost tap kvm_intel kvm irqbypass bridge stp llc ---[ end trace f9daedb85ab3ddba ]--- Fixes: b7c4145ba2eb ("KVM: Don't spin on virt instruction faults during reboot") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streamsUros Bizjak
Recently the minimum required version of binutils was changed to 2.20, which supports all SVM instruction mnemonics. The patch removes all .byte #defines and uses real instruction mnemonics instead. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()Lan Tianyu
Originally, flush tlb is done by slot_handle_level_range(). This patch moves the flush directly to kvm_zap_gfn_range() when range flush is available, so that only the requested range can be flushed. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp()Lan Tianyu
This patch is to flush tlb directly in kvm_set_pte_rmapp() function when Hyper-V remote TLB flush is available, returning 0 so that kvm_mmu_notifier_change_pte() does not flush again. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte()Lan Tianyu
This patch is to move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte() in order to avoid redundant tlb flush. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: Make kvm_set_spte_hva() return intLan Tianyu
The patch is to make kvm_set_spte_hva() return int and caller can check return value to determine flush tlb or not. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: Replace old tlb flush function with new one to flush a specified range.Lan Tianyu
This patch is to replace kvm_flush_remote_tlbs() with kvm_flush_ remote_tlbs_with_address() in some functions without logic change. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM/MMU: Add tlb flush with range helper functionLan Tianyu
This patch is to add wrapper functions for tlb_remote_flush_with_range callback and flush tlb directly in kvm_mmu_zap_collapsible_spte(). kvm_mmu_zap_collapsible_spte() returns flush request to the slot_handle_leaf() and the latter does flush on demand. When range flush is available, make kvm_mmu_zap_collapsible_spte() to flush tlb with range directly to avoid returning range back to slot_handle_leaf(). Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM/VMX: Add hv tlb range flush supportLan Tianyu
This patch is to register tlb_remote_flush_with_range callback with hv tlb range flush interface. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21x86/hyper-v: Add HvFlushGuestAddressList hypercall supportLan Tianyu
Hyper-V provides HvFlushGuestAddressList() hypercall to flush EPT tlb with specified ranges. This patch is to add the hypercall support. Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: Add tlb_remote_flush_with_range callback in kvm_x86_opsLan Tianyu
Add flush range call back in the kvm_x86_ops and platform can use it to register its associated function. The parameter "kvm_tlb_range" accepts a single range and flush list which contains a list of ranges. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: x86: Disable Intel PT when VMXON in L1 guestLuwei Kang
Currently, Intel Processor Trace do not support tracing in L1 guest VMX operation(IA32_VMX_MISC[bit 14] is 0). As mentioned in SDM, on these type of processors, execution of the VMXON instruction will clears IA32_RTIT_CTL.TraceEn and any attempt to write IA32_RTIT_CTL causes a general-protection exception (#GP). Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: x86: Set intercept for Intel PT MSRs read/writeChao Peng
To save performance overhead, disable intercept Intel PT MSRs read/write when Intel PT is enabled in guest. MSR_IA32_RTIT_CTL is an exception that will always be intercepted. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: x86: Implement Intel PT MSRs read/write emulationChao Peng
This patch implement Intel Processor Trace MSRs read/write emulation. Intel PT MSRs read/write need to be emulated when Intel PT MSRs is intercepted in guest and during live migration. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: x86: Introduce a function to initialize the PT configurationLuwei Kang
Initialize the Intel PT configuration when cpuid update. Include cpuid inforamtion, rtit_ctl bit mask and the number of address ranges. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: x86: Add Intel PT context switch for each vcpuChao Peng
Load/Store Intel Processor Trace register in context switch. MSR IA32_RTIT_CTL is loaded/stored automatically from VMCS. In Host-Guest mode, we need load/resore PT MSRs only when PT is enabled in guest. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: x86: Add Intel Processor Trace cpuid emulationChao Peng
Expose Intel Processor Trace to guest only when the PT works in Host-Guest mode. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21KVM: x86: Add Intel PT virtualization work modeChao Peng
Intel Processor Trace virtualization can be work in one of 2 possible modes: a. System-Wide mode (default): When the host configures Intel PT to collect trace packets of the entire system, it can leave the relevant VMX controls clear to allow VMX-specific packets to provide information across VMX transitions. KVM guest will not aware this feature in this mode and both host and KVM guest trace will output to host buffer. b. Host-Guest mode: Host can configure trace-packet generation while in VMX non-root operation for guests and root operation for native executing normally. Intel PT will be exposed to KVM guest in this mode, and the trace output to respective buffer of host and guest. In this mode, tht status of PT will be saved and disabled before VM-entry and restored after VM-exit if trace a virtual machine. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21perf/x86/intel/pt: add new capability for Intel PTLuwei Kang
This adds support for "output to Trace Transport subsystem" capability of Intel PT. It means that PT can output its trace to an MMIO address range rather than system memory buffer. Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21perf/x86/intel/pt: Add new bit definitions for PT MSRsLuwei Kang
Add bit definitions for Intel PT MSRs to support trace output directed to the memeory subsystem and holds a count if packet bytes that have been sent out. These are required by the upcoming PT support in KVM guests for MSRs read/write emulation. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21perf/x86/intel/pt: Introduce intel_pt_validate_cap()Luwei Kang
intel_pt_validate_hw_cap() validates whether a given PT capability is supported by the hardware. It checks the PT capability array which reflects the capabilities of the hardware on which the code is executed. For setting up PT for KVM guests this is not correct as the capability array for the guest can be different from the host array. Provide a new function to check against a given capability array. Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>