summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2018-12-07preempt: Move PREEMPT_NEED_RESCHED definition into arch codeWill Deacon
PREEMPT_NEED_RESCHED is never used directly, so move it into the arch code where it can potentially be implemented using either a different bit in the preempt count or as an entirely separate entity. Cc: Robert Love <rml@tech9.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06kprobes/x86: Blacklist non-attachable interrupt functionsAndrea Righi
These interrupt functions are already non-attachable by kprobes. Blacklist them explicitly so that they can show up in /sys/kernel/debug/kprobes/blacklist and tools like BCC can use this additional information. Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David S. Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yonghong Song <yhs@fb.com> Link: http://lkml.kernel.org/r/20181206095648.GA8249@Dell Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-06arch: switch the default on ARCH_HAS_SG_CHAINChristoph Hellwig
These days architectures are mostly out of the business of dealing with struct scatterlist at all, unless they have architecture specific iommu drivers. Replace the ARCH_HAS_SG_CHAIN symbol with a ARCH_NO_SG_CHAIN one only enabled for architectures with horrible legacy iommu drivers like alpha and parisc, and conditionally for arm which wants to keep it disable for legacy platforms. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
2018-12-06x86/calgary: remove the mapping_error dma_map_ops methodChristoph Hellwig
Return DMA_MAPPING_ERROR instead of the magic bad_dma_addr on a dma mapping failure and let the core dma-mapping code handle the rest. Remove the magic EMERGENCY_PAGES that the bad_dma_addr gets redirected to. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-06x86/amd_gart: remove the mapping_error dma_map_ops methodChristoph Hellwig
Return DMA_MAPPING_ERROR instead of the magic bad_dma_addr on a dma mapping failure and let the core dma-mapping code handle the rest. Remove the magic EMERGENCY_PAGES that the bad_dma_addr gets redirected to. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-06x86/mce: Unify pr_* prefixBorislav Petkov
Move the pr_fmt prefix to internal.h and include it everywhere. This way, all pr_* printed strings will be prepended with "mce: ". No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20181205200913.GR29510@zn.tnic
2018-12-06x86/speculation: Change misspelled STIPB to STIBPWaiman Long
STIBP stands for Single Thread Indirect Branch Predictors. The acronym, however, can be easily mis-spelled as STIPB. It is perhaps due to the presence of another related term - IBPB (Indirect Branch Predictor Barrier). Fix the mis-spelling in the code. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: KarimAllah Ahmed <karahmed@amazon.de> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/1544039368-9009-1-git-send-email-longman@redhat.com
2018-12-05x86/mce: Streamline MCE subsystem's namingBorislav Petkov
Rename the containing folder to "mce" which is the most widespread name. Drop the "mce[-_]" filename prefix of some compilation units (while others don't have it). This unifies the file naming in the MCE subsystem: mce/ |-- amd.c |-- apei.c |-- core.c |-- dev-mcelog.c |-- genpool.c |-- inject.c |-- intel.c |-- internal.h |-- Makefile |-- p5.c |-- severity.c |-- therm_throt.c |-- threshold.c `-- winchip.c No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20181205141323.14995-1-bp@alien8.de
2018-12-05x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()Dan Williams
Commit: f77084d96355 "x86/mm/pat: Disable preemption around __flush_tlb_all()" addressed a case where __flush_tlb_all() is called without preemption being disabled. It also left a warning to catch other cases where preemption is not disabled. That warning triggers for the memory hotplug path which is also used for persistent memory enabling: WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460 RIP: 0010:__flush_tlb_all+0x1b/0x3a [..] Call Trace: phys_pud_init+0x29c/0x2bb kernel_physical_mapping_init+0xfc/0x219 init_memory_mapping+0x1a5/0x3b0 arch_add_memory+0x2c/0x50 devm_memremap_pages+0x3aa/0x610 pmem_attach_disk+0x585/0x700 [nd_pmem] Andy wondered why a path that can sleep was using __flush_tlb_all() [1] and Dave confirmed the expectation for TLB flush is for modifying / invalidating existing PTE entries, but not initial population [2]. Drop the usage of __flush_tlb_all() in phys_{p4d,pud,pmd}_init() on the expectation that this path is only ever populating empty entries for the linear map. Note, at linear map teardown time there is a call to the all-cpu flush_tlb_all() to invalidate the removed mappings. [1]: https://lkml.kernel.org/r/9DFD717D-857D-493D-A606-B635D72BAC21@amacapital.net [2]: https://lkml.kernel.org/r/749919a4-cdb1-48a3-adb4-adb81a5fa0b5@intel.com [ mingo: Minor readability edits. ] Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave.hansen@intel.com Fixes: f77084d96355 ("x86/mm/pat: Disable preemption around __flush_tlb_all()") Link: http://lkml.kernel.org/r/154395944713.32119.15611079023837132638.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-05x86/mm: Validate kernel_physical_mapping_init() PTE populationDan Williams
The usage of __flush_tlb_all() in the kernel_physical_mapping_init() path is not necessary. In general flushing the TLB is not required when updating an entry from the !present state. However, to give confidence in the future removal of TLB flushing in this path, use the new set_pte_safe() family of helpers to assert that the !present assumption is true in this path. [ mingo: Minor readability edits. ] Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@surriel.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/154395944177.32119.8524957429632012270.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-05x86/vdso: Remove a stale/misleading comment from the linker scriptSean Christopherson
Once upon a time, vdso2c aggressively stripped data from the vDSO image when generating the final userspace image. This included stripping the .altinstructions and .altinstr_replacement sections. Eventually, the stripping process reverted to "objdump -S" and no longer removed the aforementioned sections, but the comment remained. Keeping the .alt* sections at the end of the PT_LOAD segment is no longer necessary, but there's no harm in doing so and it's a helpful reminder that they don't need to be included in the final vDSO image, i.e. someone may want to take another stab at zapping/stripping the unneeded sections. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: da861e18eccc ("x86, vdso: Get rid of the fake section mechanism") Link: http://lkml.kernel.org/r/20181204212600.28090-3-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-05x86/vdso: Remove obsolete "fake section table" reservationSean Christopherson
At one point the vDSO image was manually stripped down by vdso2c in an attempt to minimize the size of the image mapped into userspace. Part of that stripping process involved building a fake section table so as not to break userspace processes that parse the section table. Memory for the fake section table was reserved in the .rodata section so that vdso2c could simply copy the entire PT_LOAD segment into the userspace image after building the fake table. Eventually, the entire fake section table approach was dropped in favor of stripping the vdso "the old fashioned way", i.e. via objdump -S. But, the reservation in .rodata for the fake table was left behind. Remove the reserveration along with a few other related defines and section entries. Removing the fake section table placeholder zaps a whopping 0x340 bytes from the 64-bit vDSO image, which drops the current image's size to under 4k, i.e. reduces the effective size of the userspace vDSO mapping by a full page. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: da861e18eccc ("x86, vdso: Get rid of the fake section mechanism") Link: http://lkml.kernel.org/r/20181204212600.28090-2-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-05x86/umip: Make the UMIP activated message genericLendacky, Thomas
The User Mode Instruction Prevention (UMIP) feature is part of the x86_64 instruction set architecture and not specific to Intel. Make the message generic. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-05x86/build: Fix compiler support check for CONFIG_RETPOLINEMasahiro Yamada
It is troublesome to add a diagnostic like this to the Makefile parse stage because the top-level Makefile could be parsed with a stale include/config/auto.conf. Once you are hit by the error about non-retpoline compiler, the compilation still breaks even after disabling CONFIG_RETPOLINE. The easiest fix is to move this check to the "archprepare" like this commit did: 829fe4aa9ac1 ("x86: Allow generating user-space headers without a compiler") Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com> Fixes: 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support") Link: http://lkml.kernel.org/r/1543991239-18476-1-git-send-email-yamada.masahiro@socionext.com Link: https://lkml.org/lkml/2018/12/4/206 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-04x86/fpu: Don't export __kernel_fpu_{begin,end}()Sebastian Andrzej Siewior
There is one user of __kernel_fpu_begin() and before invoking it, it invokes preempt_disable(). So it could invoke kernel_fpu_begin() right away. The 32bit version of arch_efi_call_virt_setup() and arch_efi_call_virt_teardown() does this already. The comment above *kernel_fpu*() claims that before invoking __kernel_fpu_begin() preemption should be disabled and that KVM is a good example of doing it. Well, KVM doesn't do that since commit f775b13eedee2 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run") so it is not an example anymore. With EFI gone as the last user of __kernel_fpu_{begin|end}(), both can be made static and not exported anymore. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Nicolai Stange <nstange@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Cc: linux-efi <linux-efi@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181129150210.2k4mawt37ow6c2vq@linutronix.de
2018-12-04x86/hpet: Remove unused FSEC_PER_NSEC defineRoland Dreier
The FSEC_PER_NSEC macro has had zero users since commit ab0e08f15d23 ("x86: hpet: Cleanup the clockevents init and register code"). Remove it. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181130211450.5200-1-roland@purestorage.com
2018-12-04kprobes/x86: Fix instruction patching corruption when copying more than one ↵Masami Hiramatsu
RIP-relative instruction After copy_optimized_instructions() copies several instructions to the working buffer it tries to fix up the real RIP address, but it adjusts the RIP-relative instruction with an incorrect RIP address for the 2nd and subsequent instructions due to a bug in the logic. This will break the kernel pretty badly (with likely outcomes such as a kernel freeze, a crash, or worse) because probed instructions can refer to the wrong data. For example putting kprobes on cpumask_next() typically hits this bug. cpumask_next() is normally like below if CONFIG_CPUMASK_OFFSTACK=y (in this case nr_cpumask_bits is an alias of nr_cpu_ids): <cpumask_next>: 48 89 f0 mov %rsi,%rax 8b 35 7b fb e2 00 mov 0xe2fb7b(%rip),%esi # ffffffff82db9e64 <nr_cpu_ids> 55 push %rbp ... If we put a kprobe on it and it gets jump-optimized, it gets patched by the kprobes code like this: <cpumask_next>: e9 95 7d 07 1e jmpq 0xffffffffa000207a 7b fb jnp 0xffffffff81f8a2e2 <cpumask_next+2> e2 00 loop 0xffffffff81f8a2e9 <cpumask_next+9> 55 push %rbp This shows that the first two MOV instructions were copied to a trampoline buffer at 0xffffffffa000207a. Here is the disassembled result of the trampoline, skipping the optprobe template instructions: # Dump of assembly code from 0xffffffffa000207a to 0xffffffffa00020ea: 54 push %rsp ... 48 83 c4 08 add $0x8,%rsp 9d popfq 48 89 f0 mov %rsi,%rax 8b 35 82 7d db e2 mov -0x1d24827e(%rip),%esi # 0xffffffff82db9e67 <nr_cpu_ids+3> This dump shows that the second MOV accesses *(nr_cpu_ids+3) instead of the original *nr_cpu_ids. This leads to a kernel freeze because cpumask_next() always returns 0 and for_each_cpu() never ends. Fix this by adding 'len' correctly to the real RIP address while copying. [ mingo: Improved the changelog. ] Reported-by: Michael Rodin <michael@rodin.online> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # v4.15+ Fixes: 63fef14fc98a ("kprobes/x86: Make insn buffer always ROX and use text_poke()") Link: http://lkml.kernel.org/r/153504457253.22602.1314289671019919596.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-04Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU changes from Paul E. McKenney: - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar. - Replace calls of RCU-bh and RCU-sched update-side functions to their vanilla RCU counterparts. This series is a step towards complete removal of the RCU-bh and RCU-sched update-side functions. ( Note that some of these conversions are going upstream via their respective maintainers. ) - Documentation updates, including a number of flavor-consolidation updates from Joel Fernandes. - Miscellaneous fixes. - Automate generation of the initrd filesystem used for rcutorture testing. - Convert spin_is_locked() assertions to instead use lockdep. ( Note that some of these conversions are going upstream via their respective maintainers. ) - SRCU updates, especially including a fix from Dennis Krein for a bag-on-head-class bug. - RCU torture-test updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-03x86/fpu: Update comment for __raw_xsave_addr()Sebastian Andrzej Siewior
The comment above __raw_xsave_addr() claims that the function does not work for compacted buffers and was introduced in: b8b9b6ba9dec3 ("x86/fpu: Allow setting of XSAVE state") In this commit, the function was factored out of get_xsave_addr() and this function claims that it works with "standard format or compacted format of xsave area". It accesses the "xstate_comp_offsets" variable for the actual offset and it was introduced in commit 7496d6458fe32 ("Define kernel API to get address of each state in xsave area") Based on the code (back then and now): - xstate_offsets holds the standard offset. - if compacted mode is not supported then xstate_comp_offsets gets the xstate_offsets copied. - if compacted mode is supported then xstate_comp_offsets will hold the offset for the compacted buffer. Based on that the function works for compacted buffers as long as the CPU supports it and this what we care about. Remove the "Note:" which is not accurate. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181128222035.2996-7-bigeasy@linutronix.de
2018-12-03x86/fpu: Add might_fault() to user_insn()Sebastian Andrzej Siewior
Every user of user_insn() passes an user memory pointer to this macro. Add might_fault() to user_insn() so we can spot users which are using this macro in sections where page faulting is not allowed. [ bp: Space it out to make it more visible. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181128222035.2996-6-bigeasy@linutronix.de
2018-12-03x86/pkeys: Make init_pkru_value staticSebastian Andrzej Siewior
The variable init_pkru_value isn't used outside of this file. Make it static. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rik van Riel <riel@surriel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181128222035.2996-5-bigeasy@linutronix.de
2018-12-03x86/thread_info: Remove _TIF_ALLWORK_MASKSebastian Andrzej Siewior
There is no user of _TIF_ALLWORK_MASK since commit 21d375b6b34ff ("x86/entry/64: Remove the SYSCALL64 fast path"). Remove the unused define _TIF_ALLWORK_MASK. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Howells <dhowells@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181128222035.2996-4-bigeasy@linutronix.de
2018-12-03x86/process/32: Remove asm/math_emu.h includeSebastian Andrzej Siewior
The math_emu.h header files contains the definition of struct math_emu_info which is not used in this file. Remove the asm/math_emu.h include. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181128222035.2996-3-bigeasy@linutronix.de
2018-12-03x86/fpu: Use unsigned long long shift in xfeature_uncompacted_offset()Sebastian Andrzej Siewior
The xfeature mask is 64-bit so a shift from a number to its mask should have ULL suffix or else bits above position 31 will be lost. This is not a problem now but should XFEATURE_MASK_SUPERVISOR gain a bit >31 then this check won't catch it. Use BIT_ULL() to compute a mask from a number. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181128222035.2996-2-bigeasy@linutronix.de
2018-12-03x86/umip: Print UMIP line only onceBorislav Petkov
... instead of issuing it per CPU and flooding dmesg unnecessarily. Streamline the formulation, while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20181127205936.30331-1-bp@alien8.de
2018-12-03x86/boot: Clear RSDP address in boot_params for broken loadersJuergen Gross
Gunnar Krueger reported a systemd-boot failure and bisected it down to: e6e094e053af75 ("x86/acpi, x86/boot: Take RSDP address from boot params if available") In case a broken boot loader doesn't clear its 'struct boot_params', clear rsdp_addr in sanitize_boot_params(). Reported-by: Gunnar Krueger <taijian@posteo.de> Tested-by: Gunnar Krueger <taijian@posteo.de> Signed-off-by: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: sstabellini@kernel.org Fixes: e6e094e053af75 ("x86/acpi, x86/boot: Take RSDP address from boot params if available") Link: http://lkml.kernel.org/r/20181203103811.17056-1-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-03x86: Fix various typos in commentsIngo Molnar
Go over arch/x86/ and fix common typos in comments, and a typo in an actual function argument name. No change in functionality intended. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-03Merge tag 'v4.20-rc5' into x86/cleanups, to sync up the treeIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-02Merge tag 'for-linus-4.20a-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A revert of a previous commit as it is no longer necessary and has shown to cause problems in some memory hotplug cases. - Some small fixes and a minor cleanup. - A patch for adding better diagnostic data in a very rare failure case. * tag 'for-linus-4.20a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: pvcalls-front: fixes incorrect error handling Revert "xen/balloon: Mark unallocated host memory as UNUSABLE" xen: xlate_mmu: add missing header to fix 'W=1' warning xen/x86: add diagnostic printout to xen_mc_flush() in case of error x86/xen: cleanup includes in arch/x86/xen/spinlock.c
2018-12-01Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull STIBP fallout fixes from Thomas Gleixner: "The performance destruction department finally got it's act together and came up with a cure for the STIPB regression: - Provide a command line option to control the spectre v2 user space mitigations. Default is either seccomp or prctl (if seccomp is disabled in Kconfig). prctl allows mitigation opt-in, seccomp enables the migitation for sandboxed processes. - Rework the code to handle the conditional STIBP/IBPB control and remove the now unused ptrace_may_access_sched() optimization attempt - Disable STIBP automatically when SMT is disabled - Optimize the switch_to() logic to avoid MSR writes and invocations of __switch_to_xtra(). - Make the asynchronous speculation TIF updates synchronous to prevent stale mitigation state. As a general cleanup this also makes retpoline directly depend on compiler support and removes the 'minimal retpoline' option which just pretended to provide some form of security while providing none" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) x86/speculation: Provide IBPB always command line options x86/speculation: Add seccomp Spectre v2 user space protection mode x86/speculation: Enable prctl mode for spectre_v2_user x86/speculation: Add prctl() control for indirect branch speculation x86/speculation: Prepare arch_smt_update() for PRCTL mode x86/speculation: Prevent stale SPEC_CTRL msr content x86/speculation: Split out TIF update ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS x86/speculation: Prepare for conditional IBPB in switch_mm() x86/speculation: Avoid __switch_to_xtra() calls x86/process: Consolidate and simplify switch_to_xtra() code x86/speculation: Prepare for per task indirect branch speculation control x86/speculation: Add command line control for indirect branch speculation x86/speculation: Unify conditional spectre v2 print functions x86/speculataion: Mark command line parser data __initdata x86/speculation: Mark string arrays const correctly x86/speculation: Reorder the spec_v2 code x86/l1tf: Show actual SMT state x86/speculation: Rework SMT state change sched/smt: Expose sched_smt_present static key ...
2018-12-01kbuild: fix UML build error with CONFIG_GCC_PLUGINSMasahiro Yamada
UML fails to build with CONFIG_GCC_PLUGINS=y. $ make -s ARCH=um mrproper $ make -s ARCH=um allmodconfig $ make ARCH=um UPD include/generated/uapi/linux/version.h WRAP arch/x86/include/generated/uapi/asm/bpf_perf_event.h WRAP arch/x86/include/generated/uapi/asm/poll.h WRAP arch/x86/include/generated/asm/dma-contiguous.h WRAP arch/x86/include/generated/asm/early_ioremap.h WRAP arch/x86/include/generated/asm/export.h WRAP arch/x86/include/generated/asm/mcs_spinlock.h WRAP arch/x86/include/generated/asm/mm-arch-hooks.h SYSTBL arch/x86/include/generated/asm/syscalls_32.h SYSHDR arch/x86/include/generated/asm/unistd_32_ia32.h SYSHDR arch/x86/include/generated/asm/unistd_64_x32.h SYSTBL arch/x86/include/generated/asm/syscalls_64.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_32.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_64.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_x32.h HOSTCC scripts/unifdef CC arch/x86/um/user-offsets.s cc1: error: cannot load plugin ./scripts/gcc-plugins/cyc_complexity_plugin.so ./scripts/gcc-plugins/cyc_complexity_plugin.so: cannot open shared object file: No such file or directory cc1: error: cannot load plugin ./scripts/gcc-plugins/structleak_plugin.so ./scripts/gcc-plugins/structleak_plugin.so: cannot open shared object file: No such file or directory cc1: error: cannot load plugin ./scripts/gcc-plugins/latent_entropy_plugin.so ./scripts/gcc-plugins/latent_entropy_plugin.so: cannot open shared object file: No such file or directory cc1: error: cannot load plugin ./scripts/gcc-plugins/randomize_layout_plugin.so ./scripts/gcc-plugins/randomize_layout_plugin.so: cannot open shared object file: No such file or directory make[1]: *** [scripts/Makefile.build;119: arch/x86/um/user-offsets.s] Error 1 make: *** [arch/um/Makefile;152: arch/x86/um/user-offsets.s] Error 2 Reorder the preparation stage (with cleanups) to make sure gcc-plugins is built before descending to arch/x86/um/. Fixes: 6b90bd4ba40b ("GCC plugin infrastructure") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-11-30Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - MCE related boot crash fix on certain AMD systems - FPU exception handling fix - FPU handling race fix - revert+rewrite of the RSDP boot protocol extension, use boot_params instead - documentation fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/MCE/AMD: Fix the thresholding machinery initialization order x86/fpu: Use the correct exception table macro in the XSTATE_OP wrapper x86/fpu: Disable bottom halves while loading FPU registers x86/acpi, x86/boot: Take RSDP address from boot params if available x86/boot: Mostly revert commit ae7e1238e68f2a ("Add ACPI RSDP address to setup_header") x86/ptrace: Fix documentation for tracehook_report_syscall_entry()
2018-11-30Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes: - counter freezing related regression fix - uprobes race fix - Intel PMU unusual event combination fix - .. and diverse tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: uprobes: Fix handle_swbp() vs. unregister() + register() race once more perf/x86/intel: Disallow precise_ip on BTS events perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts() perf/x86/intel: Move branch tracing setup to the Intel-specific source file perf/x86/intel: Fix regression by default disabling perfmon v4 interrupt handling perf tools beauty ioctl: Support new ISO7816 commands tools uapi asm-generic: Synchronize ioctls.h tools arch x86: Update tools's copy of cpufeatures.h tools headers uapi: Synchronize i915_drm.h perf tools: Restore proper cwd on return from mnt namespace tools build feature: Check if get_current_dir_name() is available perf tools: Fix crash on synthesizing the unit
2018-11-30Merge tag 'trace-v4.20-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "While rewriting the function graph tracer, I discovered a design flaw that was introduced by a patch that tried to fix one bug, but by doing so created another bug. As both bugs corrupt the output (but they do not crash the kernel), I decided to fix the design such that it could have both bugs fixed. The original fix, fixed time reporting of the function graph tracer when doing a max_depth of one. This was code that can test how much the kernel interferes with userspace. But in doing so, it could corrupt the time keeping of the function profiler. The issue is that the curr_ret_stack variable was being used for two different meanings. One was to keep track of the stack pointer on the ret_stack (shadow stack used by the function graph tracer), and the other use case was the graph call depth. Although, the two may be closely related, where they got updated was the issue that lead to the two different bugs that required the two use cases to be updated differently. The big issue with this fix is that it requires changing each architecture. The good news is, I was able to remove a lot of code that was duplicated within the architectures and place it into a single location. Then I could make the fix in one place. I pushed this code into linux-next to let it settle over a week, and before doing so, I cross compiled all the affected architectures to make sure that they built fine. In the mean time, I also pulled in a patch that fixes the sched_switch previous tasks state output, that was not actually correct" * tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: sched, trace: Fix prev_state output in sched_switch tracepoint function_graph: Have profiler use curr_ret_stack and not depth function_graph: Reverse the order of pushing the ret_stack and the callback function_graph: Move return callback before update of curr_ret_stack function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack function_graph: Make ftrace_push_return_trace() static sparc/function_graph: Simplify with function_graph_enter() sh/function_graph: Simplify with function_graph_enter() s390/function_graph: Simplify with function_graph_enter() riscv/function_graph: Simplify with function_graph_enter() powerpc/function_graph: Simplify with function_graph_enter() parisc: function_graph: Simplify with function_graph_enter() nds32: function_graph: Simplify with function_graph_enter() MIPS: function_graph: Simplify with function_graph_enter() microblaze: function_graph: Simplify with function_graph_enter() arm64: function_graph: Simplify with function_graph_enter() ARM: function_graph: Simplify with function_graph_enter() x86/function_graph: Simplify with function_graph_enter() function_graph: Create function_graph_enter() to consolidate architecture code
2018-11-30x86/efi: Move efi_<reserve/free>_boot_services() to arch/x86Sai Praneeth Prakhya
efi_<reserve/free>_boot_services() are x86 specific quirks and as such should be in asm/efi.h, so move them from linux/efi.h. Also, call efi_free_boot_services() from __efi_enter_virtual_mode() as it is x86 specific call and ideally shouldn't be part of init/main.c Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arend van Spriel <arend.vanspriel@broadcom.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Snowberg <eric.snowberg@oracle.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jon Hunter <jonathanh@nvidia.com> Cc: Julien Thierry <julien.thierry@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20181129171230.18699-7-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-30x86/efi: Unmap EFI boot services code/data regions from efi_pgdSai Praneeth Prakhya
efi_free_boot_services(), as the name suggests, frees EFI boot services code/data regions but forgets to unmap these regions from efi_pgd. This means that any code that's running in efi_pgd address space (e.g: any EFI runtime service) would still be able to access these regions but the contents of these regions would have long been over written by someone else. So, it's important to unmap these regions. Hence, introduce efi_unmap_pages() to unmap these regions from efi_pgd. After unmapping EFI boot services code/data regions, any illegal access by buggy firmware to these regions would result in page fault which will be handled by EFI specific fault handler. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arend van Spriel <arend.vanspriel@broadcom.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Snowberg <eric.snowberg@oracle.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jon Hunter <jonathanh@nvidia.com> Cc: Julien Thierry <julien.thierry@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20181129171230.18699-6-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-30x86/mm/pageattr: Introduce helper function to unmap EFI boot servicesSai Praneeth Prakhya
Ideally, after kernel assumes control of the platform, firmware shouldn't access EFI boot services code/data regions. But, it's noticed that this is not so true in many x86 platforms. Hence, during boot, kernel reserves EFI boot services code/data regions [1] and maps [2] them to efi_pgd so that call to set_virtual_address_map() doesn't fail. After returning from set_virtual_address_map(), kernel frees the reserved regions [3] but they still remain mapped. Hence, introduce kernel_unmap_pages_in_pgd() which will later be used to unmap EFI boot services code/data regions. While at it modify kernel_map_pages_in_pgd() by: 1. Adding __init modifier because it's always used *only* during boot. 2. Add a warning if it's used after SMP is initialized because it uses __flush_tlb_all() which flushes mappings only on current CPU. Unmapping EFI boot services code/data regions will result in clearing PAGE_PRESENT bit and it shouldn't bother L1TF cases because it's already handled by protnone_mask() at arch/x86/include/asm/pgtable-invert.h. [1] efi_reserve_boot_services() [2] efi_map_region() -> __map_region() -> kernel_map_pages_in_pgd() [3] efi_free_boot_services() Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arend van Spriel <arend.vanspriel@broadcom.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Snowberg <eric.snowberg@oracle.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jon Hunter <jonathanh@nvidia.com> Cc: Julien Thierry <julien.thierry@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20181129171230.18699-5-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-30x86/earlyprintk/efi: Fix infinite loop on some screen widthsYiFei Zhu
An affected screen resolution is 1366 x 768, which width is not divisible by 8, the default font width. On such screens, when longer lines are earlyprintk'ed, overflow-to-next-line can never trigger, due to the left-most x-coordinate of the next character always less than the screen width. Earlyprintk will infinite loop in trying to print the rest of the string but unable to, due to the line being full. This patch makes the trigger consider the right-most x-coordinate, instead of left-most, as the value to compare against the screen width threshold. Signed-off-by: YiFei Zhu <zhuyifei1999@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arend van Spriel <arend.vanspriel@broadcom.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Snowberg <eric.snowberg@oracle.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jon Hunter <jonathanh@nvidia.com> Cc: Julien Thierry <julien.thierry@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20181129171230.18699-12-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-30x86/efi: Allocate e820 buffer before calling efi_exit_boot_serviceEric Snowberg
The following commit: d64934019f6c ("x86/efi: Use efi_exit_boot_services()") introduced a regression on systems with large memory maps causing them to hang on boot. The first "goto get_map" that was removed from exit_boot() ensured there was enough room for the memory map when efi_call_early(exit_boot_services) was called. This happens when (nr_desc > ARRAY_SIZE(params->e820_table). Chain of events: exit_boot() efi_exit_boot_services() efi_get_memory_map <- at this point the mm can't grow over 8 desc priv_func() exit_boot_func() allocate_e820ext() <- new mm grows over 8 desc from e820 alloc efi_call_early(exit_boot_services) <- mm key doesn't match so retry efi_call_early(get_memory_map) <- not enough room for new mm system hangs This patch allocates the e820 buffer before calling efi_exit_boot_services() and fixes the regression. [ mingo: minor cleanliness edits. ] Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arend van Spriel <arend.vanspriel@broadcom.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Jon Hunter <jonathanh@nvidia.com> Cc: Julien Thierry <julien.thierry@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20181129171230.18699-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-29Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"Igor Druzhinin
This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f. That commit unintentionally broke Xen balloon memory hotplug with "hotplug_unpopulated" set to 1. As long as "System RAM" resource got assigned under a new "Unusable memory" resource in IO/Mem tree any attempt to online this memory would fail due to general kernel restrictions on having "System RAM" resources as 1st level only. The original issue that commit has tried to workaround fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to avoid conflict") which made the original fix to Xen ballooning unnecessary. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-11-29xen/x86: add diagnostic printout to xen_mc_flush() in case of errorJuergen Gross
Failure of an element of a Xen multicall is signalled via a WARN() only if the kernel is compiled with MC_DEBUG. It is impossible to know which element failed and why it did so. Change that by printing the related information even without MC_DEBUG, even if maybe in some limited form (e.g. without information which caller produced the failing element). Move the printing out of the switch statement in order to have the same information for a single call. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-11-29crypto: x86/chacha20 - Add a 4-block AVX-512VL variantMartin Willi
This version uses the same principle as the AVX2 version by scheduling the operations for two block pairs in parallel. It benefits from the AVX-512VL rotate instructions and the more efficient partial block handling using "vmovdqu8", resulting in a speedup of the raw block function of ~20%. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-29crypto: x86/chacha20 - Add a 2-block AVX-512VL variantMartin Willi
This version uses the same principle as the AVX2 version. It benefits from the AVX-512VL rotate instructions and the more efficient partial block handling using "vmovdqu8", resulting in a speedup of ~20%. Unlike the AVX2 version, it is faster than the single block SSSE3 version to process a single block. Hence we engage that function for (partial) single block lengths as well. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-29crypto: x86/chacha20 - Add a 8-block AVX-512VL variantMartin Willi
This variant is similar to the AVX2 version, but benefits from the AVX-512 rotate instructions and the additional registers, so it can operate without any data on the stack. It uses ymm registers only to avoid the massive core throttling on Skylake-X platforms. Nontheless does it bring a ~30% speed improvement compared to the AVX2 variant for random encryption lengths. The AVX2 version uses "rep movsb" for partial block XORing via the stack. With AVX-512, the new "vmovdqu8" can do this much more efficiently. The associated "kmov" instructions to work with dynamic masks is not part of the AVX-512VL instruction set, hence we depend on AVX-512BW as well. Given that the major AVX-512VL architectures provide AVX-512BW and this extension does not affect core clocking, this seems to be no problem at least for now. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-29x86/resctrl: Remove unnecessary check for cbm_validate()Babu Moger
The Smatch static checker reports the following error after commit: a36c5ff560fb ("x86/resctrl: Bring cbm_validate() into the resource structure"): arch/x86/kernel/cpu/resctrl/ctrlmondata.c:227 parse_cbm() error: uninitialized symbol 'cbm_val'. arch/x86/kernel/cpu/resctrl/ctrlmondata.c:236 parse_cbm() error: uninitialized symbol 'cbm_val'. This could happen if ->cbm_validate() is NULL which could leave cbm_val uninitialized. However, there is no case where ->cbm_validate() can be NULL as it is initialized based on a vendor check. So it is either an Intel or an AMD version it points to. And in both the cases it is initialized properly. Thus, remove the first check. Verified the fix running Smatch. [ bp: massage commit message. ] Fixes: a36c5ff560fb ("x86/resctrl: Bring cbm_validate() into the resource structure") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181128224234.22998-1-babu.moger@amd.com
2018-11-28x86/boot: Add missing va_end() to die()Mattias Jacobsson
Each call to va_start() must have a corresponding call to va_end() before the end of the function. Add the missing va_end(). Found with Coccinelle. Signed-off-by: Mattias Jacobsson <2pi@mok.nu> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20181128161607.10973-1-2pi@mok.nu
2018-11-28x86/speculation: Provide IBPB always command line optionsThomas Gleixner
Provide the possibility to enable IBPB always in combination with 'prctl' and 'seccomp'. Add the extra command line options and rework the IBPB selection to evaluate the command instead of the mode selected by the STIPB switch case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185006.144047038@linutronix.de
2018-11-28x86/speculation: Add seccomp Spectre v2 user space protection modeThomas Gleixner
If 'prctl' mode of user space protection from spectre v2 is selected on the kernel command-line, STIBP and IBPB are applied on tasks which restrict their indirect branch speculation via prctl. SECCOMP enables the SSBD mitigation for sandboxed tasks already, so it makes sense to prevent spectre v2 user space to user space attacks as well. The Intel mitigation guide documents how STIPB works: Setting bit 1 (STIBP) of the IA32_SPEC_CTRL MSR on a logical processor prevents the predicted targets of indirect branches on any logical processor of that core from being controlled by software that executes (or executed previously) on another logical processor of the same core. Ergo setting STIBP protects the task itself from being attacked from a task running on a different hyper-thread and protects the tasks running on different hyper-threads from being attacked. While the document suggests that the branch predictors are shielded between the logical processors, the observed performance regressions suggest that STIBP simply disables the branch predictor more or less completely. Of course the document wording is vague, but the fact that there is also no requirement for issuing IBPB when STIBP is used points clearly in that direction. The kernel still issues IBPB even when STIBP is used until Intel clarifies the whole mechanism. IBPB is issued when the task switches out, so malicious sandbox code cannot mistrain the branch predictor for the next user space task on the same logical processor. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185006.051663132@linutronix.de
2018-11-28x86/speculation: Enable prctl mode for spectre_v2_userThomas Gleixner
Now that all prerequisites are in place: - Add the prctl command line option - Default the 'auto' mode to 'prctl' - When SMT state changes, update the static key which controls the conditional STIBP evaluation on context switch. - At init update the static key which controls the conditional IBPB evaluation on context switch. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.958421388@linutronix.de
2018-11-28x86/speculation: Add prctl() control for indirect branch speculationThomas Gleixner
Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of indirect branch speculation via STIBP and IBPB. Invocations: Check indirect branch speculation status with - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0); Enable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0); Disable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0); Force disable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0); See Documentation/userspace-api/spec_ctrl.rst. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Casey Schaufler <casey.schaufler@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Jon Masters <jcm@redhat.com> Cc: Waiman Long <longman9394@gmail.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dave Stewart <david.c.stewart@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.de