summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2019-02-13x86/a.out: Clear the dump structure initiallyBorislav Petkov
dump_thread32() in aout_core_dump() does not clear the user32 structure allocated on the stack as the first thing on function entry. As a result, the dump.u_comm, dump.u_ar0 and dump.signal which get assigned before the clearing, get overwritten. Rename that function to fill_dump() to make it clear what it does and call it first thing. This was caught while staring at a patch by Derek Robson <robsonde@gmail.com>. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Derek Robson <robsonde@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Matz <matz@suse.de> Cc: x86@kernel.org Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20190202005512.3144-1-robsonde@gmail.com
2019-02-13x86/kprobes: Prohibit probing on IRQ handlers directlyMasami Hiramatsu
Prohibit probing on IRQ handlers in irqentry_text because if it interrupts user mode, at that point we haven't changed to kernel space yet and which eventually leads a double fault. E.g. # echo p apic_timer_interrupt > kprobe_events # echo 1 > events/kprobes/enable PANIC: double fault, error_code: 0x0 CPU: 1 PID: 814 Comm: less Not tainted 4.20.0-rc3+ #30 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:error_entry+0x12/0xf0 [snip] Call Trace: <ENTRY_TRAMPOLINE> ? native_iret+0x7/0x7 ? async_page_fault+0x8/0x30 ? trace_hardirqs_on_thunk+0x1c/0x1c ? error_entry+0x7c/0xf0 ? async_page_fault+0x8/0x30 ? native_iret+0x7/0x7 ? int3+0xa/0x20 ? trace_hardirqs_on_thunk+0x1c/0x1c ? error_entry+0x7c/0xf0 ? int3+0xa/0x20 ? apic_timer_interrupt+0x1/0x20 </ENTRY_TRAMPOLINE> Kernel panic - not syncing: Machine halted. Kernel Offset: disabled ---[ end Kernel panic - not syncing: Machine halted. ]--- Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <righi.andrea@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/154998796400.31052.8406236614820687840.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-13x86/kprobes: Prohibit probing on functions before kprobe_int3_handler()Masami Hiramatsu
Prohibit probing on the functions called before kprobe_int3_handler() in do_int3(). More specifically, ftrace_int3_handler(), poke_int3_handler(), and ist_enter(). And since rcu_nmi_enter() is called by ist_enter(), it also should be marked as NOKPROBE_SYMBOL. Since those are handled before kprobe_int3_handler(), probing those functions can cause a breakpoint recursion and crash the kernel. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <righi.andrea@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/154998793571.31052.11301258949601150994.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-13x86/kprobes: Move trampoline code into RODATAMasami Hiramatsu
Move optprobe trampoline code into RODATA since it is not executed, but copied and modified to be used on a trampoline buffer. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <righi.andrea@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/154998790744.31052.3016106262944915510.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-13x86/kprobes: Prohibit probing on optprobe template codeMasami Hiramatsu
Prohibit probing on optprobe template code, since it is not a code but a template instruction sequence. If we modify this template, copied template must be broken. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <righi.andrea@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 9326638cbee2 ("kprobes, x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation") Link: http://lkml.kernel.org/r/154998787911.31052.15274376330136234452.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-12x86/kvm/nVMX: read from MSR_IA32_VMX_PROCBASED_CTLS2 only when it is availableVitaly Kuznetsov
SDM says MSR_IA32_VMX_PROCBASED_CTLS2 is only available "If (CPUID.01H:ECX.[5] && IA32_VMX_PROCBASED_CTLS[63])". It was found that some old cpus (namely "Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz (family: 0x6, model: 0xf, stepping: 0x6") don't have it. Add the missing check. Reported-by: Zdenek Kaspar <zkaspar82@gmail.com> Tested-by: Zdenek Kaspar <zkaspar82@gmail.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Use vcpu->arch.regs directly when saving/loading guest stateSean Christopherson
...now that all other references to struct vcpu_vmx have been removed. Note that 'vmx' still needs to be passed into the asm blob in _ASM_ARG1 as it is consumed by vmx_update_host_rsp(). And similar to that code, use _ASM_ARG2 in the assembly code to prepare for moving to proper asm, while explicitly referencing the exact registers in the clobber list for clarity in the short term and to avoid additional precompiler games. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Don't save guest registers after VM-FailSean Christopherson
A failed VM-Enter (obviously) didn't succeed, meaning the CPU never executed an instrunction in guest mode and so can't have changed the general purpose registers. In addition to saving some instructions in the VM-Fail case, this also provides a separate path entirely and thus an opportunity to propagate the fail condition to vmx->fail via register without introducing undue pain. Using a register, as opposed to directly referencing vmx->fail, eliminates the need to pass the offset of 'fail', which will simplify moving the code to proper assembly in future patches. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Invert the ordering of saving guest/host scratch reg at VM-EnterSean Christopherson
Switching the ordering allows for an out-of-line path for VM-Fail that elides saving guest state but still shares the register clearing with the VM-Exit path. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Pass "launched" directly to the vCPU-run asm blobSean Christopherson
...and remove struct vcpu_vmx's temporary __launched variable. Eliminating __launched is a bonus, the real motivation is to get to the point where the only reference to struct vcpu_vmx in the asm code is to vcpu.arch.regs, which will simplify moving the blob to a proper asm file. Note that also means this approach is deliberately different than what is used in nested_vmx_check_vmentry_hw(). Use BL as it is a callee-save register in both 32-bit and 64-bit ABIs, i.e. it can't be modified by vmx_update_host_rsp(), to avoid having to temporarily save/restore the launched flag. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Update VMCS.HOST_RSP via helper C functionSean Christopherson
Providing a helper function to update HOST_RSP is visibly easier to read, and more importantly (for the future) eliminates two arguments to the VM-Enter assembly blob. Reducing the number of arguments to the asm blob is for all intents and purposes a prerequisite to moving the code to a proper assembly routine. It's not truly mandatory, but it greatly simplifies the future code, and the cost of the extra CALL+RET is negligible in the grand scheme. Note that although _ASM_ARG[1-3] can be used in the inline asm itself, the intput/output constraints need to be manually defined. gcc will actually compile with _ASM_ARG[1-3] specified as constraints, but what it actually ends up doing with the bogus constraint is unknown. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Load/save guest CR2 via C code in __vmx_vcpu_run()Sean Christopherson
...to eliminate its parameter and struct vcpu_vmx offset definition from the assembly blob. Accessing CR2 from C versus assembly doesn't change the likelihood of taking a page fault (and modifying CR2) while it's loaded with the guest's value, so long as we don't do anything silly between accessing CR2 and VM-Enter/VM-Exit. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: nVMX: Cache host_rsp on a per-VMCS basisSean Christopherson
Currently, host_rsp is cached on a per-vCPU basis, i.e. it's stored in struct vcpu_vmx. In non-nested usage the caching is for all intents and purposes 100% effective, e.g. only the first VMLAUNCH needs to synchronize VMCS.HOST_RSP since the call stack to vmx_vcpu_run() is identical each and every time. But when running a nested guest, KVM must invalidate the cache when switching the current VMCS as it can't guarantee the new VMCS has the same HOST_RSP as the previous VMCS. In other words, the cache loses almost all of its efficacy when running a nested VM. Move host_rsp to struct vmcs_host_state, which is per-VMCS, so that it is cached on a per-VMCS basis and restores its 100% hit rate when nested VMs are in play. Note that the host_rsp cache for vmcs02 essentially "breaks" when nested early checks are enabled as nested_vmx_check_vmentry_hw() will see a different RSP at the time of its VM-Enter. While it's possible to avoid even that VMCS.HOST_RSP synchronization, e.g. by employing a dedicated VM-Exit stack, there is little motivation for doing so as the overhead of two VMWRITEs (~55 cycles) is dwarfed by the overhead of the extra VMX transition (600+ cycles) and is a proverbial drop in the ocean relative to the total cost of a nested transtion (10s of thousands of cycles). Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: nVMX: Let the compiler select the reg for holding HOST_RSPSean Christopherson
...and provide an explicit name for the constraint. Naming the input constraint makes the code self-documenting and also avoids the fragility of numerically referring to constraints, e.g. %4 breaks badly whenever the constraints are modified. Explicitly using RDX was inherited from vCPU-run, i.e. completely arbitrary. Even vCPU-run doesn't truly need to explicitly use RDX, but doing so is more robust as vCPU-run needs tight control over its register usage. Note that while the naming "conflict" between host_rsp and HOST_RSP is slightly confusing, the former will be renamed slightly in a future patch, at which point HOST_RSP is absolutely what is desired. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: nVMX: Reference vmx->loaded_vmcs->launched directlySean Christopherson
Temporarily propagating vmx->loaded_vmcs->launched to vmx->__launched is not functionally necessary, but rather was done historically to avoid passing both 'vmx' and 'loaded_vmcs' to the vCPU-run asm blob. Nested early checks inherited this behavior by virtue of copy+paste. A future patch will move HOST_RSP caching to be per-VMCS, i.e. store 'host_rsp' in loaded VMCS. Now that the reference to 'vmx->fail' is also gone from nested early checks, referencing 'loaded_vmcs' directly means we can drop the 'vmx' reference when introducing per-VMCS RSP caching. And it means __launched can be dropped from struct vcpu_vmx if/when vCPU-run receives similar treatment. Note the use of a named register constraint for 'loaded_vmcs'. Using RCX to hold 'vmx' was inherited from vCPU-run. In the vCPU-run case, the scratch register needs to be explicitly defined as it is crushed when loading guest state, i.e. deferring to the compiler would corrupt the pointer. Since nested early checks never loads guests state, it's a-ok to let the compiler pick any register. Naming the constraint avoids the fragility of referencing constraints via %1, %2, etc.., which breaks horribly when modifying constraints, and generally makes the asm blob more readable. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: nVMX: Capture VM-Fail via CC_{SET,OUT} in nested early checksSean Christopherson
...to take advantage of __GCC_ASM_FLAG_OUTPUTS__ when possible. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: nVMX: Capture VM-Fail to a local var in nested_vmx_check_vmentry_hw()Sean Christopherson
Unlike the primary vCPU-run flow, the nested early checks code doesn't actually want to propagate VM-Fail back to 'vmx'. Yay copy+paste. In additional to eliminating the need to clear vmx->fail before returning, using a local boolean also drops a reference to 'vmx' in the asm blob. Dropping the reference to 'vmx' will save a register in the long run as future patches will shift all pointer references from 'vmx' to 'vmx->loaded_vmcs'. Fixes: 52017608da33 ("KVM: nVMX: add option to perform early consistency checks via H/W") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: nVMX: Explicitly reference the scratch reg in nested early checksSean Christopherson
Using %1 to reference RCX, i.e. the 'vmx' pointer', is obtuse and fragile, e.g. it results in cryptic and infurating compile errors if the output constraints are touched by anything more than a gentle breeze. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: nVMX: Drop STACK_FRAME_NON_STANDARD from nested_vmx_check_vmentry_hw()Sean Christopherson
...as it doesn't technically actually do anything non-standard with the stack even though it modifies RSP in a weird way. E.g. RSP is loaded with VMCS.HOST_RSP if the VM-Enter gets far enough to trigger VM-Exit, but it's simply reloaded with the current value. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: nVMX: Remove a rogue "rax" clobber from nested_vmx_check_vmentry_hw()Sean Christopherson
RAX is not touched by nested_vmx_check_vmentry_hw(), directly or indirectly (e.g. vmx_vmenter()). Remove it from the clobber list. Fixes: 52017608da33 ("KVM: nVMX: add option to perform early consistency checks via H/W") Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Let the compiler save/load RDX during vCPU-runSean Christopherson
Per commit c20363006af6 ("KVM: VMX: Let gcc to choose which registers to save (x86_64)"), the only reason RDX is saved/loaded to/from the stack is because it was specified as an input, i.e. couldn't be marked as clobbered (ignoring the fact that "saving" it to a dummy output would indirectly mark it as clobbered). Now that RDX is no longer an input, clobber it. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Manually load RDX in vCPU-run asm blobSean Christopherson
Load RDX with the VMCS.HOST_RSP field encoding on-demand instead of delegating to the compiler via an input constraint. In addition to saving one whole MOV instruction, this allows RDX to be properly clobbered (in a future patch) instead of being saved/loaded to/from the stack. Despite nested_vmx_check_vmentry_hw() having similar code, leave it alone, for now. In that case, RDX is unconditionally used and isn't clobbered, i.e. sending in HOST_RSP as an input is simpler. Note that because HOST_RSP is an enum and not a define, it must be redefined as an immediate instead of using __stringify(HOST_RSP). The naming "conflict" between host_rsp and HOST_RSP is slightly confusing, but the former will be removed in a future patch, at which point HOST_RSP is absolutely what is desired. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Save RSI to an unused output in the vCPU-run asm blobSean Christopherson
RSI is clobbered by the vCPU-run asm blob, but it's not marked as such, probably because GCC doesn't let you mark inputs as clobbered. "Save" RSI to a dummy output so that GCC recognizes it as being clobbered. Fixes: 773e8a0425c9 ("x86/kvm: use Enlightened VMCS when running on Hyper-V") Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Modify only RSP when creating a placeholder for guest's RCXSean Christopherson
In the vCPU-run asm blob, the guest's RCX is temporarily saved onto the stack after VM-Exit as the exit flow must first load a register with a pointer to the vCPU's save area in order to save the guest's registers. RCX is arbitrarily designated as the scratch register. Since the stack usage is to (1)save host, (2)save guest, (3)load host and (4)load guest, the code can't conform to the stack's natural FIFO semantics, i.e. it can't simply do PUSH/POP. Regardless of whether it is done for the host's value or guest's value, at some point the code needs to access the stack using a non-traditional method, e.g. MOV instead of POP. vCPU-run opts to create a placeholder on the stack for guest's RCX (by adjusting RSP) and saves RCX to its place immediately after VM-Exit (via MOV). In other words, the purpose of the first 'PUSH RCX' at the start of the vCPU-run asm blob is to adjust RSP down, i.e. there's no need to actually access memory. Use 'SUB $wordsize, RSP' instead of 'PUSH RCX' to make it more obvious that the intent is simply to create a gap on the stack for the guest's RCX. Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Zero out *all* general purpose registers after VM-ExitSean Christopherson
...except RSP, which is restored by hardware as part of VM-Exit. Paolo theorized that restoring registers from the stack after a VM-Exit in lieu of zeroing them could lead to speculative execution with the guest's values, e.g. if the stack accesses miss the L1 cache[1]. Zeroing XORs are dirt cheap, so just be ultra-paranoid. Note that the scratch register (currently RCX) used to save/restore the guest state is also zeroed as its host-defined value is loaded via the stack, just with a MOV instead of a POP. [1] https://patchwork.kernel.org/patch/10771539/#22441255 Fixes: 0cb5b30698fd ("kvm: vmx: Scrub hardware GPRs at VM-exit") Cc: <stable@vger.kernel.org> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: nVMX: Check a single byte for VMCS "launched" in nested early checksSean Christopherson
Nested early checks does a manual comparison of a VMCS' launched status in its asm blob to execute the correct VM-Enter instruction, i.e. VMLAUNCH vs. VMRESUME. The launched flag is a bool, which is a typedef of _Bool. C99 does not define an exact size for _Bool, stating only that is must be large enough to hold '0' and '1'. Most, if not all, compilers use a single byte for _Bool, including gcc[1]. The use of 'cmpl' instead of 'cmpb' was not deliberate, but rather the result of a copy-paste as the asm blob was directly derived from the asm blob for vCPU-run. This has not caused any known problems, likely due to compilers aligning variables to 4-byte or 8-byte boundaries and KVM zeroing out struct vcpu_vmx during allocation. I.e. vCPU-run accesses "junk" data, it just happens to always be zero and so doesn't affect the result. [1] https://gcc.gnu.org/ml/gcc-patches/2000-10/msg01127.html Fixes: 52017608da33 ("KVM: nVMX: add option to perform early consistency checks via H/W") Cc: <stable@vger.kernel.org> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-12KVM: VMX: Compare only a single byte for VMCS' "launched" in vCPU-runSean Christopherson
The vCPU-run asm blob does a manual comparison of a VMCS' launched status to execute the correct VM-Enter instruction, i.e. VMLAUNCH vs. VMRESUME. The launched flag is a bool, which is a typedef of _Bool. C99 does not define an exact size for _Bool, stating only that is must be large enough to hold '0' and '1'. Most, if not all, compilers use a single byte for _Bool, including gcc[1]. Originally, 'launched' was of type 'int' and so the asm blob used 'cmpl' to check the launch status. When 'launched' was moved to be stored on a per-VMCS basis, struct vcpu_vmx's "temporary" __launched flag was added in order to avoid having to pass the current VMCS into the asm blob. The new '__launched' was defined as a 'bool' and not an 'int', but the 'cmp' instruction was not updated. This has not caused any known problems, likely due to compilers aligning variables to 4-byte or 8-byte boundaries and KVM zeroing out struct vcpu_vmx during allocation. I.e. vCPU-run accesses "junk" data, it just happens to always be zero and so doesn't affect the result. [1] https://gcc.gnu.org/ml/gcc-patches/2000-10/msg01127.html Fixes: d462b8192368 ("KVM: VMX: Keep list of loaded VMCSs, instead of vcpus") Cc: <stable@vger.kernel.org> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-11x86/fpu: Track AVX-512 usage of tasksAubrey Li
User space tools which do automated task placement need information about AVX-512 usage of tasks, because AVX-512 usage could cause core turbo frequency drop and impact the running task on the sibling CPU. The XSAVE hardware structure has bits that indicate when valid state is present in registers unique to AVX-512 use. Use these bits to indicate when AVX-512 has been in use and add per-task AVX-512 state timestamp tracking to context switch. Well-written AVX-512 applications are expected to clear the AVX-512 state when not actively using AVX-512 registers, so the tracking mechanism is imprecise and can theoretically miss AVX-512 usage during context switch. But it has been measured to be precise enough to be useful under real-world workloads like tensorflow and linpack. If higher precision is required, suggest user space tools to use the PMU-based mechanisms in combination. Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: aubrey.li@intel.com Link: http://lkml.kernel.org/r/20190117183822.31333-1-aubrey.li@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11Merge tag 'v5.0-rc6' into x86/fpu, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11x86/cpufeature: Fix various quality problems in the <asm/cpu_device_hd.h> headerIngo Molnar
Thomas noticed that the new arch/x86/include/asm/cpu_device_id.h header is a train-wreck that didn't incorporate review feedback like not using __u8 in kernel-only headers. While at it also fix all the *other* problems this header has: - Use canonical names for the header guards. It's inexplicable why a non-standard guard was used. - Don't define the header guard to 1. Plus annotate the closing #endif as done absolutely every other header. Again, an inexplicable source of noise. - Move the kernel API calls provided by this header next to each other, there's absolutely no reason to have them spread apart in the header. - Align the INTEL_CPU_DESC() macro initializations vertically, this is easier to read and it's also the canonical style. - Actually name the macro arguments properly: instead of 'mod, step, rev', spell out 'model, stepping, revision' - it's not like we have a lack of characters in this header. - Actually make arguments macro-safe - again it's inexplicable why it wasn't done properly to begin with. Quite amazing how many problems a 41 lines header can contain. This kind of code quality is unacceptable, and it slipped through the review net of 2 developers and 2 maintainers, including myself, until Thomas noticed it. :-/ Reported-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11perf/x86: Add check_period PMU callbackJiri Olsa
Vince (and later on Ravi) reported crashes in the BTS code during fuzzing with the following backtrace: general protection fault: 0000 [#1] SMP PTI ... RIP: 0010:perf_prepare_sample+0x8f/0x510 ... Call Trace: <IRQ> ? intel_pmu_drain_bts_buffer+0x194/0x230 intel_pmu_drain_bts_buffer+0x160/0x230 ? tick_nohz_irq_exit+0x31/0x40 ? smp_call_function_single_interrupt+0x48/0xe0 ? call_function_single_interrupt+0xf/0x20 ? call_function_single_interrupt+0xa/0x20 ? x86_schedule_events+0x1a0/0x2f0 ? x86_pmu_commit_txn+0xb4/0x100 ? find_busiest_group+0x47/0x5d0 ? perf_event_set_state.part.42+0x12/0x50 ? perf_mux_hrtimer_restart+0x40/0xb0 intel_pmu_disable_event+0xae/0x100 ? intel_pmu_disable_event+0xae/0x100 x86_pmu_stop+0x7a/0xb0 x86_pmu_del+0x57/0x120 event_sched_out.isra.101+0x83/0x180 group_sched_out.part.103+0x57/0xe0 ctx_sched_out+0x188/0x240 ctx_resched+0xa8/0xd0 __perf_event_enable+0x193/0x1e0 event_function+0x8e/0xc0 remote_function+0x41/0x50 flush_smp_call_function_queue+0x68/0x100 generic_smp_call_function_single_interrupt+0x13/0x30 smp_call_function_single_interrupt+0x3e/0xe0 call_function_single_interrupt+0xf/0x20 </IRQ> The reason is that while event init code does several checks for BTS events and prevents several unwanted config bits for BTS event (like precise_ip), the PERF_EVENT_IOC_PERIOD allows to create BTS event without those checks being done. Following sequence will cause the crash: If we create an 'almost' BTS event with precise_ip and callchains, and it into a BTS event it will crash the perf_prepare_sample() function because precise_ip events are expected to come in with callchain data initialized, but that's not the case for intel_pmu_drain_bts_buffer() caller. Adding a check_period callback to be called before the period is changed via PERF_EVENT_IOC_PERIOD. It will deny the change if the event would become BTS. Plus adding also the limit_period check as well. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20190204123532.GA4794@krava Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11x86/defconfig: Enable EFI stub, mixed mode and BGRTArd Biesheuvel
Enable various options in the defconfigs of both i386 and x86_64 that could do with some test coverage in automated testing. Note that these options are typically enabled by distros as well. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11mm/gup: Remove the 'write' parameter from gup_fast_permitted()Ira Weiny
The 'write' parameter is unused in gup_fast_permitted() so remove it. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20190210223424.13934-1-ira.weiny@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11perf/x86/intel: Add counter freezing quirk for GoldmontKan Liang
A microcode patch is also needed for Goldmont while counter freezing feature is enabled. Otherwise, there will be some issues, e.g. PMI lost. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-5-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11perf/x86/intel: Clean up counter freezing quirkKan Liang
Clean up counter freezing quirk to use the new facility to check for min microcode revisions. Rename the counter freezing quirk related functions. Because other platforms, e.g. Goldmont, also needs to call the quirk. Only check the boot CPU, assuming models and features are consistent over all CPUs. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-4-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11perf/x86/intel: Clean up SNB PEBS quirkKan Liang
Clean up SNB PEBS quirk to use the new facility to check for min microcode revisions. Only check the boot CPU, assuming models and features are consistent over all CPUs. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11perf/x86/kvm: Avoid unnecessary work in guest filteringAndi Kleen
KVM added a workaround for PEBS events leaking into guests with commit: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.") This uses the VT entry/exit list to add an extra disable of the PEBS_ENABLE MSR. Intel also added a fix for this issue to microcode updates on Haswell/Broadwell/Skylake. It turns out using the MSR entry/exit list makes VM exits significantly slower. The list is only needed for disabling PEBS, because the GLOBAL_CTRL change gets optimized by KVM into changing the VMCS. Check for the microcode updates that have the microcode fix for leaking PEBS, and disable the extra entry/exit list entry for PEBS_ENABLE. In addition we always clear the GLOBAL_CTRL for the PEBS counter while running in the guest, which is enough to make them never fire at the wrong side of the host/guest transition. The overhead for VM exits with the filtering active with the patch is reduced from 8% to 4%. The microcode patch has already been merged into future platforms. This patch is one-off thing. The quirks is used here. For other old platforms which doesn't have microcode patch and quirks, extra disable of the PEBS_ENABLE MSR is still required. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1549319013-4522-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11Merge branch 'x86/cpu' into perf/core, to pick up dependent commitIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-11x86/cpufeature: Add facility to check for min microcode revisionsKan Liang
For bug workarounds or checks, it is useful to check for specific microcode revisions. Add a new generic function to match the CPU with stepping. Add the other function to check the min microcode revisions for the matched CPU. A new table format is introduced to facilitate the quirk to fill the related information. This does not change the existing x86_cpu_id because it's an ABI shared with modules, and also has quite different requirements, as in no wildcards, but everything has to be matched exactly. Originally-by: Andi Kleen <ak@linux.intel.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: eranian@google.com Link: https://lkml.kernel.org/r/1549319013-4522-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-10Merge tag 'y2038-new-syscalls' of ↵Thomas Gleixner
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038 Pull y2038 - time64 system calls from Arnd Bergmann: This series finally gets us to the point of having system calls with 64-bit time_t on all architectures, after a long time of incremental preparation patches. There was actually one conversion that I missed during the summer, i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes and review comments. The following system calls are now added on all 32-bit architectures using the same system call numbers: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 Each one of these corresponds directly to an existing system call that includes a 'struct timespec' argument, or a structure containing a timespec or (in case of clock_adjtime) timeval. Not included here are new versions of getitimer/setitimer and getrusage/waitid, which are planned for the future but only needed to make a consistent API rather than for correct operation beyond y2038. These four system calls are based on 'timeval', and it has not been finally decided what the replacement kernel interface will use instead. So far, I have done a lot of build testing across most architectures, which has found a number of bugs. Runtime testing so far included testing LTP on 32-bit ARM with the existing system calls, to ensure we do not regress for existing binaries, and a test with a 32-bit x86 build of LTP against a modified version of the musl C library that has been adapted to the new system call interface [3]. This library can be used for testing on all architectures supported by musl-1.1.21, but it is not how the support is getting integrated into the official musl release. Official musl support is planned but will require more invasive changes to the library. Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/ Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/ Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]
2019-02-10Merge tag 'y2038-syscall-cleanup' of ↵Thomas Gleixner
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038 Pull preparatory work for y2038 changes from Arnd Bergmann: System call unification and cleanup The system call tables have diverged a bit over the years, and a number of the recent additions never made it into all architectures, for one reason or another. This is an attempt to clean it up as far as we can without breaking compatibility, doing a number of steps: - Add system calls that have not yet been integrated into all architectures but that we definitely want there. This includes {,f}statfs64() and get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally. - The s390 compat syscall handling is cleaned up to be more like what we do on other architectures, while keeping the 31-bit pointer extension. This was merged as a shared branch by the s390 maintainers and is included here in order to base the other patches on top. - Add the separate ipc syscalls on all architectures that traditionally only had sys_ipc(). This version is done without support for IPC_OLD that is we have in sys_ipc. The new semtimedop_time64 syscall will only be added here, not in sys_ipc - Add syscall numbers for a couple of syscalls that we probably don't need everywhere, in particular pkey_* and rseq, for the purpose of symmetry: if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I expect that any future system calls will get assigned on all platforms together, even when they appear to be specific to a single architecture. - Prepare for having the same system call numbers for any future calls. In combination with the generated tables, this hopefully makes it easier to add new calls across all architectures together. All of the above are technically separate from the y2038 work, but are done as preparation before we add the new 64-bit time_t system calls everywhere, providing a common baseline set of system calls. I expect that glibc and other libraries that want to use 64-bit time_t will require linux-5.1 kernel headers for building in the future, and at a much later point may also require linux-5.1 or a later version as the minimum kernel at runtime. Having a common baseline then allows the removal of many architecture or kernel version specific workarounds.
2019-02-10Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A handful of fixes: - Fix an MCE corner case bug/crash found via MCE injection testing - Fix 5-level paging boot crash - Fix MCE recovery cache invalidation bug - Fix regression on Xen guests caused by a recent PMD level mremap speedup optimization" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Make set_pmd_at() paravirt aware x86/mm/cpa: Fix set_mce_nospec() x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 setting x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()
2019-02-10Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A couple of kernel side fixes: - Fix the Intel uncore driver on certain hardware configurations - Fix a CPU hotplug related memory allocation bug - Remove a spurious WARN() ... plus also a handful of perf tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf script python: Add Python3 support to tests/attr.py perf trace: Support multiple "vfs_getname" probes perf symbols: Filter out hidden symbols from labels perf symbols: Add fallback definitions for GELF_ST_VISIBILITY() tools headers uapi: Sync linux/in.h copy from the kernel sources perf clang: Do not use 'return std::move(something)' perf mem/c2c: Fix perf_mem_events to support powerpc perf tests evsel-tp-sched: Fix bitwise operator perf/core: Don't WARN() for impossible ring-buffer sizes perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu() perf/x86/intel/uncore: Add Node ID mask
2019-02-10x86/mm: Make set_pmd_at() paravirt awareJuergen Gross
set_pmd_at() calls native_set_pmd() unconditionally on x86. This was fine as long as only huge page entries were written via set_pmd_at(), as Xen pv guests don't support those. Commit 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions") introduced a usage of set_pmd_at() possible on pv guests, leading to failures like: BUG: unable to handle kernel paging request at ffff888023e26778 #PF error: [PROT] [WRITE] RIP: e030:move_page_tables+0x7c1/0xae0 move_vma.isra.3+0xd1/0x2d0 __se_sys_mremap+0x3c6/0x5b0 do_syscall_64+0x49/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Make set_pmd_at() paravirt aware by just letting it use set_pmd(). Fixes: 2c91bd4a4e2e53 ("mm: speed up mremap by 20x on large regions") Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: xen-devel@lists.xenproject.org Cc: boris.ostrovsky@oracle.com Cc: sstabellini@kernel.org Cc: hpa@zytor.com Cc: bp@alien8.de Cc: torvalds@linux-foundation.org Link: https://lkml.kernel.org/r/20190210074056.11842-1-jgross@suse.com
2019-02-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
An ipvlan bug fix in 'net' conflicted with the abstraction away of the IPV6 specific support in 'net-next'. Similarly, a bug fix for mlx5 in 'net' conflicted with the flow action conversion in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08x86/fpu: Move init_xstate_size() to __init sectionSergey Senozhatsky
WARNING: vmlinux.o(.text.unlikely+0x1c05): Section mismatch in reference from the function init_xstate_size() to the function .init.text:get_xsave_size() WARNING: vmlinux.o(.text.unlikely+0x1c19): Section mismatch in reference from the function init_xstate_size() to the function .init.text:get_xsaves_size() Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Link: https://lkml.kernel.org/r/20190108130225.5066-2-sergey.senozhatsky@gmail.com
2019-02-08x86/cpu_entry_area: Move percpu_setup_debug_store() to __init sectionSergey Senozhatsky
WARNING: vmlinux.o(.text.unlikely+0x498d): Section mismatch in reference from the function percpu_setup_debug_store() to the function .init.text:cea_map_percpu_pages() Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Link: https://lkml.kernel.org/r/20190108130225.5066-1-sergey.senozhatsky@gmail.com
2019-02-08x86/mtrr: Remove unused variableBo Yu
Compiling the kernel with W=1 results in the following warning: arch/x86/kernel/cpu/mtrr/cleanup.c:299:16: warning: variable ‘second_basek’ set but not used [-Wunused-but-set-variable] unsigned long second_basek, second_sizek; Remove the unused variable. [ tglx: Massaged changelog ] Signed-off-by: Bo Yu <tsu.yubo@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: puwen@hygon.cn Link: https://lkml.kernel.org/r/20190208125343.11451-1-tsu.yubo@gmail.com
2019-02-08x86/mm/cpa: Fix set_mce_nospec()Peter Zijlstra
The recent commit fe0937b24ff5 ("x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function") accidentally made the call to make_addr_canonical_again() go away, which breaks set_mce_nospec(). Re-instate the call to convert the address back into canonical form right before invoking either CLFLUSH or INVLPG. Rename the function while at it to be shorter (and less MAGA). Fixes: fe0937b24ff5 ("x86/mm/cpa: Fold cpa_flush_range() and cpa_flush_array() into a single cpa_flush() function") Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Rik van Riel <riel@surriel.com> Link: https://lkml.kernel.org/r/20190208120859.GH32511@hirez.programming.kicks-ass.net
2019-02-08crypto: x86/aesni-gcm - fix crash on empty plaintextEric Biggers
gcmaes_crypt_by_sg() dereferences the NULL pointer returned by scatterwalk_ffwd() when encrypting an empty plaintext and the source scatterlist ends immediately after the associated data. Fix it by only fast-forwarding to the src/dst data scatterlists if the data length is nonzero. This bug is reproduced by the "rfc4543(gcm(aes))" test vectors when run with the new AEAD test manager. Fixes: e845520707f8 ("crypto: aesni - Update aesni-intel_glue to use scatter/gather") Cc: <stable@vger.kernel.org> # v4.17+ Cc: Dave Watson <davejwatson@fb.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>