Age | Commit message (Collapse) | Author |
|
- Remove VMX_EPT_EXTENT_INDIVIDUAL_ADDR, since there is no such type of
EPT invalidation
- Add missing VPID types names
Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com>
Tested-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
From the Intel SDM, volume 3, section 10.4.3, "Enabling or Disabling the
Local APIC,"
When IA32_APIC_BASE[11] is 0, the processor is functionally equivalent
to an IA-32 processor without an on-chip APIC. The CPUID feature flag
for the APIC (see Section 10.4.2, "Presence of the Local APIC") is
also set to 0.
Signed-off-by: Jim Mattson <jmattson@google.com>
[Changed subject tag from nVMX to x86.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Avoid the pointless function call to pv_lock_ops.vcpu_is_preempted()
when a paravirt spinlock enabled kernel is ran on native hardware.
Do this by patching out the CALL instruction with "XOR %RAX,%RAX"
which has the same effect (0 return value).
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Support the vcpu_is_preempted() functionality under Xen. This will
enhance lock performance on overcommitted hosts (more runnable vCPUs
than physical CPUs in the system) as doing busy waits for preempted
vCPUs will hurt system performance far worse than early yielding.
A quick test (4 vCPUs on 1 physical CPU doing a parallel build job
with "make -j 8") reduced system time by about 5% with this patch.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-11-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Support the vcpu_is_preempted() functionality under KVM. This will
enhance lock performance on overcommitted hosts (more runnable vCPUs
than physical CPUs in the system) as doing busy waits for preempted
vCPUs will hurt system performance far worse than early yielding.
struct kvm_steal_time::preempted indicates that if one vCPU is running or
not after commit "x86, kvm/x86.c: support vCPU preempted check".
unix benchmark result:
host: kernel 4.8.1, i5-4570, 4 cpus
guest: kernel 4.8.1, 8 vcpus
test-case after-patch before-patch
Execl Throughput | 18307.9 lps | 11701.6 lps
File Copy 1024 bufsize 2000 maxblocks | 1352407.3 KBps | 790418.9 KBps
File Copy 256 bufsize 500 maxblocks | 367555.6 KBps | 222867.7 KBps
File Copy 4096 bufsize 8000 maxblocks | 3675649.7 KBps | 1780614.4 KBps
Pipe Throughput | 11872208.7 lps | 11855628.9 lps
Pipe-based Context Switching | 1495126.5 lps | 1490533.9 lps
Process Creation | 29881.2 lps | 28572.8 lps
Shell Scripts (1 concurrent) | 23224.3 lpm | 22607.4 lpm
Shell Scripts (8 concurrent) | 3531.4 lpm | 3211.9 lpm
System Call Overhead | 10385653.0 lps | 10419979.0 lps
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-10-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Support the vcpu_is_preempted() functionality under KVM. This will
enhance lock performance on overcommitted hosts (more runnable vCPUs
than physical CPUs in the system) as doing busy waits for preempted
vCPUs will hurt system performance far worse than early yielding.
Use struct kvm_steal_time::preempted to indicate that if a vCPU
is running or not.
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-9-git-send-email-xinhui.pan@linux.vnet.ibm.com
[ Typo fixes. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
guests
Optimize spinlock and mutex busy-loops by providing a vcpu_is_preempted(cpu)
function on KVM and Xen platforms.
Extend the pv_lock_ops interface accordingly and implement the callbacks
on KVM and Xen.
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Translated to English. ]
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: borntraeger@de.ibm.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-7-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Group validation expects all events to be of the same PMU; however
is_uncore_pmu() is too wide, it matches _all_ uncore events, even
across PMUs.
This triggers failure when we group different events from different
uncore PMUs, like:
perf stat -vv -e '{uncore_cbox_0/config=0x0334/,uncore_qpi_0/event=1/}' -a sleep 1
Fix is_uncore_pmu() by only matching events to the box at hand.
Note that generic code; ran after this step; will disallow this
mixture of PMU events.
Reported-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20161118125354.GQ3117@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Vince Weaver reported that perf_fuzzer + KASAN detects that PEBS event
unwinds sometimes do 'weird' things. In particular, we seemed to be
ending up unwinding from random places on the NMI stack.
While it was somewhat expected that the event record BP,SP would not
match the interrupt BP,SP in that the interrupt is strictly later than
the record event, it was overlooked that it could be on an already
overwritten stack.
Therefore, don't copy the recorded BP,SP over the interrupted BP,SP
when we need stack unwinds.
Note that its still possible the unwind doesn't full match the actual
event, as its entirely possible to have done an (I)RET between record
and interrupt, but on average it should still point in the general
direction of where the event came from. Also, it's the best we can do,
considering.
The particular scenario that triggered the bogus NMI stack unwind was
a PEBS event with very short period, upon enabling the event at the
tail of the PMI handler (FREEZE_ON_PMI is not used), it instantly
triggers a record (while still on the NMI stack) which in turn
triggers the next PMI. This then causes back-to-back NMIs and we'll
try and unwind the stack-frame from the last NMI, which obviously is
now overwritten by our own.
Analyzed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davej@codemonkey.org.uk <davej@codemonkey.org.uk>
Cc: dvyukov@google.com <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: ca037701a025 ("perf, x86: Add PEBS infrastructure")
Link: http://lkml.kernel.org/r/20161117171731.GV3157@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The following commit:
75925e1ad7f5 ("perf/x86: Optimize stack walk user accesses")
... switched from copy_from_user_nmi() to __copy_from_user_nmi() with a manual
access_ok() check.
Unfortunately, copy_from_user_nmi() does an explicit check against TASK_SIZE,
whereas the access_ok() uses whatever the current address limit of the task is.
We are getting NMIs when __probe_kernel_read() has switched to KERNEL_DS, and
then see vmalloc faults when we access what looks like pointers into vmalloc
space:
[] WARNING: CPU: 3 PID: 3685731 at arch/x86/mm/fault.c:435 vmalloc_fault+0x289/0x290
[] CPU: 3 PID: 3685731 Comm: sh Tainted: G W 4.6.0-5_fbk1_223_gdbf0f40 #1
[] Call Trace:
[] <NMI> [<ffffffff814717d1>] dump_stack+0x4d/0x6c
[] [<ffffffff81076e43>] __warn+0xd3/0xf0
[] [<ffffffff81076f2d>] warn_slowpath_null+0x1d/0x20
[] [<ffffffff8104a899>] vmalloc_fault+0x289/0x290
[] [<ffffffff8104b5a0>] __do_page_fault+0x330/0x490
[] [<ffffffff8104b70c>] do_page_fault+0xc/0x10
[] [<ffffffff81794e82>] page_fault+0x22/0x30
[] [<ffffffff81006280>] ? perf_callchain_user+0x100/0x2a0
[] [<ffffffff8115124f>] get_perf_callchain+0x17f/0x190
[] [<ffffffff811512c7>] perf_callchain+0x67/0x80
[] [<ffffffff8114e750>] perf_prepare_sample+0x2a0/0x370
[] [<ffffffff8114e840>] perf_event_output+0x20/0x60
[] [<ffffffff8114aee7>] ? perf_event_update_userpage+0xc7/0x130
[] [<ffffffff8114ea01>] __perf_event_overflow+0x181/0x1d0
[] [<ffffffff8114f484>] perf_event_overflow+0x14/0x20
[] [<ffffffff8100a6e3>] intel_pmu_handle_irq+0x1d3/0x490
[] [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
[] [<ffffffff81197191>] ? vunmap_page_range+0x1a1/0x2f0
[] [<ffffffff811972f1>] ? unmap_kernel_range_noflush+0x11/0x20
[] [<ffffffff814f2056>] ? ghes_copy_tofrom_phys+0x116/0x1f0
[] [<ffffffff81040d1d>] ? x2apic_send_IPI_self+0x1d/0x20
[] [<ffffffff8100411d>] perf_event_nmi_handler+0x2d/0x50
[] [<ffffffff8101ea31>] nmi_handle+0x61/0x110
[] [<ffffffff8101ef94>] default_do_nmi+0x44/0x110
[] [<ffffffff8101f13b>] do_nmi+0xdb/0x150
[] [<ffffffff81795187>] end_repeat_nmi+0x1a/0x1e
[] [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
[] [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
[] [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
[] <<EOE>> <IRQ> [<ffffffff8115d05e>] ? __probe_kernel_read+0x3e/0xa0
Fix this by moving the valid_user_frame() check to before the uaccess
that loads the return address and the pointer to the next frame.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Fixes: 75925e1ad7f5 ("perf/x86: Optimize stack walk user accesses")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The Unified Memory Controllers (UMCs) on Fam17h log a normalized address
in their MCA_ADDR registers. We need to convert that normalized address
to a system physical address in order to support a few facilities:
1) To offline poisoned pages in DRAM proactively in the deferred error
handler.
2) To print sysaddr and page info for DRAM ECC errors in EDAC.
[ Boris: fixes/cleanups ontop:
* hi_addr_offset = 0 - no need for that branch. Stick it all under the
HiAddrOffsetEn case. It confines hi_addr_offset's declaration too.
* Move variables to the innermost scope they're used at so that we save
on stack and not blow it up immediately on function entry.
* Do not modify *sys_addr prematurely - we want to not exit early and
have modified *sys_addr some, which callers get to see. We either
convert to a sys_addr or we don't do anything. And we signal that with
the retval of the function.
* Rename label out -> out_err - because it is the error path.
* No need to pr_err of the conversion failed case: imagine a
sparsely-populated machine with UMCs which don't have DIMMs. Callers
should look at the retval instead and issue a printk only when really
necessary. No need for useless info in dmesg.
* s/temp_reg/tmp/ and other variable names shortening => shorter code.
* Use BIT() everywhere.
* Make error messages more informative.
* Small build fix for the !CONFIG_X86_MCE_AMD case.
* ... and more minor cleanups.
]
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20161122111133.mjzpvzhf7o7yl2oa@pd.tnic
[ Typo fixes. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
NMI stack dumps are bracketed by the following tags:
<NMI>
...
<EOE>
The ending tag is kind of confusing if you don't already know what "EOE"
means (end of exception). The same ending tag is also used to mark the
end of all other exceptions' stacks. For example:
<#DF>
...
<EOE>
And similarly, "EOI" is used as the ending tag for interrupts:
<IRQ>
...
<EOI>
Change the tags to be more comprehensible by making them symmetrical and
more XML-esque:
<NMI>
...
</NMI>
<#DF>
...
</#DF>
<IRQ>
...
</IRQ>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/180196e3754572540b595bc56b947d43658979a7.1479491159.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Rename the watchdog platform library file to explicitly show that is used only
on Intel Merrifield platforms.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161118172723.179761-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since the bootloader may load the compressed x86 kernel at any address,
it should always be built as PIE, not just when CONFIG_RELOCATABLE=y.
Otherwise, linker in binutils 2.27 will optimize GOT load into the
absolute address when building the compressed x86 kernel as a non-PIE
executable.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
[ Small wording changes. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
So adding thresholding_en et al was a good thing for removing the
per-CPU thresholding callback, i.e., threshold_cpu_callback.
But, in order for it to work and especially that test in
mce_threshold_create_device() so that all thresholding banks get
properly created and not the whole thing to fail with a NULL ptr
dereference at mce_cpu_pre_down() when we offline the CPUs, we need to
set the thresholding_en flag *before* we start creating the devices.
Yap, it failed because thresholding_en wasn't set at the time
we were creating the banks so we didn't create any and then at
mce_cpu_pre_down() -> mce_threshold_remove_device() time, we would blow
up.
And the fix is actually easy: we have thresholding on the system when we
have managed to set the thresholding vector to amd_threshold_interrupt()
earlier in mce_amd_feature_init() while we were picking apart the
thresholding banks and what is set and what not.
So let's do that.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Fixes: 4d7b02d58c40 ("x86/mcheck: Split threshold_cpu_callback into two callbacks")
Link: http://lkml.kernel.org/r/20161119103402.5227-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Watchdog device in Intel Tangier relies on SCU to be present. It uses the SCU
IPC channel to send commands and receive responses. If watchdog driver is
initialized quite before SCU and a command has been sent the result is always
an error like the following:
intel_mid_wdt: Error stopping watchdog: 0xffffffed
Register watchdog device whne SCU is ready to avoid described issue.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161118165224.175514-1-andriy.shevchenko@linux.intel.com
[ Small cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Robert O'Callahan reported that after an execve PTRACE_GETREGSET
NT_X86_XSTATE continues to return the pre-exec register values
until the exec'ed task modifies FPU state.
The test code is at:
https://bugzilla.redhat.com/attachment.cgi?id=1164286.
What is happening is fpu__clear() does not properly clear fpstate.
Fix it by doing just that.
Reported-by: Robert O'Callahan <robert@ocallahan.org>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1479402695-6553-1-git-send-email-yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Linux will have all kinds of sporadic problems on systems that don't
have the CPUID instruction unless CONFIG_M486=y. In particular,
sync_core() will explode.
I believe that these kernels had a better chance of working before
commit 05fb3c199bb0 ("x86/boot: Initialize FPU and X86_FEATURE_ALWAYS
even if we don't have CPUID"). That commit inadvertently fixed a
serious bug: we used to fail to detect the FPU if CPUID wasn't
present. Because we also used to forget to set X86_FEATURE_ALWAYS, we
end up with no cpu feature bits set at all. This meant that
alternative patching didn't do anything and, if paravirt was disabled,
we could plausibly finish the entire boot process without calling
sync_core().
Rather than trying to work around these issues, just have the kernel
fail loudly if it's running on a CPUID-less 486, doesn't have CPUID,
and doesn't have CONFIG_M486 set.
Reported-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/70eac6639f23df8be5fe03fa1984aedd5d40077a.1479598603.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
On the 80486 DX, it seems that some exceptions may leave garbage in
the high bits of CS. This causes sporadic failures in which
early_fixup_exception() refuses to fix up an exception.
As far as I can tell, this has been buggy for a long time, but the
problem seems to have been exacerbated by commits:
1e02ce4cccdc ("x86: Store a per-cpu shadow copy of CR4")
e1bfc11c5a6f ("x86/init: Fix cr4_init_shadow() on CR4-less machines")
This appears to have broken for as long as we've had early
exception handling.
[ Note to stable maintainers: This patch is needed all the way back to 3.4,
but it will only apply to 4.6 and up, as it depends on commit:
0e861fbb5bda ("x86/head: Move early exception panic code into early_fixup_exception()")
If you want to backport to kernels before 4.6, please don't backport the
prerequisites (there was a big chain of them that rewrote a lot of the
early exception machinery); instead, ask me and I can send you a one-liner
that will apply. ]
Reported-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 4c5023a3fa2e ("x86-32: Handle exception table entries during early boot")
Link: http://lkml.kernel.org/r/cb32c69920e58a1a58e7b5cad975038a69c0ce7d.1479609510.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pull KVM fixes from Radim Krčmář:
"ARM:
- Fix handling of the 32bit cycle counter
- Fix cycle counter filtering
x86:
- Fix a race leading to double unregistering of user notifiers
- Amend oversight in kvm_arch_set_irq that turned Hyper-V code dead
- Use SRCU around kvm_lapic_set_vapic_addr
- Avoid recursive flushing of asynchronous page faults
- Do not rely on deferred update in KVM_GET_CLOCK, which fixes #GP
- Let userspace know that KVM_GET_CLOCK is useful with master clock;
4.9 changed the return value to better match the guest clock, but
didn't provide means to let guests take advantage of it"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: merge kvm_arch_set_irq and kvm_arch_set_irq_inatomic
KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr
KVM: async_pf: avoid recursive flushing of work items
kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use
KVM: Disable irq while unregistering user notifier
KVM: x86: do not go through vcpu in __get_kvmclock_ns
KVM: arm64: Fix the issues when guest PMCCFILTR is configured
arm64: KVM: pmu: Fix AArch32 cycle counter access
|
|
kvm_arch_set_irq is unused since commit b97e6de9c96. Merge
its functionality with kvm_arch_set_irq_inatomic.
Reported-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Reported by syzkaller:
[ INFO: suspicious RCU usage. ]
4.9.0-rc4+ #47 Not tainted
-------------------------------
./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!
stack backtrace:
CPU: 1 PID: 6679 Comm: syz-executor Not tainted 4.9.0-rc4+ #47
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff880039e2f6d0 ffffffff81c2e46b ffff88003e3a5b40 0000000000000000
0000000000000001 ffffffff83215600 ffff880039e2f700 ffffffff81334ea9
ffffc9000730b000 0000000000000004 ffff88003c4f8420 ffff88003d3f8000
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff81c2e46b>] dump_stack+0xb3/0x118 lib/dump_stack.c:51
[<ffffffff81334ea9>] lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4445
[< inline >] __kvm_memslots include/linux/kvm_host.h:534
[< inline >] kvm_memslots include/linux/kvm_host.h:541
[<ffffffff8105d6ae>] kvm_gfn_to_hva_cache_init+0xa1e/0xce0 virt/kvm/kvm_main.c:1941
[<ffffffff8112685d>] kvm_lapic_set_vapic_addr+0xed/0x140 arch/x86/kvm/lapic.c:2217
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: fda4e2e85589191b123d31cdc21fd33ee70f50fd
Cc: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Userspace can read the exact value of kvmclock by reading the TSC
and fetching the timekeeping parameters out of guest memory. This
however is brittle and not necessary anymore with KVM 4.11. Provide
a mechanism that lets userspace know if the new KVM_GET_CLOCK
semantics are in effect, and---since we are at it---if the clock
is stable across all VCPUs.
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Function user_notifier_unregister should be called only once for each
registered user notifier.
Function kvm_arch_hardware_disable can be executed from an IPI context
which could cause a race condition with a VCPU returning to user mode
and attempting to unregister the notifier.
Signed-off-by: Ignacio Alvarado <ikalvarado@google.com>
Cc: stable@vger.kernel.org
Fixes: 18863bdd60f8 ("KVM: x86 shared msr infrastructure")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Going through the first VCPU is wrong if you follow a KVM_SET_CLOCK with
a KVM_GET_CLOCK immediately after, without letting the VCPU run and
call kvm_guest_time_update.
To fix this, compute the kvmclock value ourselves, using the master
clock (tsc, nsec) pair as the base and the host CPU frequency as
the scale.
Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild fixes from Michal Marek:
"Here are some regression fixes for kbuild:
- modversion support for exported asm symbols (Nick Piggin). The
affected architectures need separate patches adding
asm-prototypes.h.
- fix rebuilds of lib-ksyms.o (Nick Piggin)
- -fno-PIE builds (Sebastian Siewior and Borislav Petkov). This is
not a kernel regression, but one of the Debian gcc package.
Nevertheless, it's quite annoying, so I think it should go into
mainline and stable now"
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: Steal gcc's pie from the very beginning
kbuild: be more careful about matching preprocessed asm ___EXPORT_SYMBOL
x86/kexec: add -fno-PIE
scripts/has-stack-protector: add -fno-PIE
kbuild: add -fno-PIE
kbuild: modversions for EXPORT_SYMBOL() for asm
kbuild: prevent lib-ksyms.o rebuilds
|
|
Bring in -rc4 patches so I can successfully merge the sound doc changes.
|
|
Upon removal of the is_idle flag, these routines became NOPs.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/822f2c22cc5890f7b8ea0eeec60277eb44505b4e.1479449716.git.len.brown@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Upon removal of the "is_idle" flag, x86_test_and_clear_bit_percpu() is no
longer used.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/b334ae6819507e3dfc0a4b33ed974714d067eb4a.1479449716.git.len.brown@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Upon removal of the idle_notifier, all accesses to the "is_idle" flag serve
no purpose.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/e4a24197cf9c227fcd1ca2df09999eaec9052f49.1479449716.git.len.brown@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Upon removal of the i7300_idle driver, the idle_notifer is unused.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/f15385a82ec4bf51f4f06777193d83f03b28cfdd.1479449716.git.len.brown@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
All places which used the TSC_RELIABLE to skip the delayed calibration
have been converted to use the TSC_KNOWN_FREQ flag.
Make the immeditate clocksource registration, which skips the long term
calibration, solely depend on TSC_KNOWN_FREQ.
The TSC_RELIABLE now merily removes the requirement for a watchdog
clocksource.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
|
|
TSC on Intel Atom SoCs capable of determining TSC frequency by MSR is
reliable and the frequency is known (provided by HW).
On these platforms PIT/HPET is generally not available so calibration won't
work at all and there is no other clocksource to act as a watchdog for the
TSC, so we have no other choice than to trust it.
Set both X86_FEATURE_TSC_KNOWN_FREQ and X86_FEATURE_TSC_RELIABLE flags to
make sure the calibration is skipped and no watchdog is required.
Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-5-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
On Intel GOLDMONT Atom SoC TSC is the only available clocksource, so there
is no way to do software calibration or have a watchdog clocksource for it.
Software calibration is already disabled via the TSC_KNOWN_FREQ flag, but
the watchdog requirement still persists, so such systems cannot switch to
high resolution/nohz mode.
Mark it reliable, so it becomes usable. Hardware teams confirmed that this
is safe on that SoC.
Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-4-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
CPUs/SoCs with CPUID leaf 0x15 come with a known frequency and will report
the frequency to software via CPUID instruction. This hardware provided
frequency is the "real" frequency of TSC.
Set the X86_FEATURE_TSC_KNOWN_FREQ flag for such systems to skip the
software calibration process.
A 24 hours test on one of the CPUID 0x15 capable platforms was
conducted. PIT calibrated frequency resulted in more than 3 seconds drift
whereas the CPUID determined frequency showed less than 0.5 second
drift.
Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-3-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The X86_FEATURE_TSC_RELIABLE flag in Linux kernel implies both reliable
(at runtime) and trustable (at calibration). But reliable running and
trustable calibration independent of each other.
Add a new flag X86_FEATURE_TSC_KNOWN_FREQ, which denotes that the frequency
is known (via MSR/CPUID). This flag is only meant to skip the long term
calibration on systems which have a known frequency.
Add X86_FEATURE_TSC_KNOWN_FREQ to the skip the delayed calibration and
leave X86_FEATURE_TSC_RELIABLE in place.
After converting the existing users of X86_FEATURE_TSC_RELIABLE to use
either both flags or just X86_FEATURE_TSC_KNOWN_FREQ we can seperate the
functionality.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-2-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This patch enables perf core PMU support for the new AMD family-17h processors.
In family-17h, there is no PMC-event constraint. All events, irrespective of
the type, can be measured using any of the six generic performance counters.
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1479399306-13375-1-git-send-email-Janakarajan.Natarajan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The oops stack dump code scans the entire stack, which can cause KASAN
"stack-out-of-bounds" false positive warnings. Tell KASAN to ignore it.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: davej@codemonkey.org.uk
Cc: dvyukov@google.com
Link: http://lkml.kernel.org/r/5f6e80c4b0c7f7f0b6211900847a247cdaad753c.1479398226.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The guess unwinder scans the entire stack, which can cause KASAN
"stack-out-of-bounds" false positive warnings. Tell KASAN to ignore it.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: davej@codemonkey.org.uk
Cc: dvyukov@google.com
Link: http://lkml.kernel.org/r/61939c0b2b2d63ce97ba59cba3b00fd47c2962cf.1479398226.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Current multi-buffer hash implementations have a restriction on the total
length of a hash job to 512MB. Hashing larger buffers will result in an
incorrect hash. This extends the limit to 2^62 - 1.
Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Tvrtko needs
commit b3c11ac267d461d3d597967164ff7278a919a39f
Author: Eric Engestrom <eric@engestrom.ch>
Date: Sat Nov 12 01:12:56 2016 +0000
drm: move allocation out of drm_get_format_name()
to be able to apply his patches without conflicts.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
|
RDPID is a new instruction that reads MSR_TSC_AUX quickly. This
should be considerably faster than reading the GDT. Add a
cpufeature for it and use it from __vdso_getcpu() when available.
Tested-by: Megha Dey <megha.dey@intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4f6c3a22012d10f1c65b9ca15800e01b42c7d39d.1479320367.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
No need to duplicate the same define everywhere. Since
the only user is stop-machine and the only provider is
s390, we can use a default implementation of cpu_relax_yield()
in sched.h.
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-s390 <linux-s390@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When show_trace_log_lvl() is called from show_regs(), it completely
fails to dump the stack. This bug was introduced when
show_stack_log_lvl() was removed with the following commit:
0ee1dd9f5e7e ("x86/dumpstack: Remove raw stack dump")
Previous callers of that function now call show_trace_log_lvl()
directly. That resulted in a subtle change, in that the 'stack'
argument can now be NULL in certain cases.
A NULL 'stack' pointer means that the stack dump should start from the
topmost stack frame unless 'regs' is valid, in which case it should
start from 'regs->sp'.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 0ee1dd9f5e7e ("x86/dumpstack: Remove raw stack dump")
Link: http://lkml.kernel.org/r/c551842302a9c222d96a14e42e4003f059509f69.1479362652.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The latest binutils are warning about a .fill directive with an explicit
value in a .bss section:
arch/x86/kernel/head_32.S: Assembler messages:
arch/x86/kernel/head_32.S:677: Warning: ignoring fill value in section `.bss..page_aligned'
arch/x86/kernel/head_32.S:679: Warning: ignoring fill value in section `.bss..page_aligned'
This comes from the 'ENTRY()' macro padding the space between the symbols
with 'nop' via:
.align 4,0x90
Open-coding the .globl directive without the padding avoids that warning,
as all the symbols are already page aligned.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161116141726.2013389-1-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add a few new AVX512 instruction groups/features for enumeration in
/proc/cpuinfo: AVX512IFMA and AVX512VBMI.
Clear the flags in fpu_xstate_clear_all_cpu_caps().
CPUID.(EAX=7,ECX=0):EBX[bit 21] AVX512IFMA
CPUID.(EAX=7,ECX=0):ECX[bit 1] AVX512VBMI
Detailed information of cpuid bits for the features can be found at
https://bugzilla.kernel.org/show_bug.cgi?id=187891
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: mingo@elte.hu
Link: http://lkml.kernel.org/r/1479327060-18668-1-git-send-email-gayatri.kammela@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Add two new AVX512 subfeatures support for KVM guest.
AVX512_4VNNIW:
Vector instructions for deep learning enhanced word variable precision.
AVX512_4FMAPS:
Vector instructions for deep learning floating-point single precision.
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
[Changed subject tags.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Internal errors were reported on 16 bit fxsave and fxrstor with ipxe.
Old Intels don't have unrestricted_guest, so we have to emulate them.
The patch takes advantage of the hardware implementation.
AMD and Intel differ in saving and restoring other fields in first 32
bytes. A test wrote 0xff to the fxsave area, 0 to upper bits of MCSXR
in the fxsave area, executed fxrstor, rewrote the fxsave area to 0xee,
and executed fxsave:
Intel (Nehalem):
7f 1f 7f 7f ff 00 ff 07 ff ff ff ff ff ff 00 00
ff ff ff ff ff ff 00 00 ff ff 00 00 ff ff 00 00
Intel (Haswell -- deprecated FPU CS and FPU DS):
7f 1f 7f 7f ff 00 ff 07 ff ff ff ff 00 00 00 00
ff ff ff ff 00 00 00 00 ff ff 00 00 ff ff 00 00
AMD (Opteron 2300-series):
7f 1f 7f 7f ff 00 ee ee ee ee ee ee ee ee ee ee
ee ee ee ee ee ee ee ee ff ff 00 00 ff ff 02 00
fxsave/fxrstor will only be emulated on early Intels, so KVM can't do
much to improve the situation.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|