Age | Commit message (Collapse) | Author |
|
The msr tracing for writes is incorrectly conditional on the read trace.
Fixes: 7f47d8cc039f "x86, tracing, perf: Add trace point for MSR accesses"
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: stable@vger.kernel.org
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/1464976859-21850-1-git-send-email-dgilbert@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Commit 3195ef59cb42 ("x86: Do full rtc synchronization with ntp") had
the side-effect of unconditionally enabling the RTC_LIB symbol on x86,
which in turn disables the selection of the CONFIG_RTC and
CONFIG_GEN_RTC drivers that contain a two older implementations of
the CONFIG_RTC_DRV_CMOS driver.
This removes x86 from the list for genrtc, and changes all references
to the asm/rtc.h header to instead point to the interfaces
from linux/mc146818rtc.h.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
Drivers should not really include stuff from asm-generic directly,
and the PC-style cmos rtc driver does this in order to reuse the
mc146818 implementation of get_rtc_time/set_rtc_time rather than
the architecture specific one for the architecture it gets built for.
To make it more obvious what is going on, this moves and renames the
two functions into include/linux/mc146818rtc.h, which holds the
other mc146818 specific code. Ideally it would be in a .c file,
but that would require extra infrastructure as the functions are
called by multiple drivers with conflicting dependencies.
With this change, the asm-generic/rtc.h header also becomes much
more generic, so it can be reused more easily across any architecture
that still relies on the genrtc driver.
The only caller of the internal __get_rtc_time/__set_rtc_time
functions is in arch/alpha/kernel/rtc.c, and we just change those
over to the new naming.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
arch/x86/kvm/iommu.c includes <linux/intel-iommu.h> and <linux/dmar.h>, which
both are unnecessary, in fact incorrect to be here as they are intel specific.
Building kvm on x86 passed after removing above inclusion.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
The syzkaller folks reported a NULL pointer dereference that seems
to be cause by a race between KVM_CREATE_IRQCHIP and KVM_CREATE_PIT2.
The former takes kvm->lock (except when registering the devices,
which needs kvm->slots_lock); the latter takes kvm->slots_lock only.
Change KVM_CREATE_PIT2 to follow the same model as KVM_CREATE_IRQCHIP.
Testcase:
#include <pthread.h>
#include <linux/kvm.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <sys/syscall.h>
#include <unistd.h>
long r[23];
void* thr1(void* arg)
{
struct kvm_pit_config pitcfg = { .flags = 4 };
switch ((long)arg) {
case 0: r[2] = open("/dev/kvm", O_RDONLY|O_ASYNC); break;
case 1: r[3] = ioctl(r[2], KVM_CREATE_VM, 0); break;
case 2: r[4] = ioctl(r[3], KVM_CREATE_IRQCHIP, 0); break;
case 3: r[22] = ioctl(r[3], KVM_CREATE_PIT2, &pitcfg); break;
}
return 0;
}
int main(int argc, char **argv)
{
long i;
pthread_t th[4];
memset(r, -1, sizeof(r));
for (i = 0; i < 4; i++) {
pthread_create(&th[i], 0, thr, (void*)i);
if (argc > 1 && rand()%2) usleep(rand()%1000);
}
usleep(20000);
return 0;
}
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Make the function names more similar between KVM_REQ_NMI and KVM_REQ_SMI.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
If the processor exits to KVM while delivering an interrupt,
the hypervisor then requeues the interrupt for the next vmentry.
Trying to enter SMM in this same window causes to enter non-root
mode in emulated SMM (i.e. with IF=0) and with a request to
inject an IRQ (i.e. with a valid VM-entry interrupt info field).
This is invalid guest state (SDM 26.3.1.4 "Check on Guest RIP
and RFLAGS") and the processor fails vmentry.
The fix is to defer the injection from KVM_REQ_SMI to KVM_REQ_EVENT,
like we already do for e.g. NMIs. This patch doesn't change the
name of the process_smi function so that it can be applied to
stable releases. The next patch will modify the names so that
process_nmi and process_smi handle respectively KVM_REQ_NMI and
KVM_REQ_SMI.
This is especially common with Windows, probably due to the
self-IPI trick that it uses to deliver deferred procedure
calls (DPCs).
Reported-by: Laszlo Ersek <lersek@redhat.com>
Reported-by: Michał Zegan <webczat_200@poczta.onet.pl>
Fixes: 64d6067057d9658acb8675afcfba549abdb7fc16
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Now that we have topology_max_smt_threads() use it
to detect the HT workarounds for older CPUs.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add topdown event declarations to Silvermont / Airmont.
These cores do not support the full Top Down metrics, but an useful
subset (FrontendBound, Retiring, Backend Bound/Bad Speculation).
The perf stat tool automatically handles the missing events
and combines the available metrics.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add declarations for the events needed for topdown to the
Intel big core CPUs starting with Sandy Bridge. We need
to report different values if HyperThreading is on or off.
The only thing this patch does is to export some events
in sysfs.
topdown level 1 uses a set of abstracted metrics which
are generic to out of order CPU cores (although some
CPUs may not implement all of them):
topdown-total-slots Available slots in the pipeline
topdown-slots-issued Slots issued into the pipeline
topdown-slots-retired Slots successfully retired
topdown-fetch-bubbles Pipeline gaps in the frontend
topdown-recovery-bubbles Pipeline gaps during recovery
from misspeculation
A slot is a single operation in the CPU pipe line.
These metrics then allow to compute four useful metrics:
FrontendBound, BackendBound, Retiring, BadSpeculation.
The formulas to compute the metrics are generic, they
only change based on the availability on the abstracted
input values.
The kernel declares the events supported by the current
CPU and their scaling factors (such as the pipeline width)
and perf stat then computes the formulas based on the
available metrics. This is similar how existing
perf metrics, such as TSC metrics or IPC, are implemented.
This abstracts all CPU pipe line specific knowledge in the
kernel driver, but still avoids the need for larger scale perf
interface changes.
For HyperThreading the any bit is needed to get accurate
values when both threads are executing. This implies that
the events can only be collected as root or with
perf_event_paranoid=-1 for now.
The basic scheme is based on the following paper:
Yasin, A Top Down Method for Performance analysis and Counter architecture ISPASS14 (pdf available via google)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add a way to show different sysfs events attributes depending on
HyperThreading is on or off. This is difficult to determine
early at boot, so we just do it dynamically when the sysfs
attribute is read.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
For SMT specific workarounds it is useful to know if SMT is active
on any online CPU in the system. This currently requires a loop
over all online CPUs.
Add a global variable that is updated with the maximum number
of smt threads on any CPU on online/offline, and use it for
topology_max_smt_threads()
The single call is easier to use than a loop.
Not exported to user space because user space already can use
the existing sibling interfaces to find this out.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1463703002-19686-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Some platforms, e.g. Knights Landing, use a common PCI device ID for
multiple instances of an uncore PMU device type. So it is impossible to
locate the specific instances only by PCI device ID.
The current code specially handles Knights Landing by arbitrarily pointing
an instance to an unused uncore box. However, we still have no idea
which uncore device is mapped to which box.
Furthermore, there could be more platforms which use a common PCI device ID
for uncore devices. We have to specially handle them one by one.
This patch records full device information (slot, func, and device ID)
in id_table[]. So the probe function can point the instance to a specific
uncore box by checking the full device information.
Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: tglx@linutronix.de
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: bp@suse.de
Cc: harish.chegondi@intel.com
Cc: hubert.chrzaniuk@intel.com
Cc: lawrence.f.meadows@intel.com
Link: http://lkml.kernel.org/r/1463379504-39003-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Due to change in register definition we need to update OCR mask:
MSR_OFFCORE_RESP0 reserved bits: 3,4,18,29,30,33,34, 8,11,14
MSR_OFFCORE_RESP1 reserved bits: 3,4,18,29,30,33,34, 38
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: kan.liang@intel.com
Cc: lukasz.anaczkowski@intel.com
Cc: zheng.z.yan@intel.com
Link: http://lkml.kernel.org/r/1463433419-16893-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add the 'static' keyword to intel_bdw_event_constraints[], snb_events_attrs[],
nhm_events_attrs[] and intel_skl_event_constraints arrays[], because
they are only used locally.
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: kan.liang@intel.com
Cc: lukasz.anaczkowski@intel.com
Cc: zheng.z.yan@intel.com
Link: http://lkml.kernel.org/r/1463433378-16816-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
There was a report that on certain Broadwell-EP systems writing any bit of
the SBOX PMU initialization MSR would #GP at boot. This did not happen
on all systems. My test systems booted fine.
Considering both DE and EP may have such issues, this patch removes SBOX
support for all Broadwell platforms for now.
Reported-and-tested-by: Mark van Dijk <mark@voidzero.net>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1464347540-5763-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to
any of bits 63:32. However, this is not detected at KVM_SET_DEBUGREGS
time, and the next KVM_RUN oopses:
general protection fault: 0000 [#1] SMP
CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
[...]
Call Trace:
[<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm]
[<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
[<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
[<ffffffff812418a9>] SyS_ioctl+0x79/0x90
[<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71
Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
RIP [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40
RSP <ffff88005836bd50>
Testcase (beautified/reduced from syzkaller output):
#include <unistd.h>
#include <sys/syscall.h>
#include <string.h>
#include <stdint.h>
#include <linux/kvm.h>
#include <fcntl.h>
#include <sys/ioctl.h>
long r[8];
int main()
{
struct kvm_debugregs dr = { 0 };
r[2] = open("/dev/kvm", O_RDONLY);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
memcpy(&dr,
"\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72"
"\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8"
"\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9"
"\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb",
48);
r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr);
r[6] = ioctl(r[4], KVM_RUN, 0);
}
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
This cannot be returned by KVM_GET_VCPU_EVENTS, so it is okay to return
EINVAL. It causes a WARN from exception_type:
WARNING: CPU: 3 PID: 16732 at arch/x86/kvm/x86.c:345 exception_type+0x49/0x50 [kvm]()
CPU: 3 PID: 16732 Comm: a.out Tainted: G W 4.4.6-300.fc23.x86_64 #1
Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
0000000000000286 000000006308a48b ffff8800bec7fcf8 ffffffff813b542e
0000000000000000 ffffffffa0966496 ffff8800bec7fd30 ffffffff810a40f2
ffff8800552a8000 0000000000000000 00000000002c267c 0000000000000001
Call Trace:
[<ffffffff813b542e>] dump_stack+0x63/0x85
[<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0
[<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20
[<ffffffffa0924809>] exception_type+0x49/0x50 [kvm]
[<ffffffffa0934622>] kvm_arch_vcpu_ioctl_run+0x10a2/0x14e0 [kvm]
[<ffffffffa091c04d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
[<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
[<ffffffff812414a9>] SyS_ioctl+0x79/0x90
[<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71
---[ end trace b1a0391266848f50 ]---
Testcase (beautified/reduced from syzkaller output):
#include <unistd.h>
#include <sys/syscall.h>
#include <string.h>
#include <stdint.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <linux/kvm.h>
long r[31];
int main()
{
memset(r, -1, sizeof(r));
r[2] = open("/dev/kvm", O_RDONLY);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
r[7] = ioctl(r[3], KVM_CREATE_VCPU, 0);
struct kvm_vcpu_events ve = {
.exception.injected = 1,
.exception.nr = 0xd4
};
r[27] = ioctl(r[7], KVM_SET_VCPU_EVENTS, &ve);
r[30] = ioctl(r[7], KVM_RUN, 0);
return 0;
}
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
This causes an ugly dmesg splat. Beautified syzkaller testcase:
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <linux/kvm.h>
long r[8];
int main()
{
struct kvm_cpuid2 c = { 0 };
r[2] = open("/dev/kvm", O_RDWR);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
r[4] = ioctl(r[3], KVM_CREATE_VCPU, 0x8);
r[7] = ioctl(r[4], KVM_SET_CPUID, &c);
return 0;
}
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Found by syzkaller:
WARNING: CPU: 3 PID: 15175 at arch/x86/kvm/x86.c:7705 __x86_set_memory_region+0x1dc/0x1f0 [kvm]()
CPU: 3 PID: 15175 Comm: a.out Tainted: G W 4.4.6-300.fc23.x86_64 #1
Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
0000000000000286 00000000950899a7 ffff88011ab3fbf0 ffffffff813b542e
0000000000000000 ffffffffa0966496 ffff88011ab3fc28 ffffffff810a40f2
00000000000001fd 0000000000003000 ffff88014fc50000 0000000000000000
Call Trace:
[<ffffffff813b542e>] dump_stack+0x63/0x85
[<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0
[<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20
[<ffffffffa09251cc>] __x86_set_memory_region+0x1dc/0x1f0 [kvm]
[<ffffffffa092521b>] x86_set_memory_region+0x3b/0x60 [kvm]
[<ffffffffa09bb61c>] vmx_set_tss_addr+0x3c/0x150 [kvm_intel]
[<ffffffffa092f4d4>] kvm_arch_vm_ioctl+0x654/0xbc0 [kvm]
[<ffffffffa091d31a>] kvm_vm_ioctl+0x9a/0x6f0 [kvm]
[<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
[<ffffffff812414a9>] SyS_ioctl+0x79/0x90
[<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71
Testcase:
#include <unistd.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <string.h>
#include <linux/kvm.h>
long r[8];
int main()
{
memset(r, -1, sizeof(r));
r[2] = open("/dev/kvm", O_RDONLY|O_TRUNC);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
r[5] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul);
r[7] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul);
return 0;
}
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Intel CPUs having Turbo Boost feature implement an MSR to provide a
control interface via rdmsr/wrmsr instructions. One could detect the
presence of this feature by issuing one of these instructions and
handling the #GP exception which is generated in case the referenced MSR
is not implemented by the CPU.
KVM's vCPU model behaves exactly as a real CPU in this case by injecting
a fault when MSR_IA32_PERF_CTL is called (which KVM does not support).
However, some operating systems use this register during an early boot
stage in which their kernel is not capable of handling #GP correctly,
causing #DP and finally a triple fault effectively resetting the vCPU.
This patch implements a dummy handler for MSR_IA32_PERF_CTL to avoid the
crashes.
Signed-off-by: Dmitry Bilunov <kmeaw@yandex-team.ru>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
In theory, nothing prevents the compiler from write-tearing PTEs, or
split PTE writes. These partially-modified PTEs can be fetched by other
cores and cause mayhem. I have not really encountered such case in
real-life, but it does seem possible.
For example, the compiler may try to do something creative for
kvm_set_pte_rmapp() and perform multiple writes to the PTE.
Signed-off-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Currently there are several checkpatch warnings in the sha1_mb.c file:
'WARNING: line over 80 characters' in the sha1_mb.c file. Also, the
syntax of some multi-line comments are not correct. This patch fixes
these issues.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
into drm-next
drm-intel-next-2016-05-22:
- cmd-parser support for direct reg->reg loads (Ken Graunke)
- better handle DP++ smart dongles (Ville)
- bxt guc fw loading support (Nick Hoathe)
- remove a bunch of struct typedefs from dpll code (Ander)
- tons of small work all over to avoid casting between drm_device and the i915
dev struct (Tvrtko&Chris)
- untangle request retiring from other operations, also fixes reset stat corner
cases (Chris)
- skl atomic watermark support from Matt Roper, yay!
- various wm handling bugfixes from Ville
- big pile of cdclck rework for bxt/skl (Ville)
- CABC (Content Adaptive Brigthness Control) for dsi panels (Jani&Deepak M)
- nonblocking atomic commits for plane-only updates (Maarten Lankhorst)
- bunch of PSR fixes&improvements
- untangle our map/pin/sg_iter code a bit (Dave Gordon)
drm-intel-next-2016-05-08:
- refactor stolen quirks to share code between early quirks and i915 (Joonas)
- refactor gem BO/vma funcstion (Tvrtko&Dave)
- backlight over DPCD support (Yetunde Abedisi)
- more dsi panel sequence support (Jani)
- lots of refactoring around handling iomaps, vma, ring access and related
topics culmulating in removing the duplicated request tracking in the execlist
code (Chris & Tvrtko) includes a small patch for core iomapping code
- hw state readout for bxt dsi (Ramalingam C)
- cdclk cleanups (Ville)
- dedupe chv pll code a bit (Ander)
- enable semaphores on gen8+ for legacy submission, to be able to have a direct
comparison against execlist on the same platform (Chris) Not meant to be used
for anything else but performance tuning
- lvds border bit hw state checker fix (Jani)
- rpm vs. shrinker/oom-notifier fixes (Praveen Paneri)
- l3 tuning (Imre)
- revert mst dp audio, it's totally non-functional and crash-y (Lyude)
- first official dmc for kbl (Rodrigo)
- and tons of small things all over as usual
* 'drm-intel-next' of git://anongit.freedesktop.org/drm-intel: (194 commits)
drm/i915: Revert async unpin and nonblocking atomic commit
drm/i915: Update DRIVER_DATE to 20160522
drm/i915: Inline sg_next() for the optimised SGL iterator
drm/i915: Introduce & use new lightweight SGL iterators
drm/i915: optimise i915_gem_object_map() for small objects
drm/i915: refactor i915_gem_object_pin_map()
drm/i915/psr: Implement PSR2 w/a for gen9
drm/i915/psr: Use ->get_aux_send_ctl functions
drm/i915/psr: Order DP aux transactions correctly
drm/i915/psr: Make idle_frames sensible again
drm/i915/psr: Try to program link training times correctly
drm/i915/userptr: Convert to drm_i915_private
drm/i915: Allow nonblocking update of pageflips.
drm/i915: Check for unpin correctness.
Reapply "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
drm/i915: Make unpin async.
drm/i915: Prepare connectors for nonblocking checks.
drm/i915: Pass atomic states to fbc update functions.
drm/i915: Remove reset_counter from intel_crtc.
drm/i915: Remove queue_flip pointer.
...
|
|
Add the MODULE_ALIAS for the cra_driver_name of the different ciphers to
allow an automated loading if a driver name is used.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
acpi_numa_memory_affinity_init() will be reused by arm64. Move it to
drivers/acpi/numa.c to facilitate reuse.
No code change.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
acpi_numa is default to 0, it's set to -1 when disable acpi numa or
when a bad SRAT is parsed, and it's only consumed in srat_disabled()
(compare it with 0) to continue parse the SRAT or not, so we don't
need to set acpi_numa to 1 when we get a valid SRAT entry.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
bad_srat() and srat_disabled() are shared by x86 and follow-on arm64
patches. Move them to drivers/acpi/numa.c in preparation for arm64
support.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
[david.daney@cavium.com moved definitions to drivers/acpi/numa.c]
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Cleanup acpi_numa_processor_affinity_init() in preparation for its
move to drivers/acpi/numa.c. It will be reused by arm64, this has no
functional change.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Identical implementations of acpi_numa_slit_init() are used by both
x86 and follow-on arm64 support. Move it to drivers/acpi/numa.c, and
guard with CONFIG_X86 || CONFIG_ARM64 because ia64 has its own
architecture specific implementation.
No code change.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Since acpi_numa_arch_fixup() is only used in arch ia64, move it there
to make a generic interface easier. This avoids empty function stubs
or some complex kconfig options for x86 and arm64.
Signed-off-by: Robert Richter <rrichter@cavium.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
"This contains a nice FPU fixup from Eli Cooper for UML"
* 'for-linus-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: add extended processor state save/restore support
um: extend fpstate to _xstate to support YMM registers
um: fix FPU state preservation around signal handlers
|
|
The do_brk() and vm_brk() return value was "unsigned long" and returned
the starting address on success, and an error value on failure. The
reasons are entirely historical, and go back to it basically behaving
like the mmap() interface does.
However, nobody actually wanted that interface, and it causes totally
pointless IS_ERR_VALUE() confusion.
What every single caller actually wants is just the simpler integer
return of zero for success and negative error number on failure.
So just convert to that much clearer and more common calling convention,
and get rid of all the IS_ERR_VALUE() uses wrt vm_brk().
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull x86 platform driver updates from Darren Hart:
"Mostly minor updates and cleanups. One new power management
controller driver for Intel Core SoCs.
platform/x86:
- Add PMC Driver for Intel Core SoC
dell-rbtn:
- Ignore ACPI notifications if device is suspended
thinkpad_acpi:
- save kbdlight state on suspend and restore it on resume
intel_menlow:
- reduce code duplication
asus-wmi:
- provide access to ALS control
ideapad-laptop:
- add a new WMI string for ESC key
surfacepro3_button:
- Add a warning when switching to tablet mode
sony-laptop:
- Avoid oops on module unload for older laptops
intel_telemetry:
- Constify telemetry_core_ops structures
fujitsu-laptop:
- Use IS_ENABLED() instead of checking for built-in or module
asus-laptop:
- correct error handling in sysfs_acpi_set
- remove redundant initializers
- correct error handling in asus_read_brightness()
fujitsu-laptop:
- Support radio LED"
* tag 'platform-drivers-x86-v4.7-1' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
platform/x86: Add PMC Driver for Intel Core SoC
dell-rbtn: Ignore ACPI notifications if device is suspended
thinkpad_acpi: save kbdlight state on suspend and restore it on resume
intel_menlow: reduce code duplication
asus-wmi: provide access to ALS control
ideapad-laptop: add a new WMI string for ESC key
surfacepro3_button: Add a warning when switching to tablet mode
sony-laptop: Avoid oops on module unload for older laptops
intel_telemetry: Constify telemetry_core_ops structures
fujitsu-laptop: Use IS_ENABLED() instead of checking for built-in or module
asus-laptop: correct error handling in sysfs_acpi_set
asus-laptop: remove redundant initializers
asus-laptop: correct error handling in asus_read_brightness()
fujitsu-laptop: Support radio LED
|
|
Pull second batch of KVM updates from Radim Krčmář:
"General:
- move kvm_stat tool from QEMU repo into tools/kvm/kvm_stat (kvm_stat
had nothing to do with QEMU in the first place -- the tool only
interprets debugfs)
- expose per-vm statistics in debugfs and support them in kvm_stat
(KVM always collected per-vm statistics, but they were summarised
into global statistics)
x86:
- fix dynamic APICv (VMX was improperly configured and a guest could
access host's APIC MSRs, CVE-2016-4440)
- minor fixes
ARM changes from Christoffer Dall:
- new vgic reimplementation of our horribly broken legacy vgic
implementation. The two implementations will live side-by-side
(with the new being the configured default) for one kernel release
and then we'll remove the legacy one.
- fix for a non-critical issue with virtual abort injection to guests"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits)
tools: kvm_stat: Add comments
tools: kvm_stat: Introduce pid monitoring
KVM: Create debugfs dir and stat files for each VM
MAINTAINERS: Add kvm tools
tools: kvm_stat: Powerpc related fixes
tools: Add kvm_stat man page
tools: Add kvm_stat vm monitor script
kvm:vmx: more complete state update on APICv on/off
KVM: SVM: Add more SVM_EXIT_REASONS
KVM: Unify traced vector format
svm: bitwise vs logical op typo
KVM: arm/arm64: vgic-new: Synchronize changes to active state
KVM: arm/arm64: vgic-new: enable build
KVM: arm/arm64: vgic-new: implement mapped IRQ handling
KVM: arm/arm64: vgic-new: Wire up irqfd injection
KVM: arm/arm64: vgic-new: Add vgic_v2/v3_enable
KVM: arm/arm64: vgic-new: vgic_init: implement map_resources
KVM: arm/arm64: vgic-new: vgic_init: implement vgic_init
KVM: arm/arm64: vgic-new: vgic_init: implement vgic_create
KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init
...
|
|
This patch adds the Power Management Controller driver as a PCI driver
for Intel Core SoC architecture.
This driver can utilize debugging capabilities and supported features
as exposed by the Power Management Controller.
Please refer to the below specification for more details on PMC features.
http://www.intel.in/content/www/in/en/chipsets/100-series-chipset-datasheet-vol-2.html
The current version of this driver exposes SLP_S0_RESIDENCY counter.
This counter can be used for detecting fragile SLP_S0 signal related
failures and take corrective actions when PCH SLP_S0 signal is not
asserted after kernel freeze as part of suspend to idle flow
(echo freeze > /sys/power/state).
Intel Platform Controller Hub (PCH) asserts SLP_S0 signal when it
detects favorable conditions to enter its low power mode. As a
pre-requisite the SoC should be in deepest possible Package C-State
and devices should be in low power mode. For example, on Skylake SoC
the deepest Package C-State is Package C10 or PC10. Suspend to idle
flow generally leads to PC10 state but PC10 state may not be sufficient
for realizing the platform wide power potential which SLP_S0 signal
assertion can provide.
SLP_S0 signal is often connected to the Embedded Controller (EC) and the
Power Management IC (PMIC) for other platform power management related
optimizations.
In general, SLP_S0 assertion == PC10 + PCH low power mode + ModPhy Lanes
power gated + PLL Idle.
As part of this driver, a mechanism to read the SLP_S0_RESIDENCY is exposed
as an API and also debugfs features are added to indicate SLP_S0 signal
assertion residency in microseconds.
echo freeze > /sys/power/state
wake the system
cat /sys/kernel/debug/pmc_core/slp_s0_residency_usec
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
Signed-off-by: Vishwanath Somayaji <vishwanath.somayaji@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:
- new option CONFIG_TRIM_UNUSED_KSYMS which does a two-pass build and
unexports symbols which are not used in the current config [Nicolas
Pitre]
- several kbuild rule cleanups [Masahiro Yamada]
- warning option adjustments for gcov etc [Arnd Bergmann]
- a few more small fixes
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (31 commits)
kbuild: move -Wunused-const-variable to W=1 warning level
kbuild: fix if_change and friends to consider argument order
kbuild: fix adjust_autoksyms.sh for modules that need only one symbol
kbuild: fix ksym_dep_filter when multiple EXPORT_SYMBOL() on the same line
gcov: disable -Wmaybe-uninitialized warning
gcov: disable tree-loop-im to reduce stack usage
gcov: disable for COMPILE_TEST
Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES
Kbuild: change CC_OPTIMIZE_FOR_SIZE definition
kbuild: forbid kernel directory to contain spaces and colons
kbuild: adjust ksym_dep_filter for some cmd_* renames
kbuild: Fix dependencies for final vmlinux link
kbuild: better abstract vmlinux sequential prerequisites
kbuild: fix call to adjust_autoksyms.sh when output directory specified
kbuild: Get rid of KBUILD_STR
kbuild: rename cmd_as_s_S to cmd_cpp_s_S
kbuild: rename cmd_cc_i_c to cmd_cpp_i_c
kbuild: drop redundant "PHONY += FORCE"
kbuild: delete unnecessary "@:"
kbuild: mark help target as PHONY
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes: EFI, entry code, pkeys and MPX fixes, TASK_SIZE cleanups
and a tsc frequency table fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Switch from TASK_SIZE to TASK_SIZE_MAX in the page fault code
x86/fsgsbase/64: Use TASK_SIZE_MAX for FSBASE/GSBASE upper limits
x86/mm/mpx: Work around MPX erratum SKD046
x86/entry/64: Fix stack return address retrieval in thunk
x86/efi: Fix 7-parameter efi_call()s
x86/cpufeature, x86/mm/pkeys: Fix broken compile-time disabling of pkeys
x86/tsc: Add missing Cherrytrail frequency to the table
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Mostly tooling and PMU driver fixes, but also a number of late updates
such as the reworking of the call-chain size limiting logic to make
call-graph recording more robust, plus tooling side changes for the
new 'backwards ring-buffer' extension to the perf ring-buffer"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
perf record: Read from backward ring buffer
perf record: Rename variable to make code clear
perf record: Prevent reading invalid data in record__mmap_read
perf evlist: Add API to pause/resume
perf trace: Use the ptr->name beautifier as default for "filename" args
perf trace: Use the fd->name beautifier as default for "fd" args
perf report: Add srcline_from/to branch sort keys
perf evsel: Record fd into perf_mmap
perf evsel: Add overwrite attribute and check write_backward
perf tools: Set buildid dir under symfs when --symfs is provided
perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
perf annotate: Sort list of recognised instructions
perf annotate: Fix identification of ARM blt and bls instructions
perf tools: Fix usage of max_stack sysctl
perf callchain: Stop validating callchains by the max_stack sysctl
perf trace: Fix exit_group() formatting
perf top: Use machine->kptr_restrict_warned
perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
perf machine: Do not bail out if not managing to read ref reloc symbol
perf/x86/intel/p4: Trival indentation fix, remove space
...
|
|
The function to update APICv on/off state (in particular, to deactivate
it when enabling Hyper-V SynIC) is incomplete: it doesn't adjust
APICv-related fields among secondary processor-based VM-execution
controls. As a result, Windows 2012 guests get stuck when SynIC-based
auto-EOI interrupt intersected with e.g. an IPI in the guest.
In addition, the MSR intercept bitmap isn't updated every time "virtualize
x2APIC mode" is toggled. This path can only be triggered by a malicious
guest, because Windows didn't use x2APIC but rather their own synthetic
APIC access MSRs; however a guest running in a SynIC-enabled VM could
switch to x2APIC and thus obtain direct access to host APIC MSRs
(CVE-2016-4440).
The patch fixes those omissions.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Reported-by: Steve Rutherford <srutherford@google.com>
Reported-by: Yang Zhang <yang.zhang.wz@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
On rapl cleanup path, kfree() is given by mistake the address of the
pointer of the structure to free (rapl_pmus->pmus + i). Pass the pointer
instead (rapl_pmus->pmus[i]).
Fixes: 9de8d686955b "perf/x86/intel/rapl: Convert it to a per package facility"
Signed-off-by: Vincent Stehlé <vincent.stehle@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1464101629-14905-1-git-send-email-vincent.stehle@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel.
* tag 'for-linus-4.7-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: use same main loop for counting and remapping pages
xen/events: Don't move disabled irqs
xen/x86: actually allocate legacy interrupts on PV guests
Xen: don't warn about 2-byte wchar_t in efi
xen/gntdev: reduce copy batch size to 16
xen/x86: don't lose event interrupts
|
|
Instead of having two functions for cycling through the E820 map in
order to count to be remapped pages and remap them later, just use one
function with a caller supplied sub-function called for each region to
be processed. This eliminates the possibility of a mismatch between
both loops which showed up in certain configurations.
Suggested-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
b4ff8389ed14 is incomplete: relies on nr_legacy_irqs() to get the number
of legacy interrupts when actually nr_legacy_irqs() returns 0 after
probe_8259A(). Use NR_IRQS_LEGACY instead.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: stable@vger.kernel.org
|
|
On slow platforms with unreliable TSC, such as QEMU emulated machines,
it is possible for the kernel to request the next event in the past. In
that case, in the current implementation of xen_vcpuop_clockevent, we
simply return -ETIME. To be precise the Xen returns -ETIME and we pass
it on. However the result of this is a missed event, which simply causes
the kernel to hang.
Instead it is better to always ask the hypervisor for a timer event,
even if the timeout is in the past. That way there are no lost
interrupts and the kernel survives. To do that, remove the
VCPU_SSHOTTMR_future flag.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Juergen Gross <jgross@suse.com>
|
|
Useful when tracing nested setups where the guest may trigger more than
the host usually does. But even some typical host exits were missing.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
These were supposed to be a bitwise operation but there is a typo.
The result is mostly harmless, but sparse correctly complains.
Fixes: 44a95dae1d22 ('KVM: x86: Detect and Initialize AVIC support')
Fixes: 18f40c53e10f ('svm: Add VMEXIT handlers for AVIC')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
most architectures are relying on mmap_sem for write in their
arch_setup_additional_pages. If the waiting task gets killed by the oom
killer it would block oom_reaper from asynchronous address space reclaim
and reduce the chances of timely OOM resolving. Wait for the lock in
the killable mode and return with EINTR if the task got killed while
waiting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Andy Lutomirski <luto@amacapital.net> [x86 vdso]
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
vm_brk is allowed to fail but load_aout_binary simply ignores the error
and happily continues. I haven't noticed any problem from that in real
life but later patches will make the failure more likely because vm_brk
will become killable (resp. mmap_sem for write waiting will become
killable) so we should be more careful now.
The error handling should be quite straightforward because there are
calls to vm_mmap which check the error properly already. The only
notable exception is set_brk which is called after beyond_if label. But
nothing indicates that we cannot move it above set_binfmt as the two do
not depend on each other and fail before we do set_binfmt and alter
reference counting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This option was replaced by PAGE_COUNTER which is selected by MEMCG.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|