summaryrefslogtreecommitdiff
path: root/arch/x86/kvm
AgeCommit message (Collapse)Author
2016-03-03KVM: MMU: introduce kvm_mmu_slot_gfn_write_protectXiao Guangrong
Split rmap_write_protect() and introduce the function to abstract the write protection based on the slot This function will be used in the later patch Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-03KVM: MMU: introduce kvm_mmu_gfn_{allow,disallow}_lpageXiao Guangrong
Abstract the common operations from account_shadowed() and unaccount_shadowed(), then introduce kvm_mmu_gfn_disallow_lpage() and kvm_mmu_gfn_allow_lpage() These two functions will be used by page tracking in the later patch Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-03KVM: MMU: rename has_wrprotected_page to mmu_gfn_lpage_is_disallowedXiao Guangrong
kvm_lpage_info->write_count is used to detect if the large page mapping for the gfn on the specified level is allowed, rename it to disallow_lpage to reflect its purpose, also we rename has_wrprotected_page() to mmu_gfn_lpage_is_disallowed() to make the code more clearer Later we will extend this mechanism for page tracking: if the gfn is tracked then large mapping for that gfn on any level is not allowed. The new name is more straightforward Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-03kvm: x86: Check dest_map->vector to match eoi signals for rtcJoerg Roedel
Using the vector stored at interrupt delivery makes the eoi matching safe agains irq migration in the ioapic. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-03kvm: x86: Track irq vectors in ioapic->rtc_status.dest_mapJoerg Roedel
This allows backtracking later in case the rtc irq has been moved to another vcpu/vector. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-03kvm: x86: Convert ioapic->rtc_status.dest_map to a structJoerg Roedel
Currently this is a bitmap which tracks which CPUs we expect an EOI from. Move this bitmap to a struct so that we can track additional information there. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-02kvm: x86: Update tsc multiplier on change.Owen Hofmann
vmx.c writes the TSC_MULTIPLIER field in vmx_vcpu_load, but only when a vcpu has migrated physical cpus. Record the last value written and update in vmx_vcpu_load on any change, otherwise a cpu migration must occur for TSC frequency scaling to take effect. Cc: stable@vger.kernel.org Fixes: ff2c3a1803775cc72dc6f624b59554956396b0ee Signed-off-by: Owen Hofmann <osh@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-29Merge branch 'sched/urgent' into sched/core, to pick up fixes before ↵Ingo Molnar
applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-26KVM: x86: fix root cause for missed hardware breakpointsPaolo Bonzini
Commit 172b2386ed16 ("KVM: x86: fix missed hardware breakpoints", 2016-02-10) worked around a case where the debug registers are not loaded correctly on preemption and on the first entry to KVM_RUN. However, Xiao Guangrong pointed out that the root cause must be that KVM_DEBUGREG_BP_ENABLED is not being set correctly. This can indeed happen due to the lazy debug exit mechanism, which does not call kvm_update_dr7. Fix it by replacing the existing loop (more or less equivalent to kvm_update_dr0123) with calls to all the kvm_update_dr* functions. Cc: stable@vger.kernel.org # 4.1+ Fixes: 172b2386ed16a9143d9a456aae5ec87275c61489 Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25KVM: Use simple waitqueue for vcpu->wqMarcelo Tosatti
The problem: On -rt, an emulated LAPIC timer instances has the following path: 1) hard interrupt 2) ksoftirqd is scheduled 3) ksoftirqd wakes up vcpu thread 4) vcpu thread is scheduled This extra context switch introduces unnecessary latency in the LAPIC path for a KVM guest. The solution: Allow waking up vcpu thread from hardirq context, thus avoiding the need for ksoftirqd to be scheduled. Normal waitqueues make use of spinlocks, which on -RT are sleepable locks. Therefore, waking up a waitqueue waiter involves locking a sleeping lock, which is not allowed from hard interrupt context. cyclictest command line: This patch reduces the average latency in my tests from 14us to 11us. Daniel writes: Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency benchmark on mainline. The test was run 1000 times on tip/sched/core 4.4.0-rc8-01134-g0905f04: ./x86-run x86/tscdeadline_latency.flat -cpu host with idle=poll. The test seems not to deliver really stable numbers though most of them are smaller. Paolo write: "Anything above ~10000 cycles means that the host went to C1 or lower---the number means more or less nothing in that case. The mean shows an improvement indeed." Before: min max mean std count 1000.000000 1000.000000 1000.000000 1000.000000 mean 5162.596000 2019270.084000 5824.491541 20681.645558 std 75.431231 622607.723969 89.575700 6492.272062 min 4466.000000 23928.000000 5537.926500 585.864966 25% 5163.000000 1613252.750000 5790.132275 16683.745433 50% 5175.000000 2281919.000000 5834.654000 23151.990026 75% 5190.000000 2382865.750000 5861.412950 24148.206168 max 5228.000000 4175158.000000 6254.827300 46481.048691 After min max mean std count 1000.000000 1000.00000 1000.000000 1000.000000 mean 5143.511000 2076886.10300 5813.312474 21207.357565 std 77.668322 610413.09583 86.541500 6331.915127 min 4427.000000 25103.00000 5529.756600 559.187707 25% 5148.000000 1691272.75000 5784.889825 17473.518244 50% 5160.000000 2308328.50000 5832.025000 23464.837068 75% 5172.000000 2393037.75000 5853.177675 24223.969976 max 5222.000000 3922458.00000 6186.720500 42520.379830 [Patch was originaly based on the swait implementation found in the -rt tree. Daniel ported it to mainline's version and gathered the benchmark numbers for tscdeadline_latency test.] Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-rt-users@vger.kernel.org Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25KVM: x86: MMU: fix ubsan index-out-of-range warningMike Krinkin
Ubsan reports the following warning due to a typo in update_accessed_dirty_bits template, the patch fixes the typo: [ 168.791851] ================================================================================ [ 168.791862] UBSAN: Undefined behaviour in arch/x86/kvm/paging_tmpl.h:252:15 [ 168.791866] index 4 is out of range for type 'u64 [4]' [ 168.791871] CPU: 0 PID: 2950 Comm: qemu-system-x86 Tainted: G O L 4.5.0-rc5-next-20160222 #7 [ 168.791873] Hardware name: LENOVO 23205NG/23205NG, BIOS G2ET95WW (2.55 ) 07/09/2013 [ 168.791876] 0000000000000000 ffff8801cfcaf208 ffffffff81c9f780 0000000041b58ab3 [ 168.791882] ffffffff82eb2cc1 ffffffff81c9f6b4 ffff8801cfcaf230 ffff8801cfcaf1e0 [ 168.791886] 0000000000000004 0000000000000001 0000000000000000 ffffffffa1981600 [ 168.791891] Call Trace: [ 168.791899] [<ffffffff81c9f780>] dump_stack+0xcc/0x12c [ 168.791904] [<ffffffff81c9f6b4>] ? _atomic_dec_and_lock+0xc4/0xc4 [ 168.791910] [<ffffffff81da9e81>] ubsan_epilogue+0xd/0x8a [ 168.791914] [<ffffffff81daafa2>] __ubsan_handle_out_of_bounds+0x15c/0x1a3 [ 168.791918] [<ffffffff81daae46>] ? __ubsan_handle_shift_out_of_bounds+0x2bd/0x2bd [ 168.791922] [<ffffffff811287ef>] ? get_user_pages_fast+0x2bf/0x360 [ 168.791954] [<ffffffffa1794050>] ? kvm_largepages_enabled+0x30/0x30 [kvm] [ 168.791958] [<ffffffff81128530>] ? __get_user_pages_fast+0x360/0x360 [ 168.791987] [<ffffffffa181b818>] paging64_walk_addr_generic+0x1b28/0x2600 [kvm] [ 168.792014] [<ffffffffa1819cf0>] ? init_kvm_mmu+0x1100/0x1100 [kvm] [ 168.792019] [<ffffffff8129e350>] ? debug_check_no_locks_freed+0x350/0x350 [ 168.792044] [<ffffffffa1819cf0>] ? init_kvm_mmu+0x1100/0x1100 [kvm] [ 168.792076] [<ffffffffa181c36d>] paging64_gva_to_gpa+0x7d/0x110 [kvm] [ 168.792121] [<ffffffffa181c2f0>] ? paging64_walk_addr_generic+0x2600/0x2600 [kvm] [ 168.792130] [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90 [ 168.792178] [<ffffffffa17d9a4a>] emulator_read_write_onepage+0x27a/0x1150 [kvm] [ 168.792208] [<ffffffffa1794d44>] ? __kvm_read_guest_page+0x54/0x70 [kvm] [ 168.792234] [<ffffffffa17d97d0>] ? kvm_task_switch+0x160/0x160 [kvm] [ 168.792238] [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90 [ 168.792263] [<ffffffffa17daa07>] emulator_read_write+0xe7/0x6d0 [kvm] [ 168.792290] [<ffffffffa183b620>] ? em_cr_write+0x230/0x230 [kvm] [ 168.792314] [<ffffffffa17db005>] emulator_write_emulated+0x15/0x20 [kvm] [ 168.792340] [<ffffffffa18465f8>] segmented_write+0xf8/0x130 [kvm] [ 168.792367] [<ffffffffa1846500>] ? em_lgdt+0x20/0x20 [kvm] [ 168.792374] [<ffffffffa14db512>] ? vmx_read_guest_seg_ar+0x42/0x1e0 [kvm_intel] [ 168.792400] [<ffffffffa1846d82>] writeback+0x3f2/0x700 [kvm] [ 168.792424] [<ffffffffa1846990>] ? em_sidt+0xa0/0xa0 [kvm] [ 168.792449] [<ffffffffa185554d>] ? x86_decode_insn+0x1b3d/0x4f70 [kvm] [ 168.792474] [<ffffffffa1859032>] x86_emulate_insn+0x572/0x3010 [kvm] [ 168.792499] [<ffffffffa17e71dd>] x86_emulate_instruction+0x3bd/0x2110 [kvm] [ 168.792524] [<ffffffffa17e6e20>] ? reexecute_instruction.part.110+0x2e0/0x2e0 [kvm] [ 168.792532] [<ffffffffa14e9a81>] handle_ept_misconfig+0x61/0x460 [kvm_intel] [ 168.792539] [<ffffffffa14e9a20>] ? handle_pause+0x450/0x450 [kvm_intel] [ 168.792546] [<ffffffffa15130ea>] vmx_handle_exit+0xd6a/0x1ad0 [kvm_intel] [ 168.792572] [<ffffffffa17f6a6c>] ? kvm_arch_vcpu_ioctl_run+0xbdc/0x6090 [kvm] [ 168.792597] [<ffffffffa17f6bcd>] kvm_arch_vcpu_ioctl_run+0xd3d/0x6090 [kvm] [ 168.792621] [<ffffffffa17f6a6c>] ? kvm_arch_vcpu_ioctl_run+0xbdc/0x6090 [kvm] [ 168.792627] [<ffffffff8293b530>] ? __ww_mutex_lock_interruptible+0x1630/0x1630 [ 168.792651] [<ffffffffa17f5e90>] ? kvm_arch_vcpu_runnable+0x4f0/0x4f0 [kvm] [ 168.792656] [<ffffffff811eeb30>] ? preempt_notifier_unregister+0x190/0x190 [ 168.792681] [<ffffffffa17e0447>] ? kvm_arch_vcpu_load+0x127/0x650 [kvm] [ 168.792704] [<ffffffffa178e9a3>] kvm_vcpu_ioctl+0x553/0xda0 [kvm] [ 168.792727] [<ffffffffa178e450>] ? vcpu_put+0x40/0x40 [kvm] [ 168.792732] [<ffffffff8129e350>] ? debug_check_no_locks_freed+0x350/0x350 [ 168.792735] [<ffffffff82946087>] ? _raw_spin_unlock+0x27/0x40 [ 168.792740] [<ffffffff8163a943>] ? handle_mm_fault+0x1673/0x2e40 [ 168.792744] [<ffffffff8129daa8>] ? trace_hardirqs_on_caller+0x478/0x6c0 [ 168.792747] [<ffffffff8129dcfd>] ? trace_hardirqs_on+0xd/0x10 [ 168.792751] [<ffffffff812e848b>] ? debug_lockdep_rcu_enabled+0x7b/0x90 [ 168.792756] [<ffffffff81725a80>] do_vfs_ioctl+0x1b0/0x12b0 [ 168.792759] [<ffffffff817258d0>] ? ioctl_preallocate+0x210/0x210 [ 168.792763] [<ffffffff8174aef3>] ? __fget+0x273/0x4a0 [ 168.792766] [<ffffffff8174acd0>] ? __fget+0x50/0x4a0 [ 168.792770] [<ffffffff8174b1f6>] ? __fget_light+0x96/0x2b0 [ 168.792773] [<ffffffff81726bf9>] SyS_ioctl+0x79/0x90 [ 168.792777] [<ffffffff82946880>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 168.792780] ================================================================================ Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com> Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-24KVM: x86: fix conversion of addresses to linear in 32-bit protected modePaolo Bonzini
Commit e8dd2d2d641c ("Silence compiler warning in arch/x86/kvm/emulate.c", 2015-09-06) broke boot of the Hurd. The bug is that the "default:" case actually could modify "la", but after the patch this change is not reflected in *linear. The bug is visible whenever a non-zero segment base causes the linear address to wrap around the 4GB mark. Fixes: e8dd2d2d641cb2724ee10e76c0ad02e04289c017 Cc: stable@vger.kernel.org Reported-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-24KVM: x86: fix missed hardware breakpointsPaolo Bonzini
Sometimes when setting a breakpoint a process doesn't stop on it. This is because the debug registers are not loaded correctly on VCPU load. The following simple reproducer from Oleg Nesterov tries using debug registers in two threads. To see the bug, run a 2-VCPU guest with "taskset -c 0" and run "./bp 0 1" inside the guest. #include <unistd.h> #include <signal.h> #include <stdlib.h> #include <stdio.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <sys/user.h> #include <asm/debugreg.h> #include <assert.h> #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len) { unsigned long dr7; dr7 = ((len | type) & 0xf) << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE); if (enable) dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE)); return dr7; } int write_dr(int pid, int dr, unsigned long val) { return ptrace(PTRACE_POKEUSER, pid, offsetof (struct user, u_debugreg[dr]), val); } void set_bp(pid_t pid, void *addr) { unsigned long dr7; assert(write_dr(pid, 0, (long)addr) == 0); dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1); assert(write_dr(pid, 7, dr7) == 0); } void *get_rip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } void test(int nr) { void *bp_addr = &&label + nr, *bp_hit; int pid; printf("test bp %d\n", nr); assert(nr < 16); // see 16 asm nops below pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); kill(getpid(), SIGSTOP); for (;;) { label: asm ( "nop; nop; nop; nop;" "nop; nop; nop; nop;" "nop; nop; nop; nop;" "nop; nop; nop; nop;" ); } } assert(pid == wait(NULL)); set_bp(pid, bp_addr); for (;;) { assert(ptrace(PTRACE_CONT, pid, 0, 0) == 0); assert(pid == wait(NULL)); bp_hit = get_rip(pid); if (bp_hit != bp_addr) fprintf(stderr, "ERR!! hit wrong bp %ld != %d\n", bp_hit - &&label, nr); } } int main(int argc, const char *argv[]) { while (--argc) { int nr = atoi(*++argv); if (!fork()) test(nr); } while (wait(NULL) > 0) ; return 0; } Cc: stable@vger.kernel.org Suggested-by: Nadav Amit <namit@cs.technion.ac.il> Reported-by: Andrey Wagin <avagin@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-24x86: Fix misspellings in commentsAdam Buchbinder
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: trivial@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-24x86/kvm: Add output operand in vmx_handle_external_intr inline asmChris J Arges
Stacktool generates the following warning: stacktool: arch/x86/kvm/vmx.o: vmx_handle_external_intr()+0x67: call without frame pointer save/setup By adding the stackpointer as an output operand, this patch ensures that a stack frame is created when CONFIG_FRAME_POINTER is enabled for the inline assmebly statement. Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: gleb@kernel.org Cc: kvm@vger.kernel.org Cc: live-patching@vger.kernel.org Cc: pbonzini@redhat.com Link: http://lkml.kernel.org/r/1453499078-9330-3-git-send-email-chris.j.arges@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-24x86/kvm: Make test_cc() always inlineJosh Poimboeuf
With some configs (including allyesconfig), gcc doesn't inline test_cc(). When that happens, test_cc() doesn't create a stack frame before inserting the inline asm call instruction. This breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled and can result in a bad stack trace. Force it to always be inlined so that its containing function's stack frame can be used. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/20160122161612.GE20502@treble.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-24x86/kvm: Set ELF function type for fastop functionsJosh Poimboeuf
The callable functions created with the FOP* and FASTOP* macros are missing ELF function annotations, which confuses tools like stacktool. Properly annotate them. This adds some additional labels to the assembly, but the generated binary code is unchanged (with the exception of instructions which have embedded references to __LINE__). Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/e399651c89ace54906c203c0557f66ed6ea3ce8d.1453405861.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-23KVM: x86: use list_last_entryGeliang Tang
To make the intention clearer, use list_last_entry instead of list_entry. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-23KVM: x86: use list_for_each_entry*Geliang Tang
Use list_for_each_entry*() instead of list_for_each*() to simplify the code. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-23KVM: x86: MMU: Move handle_mmio_page_fault() call to kvm_mmu_page_fault()Takuya Yoshikawa
Rather than placing a handle_mmio_page_fault() call in each vcpu->arch.mmu.page_fault() handler, moving it up to kvm_mmu_page_fault() makes the code better: - avoids code duplication - for kvm_arch_async_page_ready(), which is the other caller of vcpu->arch.mmu.page_fault(), removes an extra error_code check - avoids returning both RET_MMIO_PF_* values and raw integer values from vcpu->arch.mmu.page_fault() Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-23KVM: x86: MMU: Consolidate quickly_check_mmio_pf() and is_mmio_page_fault()Takuya Yoshikawa
These two have only slight differences: - whether 'addr' is of type u64 or of type gva_t - whether they have 'direct' parameter or not Concerning the former, quickly_check_mmio_pf()'s u64 is better because 'addr' needs to be able to have both a guest physical address and a guest virtual address. The latter is just a stylistic issue as we can always calculate the mode from the 'vcpu' as is_mmio_page_fault() does. This patch keeps the parameter to make the following patch cleaner. In addition, the patch renames the function to mmio_info_in_cache() to make it clear what it actually checks for. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: x86: pass kvm_get_time_scale arguments in hertzPaolo Bonzini
Prepare for improving the precision in the next patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16kvm/x86: Hyper-V VMBus hypercall userspace exitAndrey Smetanin
The patch implements KVM_EXIT_HYPERV userspace exit functionality for Hyper-V VMBus hypercalls: HV_X64_HCALL_POST_MESSAGE, HV_X64_HCALL_SIGNAL_EVENT. Changes v3: * use vcpu->arch.complete_userspace_io to setup hypercall result Changes v2: * use KVM_EXIT_HYPERV for hypercalls Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Joerg Roedel <joro@8bytes.org> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16kvm/x86: Reject Hyper-V hypercall continuationAndrey Smetanin
Currently we do not support Hyper-V hypercall continuation so reject it. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Joerg Roedel <joro@8bytes.org> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16kvm/x86: Pass return code of kvm_emulate_hypercallAndrey Smetanin
Pass the return code from kvm_emulate_hypercall on to the caller, in order to allow it to indicate to the userspace that the hypercall has to be handled there. Also adjust all the existing code paths to return 1 to make sure the hypercall isn't passed to the userspace without setting kvm_run appropriately. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Joerg Roedel <joro@8bytes.org> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16kvm/x86: Rename Hyper-V long spin wait hypercallAndrey Smetanin
Rename HV_X64_HV_NOTIFY_LONG_SPIN_WAIT by HVCALL_NOTIFY_LONG_SPIN_WAIT, so the name is more consistent with the other hypercalls. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Joerg Roedel <joro@8bytes.org> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org [Change name, Andrey used HV_X64_HCALL_NOTIFY_LONG_SPIN_WAIT. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: x86: fix missed hardware breakpointsPaolo Bonzini
Sometimes when setting a breakpoint a process doesn't stop on it. This is because the debug registers are not loaded correctly on VCPU load. The following simple reproducer from Oleg Nesterov tries using debug registers in both the host and the guest, for example by running "./bp 0 1" on the host and "./bp 14 15" under QEMU. #include <unistd.h> #include <signal.h> #include <stdlib.h> #include <stdio.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <sys/user.h> #include <asm/debugreg.h> #include <assert.h> #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len) { unsigned long dr7; dr7 = ((len | type) & 0xf) << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE); if (enable) dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE)); return dr7; } int write_dr(int pid, int dr, unsigned long val) { return ptrace(PTRACE_POKEUSER, pid, offsetof (struct user, u_debugreg[dr]), val); } void set_bp(pid_t pid, void *addr) { unsigned long dr7; assert(write_dr(pid, 0, (long)addr) == 0); dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1); assert(write_dr(pid, 7, dr7) == 0); } void *get_rip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } void test(int nr) { void *bp_addr = &&label + nr, *bp_hit; int pid; printf("test bp %d\n", nr); assert(nr < 16); // see 16 asm nops below pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); kill(getpid(), SIGSTOP); for (;;) { label: asm ( "nop; nop; nop; nop;" "nop; nop; nop; nop;" "nop; nop; nop; nop;" "nop; nop; nop; nop;" ); } } assert(pid == wait(NULL)); set_bp(pid, bp_addr); for (;;) { assert(ptrace(PTRACE_CONT, pid, 0, 0) == 0); assert(pid == wait(NULL)); bp_hit = get_rip(pid); if (bp_hit != bp_addr) fprintf(stderr, "ERR!! hit wrong bp %ld != %d\n", bp_hit - &&label, nr); } } int main(int argc, const char *argv[]) { while (--argc) { int nr = atoi(*++argv); if (!fork()) test(nr); } while (wait(NULL) > 0) ; return 0; } Cc: stable@vger.kernel.org Suggested-by: Nadadv Amit <namit@cs.technion.ac.il> Reported-by: Andrey Wagin <avagin@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: x86: fix *NULL on invalid low-prio irqRadim Krčmář
Smatch noticed a NULL dereference in kvm_intr_is_single_vcpu_fast that happens if VM already warned about invalid lowest-priority interrupt. Create a function for common code while fixing it. Fixes: 6228a0da8057 ("KVM: x86: Add lowest-priority support for vt-d posted-interrupts") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: x86: rewrite handling of scaled TSC for kvmclockPaolo Bonzini
This is the same as before: kvm_scale_tsc(tgt_tsc_khz) = tgt_tsc_khz * ratio = tgt_tsc_khz * user_tsc_khz / tsc_khz (see set_tsc_khz) = user_tsc_khz (see kvm_guest_time_update) = vcpu->arch.virtual_tsc_khz (see kvm_set_tsc_khz) However, computing it through kvm_scale_tsc will make it possible to include the NTP correction in tgt_tsc_khz. Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: x86: rename argument to kvm_set_tsc_khzPaolo Bonzini
This refers to the desired (scaled) frequency, which is called user_tsc_khz in the rest of the file. Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: VMX: Fix guest debugging while in L2Jan Kiszka
When we take a #DB or #BP vmexit while in guest mode, we first of all need to check if there is ongoing guest debugging that might be interested in the event. Currently, we unconditionally leave L2 and inject the event into L1 if it is intercepting the exceptions. That breaks things marvelously. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: VMX: Factor out is_exception_n helperJan Kiszka
There is quite some common code in all these is_<exception>() helpers. Factor it out before adding even more of them. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: consolidate different ways to test for in-kernel LAPICPaolo Bonzini
Different pieces of code checked for vcpu->arch.apic being (non-)NULL, or used kvm_vcpu_has_lapic (more optimized) or lapic_in_kernel. Replace everything with lapic_in_kernel's name and kvm_vcpu_has_lapic's implementation. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: consolidate "has lapic" checks into irq.cPaolo Bonzini
Do for kvm_cpu_has_pending_timer and kvm_inject_pending_timer_irqs what the other irq.c routines have been doing. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: APIC: remove unnecessary double checks on APIC existencePaolo Bonzini
Usually the in-kernel APIC's existence is checked in the caller. Do not bother checking it again in lapic.c. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM/VMX: Add host irq information in trace event when updating IRTE for ↵Feng Wu
posted interrupts Add host irq information in trace event, so we can better understand which irq is in posted mode. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: Add lowest-priority support for vt-d posted-interruptsFeng Wu
Use vector-hashing to deliver lowest-priority interrupts for VT-d posted-interrupts. This patch extends kvm_intr_is_single_vcpu() to support lowest-priority handling. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: Use vector-hashing to deliver lowest-priority interruptsFeng Wu
Use vector-hashing to deliver lowest-priority interrupts, As an example, modern Intel CPUs in server platform use this method to handle lowest-priority interrupts. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: Recover IRTE to remapped mode if the interrupt is not single-destinationFeng Wu
When the interrupt is not single destination any more, we need to change back IRTE to remapped mode explicitly. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: introduce do_shl32_div32Paolo Bonzini
This is similar to the existing div_frac function, but it returns the remainder too. Unlike div_frac, it can be used to implement long division, e.g. (a << 64) / b for 32-bit a and b. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-15kvm: rename pfn_t to kvm_pfn_tDan Williams
To date, we have implemented two I/O usage models for persistent memory, PMEM (a persistent "ram disk") and DAX (mmap persistent memory into userspace). This series adds a third, DAX-GUP, that allows DAX mappings to be the target of direct-i/o. It allows userspace to coordinate DMA/RDMA from/to persistent memory. The implementation leverages the ZONE_DEVICE mm-zone that went into 4.3-rc1 (also discussed at kernel summit) to flag pages that are owned and dynamically mapped by a device driver. The pmem driver, after mapping a persistent memory range into the system memmap via devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus page-backed pmem-pfns via flags in the new pfn_t type. The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the resulting pte(s) inserted into the process page tables with a new _PAGE_DEVMAP flag. Later, when get_user_pages() is walking ptes it keys off _PAGE_DEVMAP to pin the device hosting the page range active. Finally, get_page() and put_page() are modified to take references against the device driver established page mapping. Finally, this need for "struct page" for persistent memory requires memory capacity to store the memmap array. Given the memmap array for a large pool of persistent may exhaust available DRAM introduce a mechanism to allocate the memmap from persistent memory. The new "struct vmem_altmap *" parameter to devm_memremap_pages() enables arch_add_memory() to use reserved pmem capacity rather than the page allocator. This patch (of 18): The core has developed a need for a "pfn_t" type [1]. Move the existing pfn_t in KVM to kvm_pfn_t [2]. [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html [2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.html Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "PPC changes will come next week. - s390: Support for runtime instrumentation within guests, support of 248 VCPUs. - ARM: rewrite of the arm64 world switch in C, support for 16-bit VM identifiers. Performance counter virtualization missed the boat. - x86: Support for more Hyper-V features (synthetic interrupt controller), MMU cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (115 commits) kvm: x86: Fix vmwrite to SECONDARY_VM_EXEC_CONTROL kvm/x86: Hyper-V SynIC timers tracepoints kvm/x86: Hyper-V SynIC tracepoints kvm/x86: Update SynIC timers on guest entry only kvm/x86: Skip SynIC vector check for QEMU side kvm/x86: Hyper-V fix SynIC timer disabling condition kvm/x86: Reorg stimer_expiration() to better control timer restart kvm/x86: Hyper-V unify stimer_start() and stimer_restart() kvm/x86: Drop stimer_stop() function kvm/x86: Hyper-V timers fix incorrect logical operation KVM: move architecture-dependent requests to arch/ KVM: renumber vcpu->request bits KVM: document which architecture uses each request bit KVM: Remove unused KVM_REQ_KICK to save a bit in vcpu->requests kvm: x86: Check kvm_write_guest return value in kvm_write_wall_clock KVM: s390: implement the RI support of guest kvm/s390: drop unpaired smp_mb kvm: x86: fix comment about {mmu,nested_mmu}.gva_to_gpa KVM: x86: MMU: Use clear_page() instead of init_shadow_page_table() arm/arm64: KVM: Detect vGIC presence at runtime ...
2016-01-12kvm: x86: Fix vmwrite to SECONDARY_VM_EXEC_CONTROLHuaitong Han
vmx_cpuid_tries to update SECONDARY_VM_EXEC_CONTROL in the VMCS, but it will cause a vmwrite error on older CPUs because the code does not check for the presence of CPU_BASED_ACTIVATE_SECONDARY_CONTROLS. This will get rid of the following trace on e.g. Core2 6600: vmwrite error: reg 401e value 10 (err 12) Call Trace: [<ffffffff8116e2b9>] dump_stack+0x40/0x57 [<ffffffffa020b88d>] vmx_cpuid_update+0x5d/0x150 [kvm_intel] [<ffffffffa01d8fdc>] kvm_vcpu_ioctl_set_cpuid2+0x4c/0x70 [kvm] [<ffffffffa01b8363>] kvm_arch_vcpu_ioctl+0x903/0xfa0 [kvm] Fixes: feda805fe7c4ed9cf78158e73b1218752e3b4314 Cc: stable@vger.kernel.org Reported-by: Zdenek Kaspar <zkaspar82@gmail.com> Signed-off-by: Huaitong Han <huaitong.han@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-11Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu updates from Ingo Molnar: "The main changes in this cycle were: - Improved CPU ID handling code and related enhancements (Borislav Petkov) - RDRAND fix (Len Brown)" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Replace RDRAND forced-reseed with simple sanity check x86/MSR: Chop off lower 32-bit value x86/cpu: Fix MSR value truncation issue x86/cpu/amd, kvm: Satisfy guest kernel reads of IC_CFG MSR kvm: Add accessors for guest CPU's family, model, stepping x86/cpu: Unify CPU family, model, stepping calculation
2016-01-08kvm/x86: Hyper-V SynIC timers tracepointsAndrey Smetanin
Trace the following Hyper SynIC timers events: * periodic timer start * one-shot timer start * timer callback * timer expiration and message delivery result * timer config setup * timer count setup * timer cleanup Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08kvm/x86: Hyper-V SynIC tracepointsAndrey Smetanin
Trace the following Hyper SynIC events: * set msr * set sint irq * ack sint * sint irq eoi Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08kvm/x86: Update SynIC timers on guest entry onlyAndrey Smetanin
Consolidate updating the Hyper-V SynIC timers in a single place: on guest entry in processing KVM_REQ_HV_STIMER request. This simplifies the overall logic, and makes sure the most current state of msrs and guest clock is used for arming the timers (to achieve that, KVM_REQ_HV_STIMER has to be processed after KVM_REQ_CLOCK_UPDATE). Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08kvm/x86: Skip SynIC vector check for QEMU sideAndrey Smetanin
QEMU zero-inits Hyper-V SynIC vectors. We should allow that, and don't reject zero values if set by the host. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08kvm/x86: Hyper-V fix SynIC timer disabling conditionAndrey Smetanin
Hypervisor Function Specification(HFS) doesn't require to disable SynIC timer at timer config write if timer->count = 0. So drop this check, this allow to load timers MSR's during migration restore, because config are set before count in QEMU side. Also fix condition according to HFS doc(15.3.1): "It is not permitted to set the SINTx field to zero for an enabled timer. If attempted, the timer will be marked disabled (that is, bit 0 cleared) immediately." Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-01-08kvm/x86: Reorg stimer_expiration() to better control timer restartAndrey Smetanin
Split stimer_expiration() into two parts - timer expiration message sending and timer restart/cleanup based on timer state(config). This also fixes a bug where a one-shot timer message whose delivery failed once would get lost for good. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>