summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-01-12ACPI / PM: Drop acpi_power_nocheckRafael J. Wysocki
Since acpi_bus_set_power() should not use __acpi_bus_get_power() to update the device's device->power.state field before changing its power state (this may cause device->power.state to be inconsistent with the device power resources' reference counters), remove this call from it. In consequence, the acpi_power_nocheck variable is not necessary any more, so it can be dropped along with the DMI table used for setting that variable for HP Pavilion 05. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / PM: Drop acpi_bus_get_power()Rafael J. Wysocki
There are no more users of acpi_bus_get_power(), so it can be dropped. Moreover, it should be dropped, because it modifies the device->power.state field of an ACPI device without updating the reference counters of the device's power resources, which is wrong. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12Platform / x86: Make fujitsu_laptop use acpi_bus_update_power()Rafael J. Wysocki
Use the new function acpi_bus_update_power(), which is safer than acpi_bus_get_power(), for getting device power state in acpi_fujitsu_add() and acpi_fujitsu_hotkey_add(). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-and-Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / Fan: Rework the handling of power resourcesRafael J. Wysocki
Use the new function acpi_bus_update_power() for manipulating power resources used by ACPI fan devices, which allows them to be put into the right state during initialization and resume. Consequently, remove the flags.force_power_state field from struct acpi_device, which is not necessary any more. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / PM: Register power resource devices as soon as they are neededRafael J. Wysocki
Depending on the organization of the ACPI namespace, power resource device objects may generally be scanned after the "regular" device objects that they are referred from through _PRn. This, in turn, may cause acpi_bus_get_power_flags() to attempt to access them through acpi_bus_init_power() before they are registered (and initialized by acpi_power_driver). [This is not a theoretical issue, it actually happens for one PnP device on my testbed HP nx6325.] To fix this problem, make acpi_bus_get_power_flags() attempt to register power resource devices as soon as they have been found in the _PRn output for any other devices. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / PM: Register acpi_power_driver earlyRafael J. Wysocki
The ACPI device driver used for handling power resources, acpi_power_driver, creates a struct acpi_power_resource object for each ACPI device representing a power resource. These objects are then used when setting and reading the power states of devices using the corresponding power resources. Unfortunately, acpi_power_driver is registered after acpi_scan_init() that may add devices using the power resources before acpi_power_driver has a chance to create struct acpi_power_resource objects for them (specifically, the power resources may be referred to during the scanning process through acpi_bus_get_power() before they have been initialized). As the first step towards fixing this issue, move the registration of acpi_power_driver into acpi_scan_init() so that power resource devices can be initialized by it as soon as they have been found in the namespace. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / PM: Add function for updating device power state consistentlyRafael J. Wysocki
Add function acpi_bus_update_power() for reading the actual power state of an ACPI device and updating its device->power.state field in such a way that its power resources' reference counters will remain consistent with that field. For this purpose introduce __acpi_bus_set_power() setting the power state of an ACPI device without updating its device->power.state field and make acpi_bus_set_power() and acpi_bus_update_power() use it (acpi_bus_set_power() retains the current behavior for now). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / PM: Add function for device power state initializationRafael J. Wysocki
Add function acpi_bus_init_power() for getting the initial power state of an ACPI device and reference counting its power resources as appropriate. Make acpi_bus_get_power_flags() use the new function instead of acpi_bus_get_power() that updates device->power.state without reference counting the device's power resources. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / PM: Introduce __acpi_bus_get_power()Rafael J. Wysocki
It sometimes is necessary to get the power state of an ACPI device without updating its device->power.state field, for example to avoid inconsistencies between device->power.state and the reference counters of the device's power resources. For this purpose introduce __acpi_bus_get_power() that will return the given device's power state via a pointer (instead of modifying device->power.state) and make acpi_bus_get_power() use it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / PM: Introduce function for refcounting device power resourcesRafael J. Wysocki
Introduce function acpi_power_on_resources() that reference counts and possibly turns on ACPI power resources for a given device and a given power state of it. This function will be used for reference counting device power resources during initialization. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / PM: Add functions for manipulating lists of power resourcesRafael J. Wysocki
ACPI device power resources should be reference counted during device initialization, so that their reference counters are always up to date. It is convenient to do that with the help of a function that will reference count and possibly turn on power resources in a given list, so introduce that function, acpi_power_on_list(). For symmetry, introduce acpi_power_off_list() for performing the reverse operation and use the both of them to simplify acpi_power_transition(). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12ACPI / PM: Prevent acpi_power_get_inferred_state() from making changesRafael J. Wysocki
acpi_power_get_inferred_state() should not update device->power.state behind the back of its caller, so make it return the state via a pointer instead. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12KVM: VMX: when entering real mode align segment base to 16 bytesGleb Natapov
VMX checks that base is equal segment shifted 4 bits left. Otherwise guest entry fails. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: handle 'map_writable' in set_spte() functionXiao Guangrong
Move the operation of 'writable' to set_spte() to clean up code [avi: remove unneeded booleanification] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: audit: allow audit more guests at the same timeXiao Guangrong
It only allows to audit one guest in the system since: - 'audit_point' is a glob variable - mmu_audit_disable() is called in kvm_mmu_destroy(), so audit is disabled after a guest exited this patch fix those issues then allow to audit more guests at the same time Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: Fetch guest cr3 from hardware on demandAvi Kivity
Instead of syncing the guest cr3 every exit, which is expensince on vmx with ept enabled, sync it only on demand. [sheng: fix incorrect cr3 seen by Windows XP] Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: Replace reads of vcpu->arch.cr3 by an accessorAvi Kivity
This allows us to keep cr3 in the VMCS, later on. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: only write protect mappings at pagetable levelMarcelo Tosatti
If a pagetable contains a writeable large spte, all of its sptes will be write protected, including non-leaf ones, leading to endless pagefaults. Do not write protect pages above PT_PAGE_TABLE_LEVEL, as the spte fault paths assume non-leaf sptes are writable. Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: VMX: Correct asm constraint in vmcs_load()/vmcs_clear()Avi Kivity
'error' is byte sized, so use a byte register constraint. Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: MMU: Initialize base_role for tdp mmusAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: VMX: Optimize atomic EFER loadAvi Kivity
When NX is enabled on the host but not on the guest, we use the entry/exit msr load facility, which is slow. Optimize it to use entry/exit efer load, which is ~1200 cycles faster. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: VMX: Add definitions for more vm entry/exit control bitsAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: SVM: copy instruction bytes from VMCBAndre Przywara
In case of a nested page fault or an intercepted #PF newer SVM implementations provide a copy of the faulting instruction bytes in the VMCB. Use these bytes to feed the instruction emulator and avoid the costly guest instruction fetch in this case. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: SVM: implement enhanced INVLPG interceptAndre Przywara
When the DecodeAssist feature is available, the linear address is provided in the VMCB on INVLPG intercepts. Use it directly to avoid any decoding and emulation. This is only useful for shadow paging, though. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: SVM: enhance mov DR intercept handlerAndre Przywara
Newer SVM implementations provide the GPR number in the VMCB, so that the emulation path is no longer necesarry to handle debug register access intercepts. Implement the handling in svm.c and use it when the info is provided. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: SVM: enhance MOV CR intercept handlerAndre Przywara
Newer SVM implementations provide the GPR number in the VMCB, so that the emulation path is no longer necesarry to handle CR register access intercepts. Implement the handling in svm.c and use it when the info is provided. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: SVM: add new SVM feature bit namesAndre Przywara
the recent APM Vol.2 and the recent AMD CPUID specification describe new CPUID features bits for SVM. Name them here for later usage. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: cleanup emulate_instructionAndre Przywara
emulate_instruction had many callers, but only one used all parameters. One parameter was unused, another one is now hidden by a wrapper function (required for a future addition anyway), so most callers use now a shorter parameter list. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: move complete_insn_gp() into x86.cAndre Przywara
move the complete_insn_gp() helper function out of the VMX part into the generic x86 part to make it usable by SVM. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: x86: fix CR8 handlingAndre Przywara
The handling of CR8 writes in KVM is currently somewhat cumbersome. This patch makes it look like the other CR register handlers and fixes a possible issue in VMX, where the RIP would be incremented despite an injected #GP. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM guest: Fix kvm clock initialization when it's configured outAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: Take missing slots_lock for kvm_io_bus_unregister_dev()Takuya Yoshikawa
In KVM_CREATE_IRQCHIP, kvm_io_bus_unregister_dev() is called without taking slots_lock in the error handling path. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: return true when user space query KVM_CAP_USER_NMI extensionLai Jiangshan
userspace may check this extension in runtime. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: Correct kvm_pio tracepoint count fieldAvi Kivity
Currently, we record '1' for count regardless of the real count. Fix. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: Fix incorrect direct page write protection due to ro host pageAvi Kivity
If KVM sees a read-only host page, it will map it as read-only to prevent breaking a COW. However, if the page was part of a large guest page, KVM incorrectly extends the write protection to the entire large page frame instead of limiting it to the normal host page. This results in the instantiation of a new shadow page with read-only access. If this happens for a MOVS instruction that moves memory between two normal pages, within a single large page frame, and mapped within the guest as a large page, and if, in addition, the source operand is not writeable in the host (perhaps due to KSM), then KVM will instantiate a read-only direct shadow page, instantiate an spte for the source operand, then instantiate a new read/write direct shadow page and instantiate an spte for the destination operand. Since these two sptes are in different shadow pages, MOVS will never see them at the same time and the guest will not make progress. Fix by mapping the direct shadow page read/write, and only marking the host page read-only. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: Fix build error on s390 due to missing tlbs_dirtyAvi Kivity
Make it available for all archs. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: SVM: Add xsetbv interceptJoerg Roedel
This patch implements the xsetbv intercept to the AMD part of KVM. This makes AVX usable in a save way for the guest on AVX capable AMD hardware. The patch is tested by using AVX in the guest and host in parallel and checking for data corruption. I also used the KVM xsave unit-tests and they all pass. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: Make the way of accessing lpage_info more genericTakuya Yoshikawa
Large page information has two elements but one of them, write_count, alone is accessed by a helper function. This patch replaces this helper function with more generic one which returns newly named kvm_lpage_info structure and use it to access the other element rmap_pde. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: VMX: add module parameter to avoid trapping HLT instructions (v5)Anthony Liguori
In certain use-cases, we want to allocate guests fixed time slices where idle guest cycles leave the machine idling. There are many approaches to achieve this but the most direct is to simply avoid trapping the HLT instruction which lets the guest directly execute the instruction putting the processor to sleep. Introduce this as a module-level option for kvm-vmx.ko since if you do this for one guest, you probably want to do it for all. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: SVM: Implement Flush-By-Asid featureJoerg Roedel
This patch adds the new flush-by-asid of upcoming AMD processors to the KVM-AMD module. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: SVM: Use svm_flush_tlb instead of force_new_asidJoerg Roedel
This patch replaces all calls to force_new_asid which are intended to flush the guest-tlb by the more appropriate function svm_flush_tlb. As a side-effect the force_new_asid function is removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: SVM: Remove flush_guest_tlb functionJoerg Roedel
This function is unused and there is svm_flush_tlb which does the same. So this function can be removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: retry #PF for softmmuXiao Guangrong
Retry #PF for softmmu only when the current vcpu has the same cr3 as the time when #PF occurs Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: fix accessed bit set on prefault pathXiao Guangrong
Retry #PF is the speculative path, so don't set the accessed bit Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: rename 'no_apf' to 'prefault'Xiao Guangrong
It's the speculative path if 'no_apf = 1' and we will specially handle this speculative path in the later patch, so 'prefault' is better to fit the sense. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: SVM: Add clean-bit for LBR stateJoerg Roedel
This patch implements the clean-bit for all LBR related state. This includes the debugctl, br_from, br_to, last_excp_from, and last_excp_to msrs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: SVM: Add clean-bit for CR2 registerJoerg Roedel
This patch implements the clean-bit for the cr2 register in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: SVM: Add clean-bit for Segements and CPLJoerg Roedel
This patch implements the clean-bit defined for the cs, ds, ss, an es segemnts and the current cpl saved in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: SVM: Add clean-bit for GDT and IDTJoerg Roedel
This patch implements the clean-bit for the base and limit of the gdt and idt in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: SVM: Add clean-bit for DR6 and DR7Joerg Roedel
This patch implements the clean-bit for the dr6 and dr7 debug registers in the vmcb. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>