summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-05-30blk-mq: blk_mq_tag_to_rq should handle flush requestShaohua Li
flush request is special, which borrows the tag from the parent request. Hence blk_mq_tag_to_rq needs special handling to return the flush request from the tag. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-05-30ACPI / scan: use platform bus type by default for _HID enumerationZhang Rui
Because of the growing demand for enumerating ACPI devices to platform bus, change the code to enumerate ACPI device objects to platform bus by default. Namely, create platform devices for the ACPI device objects that 1. Have pnp.type.platform_id set (device objects with _HID currently). 2. Do not have a scan handler attached. 3. Are not SPI/I2C slave devices (that should be enumerated to the appropriate buses bus by their parent). Signed-off-by: Zhang Rui <rui.zhang@intel.com> [rjw: Subject and changelog, rebase and code cleanup] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30ACPI / scan: always register ACPI LPSS scan handlerRafael J. Wysocki
Prevent platform devices from being created for ACPI LPSS devices if CONFIG_X86_INTEL_LPSS is unset by compiling out the LPSS scan handler's callbacks only in that case and still compiling its device ID list in and registering the scan handler in either case. This change is based on a prototype from Zhang Rui. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30ACPI / scan: always register memory hotplug scan handlerRafael J. Wysocki
Prevent platform devices from being created for ACPI memory device objects if CONFIG_ACPI_HOTPLUG_MEMORY is unset by compiling out the memory hotplug scan handler's callbacks only in that case and still compiling its device ID list in and registering the scan handler in either case. Also unset the memory hotplug scan handler's .attach() callback if acpi_no_memhotplug is set, but still register the scan handler to avoid creating platform devices for ACPI memory devices in that case too. This change is based on a prototype from Zhang Rui. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30ACPI / scan: always register container scan handlerRafael J. Wysocki
Prevent platform devices from being created for ACPI containers if CONFIG_ACPI_CONTAINER is unset by compiling out the container scan handler's callbacks only in that case and still compiling its device ID list in and registering the scan handler in either case. This change is based on a prototype from Zhang Rui. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30ACPI / scan: Change the meaning of missing .attach() in scan handlersRafael J. Wysocki
Currently, some scan handlers can be compiled out entirely, which leaves the device objects they normally attach to without a scan handler. This isn't a problem as long as we don't have any default enumeration mechanism that applies to all devices without a scan handler. However, if such a default enumeration is added, it still should not be applied to devices that are normally attached to by scan handlers, because that may result in creating "physical" device objects of a wrong type for them. Since we are going to create platform device objects for all ACPI device objects with pnp.type.platform_id set by default, clear pnp.type.platform_id where there is a matching scan handler without an .attach() callback and otherwise simply treat that scan handler as though the .attach() callback was present but always returned 0. This will allow us to compile out scan handler callbacks and leave the device ID lists used by them so as to prevent creating platform device objects for the matching ACPI devices. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30ACPI / scan: introduce platform_id device PNP type flagRafael J. Wysocki
Only certain types of ACPI device objects can be enumerated as platform devices, so in order to distinguish them from the others introduce a new ACPI device PNP type flag, platform_id, and set it for devices with a valid _HID to start with. This change is based on a Zhang Rui's prototype. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30ACPI / scan: drop unsupported serial IDs from PNP ACPI scan handler ID listZhang Rui
The "serial" PNP driver supports some "unknown" PNP modems (PNPCXXX/PNPDXXX) by matching magic strings in the PNP device name or the PNP device card name. ACPI enumerated PNP devices neither are PNP cards, nor have those magic strings in device names, so this mechamism never actually works for ACPI enumerated PNPCXXX/PNPDXXX devices. Consequently, it is safe to remove those two IDs from the PNP ACPI scan handler's device ID list. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30ACPI / scan: drop IDs that do not comply with the ACPI PNP ID ruleZhang Rui
The PNP ACPI scan handler device ID list includes all the IDs from all of the struct pnp_device_id instances in the tree, but some of them do not follow the ACPI PNP ID rule (3 letters + 4 hex digits). For those IDs, the coressponding devices will never be enumerated via ACPI, so it is safe to remove them from the PNP ACPI ID list. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30ACPI / PNP: use device ID list for PNPACPI device enumerationZhang Rui
ACPI can be used to enumerate PNP devices, but the code does not handle this in the right way currently. Namely, if an ACPI device object 1. Has a _CRS method, 2. Has an identification of "three capital characters followed by four hex digits", 3. Is not in the excluded IDs list, it will be enumerated to PNP bus (that is, a PNP device object will be create for it). This means that, actually, the PNP bus type is used as the default bus type for enumerating _HID devices in ACPI. However, more and more _HID devices need to be enumerated to the platform bus instead (that is, platform device objects need to be created for them). As a result, the device ID list in acpi_platform.c is used to enforce creating platform device objects rather than PNP device objects for matching devices. That list has been continuously growing recently, unfortunately, and it is pretty much guaranteed to grow even more in the future. To address that problem it is better to enumerate _HID devices as platform devices by default. To this end, change the way of enumerating PNP devices by adding a PNP ACPI scan handler that will use a device ID list to create PNP devices for the ACPI device objects whose device IDs are present in that list. The initial device ID list in the PNP ACPI scan handler contains all of the pnp_device_id strings from all the existing PNP drivers, so this change should be transparent to the PNP core and all of the PNP drivers. Still, in the future it should be possible to reduce its size by converting PNP drivers that need not be PNP for any technical reasons into platform drivers. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [rjw: Rewrote the changelog, modified the PNP ACPI scan handler code] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30ACPI / scan: .match() callback for ACPI scan handlersRafael J. Wysocki
Introduce a .match() callback for ACPI scan handlers to allow them to use more elaborate matching algorithms if necessary. That is needed for the upcoming PNP scan handler in particular. This change is based on a Zhang Rui's prototype. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2014-05-30MIPS: uasm: Add mflo uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Resolved conflict.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
2014-05-30Merge branch 'acpi-lpss' into acpi-enumerationRafael J. Wysocki
2014-05-30MIPS: uasm: Add mul uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Resolved conflict.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/6736/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30MIPS: uasm: Add lh uam instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Resolved conflict.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/6733/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30MIPS: uasm: Add wsbh uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Resolved conflict.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/6732/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30MIPS: uasm: Add sltu uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Resolved conflict.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/6731/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30MIPS: uasm: Add sltiu uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Resolved conflict.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/6730/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30MIPS: uasm: Add jalr uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Resolved conflict.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/6729/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30tracing: Try again for saved cmdline if failed due to lockingSteven Rostedt (Red Hat)
In order to prevent the saved cmdline cache from being filled when tracing is not active, the comms are only recorded after a trace event is recorded. The problem is, a comm can fail to be recorded if the trace_cmdline_lock is held. That lock is taken via a trylock to allow it to happen from any context (including NMI). If the lock fails to be taken, the comm is skipped. No big deal, as we will try again later. But! Because of the code that was added to only record after an event, we may not try again later as the recording is made as a oneshot per event per CPU. Only disable the recording of the comm if the comm is actually recorded. Fixes: 7ffbd48d5cab "tracing: Cache comms only after an event occurred" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-30ALSA: firewire: Fix dependency on PCM and rawmidiTakashi Iwai
Now snd-firewire-lib supports rawmidi in addition to PCM, thus we need to give a proper dependency. For fixing and simplification, move the selections of SND_PCM and SND_RAWMIDI into SND_FIREWIRE_LIB section. Then each driver doesn't have to select them but only SND_FIREWIRE_LIB. Reported-by: Jim Davis <jim.epost@gmail.com> Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Tested-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-05-30MIPS: uasm: Add mfhi uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Resolved conflict.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Markos Chandras <markos.chandras@imgtec.com> Patchwork: http://patchwork.linux-mips.org/patch/6728/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30MIPS: uasm: Add divu uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Resolved conflict.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/6727/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30MIPS: uasm: Add srlv uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Fixed conflict due to other preceeding conflicts.] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/6726/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30MIPS: uasm: Add sllv uasm instructionMarkos Chandras
It will be used later on by bpf-jit [ralf@linux-mips.org: Fixed conflict with 49e9529b9d43773307b8c73bd251b71784830c3d [MIPS: uasm: add jalr instruction]. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/6725/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-30Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into ↵Paolo Bonzini
kvm-next Patch queue for ppc - 2014-05-30 In this round we have a few nice gems. PR KVM gains initial POWER8 support as well as LE host awareness, ihe e500 targets can now properly run u-boot, LE guests now work with PR KVM including KVM hypercalls and HV KVM guests can now use huge pages. On top of this there are some bug fixes. Conflicts: include/uapi/linux/kvm.h
2014-05-30KVM: PPC: Book3S PR: Rework SLB switching codeAlexander Graf
On LPAR guest systems Linux enables the shadow SLB to indicate to the hypervisor a number of SLB entries that always have to be available. Today we go through this shadow SLB and disable all ESID's valid bits. However, pHyp doesn't like this approach very much and honors us with fancy machine checks. Fortunately the shadow SLB descriptor also has an entry that indicates the number of valid entries following. During the lifetime of a guest we can just swap that value to 0 and don't have to worry about the SLB restoration magic. While we're touching the code, let's also make it more readable (get rid of rldicl), allow it to deal with a dynamic number of bolted SLB entries and only do shadow SLB swizzling on LPAR systems. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S PR: Use SLB entry 0Alexander Graf
We didn't make use of SLB entry 0 because ... of no good reason. SLB entry 0 will always be used by the Linux linear SLB entry, so the fact that slbia does not invalidate it doesn't matter as we overwrite SLB 0 on exit anyway. Just enable use of SLB entry 0 for our shadow SLB code. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S HV: Fix machine check delivery to guestPaul Mackerras
The code that delivered a machine check to the guest after handling it in real mode failed to load up r11 before calling kvmppc_msr_interrupt, which needs the old MSR value in r11 so it can see the transactional state there. This adds the missing load. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugsPaul Mackerras
This adds workarounds for two hardware bugs in the POWER8 performance monitor unit (PMU), both related to interrupt generation. The effect of these bugs is that PMU interrupts can get lost, leading to tools such as perf reporting fewer counts and samples than they should. The first bug relates to the PMAO (perf. mon. alert occurred) bit in MMCR0; setting it should cause an interrupt, but doesn't. The other bug relates to the PMAE (perf. mon. alert enable) bit in MMCR0. Setting PMAE when a counter is negative and counter negative conditions are enabled to cause alerts should cause an alert, but doesn't. The workaround for the first bug is to create conditions where a counter will overflow, whenever we are about to restore a MMCR0 value that has PMAO set (and PMAO_SYNC clear). The workaround for the second bug is to freeze all counters using MMCR2 before reading MMCR0. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S HV: Make sure we don't miss dirty pagesPaul Mackerras
Current, when testing whether a page is dirty (when constructing the bitmap for the KVM_GET_DIRTY_LOG ioctl), we test the C (changed) bit in the HPT entries mapping the page, and if it is 0, we consider the page to be clean. However, the Power ISA doesn't require processors to set the C bit to 1 immediately when writing to a page, and in fact allows them to delay the writeback of the C bit until they receive a TLB invalidation for the page. Thus it is possible that the page could be dirty and we miss it. Now, if there are vcpus running, this is not serious since the collection of the dirty log is racy already - some vcpu could dirty the page just after we check it. But if there are no vcpus running we should return definitive results, in case we are in the final phase of migrating the guest. Also, if the permission bits in the HPTE don't allow writing, then we know that no CPU can set C. If the HPTE was previously writable and the page was modified, any C bit writeback would have been flushed out by the tlbie that we did when changing the HPTE to read-only. Otherwise we need to do a TLB invalidation even if the C bit is 0, and then check the C bit. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S HV: Fix dirty map for hugepagesAlexey Kardashevskiy
The dirty map that we construct for the KVM_GET_DIRTY_LOG ioctl has one bit per system page (4K/64K). Currently, we only set one bit in the map for each HPT entry with the Change bit set, even if the HPT is for a large page (e.g., 16MB). Userspace then considers only the first system page dirty, though in fact the guest may have modified anywhere in the large page. To fix this, we make kvm_test_clear_dirty() return the actual number of pages that are dirty (and rename it to kvm_test_clear_dirty_npages() to emphasize that that's what it returns). In kvmppc_hv_get_dirty_log() we then set that many bits in the dirty map. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base addressPaul Mackerras
Currently, when a huge page is faulted in for a guest, we select the rmap chain to insert the HPTE into based on the guest physical address that the guest tried to access. Since there is an rmap chain for each system page, there are many rmap chains for the area covered by a huge page (e.g. 256 for 16MB pages when PAGE_SIZE = 64kB), and the huge-page HPTE could end up in any one of them. For consistency, and to make the huge-page HPTEs easier to find, we now put huge-page HPTEs in the rmap chain corresponding to the base address of the huge page. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates()Paul Mackerras
The global_invalidates() function contains a check that is intended to tell whether we are currently executing in the context of a hypercall issued by the guest. The reason is that the optimization of using a local TLB invalidate instruction is only valid in that context. The check was testing local_paca->kvm_hstate.kvm_vcore, which gets set when entering the guest but no longer gets cleared when exiting the guest. To fix this, we use the kvm_vcpu field instead, which does get cleared when exiting the guest, by the kvmppc_release_hwthread() calls inside kvmppc_run_core(). The effect of having the check wrong was that when kvmppc_do_h_remove() got called from htab_write() on the destination machine during a migration, it cleared the current cpu's bit in kvm->arch.need_tlb_flush. This meant that when the guest started running in the destination VM, it may miss out on doing a complete TLB flush, and therefore may end up using stale TLB entries from a previous guest that used the same LPID value. This should make migration more reliable. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register numberPaul Mackerras
Commit b005255e12a3 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs") added a definition of KVM_REG_PPC_WORT with the same register number as the existing KVM_REG_PPC_VRSAVE (though in fact the definitions are not identical because of the different register sizes.) For clarity, this moves KVM_REG_PPC_WORT to the next unused number, and also adds it to api.txt. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S: Add ONE_REG register names that were missedPaul Mackerras
Commit 3b7834743f9 ("KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg") added definitions for several KVM_REG_PPC_* symbols but missed adding some to api.txt. This adds them. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Add CAP to indicate hcall fixesAlexander Graf
We worked around some nasty KVM magic page hcall breakages: 1) NX bit not honored, so ignore NX when we detect it 2) LE guests swizzle hypercall instruction Without these fixes in place, there's no way it would make sense to expose kvm hypercalls to a guest. Chances are immensely high it would trip over and break. So add a new CAP that gives user space a hint that we have workarounds for the bugs above in place. It can use those as hint to disable PV hypercalls when the guest CPU is anything POWER7 or higher and the host does not have fixes in place. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: MPIC: Reset IRQ source private membersAlexander Graf
When we reset the in-kernel MPIC controller, we forget to reset some hidden state such as destmask and output. This state is usually set when the guest writes to the IDR register for a specific IRQ line. To make sure we stay in sync and don't forget hidden state, treat reset of the IDR register as a simple write of the IDR register. That automatically updates all the hidden state as well. Reported-by: Paul Janzen <pcj@pauljanzen.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Graciously fail broken LE hypercallsAlexander Graf
There are LE Linux guests out there that don't handle hypercalls correctly. Instead of interpreting the instruction stream from device tree as big endian they assume it's a little endian instruction stream and fail. When we see an illegal instruction from such a byte reversed instruction stream, bail out graciously and just declare every hcall as error. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30PPC: ePAPR: Fix hypercall on LE guestAlexander Graf
We get an array of instructions from the hypervisor via device tree that we write into a buffer that gets executed whenever we want to make an ePAPR compliant hypercall. However, the hypervisor passes us these instructions in BE order which we have to manually convert to LE when we want to run them in LE mode. With this fixup in place, I can successfully run LE kernels with KVM PV enabled on PR KVM. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handlerAneesh Kumar K.V
Use make_dsisr instead of open coding it. This also have the added benefit of handling alignment interrupt on additional instructions. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: BOOK3S: Always use the saved DAR valueAneesh Kumar K.V
Although it's optional, IBM POWER cpus always had DAR value set on alignment interrupt. So don't try to compute these values. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30PPC: KVM: Make NX bit available with magic pageAlexander Graf
Because old kernels enable the magic page and then choke on NXed trampoline code we have to disable NX by default in KVM when we use the magic page. However, since commit b18db0b8 we have successfully fixed that and can now leave NX enabled, so tell the hypervisor about this. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Disable NX for old magic page using guestsAlexander Graf
Old guests try to use the magic page, but map their trampoline code inside of an NX region. Since we can't fix those old kernels, try to detect whether the guest is sane or not. If not, just disable NX functionality in KVM so that old guests at least work at all. For newer guests, add a bit that we can set to keep NX functionality available. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: BOOK3S: HV: Add mixed page-size support for guestAneesh Kumar K.V
On recent IBM Power CPUs, while the hashed page table is looked up using the page size from the segmentation hardware (i.e. the SLB), it is possible to have the HPT entry indicate a larger page size. Thus for example it is possible to put a 16MB page in a 64kB segment, but since the hash lookup is done using a 64kB page size, it may be necessary to put multiple entries in the HPT for a single 16MB page. This capability is called mixed page-size segment (MPSS). With MPSS, there are two relevant page sizes: the base page size, which is the size used in searching the HPT, and the actual page size, which is the size indicated in the HPT entry. [ Note that the actual page size is always >= base page size ]. We use "ibm,segment-page-sizes" device tree node to advertise the MPSS support to PAPR guest. The penc encoding indicates whether we support a specific combination of base page size and actual page size in the same segment. We also use the penc value in the LP encoding of HPTE entry. This patch exposes MPSS support to KVM guest by advertising the feature via "ibm,segment-page-sizes". It also adds the necessary changes to decode the base page size and the actual page size correctly from the HPTE entry. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: BOOK3S: HV: Prefer CMA region for hash page table allocationAneesh Kumar K.V
Today when KVM tries to reserve memory for the hash page table it allocates from the normal page allocator first. If that fails it falls back to CMA's reserved region. One of the side effects of this is that we could end up exhausting the page allocator and get linux into OOM conditions while we still have plenty of space available in CMA. This patch addresses this issue by first trying hash page table allocation from CMA's reserved region before falling back to the normal page allocator. So if we run out of memory, we really are out of memory. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S PR: Expose TM registersAlexander Graf
POWER8 introduces transactional memory which brings along a number of new registers and MSR bits. Implementing all of those is a pretty big headache, so for now let's at least emulate enough to make Linux's context switching code happy. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S PR: Expose EBB registersAlexander Graf
POWER8 introduces a new facility called the "Event Based Branch" facility. It contains of a few registers that indicate where a guest should branch to when a defined event occurs and it's in PR mode. We don't want to really enable EBB as it will create a big mess with !PR guest mode while hardware is in PR and we don't really emulate the PMU anyway. So instead, let's just leave it at emulation of all its registers. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S PR: Expose TAR facility to guestAlexander Graf
POWER8 implements a new register called TAR. This register has to be enabled in FSCR and then from KVM's point of view is mere storage. This patch enables the guest to use TAR. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S PR: Handle Facility interrupt and FSCRAlexander Graf
POWER8 introduced a new interrupt type called "Facility unavailable interrupt" which contains its status message in a new register called FSCR. Handle these exits and try to emulate instructions for unhandled facilities. Follow-on patches enable KVM to expose specific facilities into the guest. Signed-off-by: Alexander Graf <agraf@suse.de>