summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2013-08-09xen/p2m: avoid unneccesary TLB flush in m2p_remove_override()David Vrabel
In m2p_remove_override() when removing the grant map from the kernel mapping and replacing with a mapping to the original page, the grant unmap will already have flushed the TLB and it is not necessary to do it again after updating the mapping. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2013-08-09xen: Support 64-bit PV guest receiving NMIsKonrad Rzeszutek Wilk
This is based on a patch that Zhenzhong Duan had sent - which was missing some of the remaining pieces. The kernel has the logic to handle Xen-type-exceptions using the paravirt interface in the assembler code (see PARAVIRT_ADJUST_EXCEPTION_FRAME - pv_irq_ops.adjust_exception_frame and and INTERRUPT_RETURN - pv_cpu_ops.iret). That means the nmi handler (and other exception handlers) use the hypervisor iret. The other changes that would be neccessary for this would be to translate the NMI_VECTOR to one of the entries on the ipi_vector and make xen_send_IPI_mask_allbutself use different events. Fortunately for us commit 1db01b4903639fcfaec213701a494fe3fb2c490b (xen: Clean up apic ipi interface) implemented this and we piggyback on the cleanup such that the apic IPI interface will pass the right vector value for NMI. With this patch we can trigger NMIs within a PV guest (only tested x86_64). For this to work with normal PV guests (not initial domain) we need the domain to be able to use the APIC ops - they are already implemented to use the Xen event channels. For that to be turned on in a PV domU we need to remove the masking of X86_FEATURE_APIC. Incidentally that means kgdb will also now work within a PV guest without using the 'nokgdbroundup' workaround. Note that the 32-bit version is different and this patch does not enable that. CC: Lisa Nguyen <lisa@xenapiadmin.com> CC: Ben Guthro <benjamin.guthro@citrix.com> CC: Zhenzhong Duan <zhenzhong.duan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v1: Fixed up per David Vrabel comments] Reviewed-by: Ben Guthro <benjamin.guthro@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2013-08-09kvm guest: Add configuration support to enable debug information for KVM GuestsSrivatsa Vaddagiri
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1376058122-8248-14-git-send-email-raghavendra.kt@linux.vnet.ibm.com Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Gleb Natapov <gleb@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09kvm uapi: Add KICK_CPU and PV_UNHALT definition to uapiRaghavendra K T
These are needed by both guest and host. Originally-from: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1376058122-8248-13-git-send-email-raghavendra.kt@linux.vnet.ibm.com Acked-by: Gleb Natapov <gleb@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09xen, pvticketlock: Allow interrupts to be enabled while blockingJeremy Fitzhardinge
If interrupts were enabled when taking the spinlock, we can leave them enabled while blocking to get the lock. If we can enable interrupts while waiting for the lock to become available, and we take an interrupt before entering the poll, and the handler takes a spinlock which ends up going into the slow state (invalidating the per-cpu "lock" and "want" values), then when the interrupt handler returns the event channel will remain pending so the poll will return immediately, causing it to return out to the main spinlock loop. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Link: http://lkml.kernel.org/r/1376058122-8248-12-git-send-email-raghavendra.kt@linux.vnet.ibm.com Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09x86, ticketlock: Add slowpath logicJeremy Fitzhardinge
Maintain a flag in the LSB of the ticket lock tail which indicates whether anyone is in the lock slowpath and may need kicking when the current holder unlocks. The flags are set when the first locker enters the slowpath, and cleared when unlocking to an empty queue (ie, no contention). In the specific implementation of lock_spinning(), make sure to set the slowpath flags on the lock just before blocking. We must do this before the last-chance pickup test to prevent a deadlock with the unlocker: Unlocker Locker test for lock pickup -> fail unlock test slowpath -> false set slowpath flags block Whereas this works in any ordering: Unlocker Locker set slowpath flags test for lock pickup -> fail block unlock test slowpath -> true, kick If the unlocker finds that the lock has the slowpath flag set but it is actually uncontended (ie, head == tail, so nobody is waiting), then it clears the slowpath flag. The unlock code uses a locked add to update the head counter. This also acts as a full memory barrier so that its safe to subsequently read back the slowflag state, knowing that the updated lock is visible to the other CPUs. If it were an unlocked add, then the flag read may just be forwarded from the store buffer before it was visible to the other CPUs, which could result in a deadlock. Unfortunately this means we need to do a locked instruction when unlocking with PV ticketlocks. However, if PV ticketlocks are not enabled, then the old non-locked "add" is the only unlocking code. Note: this code relies on gcc making sure that unlikely() code is out of line of the fastpath, which only happens when OPTIMIZE_SIZE=n. If it doesn't the generated code isn't too bad, but its definitely suboptimal. Thanks to Srivatsa Vaddagiri for providing a bugfix to the original version of this change, which has been folded in. Thanks to Stephan Diestelhorst for commenting on some code which relied on an inaccurate reading of the x86 memory ordering rules. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Link: http://lkml.kernel.org/r/1376058122-8248-11-git-send-email-raghavendra.kt@linux.vnet.ibm.com Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Stephan Diestelhorst <stephan.diestelhorst@amd.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09x86, pvticketlock: When paravirtualizing ticket locks, increment by 2Jeremy Fitzhardinge
Increment ticket head/tails by 2 rather than 1 to leave the LSB free to store a "is in slowpath state" bit. This halves the number of possible CPUs for a given ticket size, but this shouldn't matter in practice - kernels built for 32k+ CPU systems are probably specially built for the hardware rather than a generic distro kernel. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Link: http://lkml.kernel.org/r/1376058122-8248-9-git-send-email-raghavendra.kt@linux.vnet.ibm.com Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Attilio Rao <attilio.rao@citrix.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09x86, pvticketlock: Use callee-save for lock_spinningJeremy Fitzhardinge
Although the lock_spinning calls in the spinlock code are on the uncommon path, their presence can cause the compiler to generate many more register save/restores in the function pre/postamble, which is in the fast path. To avoid this, convert it to using the pvops callee-save calling convention, which defers all the save/restores until the actual function is called, keeping the fastpath clean. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Link: http://lkml.kernel.org/r/1376058122-8248-8-git-send-email-raghavendra.kt@linux.vnet.ibm.com Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Attilio Rao <attilio.rao@citrix.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09xen, pvticketlocks: Add xen_nopvspin parameter to disable xen pv ticketlocksJeremy Fitzhardinge
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Link: http://lkml.kernel.org/r/1376058122-8248-7-git-send-email-raghavendra.kt@linux.vnet.ibm.com Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09xen, pvticketlock: Xen implementation for PV ticket locksJeremy Fitzhardinge
Replace the old Xen implementation of PV spinlocks with and implementation of xen_lock_spinning and xen_unlock_kick. xen_lock_spinning simply registers the cpu in its entry in lock_waiting, adds itself to the waiting_cpus set, and blocks on an event channel until the channel becomes pending. xen_unlock_kick searches the cpus in waiting_cpus looking for the one which next wants this lock with the next ticket, if any. If found, it kicks it by making its event channel pending, which wakes it up. We need to make sure interrupts are disabled while we're relying on the contents of the per-cpu lock_waiting values, otherwise an interrupt handler could come in, try to take some other lock, block, and overwrite our values. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Link: http://lkml.kernel.org/r/1376058122-8248-6-git-send-email-raghavendra.kt@linux.vnet.ibm.com Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [ Raghavendra: use function + enum instead of macro, cmpxchg for zero status reset Reintroduce break since we know the exact vCPU to send IPI as suggested by Konrad.] Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09xen: Defer spinlock setup until boot CPU setupJeremy Fitzhardinge
There's no need to do it at very early init, and doing it there makes it impossible to use the jump_label machinery. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Link: http://lkml.kernel.org/r/1376058122-8248-5-git-send-email-raghavendra.kt@linux.vnet.ibm.com Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09x86, ticketlock: Collapse a layer of functionsJeremy Fitzhardinge
Now that the paravirtualization layer doesn't exist at the spinlock level any more, we can collapse the __ticket_ functions into the arch_ functions. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Link: http://lkml.kernel.org/r/1376058122-8248-4-git-send-email-raghavendra.kt@linux.vnet.ibm.com Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Attilio Rao <attilio.rao@citrix.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09x86, ticketlock: Don't inline _spin_unlock when using paravirt spinlocksRaghavendra K T
The code size expands somewhat, and its better to just call a function rather than inline it. Thanks Jeremy for original version of ARCH_NOINLINE_SPIN_UNLOCK config patch, which is simplified. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1376058122-8248-3-git-send-email-raghavendra.kt@linux.vnet.ibm.com Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09x86, spinlock: Replace pv spinlocks with pv ticketlocksJeremy Fitzhardinge
Rather than outright replacing the entire spinlock implementation in order to paravirtualize it, keep the ticket lock implementation but add a couple of pvops hooks on the slow patch (long spin on lock, unlocking a contended lock). Ticket locks have a number of nice properties, but they also have some surprising behaviours in virtual environments. They enforce a strict FIFO ordering on cpus trying to take a lock; however, if the hypervisor scheduler does not schedule the cpus in the correct order, the system can waste a huge amount of time spinning until the next cpu can take the lock. (See Thomas Friebel's talk "Prevent Guests from Spinning Around" http://www.xen.org/files/xensummitboston08/LHP.pdf for more details.) To address this, we add two hooks: - __ticket_spin_lock which is called after the cpu has been spinning on the lock for a significant number of iterations but has failed to take the lock (presumably because the cpu holding the lock has been descheduled). The lock_spinning pvop is expected to block the cpu until it has been kicked by the current lock holder. - __ticket_spin_unlock, which on releasing a contended lock (there are more cpus with tail tickets), it looks to see if the next cpu is blocked and wakes it if so. When compiled with CONFIG_PARAVIRT_SPINLOCKS disabled, a set of stub functions causes all the extra code to go away. Results: ======= setup: 32 core machine with 32 vcpu KVM guest (HT off) with 8GB RAM base = 3.11-rc patched = base + pvspinlock V12 +-----------------+----------------+--------+ dbench (Throughput in MB/sec. Higher is better) +-----------------+----------------+--------+ | base (stdev %)|patched(stdev%) | %gain | +-----------------+----------------+--------+ | 15035.3 (0.3) |15150.0 (0.6) | 0.8 | | 1470.0 (2.2) | 1713.7 (1.9) | 16.6 | | 848.6 (4.3) | 967.8 (4.3) | 14.0 | | 652.9 (3.5) | 685.3 (3.7) | 5.0 | +-----------------+----------------+--------+ pvspinlock shows benefits for overcommit ratio > 1 for PLE enabled cases, and undercommits results are flat Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Link: http://lkml.kernel.org/r/1376058122-8248-2-git-send-email-raghavendra.kt@linux.vnet.ibm.com Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Attilio Rao <attilio.rao@citrix.com> [ Raghavendra: Changed SPIN_THRESHOLD, fixed redefinition of arch_spinlock_t] Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-09usb: musb dma: add cppi41 dma driverSebastian Andrzej Siewior
This driver is currently used by musb' cppi41 couter part. I may merge both dma engine user of musb at some point but not just yet. The driver seems to work in RX/TX mode in host mode, tested on mass storage. I increaed the size of the TX / RX transfers and waited for the core code to cancel a transfers and it seems to recover. v2..3: - use mall transfers on RX side and check data toggle. - use rndis mode on tx side so we haveon interrupt for 4096 transfers. - remove custom "transferred" hack and use dmaengine_tx_status() to compute the total amount of data that has been transferred. - cancel transfers and reclaim descriptors v1..v2: - RX path added - dma mode 0 & 1 is working - device tree nodes re-created. Cc: Vinod Koul <vinod.koul@intel.com> Cc: Dan Williams <djbw@fb.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Felipe Balbi <balbi@ti.com>
2013-08-09usb: musb: dsps: use proper child nodesSebastian Andrzej Siewior
This moves the two instances from the big node into two child nodes. The glue layer ontop does almost nothing. There is one devices containing the control module for USB (2) phy, (2) usb and later the dma engine. The usb device is the "glue device" which contains the musb device as a child. This is what we do ever since. The new file musb_am335x is just here to prob the new bus and populate child devices. There are a lot of changes to the dsps file as a result of the changes: - musb_core_offset This is gone. The device tree provides memory ressources information for the device there is no need to "fix" things - instances This is gone as well. If we have two instances then we have have two child enabled nodes in the device tree. For instance the SoC in beagle bone has two USB instances but only one has been wired up so there is no need to load and init the second instance since it won't be used. - dsps_glue is now per glue device In the past there was one of this structs but with an array of two and each instance accessed its variable depending on the platform device id. - no unneeded copy of structs I do not know why struct dsps_musb_wrapper is copied but it is not necessary. The same goes for musb_hdrc_platform_data which allocated on demand and then again by platform_device_add_data(). One copy is enough. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Felipe Balbi <balbi@ti.com>
2013-08-09Merge branch 'nop-phy-rename' into nextFelipe Balbi
Signed-off-by: Felipe Balbi <balbi@ti.com> Conflicts: drivers/usb/phy/phy-generic.c
2013-08-09usb: phy: rename nop_usb_xceiv => usb_phy_gen_xceivSebastian Andrzej Siewior
The "nop" driver isn't a do-nothing-stub but supports a couple functions like clock on/off or is able to use a voltage regulator. This patch simply renames the driver to "generic" since it is easy possible to extend it by a simple function istead of writing a complete driver. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Felipe Balbi <balbi@ti.com>
2013-08-09arm64: KVM: use 'int' instead of 'u32' for variable 'target' in kvm_host.h.Chen Gang
'target' will be set to '-1' in kvm_arch_vcpu_init(), and it need check 'target' whether less than zero or not in kvm_vcpu_initialized(). So need define target as 'int' instead of 'u32', just like ARM has done. The related warning: arch/arm64/kvm/../../../arch/arm/kvm/arm.c:497:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] Signed-off-by: Chen Gang <gang.chen@asianux.com> [Marc: reformated the Subject line to fit the series] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-08-09arm64: KVM: add missing dsb before invalidating Stage-2 TLBsMarc Zyngier
When performing a Stage-2 TLB invalidation, it is necessary to make sure the write to the page tables is observable by all CPUs. For this purpose, add dsb instructions to __kvm_tlb_flush_vmid_ipa and __kvm_flush_vm_context before doing the TLB invalidation itself. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-08-09arm64: KVM: perform save/restore of PAR_EL1Marc Zyngier
Not saving PAR_EL1 is an unfortunate oversight. If the guest performs an AT* operation and gets scheduled out before reading the result of the translation from PAREL1, it could become corrupted by another guest or the host. Saving this register is made slightly more complicated as KVM also uses it on the permission fault handling path, leading to an ugly "stash and restore" sequence. Fortunately, this is already a slow path so we don't really care. Also, Linux doesn't do any AT* operation, so Linux guests are not impacted by this bug. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-08-09powerpc/tm: Fix context switching TAR, PPR and DSCR SPRsMichael Neuling
If a transaction is rolled back, the Target Address Register (TAR), Processor Priority Register (PPR) and Data Stream Control Register (DSCR) should be restored to the checkpointed values before the transaction began. Any changes to these SPRs inside the transaction should not be visible in the abort handler. Currently Linux doesn't save or restore the checkpointed TAR, PPR or DSCR. If we preempt a processes inside a transaction which has modified any of these, on process restore, that same transaction may be aborted we but we won't see the checkpointed versions of these SPRs. This adds checkpointed versions of these SPRs to the thread_struct and adds the save/restore of these three SPRs to the treclaim/trechkpt code. Without this if any of these SPRs are modified during a transaction, users may incorrectly see a speculated SPR value even if the transaction is aborted. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: Save the TAR register earlierMichael Neuling
This moves us to save the Target Address Register (TAR) a earlier in __switch_to. It introduces a new function save_tar() to do this. We need to save the TAR earlier as we will overwrite it in the transactional memory reclaim/recheckpoint path. We are going to do this in a subsequent patch which will fix saving the TAR register when it's modified inside a transaction. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: Fix context switch DSCR on POWER8Michael Neuling
POWER8 allows the DSCR to be accessed directly from userspace via a new SPR number 0x3 (Rather than 0x11. DSCR SPR number 0x11 is still used on POWER8 but like POWER7, is only accessible in HV and OS modes). Currently, we allow this by setting H/FSCR DSCR bit on boot. Unfortunately this doesn't work, as the kernel needs to see the DSCR change so that it knows to no longer restore the system wide version of DSCR on context switch (ie. to set thread.dscr_inherit). This clears the H/FSCR DSCR bit initially. If a process then accesses the DSCR (via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in facility_unavailable_exception(). We also change _switch() so that we set or clear the H/FSCR DSCR bit based on the thread.dscr_inherit. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: Rework setting up H/FSCR bit definitionsMichael Neuling
This reworks the Facility Status and Control Regsiter (FSCR) config bit definitions so that we can access the bit numbers. This is needed for a subsequent patch to fix the userspace DSCR handling. HFSCR and FSCR bit definitions are the same, so reuse them. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: Fix hypervisor facility unavaliable vector numberMichael Neuling
Currently if we take hypervisor facility unavaliable (from 0xf80/0x4f80) we mark it as an OS facility unavaliable (0xf60) as the two share the same code path. The becomes a problem in facility_unavailable_exception() as we aren't able to see the hypervisor facility unavailable exceptions. Below fixes this by duplication the required macros. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/kvm/book3s_pr: Return appropriate error when allocation failsThadeu Lima de Souza Cascardo
err was overwritten by a previous function call, and checked to be 0. If the following page allocation fails, 0 is going to be returned instead of -ENOMEM. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/kvm: Add signed type cast for comparationChen Gang
'rmls' is 'unsigned long', lpcr_rmls() will return negative number when failure occurs, so it need a type cast for comparing. 'lpid' is 'unsigned long', kvmppc_alloc_lpid() return negative number when failure occurs, so it need a type cast for comparing. Signed-off-by: Chen Gang <gang.chen@asianux.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/eeh: Add missing procfs entry for PowerNVMike Qiu
The procfs entry for global statistics has been missed on PowerNV platform and the patch is going to add that. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/pseries: Add backward compatibilty to read old kernel oops-logAruna Balakrishnaiah
Older kernels has just length information in their header. Handle it while reading old kernel oops log from pstore. Applies on top of powerpc/pseries: Fix buffer overflow when reading from pstore Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc/pseries: Fix buffer overflow when reading from pstoreAruna Balakrishnaiah
When reading from pstore there is a buffer overflow during decompression due to the header added in unzip_oops. Remove unzip_oops and call pstore_decompress directly in nvram_pstore_read. Allocate buffer of size report_length of the oops header as header will not be deallocated in pstore. Since we have 'openssl' command line tool to decompress the compressed data, dump the compressed data in case decompression fails instead of not dumping anything. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09powerpc: On POWERNV enable PPC_DENORMALISATION by defaultAnton Blanchard
We want PPC_DENORMALISATION enabled when POWERNV is enabled, so update the Kconfig. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org>
2013-08-08Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32 Pull AVR32 build fix from Hans-Christian Egtvedt. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32: avr32: boards/atngw100/mrmt.c: fix building error
2013-08-08Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Kevin Hilman: - MSM: GPIO fixes (includes old code removal) - OMAP: earlyprintk regression, AM33xx cpgmac PM regression - OMAP5: urgent fix for potentially harmful voltage regulator values - Renesas: gpio-keys fix, fix SD card detection, fix shdma calculation error - STi: critical SMP boot fix - tegra: DTS fix for usb-phy - a couple MAINTAINERS updates (Arnd is on paternity leave, Kevin is stepping up to help arm-soc maintenance) * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: MAINTAINERS: add TI Keystone ARM platform MAINTAINERS: delete Srinidhi from ux500 ARM: tegra: enable ULPI phy on Colibri T20 ARM: STi: remove sti_secondary_start from INIT section. ARM: STi: Fix cpu nodes with correct device_type. ARM: shmobile: lager: do not annotate gpio_buttons as __initdata ARM: shmobile: BOCK-W: fix SDHI0 PFC settings shdma: fixup sh_dmae_get_partial() calculation error ARM: OMAP2+: hwmod: AM335x: fix cpgmac address space ARM: OMAP2+: hwmod: rt address space index for DT ARM: OMAP2+: Sync hwmod state with the pm_runtime and omap_device state ARM: OMAP2+: Avoid idling memory controllers with no drivers ARM: OMAP2+: hwmod: Fix a crash in _setup_reset() with DEBUG_LL ARM: dts: omap5-uevm: update optional/unused regulator configurations ARM: dts: omap5-uevm: fix regulator configurations mandatory for SoC ARM: dts: omap5-uevm: document regulator signals used on the actual board ARM: msm: Consolidate gpiomux for older architectures ARM: shmobile: armadillo800eva: Don't request GPIO 166 in board code ARM: msm: dts: Fix the gpio register address for msm8960
2013-08-08avr32: boards/atngw100/mrmt.c: fix building errorCong Ding
there is an additional "{", which causes building error. Signed-off-by: Cong Ding <dinggnu@gmail.com> Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
2013-08-08ARM: Fix FIQ code on VIVT CPUsRussell King
Aaro Koskinen reports the following oops: Installing fiq handler from c001b110, length 0x164 Unable to handle kernel paging request at virtual address ffff1224 pgd = c0004000 [ffff1224] *pgd=00000000, *pte=11fff0cb, *ppte=11fff00a ... [<c0013154>] (set_fiq_handler+0x0/0x6c) from [<c0365d38>] (ams_delta_init_fiq+0xa8/0x160) r6:00000164 r5:c001b110 r4:00000000 r3:fefecb4c [<c0365c90>] (ams_delta_init_fiq+0x0/0x160) from [<c0365b14>] (ams_delta_init+0xd4/0x114) r6:00000000 r5:fffece10 r4:c037a9e0 [<c0365a40>] (ams_delta_init+0x0/0x114) from [<c03613b4>] (customize_machine+0x24/0x30) This is because the vectors page is now write-protected, and to change code in there we must write to its original alias. Make that change, and adjust the cache flushing such that the code will become visible to the instruction stream on VIVT CPUs. Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-07x86, relocs: Move ELF relocation handling to CKees Cook
Moves the relocation handling into C, after decompression. This requires that the decompressed size is passed to the decompression routine as well so that relocations can be found. Only kernels that need relocation support will use the code (currently just x86_32), but this is laying the ground work for 64-bit using it in support of KASLR. Based on work by Neill Clift and Michael Davidson. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20130708161517.GA4832@www.outflux.net Acked-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-07arm64: KVM: fix 2-level page tables unmappingMarc Zyngier
When using 64kB pages, we only have two levels of page tables, meaning that PGD, PUD and PMD are fused. In this case, trying to refcount PUDs and PMDs independently is a a complete disaster, as they are the same. We manage to get it right for the allocation (stage2_set_pte uses {pmd,pud}_none), but the unmapping path clears both pud and pmd refcounts, which fails spectacularly with 2-level page tables. The fix is to avoid calling clear_pud_entry when both the pmd and pud pages are empty. For this, and instead of introducing another pud_empty function, consolidate both pte_empty and pmd_empty into page_empty (the code is actually identical) and use that to also test the validity of the pud. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-07ARM: KVM: Fix unaligned unmap_range leakChristoffer Dall
The unmap_range function did not properly cover the case when the start address was not aligned to PMD_SIZE or PUD_SIZE and an entire pte table or pmd table was cleared, causing us to leak memory when incrementing the addr. The fix is to always move onto the next page table entry boundary instead of adding the full size of the VA range covered by the corresponding table level entry. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-08-07Merge branch 'pm-cpufreq-ondemand' into pm-cpufreqRafael J. Wysocki
* pm-cpufreq: cpufreq: Remove unused function __cpufreq_driver_getavg() cpufreq: Remove unused APERF/MPERF support cpufreq: ondemand: Change the calculation of target frequency
2013-08-07KVM: nVMX: Advertise IA32_PAT in VM exit controlArthur Chunqi Li
Advertise VM_EXIT_SAVE_IA32_PAT and VM_EXIT_LOAD_IA32_PAT. Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-07KVM: nVMX: Fix up VM_ENTRY_IA32E_MODE control feature reportingJan Kiszka
Do not report that we can enter the guest in 64-bit mode if the host is 32-bit only. This is not supported by KVM. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-07KVM: nEPT: Advertise WB type EPTPJan Kiszka
At least WB must be possible. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-07nVMX: Keep arch.pat in sync on L1-L2 switchesJan Kiszka
When asking vmx to load the PAT MSR for us while switching from L1 to L2 or vice versa, we have to update arch.pat as well as it may later be used again to load or read out the MSR content. Reviewed-by: Gleb Natapov <gleb@redhat.com> Tested-by: Arthur Chunqi Li <yzt356@gmail.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-07nEPT: Miscelleneous cleanupsNadav Har'El
Some trivial code cleanups not really related to nested EPT. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Xinhao Xu <xinhao.xu@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-07nEPT: Some additional commentsNadav Har'El
Some additional comments to preexisting code: Explain who (L0 or L1) handles EPT violation and misconfiguration exits. Don't mention "shadow on either EPT or shadow" as the only two options. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Xinhao Xu <xinhao.xu@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-07Advertise the support of EPT to the L1 guest, through the appropriate MSR.Nadav Har'El
This is the last patch of the basic Nested EPT feature, so as to allow bisection through this patch series: The guest will not see EPT support until this last patch, and will not attempt to use the half-applied feature. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Xinhao Xu <xinhao.xu@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-07nEPT: Nested INVEPTNadav Har'El
If we let L1 use EPT, we should probably also support the INVEPT instruction. In our current nested EPT implementation, when L1 changes its EPT table for L2 (i.e., EPT12), L0 modifies the shadow EPT table (EPT02), and in the course of this modification already calls INVEPT. But if last level of shadow page is unsync not all L1's changes to EPT12 are intercepted, which means roots need to be synced when L1 calls INVEPT. Global INVEPT should not be different since roots are synced by kvm_mmu_load() each time EPTP02 changes. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Xinhao Xu <xinhao.xu@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-07nEPT: MMU context for nested EPTNadav Har'El
KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Xinhao Xu <xinhao.xu@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-07nEPT: Add nEPT violation/misconfigration supportYang Zhang
Inject nEPT fault to L1 guest. This patch is original from Xinhao. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Xinhao Xu <xinhao.xu@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>