summaryrefslogtreecommitdiff
path: root/arch/arm64/include
AgeCommit message (Collapse)Author
2014-01-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "First round of KVM updates for 3.14; PPC parts will come next week. Nothing major here, just bugfixes all over the place. The most interesting part is the ARM guys' virtualized interrupt controller overhaul, which lets userspace get/set the state and thus enables migration of ARM VMs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits) kvm: make KVM_MMU_AUDIT help text more readable KVM: s390: Fix memory access error detection KVM: nVMX: Update guest activity state field on L2 exits KVM: nVMX: Fix nested_run_pending on activity state HLT KVM: nVMX: Clean up handling of VMX-related MSRs KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit KVM: nVMX: Leave VMX mode on clearing of feature control MSR KVM: VMX: Fix DR6 update on #DB exception KVM: SVM: Fix reading of DR6 KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS add support for Hyper-V reference time counter KVM: remove useless write to vcpu->hv_clock.tsc_timestamp KVM: x86: fix tsc catchup issue with tsc scaling KVM: x86: limit PIT timer frequency KVM: x86: handle invalid root_hpa everywhere kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub kvm: vfio: silence GCC warning KVM: ARM: Remove duplicate include arm/arm64: KVM: relax the requirements of VMA alignment for THP ...
2014-01-20Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull ARM64 updates from Catalin Marinas: - CPU suspend support on top of PSCI (firmware Power State Coordination Interface) - jump label support - CMA can now be enabled on arm64 - HWCAP bits for crypto and CRC32 extensions - optimised percpu using tpidr_el1 register - code cleanup * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits) arm64: fix typo in entry.S arm64: kernel: restore HW breakpoint registers in cpu_suspend jump_label: use defined macros instead of hard-coding for better readability arm64, jump label: optimize jump label implementation arm64, jump label: detect %c support for ARM64 arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions arm64: move encode_insn_immediate() from module.c to insn.c arm64: introduce interfaces to hotpatch kernel and module code arm64: introduce basic aarch64 instruction decoding helpers arm64: dts: Reduce size of virtio block device for foundation model arm64: Remove unused __data_loc variable arm64: Enable CMA arm64: Warn on NULL device structure for dma APIs arm64: Add hwcaps for crypto and CRC32 extensions. arm64: drop redundant macros from read_cpuid() arm64: Remove outdated comment arm64: cmpxchg: update macros to prevent warnings arm64: support single-step and breakpoint handler hooks ARM64: fix framepointer check in unwind_frame ARM64: check stack pointer in get_wchan ...
2014-01-20Merge branch 'core-locking-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking changes from Ingo Molnar: - futex performance increases: larger hashes, smarter wakeups - mutex debugging improvements - lots of SMP ordering documentation updates - introduce the smp_load_acquire(), smp_store_release() primitives. (There are WIP patches that make use of them - not yet merged) - lockdep micro-optimizations - lockdep improvement: better cover IRQ contexts - liblockdep at last. We'll continue to monitor how useful this is * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) futexes: Fix futex_hashsize initialization arch: Re-sort some Kbuild files to hopefully help avoid some conflicts futexes: Avoid taking the hb->lock if there's nothing to wake up futexes: Document multiprocessor ordering guarantees futexes: Increase hash table size for better performance futexes: Clean up various details arch: Introduce smp_load_acquire(), smp_store_release() arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h arch: Move smp_mb__{before,after}_atomic_{inc,dec}.h into asm/atomic.h locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE mutexes: Give more informative mutex warning in the !lock->owner case powerpc: Full barrier for smp_mb__after_unlock_lock() rcu: Apply smp_mb__after_unlock_lock() to preserve grace periods Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier Documentation/memory-barriers.txt: Document ACCESS_ONCE() Documentation/memory-barriers.txt: Prohibit speculative writes Documentation/memory-barriers.txt: Add long atomic examples to memory-barriers.txt Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt Revert "smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK=y usable without debug dependency" ...
2014-01-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c net/ipv4/tcp_metrics.c Overlapping changes between the "don't create two tcp metrics objects with the same key" race fix in net and the addition of the destination address in the lookup key in net-next. Minor overlapping changes in bnx2x driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-17Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache" We noticed that it breaks ioremap (and earlyprintk) with 64K page configuration" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"
2014-01-16Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache"Catalin Marinas
This reverts commit 2f7dc6027522499582a520807cb9ffda589de47e. The above commit breaks the mapping type for Device memory because pgprot_default already contains a Normal memory type. pgprot_default is also not initialised early enough for earlyprintk resulting in an inconsistent memory mapping with 64K PAGE_SIZE configuration. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Will Deacon <will.deacon@arm.com> Acked-by: Will Deacon <will.deacon@arm.com>
2014-01-15Merge tag 'kvm-arm-for-3.14' of ↵Paolo Bonzini
git://git.linaro.org/people/christoffer.dall/linux-kvm-arm into kvm-queue
2014-01-13Merge tag 'v3.13-rc8' into core/lockingIngo Molnar
Refresh the tree with the latest fixes, before applying new changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-12arch: Introduce smp_load_acquire(), smp_store_release()Peter Zijlstra
A number of situations currently require the heavyweight smp_mb(), even though there is no need to order prior stores against later loads. Many architectures have much cheaper ways to handle these situations, but the Linux kernel currently has no portable way to make use of them. This commit therefore supplies smp_load_acquire() and smp_store_release() to remedy this situation. The new smp_load_acquire() primitive orders the specified load against any subsequent reads or writes, while the new smp_store_release() primitive orders the specifed store against any prior reads or writes. These primitives allow array-based circular FIFOs to be implemented without an smp_mb(), and also allow a theoretical hole in rcu_assign_pointer() to be closed at no additional expense on most architectures. In addition, the RCU experience transitioning from explicit smp_read_barrier_depends() and smp_wmb() to rcu_dereference() and rcu_assign_pointer(), respectively resulted in substantial improvements in readability. It therefore seems likely that replacing other explicit barriers with smp_load_acquire() and smp_store_release() will provide similar benefits. It appears that roughly half of the explicit barriers in core kernel code might be so replaced. [Changelog by PaulMck] Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-08arm64, jump label: optimize jump label implementationJiang Liu
Optimize jump label implementation for ARM64 by dynamically patching kernel text. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-08arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functionsJiang Liu
Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which will be used to implement jump label on ARM64. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-08arm64: move encode_insn_immediate() from module.c to insn.cJiang Liu
Function encode_insn_immediate() will be used by other instruction manipulate related functions, so move it into insn.c and rename it as aarch64_insn_encode_immediate(). Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-08arm64: introduce interfaces to hotpatch kernel and module codeJiang Liu
Introduce three interfaces to patch kernel and module code: aarch64_insn_patch_text_nosync(): patch code without synchronization, it's caller's responsibility to synchronize all CPUs if needed. aarch64_insn_patch_text_sync(): patch code and always synchronize with stop_machine() aarch64_insn_patch_text(): patch code and synchronize with stop_machine() if needed Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-08arm64: introduce basic aarch64 instruction decoding helpersJiang Liu
Introduce basic aarch64 instruction decoding helper aarch64_get_insn_class() and aarch64_insn_hotpatch_safe(). Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-06Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c net/ipv6/ip6_tunnel.c net/ipv6/ip6_vti.c ipv6 tunnel statistic bug fixes conflicting with consolidation into generic sw per-cpu net stats. qlogic conflict between queue counting bug fix and the addition of multiple MAC address support. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-28Merge branch 'kvm-arm64/for-3.14' into kvm-arm64/nextMarc Zyngier
2013-12-28arm64: KVM: Support X-Gene guest VCPU on APM X-Gene hostAnup Patel
This patch allows us to have X-Gene guest VCPU when using KVM arm64 on APM X-Gene host. We add KVM_ARM_TARGET_XGENE_POTENZA for X-Gene Potenza compatible guest VCPU and we return KVM_ARM_TARGET_XGENE_POTENZA in kvm_target_cpu() when running on X-Gene host with Potenza core. [maz: sanitized the commit log] Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-12-28arm64: KVM: Add Kconfig option for max VCPUs per-GuestAnup Patel
Current max VCPUs per-Guest is set to 4 which is preventing us from creating a Guest (or VM) with 8 VCPUs on Host (e.g. X-Gene Storm SOC) with 8 Host CPUs. The correct value of max VCPUs per-Guest should be same as the max CPUs supported by GICv2 which is 8 but, increasing value of max VCPUs per-Guest can make things slower hence we add Kconfig option to let KVM users select appropriate max VCPUs per-Guest. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-12-21ARM/KVM: save and restore generic timer registersAndre Przywara
For migration to work we need to save (and later restore) the state of each core's virtual generic timer. Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export the three needed registers (control, counter, compare value). Though they live in cp15 space, we don't use the existing list, since they need special accessor functions and the arch timer is optional. Acked-by: Marc Zynger <marc.zyngier@arm.com> Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-20Merge tag 'stable/for-linus-3.13-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen bugfixes from Konrad Rzeszutek Wilk: - Fix balloon driver for auto-translate guests (PVHVM, ARM) to not use scratch pages. - Fix block API header for ARM32 and ARM64 to have proper layout - On ARM when mapping guests, stick on PTE_SPECIAL - When using SWIOTLB under ARM, don't call swiotlb functions twice - When unmapping guests memory and if we fail, don't return pages which failed to be unmapped. - Grant driver was using the wrong address on ARM. * tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: Seperate the auto-translate logic properly (v2) xen/block: Correctly define structures in public headers on ARM32 and ARM64 arm: xen: foreign mapping PTEs are special. xen/arm64: do not call the swiotlb functions twice xen: privcmd: do not return pages which we have failed to unmap XEN: Grant table address, xen_hvm_resume_frames, is a phys_addr not a pfn
2013-12-19Merge tag 'arm64-suspend' of git://linux-arm.org/linux-2.6-lp into upstreamCatalin Marinas
* tag 'arm64-suspend' of git://linux-arm.org/linux-2.6-lp: arm64: add CPU power management menu/entries arm64: kernel: add PM build infrastructure arm64: kernel: add CPU idle call arm64: enable generic clockevent broadcast arm64: kernel: implement HW breakpoints CPU PM notifier arm64: kernel: refactor code to install/uninstall breakpoints arm: kvm: implement CPU PM notifier arm64: kernel: implement fpsimd CPU PM notifier arm64: kernel: cpu_{suspend/resume} implementation arm64: kernel: suspend/resume registers save/restore arm64: kernel: build MPIDR_EL1 hash function data structure arm64: kernel: add MPIDR_EL1 accessors macros Conflicts: arch/arm64/Kconfig
2013-12-19arm64: Enable CMALaura Abbott
arm64 bit targets need the features CMA provides. Add the appropriate hooks, header files, and Kconfig to allow this to happen. Cc: Will Deacon <will.deacon@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-19arm64: Add hwcaps for crypto and CRC32 extensions.Steve Capper
Advertise the optional cryptographic and CRC32 instructions to user space where present. Several hwcap bits [3-7] are allocated. Signed-off-by: Steve Capper <steve.capper@linaro.org> [bit 2 is taken now so use bits 3-7 instead] Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-19arm64: drop redundant macros from read_cpuid()Ard Biesheuvel
asm/cputype.h contains a bunch of #defines for CPU id registers that essentially map to themselves. Remove the #defines and pass the tokens directly to the inline asm() that reads the registers. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-19arm64: cmpxchg: update macros to prevent warningsMark Hambleton
Make sure the value we are going to return is referenced in order to avoid warnings from newer GCCs such as: arch/arm64/include/asm/cmpxchg.h:162:3: warning: value computed is not used [-Wunused-value] ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \ ^ net/netfilter/nf_conntrack_core.c:674:2: note: in expansion of macro ‘cmpxchg’ cmpxchg(&nf_conntrack_hash_rnd, 0, rand); [Modified to use the current underlying implementation as current mainline for both cmpxchg() and cmpxchg_local() does -- broonie] Signed-off-by: Mark Hambleton <mahamble@broadcom.com> Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-19arm64: support single-step and breakpoint handler hooksSandeepa Prabhu
AArch64 Single Steping and Breakpoint debug exceptions will be used by multiple debug framworks like kprobes & kgdb. This patch implements the hooks for those frameworks to register their own handlers for handling breakpoint and single step events. Reworked the debug exception handler in entry.S: do_dbg to route software breakpoint (BRK64) exception to do_debug_exception() Signed-off-by: Sandeepa Prabhu <sandeepa.prabhu@linaro.org> Signed-off-by: Deepak Saxena <dsaxena@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-19arm64: dcache: select DCACHE_WORD_ACCESS for little-endian CPUsWill Deacon
DCACHE_WORD_ACCESS uses the word-at-a-time API for optimised string comparisons in the vfs layer. This patch implements support for load_unaligned_zeropad in much the same way as has been done for ARM, although big-endian systems are also supported. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-19arm64: futex: ensure .fixup entries are sufficiently alignedWill Deacon
AArch64 instructions must be 4-byte aligned, so make sure this is true for the futex .fixup section. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-19arm64: use generic strnlen_user and strncpy_from_user functionsWill Deacon
This patch implements the word-at-a-time interface for arm64 using the same algorithm as ARM. We use the fls64 macro, which expands to a clz instruction via a compiler builtin. Big-endian configurations make use of the implementation from asm-generic. With this implemented, we can replace our byte-at-a-time strnlen_user and strncpy_from_user functions with the optimised generic versions. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-19arm64: percpu: implement optimised pcpu access using tpidr_el1Will Deacon
This patch implements optimised percpu variable accesses using the el1 r/w thread register (tpidr_el1) along the same lines as arch/arm/. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-19arm64: Correct virt_addr_validLaura Abbott
The definition of virt_addr_valid is that virt_addr_valid should return true if and only if virt_to_page returns a valid pointer. The current definition of virt_addr_valid only checks against the virtual address range. There's no guarantee that just because a virtual address falls bewteen PAGE_OFFSET and high_memory the associated physical memory has a valid backing struct page. Follow the example of other architectures and convert to pfn_valid to verify that the virtual address is actually valid. Cc: Will Deacon <will.deacon@arm.com> Cc: Nicolas Pitre <nico@linaro.org> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/intel/i40e/i40e_main.c drivers/net/macvtap.c Both minor merge hassles, simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17lib: Add missing arch generic-y entries for asm-generic/hash.hDavid S. Miller
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-16arm64: enable generic clockevent broadcastLorenzo Pieralisi
On platforms with power management capabilities, timers that are shut down when a CPU enters deep C-states must be emulated using an always-on timer and a timer IPI to relay the timer IRQ to target CPUs on an SMP system. This patch enables the generic clockevents broadcast infrastructure for arm64, by providing the required Kconfig entries and adding the timer IPI infrastructure. Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2013-12-16arm64: kernel: cpu_{suspend/resume} implementationLorenzo Pieralisi
Kernel subsystems like CPU idle and suspend to RAM require a generic mechanism to suspend a processor, save its context and put it into a quiescent state. The cpu_{suspend}/{resume} implementation provides such a framework through a kernel interface allowing to save/restore registers, flush the context to DRAM and suspend/resume to/from low-power states where processor context may be lost. The CPU suspend implementation relies on the suspend protocol registered in CPU operations to carry out a suspend request after context is saved and flushed to DRAM. The cpu_suspend interface: int cpu_suspend(unsigned long arg); allows to pass an opaque parameter that is handed over to the suspend CPU operations back-end so that it can take action according to the semantics attached to it. The arg parameter allows suspend to RAM and CPU idle drivers to communicate to suspend protocol back-ends; it requires standardization so that the interface can be reused seamlessly across systems, paving the way for generic drivers. Context memory is allocated on the stack, whose address is stashed in a per-cpu variable to keep track of it and passed to core functions that save/restore the registers required by the architecture. Even though, upon successful execution, the cpu_suspend function shuts down the suspending processor, the warm boot resume mechanism, based on the cpu_resume function, makes the resume path operate as a cpu_suspend function return, so that cpu_suspend can be treated as a C function by the caller, which simplifies coding the PM drivers that rely on the cpu_suspend API. Upon context save, the minimal amount of memory is flushed to DRAM so that it can be retrieved when the MMU is off and caches are not searched. The suspend CPU operation, depending on the required operations (eg CPU vs Cluster shutdown) is in charge of flushing the cache hierarchy either implicitly (by calling firmware implementations like PSCI) or explicitly by executing the required cache maintainance functions. Debug exceptions are disabled during cpu_{suspend}/{resume} operations so that debug registers can be saved and restored properly preventing preemption from debug agents enabled in the kernel. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2013-12-16arm64: kernel: suspend/resume registers save/restoreLorenzo Pieralisi
Power management software requires the kernel to save and restore CPU registers while going through suspend and resume operations triggered by kernel subsystems like CPU idle and suspend to RAM. This patch implements code that provides save and restore mechanism for the arm v8 implementation. Memory for the context is passed as parameter to both cpu_do_suspend and cpu_do_resume functions, and allows the callers to implement context allocation as they deem fit. The registers that are saved and restored correspond to the registers set actually required by the kernel to be up and running which represents a subset of v8 ISA. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2013-12-16arm64: kernel: build MPIDR_EL1 hash function data structureLorenzo Pieralisi
On ARM64 SMP systems, cores are identified by their MPIDR_EL1 register. The MPIDR_EL1 guidelines in the ARM ARM do not provide strict enforcement of MPIDR_EL1 layout, only recommendations that, if followed, split the MPIDR_EL1 on ARM 64 bit platforms in four affinity levels. In multi-cluster systems like big.LITTLE, if the affinity guidelines are followed, the MPIDR_EL1 can not be considered a linear index. This means that the association between logical CPU in the kernel and the HW CPU identifier becomes somewhat more complicated requiring methods like hashing to associate a given MPIDR_EL1 to a CPU logical index, in order for the look-up to be carried out in an efficient and scalable way. This patch provides a function in the kernel that starting from the cpu_logical_map, implement collision-free hashing of MPIDR_EL1 values by checking all significative bits of MPIDR_EL1 affinity level bitfields. The hashing can then be carried out through bits shifting and ORing; the resulting hash algorithm is a collision-free though not minimal hash that can be executed with few assembly instructions. The mpidr_el1 is filtered through a mpidr mask that is built by checking all bits that toggle in the set of MPIDR_EL1s corresponding to possible CPUs. Bits that do not toggle do not carry information so they do not contribute to the resulting hash. Pseudo code: /* check all bits that toggle, so they are required */ for (i = 1, mpidr_el1_mask = 0; i < num_possible_cpus(); i++) mpidr_el1_mask |= (cpu_logical_map(i) ^ cpu_logical_map(0)); /* * Build shifts to be applied to aff0, aff1, aff2, aff3 values to hash the * mpidr_el1 * fls() returns the last bit set in a word, 0 if none * ffs() returns the first bit set in a word, 0 if none */ fs0 = mpidr_el1_mask[7:0] ? ffs(mpidr_el1_mask[7:0]) - 1 : 0; fs1 = mpidr_el1_mask[15:8] ? ffs(mpidr_el1_mask[15:8]) - 1 : 0; fs2 = mpidr_el1_mask[23:16] ? ffs(mpidr_el1_mask[23:16]) - 1 : 0; fs3 = mpidr_el1_mask[39:32] ? ffs(mpidr_el1_mask[39:32]) - 1 : 0; ls0 = fls(mpidr_el1_mask[7:0]); ls1 = fls(mpidr_el1_mask[15:8]); ls2 = fls(mpidr_el1_mask[23:16]); ls3 = fls(mpidr_el1_mask[39:32]); bits0 = ls0 - fs0; bits1 = ls1 - fs1; bits2 = ls2 - fs2; bits3 = ls3 - fs3; aff0_shift = fs0; aff1_shift = 8 + fs1 - bits0; aff2_shift = 16 + fs2 - (bits0 + bits1); aff3_shift = 32 + fs3 - (bits0 + bits1 + bits2); u32 hash(u64 mpidr_el1) { u32 l[4]; u64 mpidr_el1_masked = mpidr_el1 & mpidr_el1_mask; l[0] = mpidr_el1_masked & 0xff; l[1] = mpidr_el1_masked & 0xff00; l[2] = mpidr_el1_masked & 0xff0000; l[3] = mpidr_el1_masked & 0xff00000000; return (l[0] >> aff0_shift | l[1] >> aff1_shift | l[2] >> aff2_shift | l[3] >> aff3_shift); } The hashing algorithm relies on the inherent properties set in the ARM ARM recommendations for the MPIDR_EL1. Exotic configurations, where for instance the MPIDR_EL1 values at a given affinity level have large holes, can end up requiring big hash tables since the compression of values that can be achieved through shifting is somewhat crippled when holes are present. Kernel warns if the number of buckets of the resulting hash table exceeds the number of possible CPUs by a factor of 4, which is a symptom of a very sparse HW MPIDR_EL1 configuration. The hash algorithm is quite simple and can easily be implemented in assembly code, to be used in code paths where the kernel virtual address space is not set-up (ie cpu_resume) and instruction and data fetches are strongly ordered so code must be compact and must carry out few data accesses. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2013-12-16arm64: kernel: add MPIDR_EL1 accessors macrosLorenzo Pieralisi
In order to simplify access to different affinity levels within the MPIDR_EL1 register values, this patch implements some preprocessor macros that allow to retrieve the MPIDR_EL1 affinity level value according to the level passed as input parameter. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2013-12-11arm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappingsSantosh Shilimkar
KVM initialisation fails on architectures implementing virt_to_idmap() because virt_to_phys() on such architectures won't fetch you the correct idmap page. So update the KVM ARM code to use the virt_to_idmap() to fix the issue. Since the KVM code is shared between arm and arm64, we create kvm_virt_to_phys() and handle the redirection in respective headers. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-11xen/arm64: do not call the swiotlb functions twiceStefano Stabellini
On arm64 the dma_map_ops implementation is based on the swiotlb. swiotlb-xen, used by default in dom0 on Xen, is also based on the swiotlb. Avoid calling into the default arm64 dma_map_ops functions from xen_dma_map_page, xen_dma_unmap_page, xen_dma_sync_single_for_cpu, and xen_dma_sync_single_for_device otherwise we end up calling into the swiotlb twice. When arm64 gets a non-swiotlb based implementation of dma_map_ops, we'll probably have to reintroduce dma_map_ops calls in page-coherent.h. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: catalin.marinas@arm.com CC: Will.Deacon@arm.com CC: Ian.Campbell@citrix.com
2013-12-06arm64: mm: Fix PMD_SECT_PROT_NONE definitionSteve Capper
Modify the value of PMD_SECT_PROT_NONE to match that of PTE_NONE. This should have been in commit 3676f9ef5481 (Move PTE_PROT_NONE higher up). Signed-off-by: Steve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> # 3.11+: 3676f9ef5481: arm64: Move PTE_PROT_NONE higher up Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-12-06arm64: Fix memory shareability attribute for ioremap_wc/cacheCatalin Marinas
Write-combine and cacheable mappings use Normal memory on arm64. On SMP systems, the pte needs the shareability bit which is set in pgprot_default. Use this for defining PROT_DEFAULT used by ioremap_wc and ioremap_cache (Device memory is shareable by default, does not need additional attributes). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-29arm64: Move PTE_PROT_NONE higher upCatalin Marinas
PTE_PROT_NONE means that a pte is present but does not have any read/write attributes. However, setting the memory type like pgprot_writecombine() is allowed and such bits overlap with PTE_PROT_NONE. This causes mmap/munmap issues in drivers that change the vma->vm_pg_prot on PROT_NONE mappings. This patch reverts the PTE_FILE/PTE_PROT_NONE shift in commit 59911ca4325d (ARM64: mm: Move PTE_PROT_NONE bit) and moves PTE_PROT_NONE together with the other software bits. Signed-off-by: Steve Capper <steve.capper@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Steve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> # 3.11+
2013-11-29arm64: Use Normal NonCacheable memory for writecombineCatalin Marinas
This provides better performance compared to Device GRE and also allows unaligned accesses. Such memory is intended to be used with standard RAM (e.g. framebuffers) and not I/O. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-25arm64: Unmask asynchronous aborts when in kernel modeCatalin Marinas
The asynchronous aborts are generally fatal for the kernel but they can be masked via the pstate A bit. If a system error happens while in kernel mode, it won't be visible until returning to user space. This patch enables this kind of abort early to help identifying the cause. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-19Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq cleanups from Ingo Molnar: "This is a multi-arch cleanup series from Thomas Gleixner, which we kept to near the end of the merge window, to not interfere with architecture updates. This series (motivated by the -rt kernel) unifies more aspects of IRQ handling and generalizes PREEMPT_ACTIVE" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: preempt: Make PREEMPT_ACTIVE generic sparc: Use preempt_schedule_irq ia64: Use preempt_schedule_irq m32r: Use preempt_schedule_irq hardirq: Make hardirq bits generic m68k: Simplify low level interrupt handling code genirq: Prevent spurious detection for unconditionally polled interrupts
2013-11-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM changes from Paolo Bonzini: "Here are the 3.13 KVM changes. There was a lot of work on the PPC side: the HV and emulation flavors can now coexist in a single kernel is probably the most interesting change from a user point of view. On the x86 side there are nested virtualization improvements and a few bugfixes. ARM got transparent huge page support, improved overcommit, and support for big endian guests. Finally, there is a new interface to connect KVM with VFIO. This helps with devices that use NoSnoop PCI transactions, letting the driver in the guest execute WBINVD instructions. This includes some nVidia cards on Windows, that fail to start without these patches and the corresponding userspace changes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits) kvm, vmx: Fix lazy FPU on nested guest arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu arm/arm64: KVM: MMIO support for BE guest kvm, cpuid: Fix sparse warning kvm: Delete prototype for non-existent function kvm_check_iopl kvm: Delete prototype for non-existent function complete_pio hung_task: add method to reset detector pvclock: detect watchdog reset at pvclock read kvm: optimize out smp_mb after srcu_read_unlock srcu: API for barrier after srcu read unlock KVM: remove vm mmap method KVM: IOMMU: hva align mapping page size KVM: x86: trace cpuid emulation when called from emulator KVM: emulator: cleanup decode_register_operand() a bit KVM: emulator: check rex prefix inside decode_register() KVM: x86: fix emulation of "movzbl %bpl, %eax" kvm_host: typo fix KVM: x86: emulate SAHF instruction MAINTAINERS: add tree for kvm.git Documentation/kvm: add a 00-INDEX file ...
2013-11-15Merge tag 'stable/for-linus-3.13-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen updates from Konrad Rzeszutek Wilk: "This has tons of fixes and two major features which are concentrated around the Xen SWIOTLB library. The short <blurb> is that the tracing facility (just one function) has been added to SWIOTLB to make it easier to track I/O progress. Additionally under Xen and ARM (32 & 64) the Xen-SWIOTLB driver "is used to translate physical to machine and machine to physical addresses of foreign[guest] pages for DMA operations" (Stefano) when booting under hardware without proper IOMMU. There are also bug-fixes, cleanups, compile warning fixes, etc. The commit times for some of the commits is a bit fresh - that is b/c we wanted to make sure we have the Ack's from the ARM folks - which with the string of back-to-back conferences took a bit of time. Rest assured - the code has been stewing in #linux-next for some time. Features: - SWIOTLB has tracing added when doing bounce buffer. - Xen ARM/ARM64 can use Xen-SWIOTLB. This work allows Linux to safely program real devices for DMA operations when running as a guest on Xen on ARM, without IOMMU support. [*1] - xen_raw_printk works with PVHVM guests if needed. Bug-fixes: - Make memory ballooning work under HVM with large MMIO region. - Inform hypervisor of MCFG regions found in ACPI DSDT. - Remove deprecated IRQF_DISABLED. - Remove deprecated __cpuinit. [*1]: "On arm and arm64 all Xen guests, including dom0, run with second stage translation enabled. As a consequence when dom0 programs a device for a DMA operation is going to use (pseudo) physical addresses instead machine addresses. This work introduces two trees to track physical to machine and machine to physical mappings of foreign pages. Local pages are assumed mapped 1:1 (physical address == machine address). It enables the SWIOTLB-Xen driver on ARM and ARM64, so that Linux can translate physical addresses to machine addresses for dma operations when necessary. " (Stefano)" * tag 'stable/for-linus-3.13-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (32 commits) xen/arm: pfn_to_mfn and mfn_to_pfn return the argument if nothing is in the p2m arm,arm64/include/asm/io.h: define struct bio_vec swiotlb-xen: missing include dma-direction.h pci-swiotlb-xen: call pci_request_acs only ifdef CONFIG_PCI arm: make SWIOTLB available xen: delete new instances of added __cpuinit xen/balloon: Set balloon's initial state to number of existing RAM pages xen/mcfg: Call PHYSDEVOP_pci_mmcfg_reserved for MCFG areas. xen: remove deprecated IRQF_DISABLED x86/xen: remove deprecated IRQF_DISABLED swiotlb-xen: fix error code returned by xen_swiotlb_map_sg_attrs swiotlb-xen: static inline xen_phys_to_bus, xen_bus_to_phys, xen_virt_to_bus and range_straddles_page_boundary grant-table: call set_phys_to_machine after mapping grant refs arm,arm64: do not always merge biovec if we are running on Xen swiotlb: print a warning when the swiotlb is full swiotlb-xen: use xen_dma_map/unmap_page, xen_dma_sync_single_for_cpu/device xen: introduce xen_dma_map/unmap_page and xen_dma_sync_single_for_cpu/device tracing/events: Fix swiotlb tracepoint creation swiotlb-xen: use xen_alloc/free_coherent_pages xen: introduce xen_alloc/free_coherent_pages ...
2013-11-15arm64: handle pgtable_page_ctor() failKirill A. Shutemov
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-14Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Included in this series are: 1. BE8 (modern big endian) changes for ARM from Ben Dooks 2. big.Little support from Nicolas Pitre and Dave Martin 3. support for LPAE systems with all system memory above 4GB 4. Perf updates from Will Deacon 5. Additional prefetching and other performance improvements from Will. 6. Neon-optimised AES implementation fro Ard. 7. A number of smaller fixes scattered around the place. There is a rather horrid merge conflict in tools/perf - I was never notified of the conflict because it originally occurred between Will's tree and other stuff. Consequently I have a resolution which Will forwarded me, which I'll forward on immediately after sending this mail. The other notable thing is I'm expecting some build breakage in the crypto stuff on ARM only with Ard's AES patches. These were merged into a stable git branch which others had already pulled, so there's little I can do about this. The problem is caused because these patches have a dependency on some code in the crypto git tree - I tried requesting a branch I can pull to resolve these, and all I got each time from the crypto people was "we'll revert our patches then" which would only make things worse since I still don't have the dependent patches. I've no idea what's going on there or how to resolve that, and since I can't split these patches from the rest of this pull request, I'm rather stuck with pushing this as-is or reverting Ard's patches. Since it should "come out in the wash" I've left them in - the only build problems they seem to cause at the moment are with randconfigs, and since it's a new feature anyway. However, if by -rc1 the dependencies aren't in, I think it'd be best to revert Ard's patches" I resolved the perf conflict roughly as per the patch sent by Russell, but there may be some differences. Any errors are likely mine. Let's see how the crypto issues work out.. * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (110 commits) ARM: 7868/1: arm/arm64: remove atomic_clear_mask() in "include/asm/atomic.h" ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval' in atomic_cmpxchg(). ARM: 7866/1: include: asm: use 'long long' instead of 'u64' within atomic.h ARM: 7871/1: amba: Extend number of IRQS ARM: 7887/1: Don't smp_cross_call() on UP devices in arch_irq_work_raise() ARM: 7872/1: Support arch_irq_work_raise() via self IPIs ARM: 7880/1: Clear the IT state independent of the Thumb-2 mode ARM: 7878/1: nommu: Implement dummy early_paging_init() ARM: 7876/1: clear Thumb-2 IT state on exception handling ARM: 7874/2: bL_switcher: Remove cpu_hotplug_driver_{lock,unlock}() ARM: footbridge: fix build warnings for netwinder ARM: 7873/1: vfp: clear vfp_current_hw_state for dying cpu ARM: fix misplaced arch_virt_to_idmap() ARM: 7848/1: mcpm: Implement cpu_kill() to synchronise on powerdown ARM: 7847/1: mcpm: Factor out logical-to-physical CPU translation ARM: 7869/1: remove unused XSCALE_PMU Kconfig param ARM: 7864/1: Handle 64-bit memory in case of 32-bit phys_addr_t ARM: 7863/1: Let arm_add_memory() always use 64-bit arguments ARM: 7862/1: pcpu: replace __get_cpu_var_uses ARM: 7861/1: cacheflush: consolidate single-CPU ARMv7 cache disabling code ...