summaryrefslogtreecommitdiff
path: root/arch/arm/include
AgeCommit message (Collapse)Author
2013-08-20ARM: cacheflush: don't round address range up to nearest pageWill Deacon
The flush_cache_user_range macro takes a pair of addresses describing the start and end of the virtual address range to flush. Due to an accidental oversight when flush_cache_range_user was introduced, the address range was rounded up so that the start and end addresses were page-aligned. For historical reference, the interesting commits in history.git are: 10eacf1775e1 ("[ARM] Clean up ARM cache handling interfaces (part 1)") 71432e79b76b ("[ARM] Add flush_cache_user_page() for sys_cacheflush()") This patch removes the alignment code, reducing the amount of flushing required for ranges that are not an exact multiple of PAGE_SIZE. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Jonathan Austin <jonathan.austin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-20ARM: cacheflush: split user cache-flushing into interruptible chunksWill Deacon
Flushing a large, non-faulting VMA from userspace can potentially result in a long time spent flushing the cache line-by-line without preemption occurring (in the case of CONFIG_PREEMPT=n). Whilst this doesn't affect the stability of the system, it can certainly affect the responsiveness and CPU availability for other tasks. This patch splits up the user cacheflush code so that it flushes in chunks of a page. After each chunk has been flushed, we may reschedule if appropriate and, before processing the next chunk, we allow any pending signals to be handled before resuming from where we left off. Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-16Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "The usual collection of random fixes. Also some further fixes to the last set of security fixes, and some more from Will (which you may already have in a slightly different form)" * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7807/1: kexec: validate CPU hotplug support ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free lock ARM: 7811/1: locks: use early clobber in arch_spin_trylock ARM: 7810/1: perf: Fix array out of bounds access in armpmu_map_hw_event() ARM: 7809/1: perf: fix event validation for software group leaders ARM: Fix FIQ code on VIVT CPUs ARM: Fix !kuser helpers case ARM: Fix the world famous typo with is_gate_vma()
2013-08-16Fix TLB gather virtual address range invalidation corner casesLinus Torvalds
Ben Tebulin reported: "Since v3.7.2 on two independent machines a very specific Git repository fails in 9/10 cases on git-fsck due to an SHA1/memory failures. This only occurs on a very specific repository and can be reproduced stably on two independent laptops. Git mailing list ran out of ideas and for me this looks like some very exotic kernel issue" and bisected the failure to the backport of commit 53a59fc67f97 ("mm: limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT"). That commit itself is not actually buggy, but what it does is to make it much more likely to hit the partial TLB invalidation case, since it introduces a new case in tlb_next_batch() that previously only ever happened when running out of memory. The real bug is that the TLB gather virtual memory range setup is subtly buggered. It was introduced in commit 597e1c3580b7 ("mm/mmu_gather: enable tlb flush range in generic mmu_gather"), and the range handling was already fixed at least once in commit e6c495a96ce0 ("mm: fix the TLB range flushed when __tlb_remove_page() runs out of slots"), but that fix was not complete. The problem with the TLB gather virtual address range is that it isn't set up by the initial tlb_gather_mmu() initialization (which didn't get the TLB range information), but it is set up ad-hoc later by the functions that actually flush the TLB. And so any such case that forgot to update the TLB range entries would potentially miss TLB invalidates. Rather than try to figure out exactly which particular ad-hoc range setup was missing (I personally suspect it's the hugetlb case in zap_huge_pmd(), which didn't have the same logic as zap_pte_range() did), this patch just gets rid of the problem at the source: make the TLB range information available to tlb_gather_mmu(), and initialize it when initializing all the other tlb gather fields. This makes the patch larger, but conceptually much simpler. And the end result is much more understandable; even if you want to play games with partial ranges when invalidating the TLB contents in chunks, now the range information is always there, and anybody who doesn't want to bother with it won't introduce subtle bugs. Ben verified that this fixes his problem. Reported-bisected-and-tested-by: Ben Tebulin <tebulin@googlemail.com> Build-testing-by: Stephen Rothwell <sfr@canb.auug.org.au> Build-testing-by: Richard Weinberger <richard.weinberger@gmail.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13Merge tag 'v3.12-pwm-cleanup-for-olof' of git://github.com/tom3q/linux into ↵Olof Johansson
next/cleanup From Tomasz Figa: Here is the Samsung PWM cleanup series. Particular patches of the series involve following modifications: - fixing up few things in samsung_pwm_timer clocksource driver, - moving remaining Samsung platforms to the new clocksource driver, - removing old clocksource driver, - adding new multiplatform- and DT-aware PWM driver, - moving all Samsung platforms to use the new PWM driver, - removing old PWM driver, - removing all PWM-related code that is not used anymore. * tag 'v3.12-pwm-cleanup-for-olof' of git://github.com/tom3q/linux: (684 commits) ARM: SAMSUNG: Remove plat/regs-timer.h header ARM: SAMSUNG: Remove remaining uses of plat/regs-timer.h header ARM: SAMSUNG: Remove pwm-clock infrastructure ARM: SAMSUNG: Remove old PWM timer platform devices pwm: Remove superseded pwm-samsung-legacy driver ARM: SAMSUNG: Modify board files to use new PWM platform device ARM: SAMSUNG: Rework private data handling in dev-backlight pwm: Add new pwm-samsung driver pwm: samsung: Rename to pwm-samsung-legacy ARM: SAMSUNG: Remove unused PWM timer IRQ chip code ARM: SAMSUNG: Remove old samsung-time driver ARM: SAMSUNG: Move all platforms to new clocksource driver ARM: SAMSUNG: Set PWM platform data ARM: SAMSUNG: Add new PWM platform device ARM: SAMSUNG: Unify base address definitions of timer block clocksource: samsung_pwm_timer: Handle suspend/resume correctly clocksource: samsung_pwm_timer: Do not use clocksource_mmio clocksource: samsung_pwm_timer: Cache clocksource register address clocksource: samsung_pwm_timer: Correct definition of AUTORELOAD bit clocksource: samsung_pwm_timer: Do not request PWM mem region + v3.11-rc4 Conflicts: arch/arm/Kconfig.debug Signed-off-by: Olof Johansson <olof@lixom.net>
2013-08-13Merge branch 'msm/cleanup' into next/cleanupKevin Hilman
From David Brown <davidb@codeaurora.org>: * msm/cleanup: ARM: msm: Only compile io.c on platforms that use it iommu/msm: Move mach includes to iommu directory ARM: msm: Remove devices-iommu.c ARM: msm: Move mach/board.h contents to common.h ARM: msm: Migrate msm_timer to CLOCKSOURCE_OF_DECLARE ARM: msm: Remove TMR and TMR0 static mappings ARM: msm: Move debug-macro.S to include/debug ARM: msm: Don't compile __msm_ioremap_caller() unless used ARM: msm: Remove unused and unmapped MSM_TLMM_BASE for 8x60 Signed-off-by: Kevin Hilman <khilman@linaro.org>
2013-08-13ARM: use phys_addr_t for DMA zone sizesRob Herring
In order to specify a DMA zone size of 4GB on LPAE systems, the sizes need to be 64-bit. So make machine_desc.dma_zone_size and arm_dma_zone_size be phys_addr_t instead of unsigned long. Signed-off-by: Rob Herring <rob.herring@calxeda.com>
2013-08-13ARM: 7808/1: KVM: mm: Get rid of L_PTE_USER ref from PAGE_S2_DEVICEChristoffer Dall
THe L_PTE_USER actually has nothing to do with stage 2 mappings and the L_PTE_S2_RDWR value sets the readable bit, which was what L_PTE_USER was used for before proper handling of stage 2 memory defines. Changelog: [v3]: Drop call to kvm_set_s2pte_writable in mmu.c [v2]: Change default mappings to be r/w instead of r/o, as per Marc Zyngier's suggestion. Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-13ARM: 7807/1: kexec: validate CPU hotplug supportStephen Warren
Architectures should fully validate whether kexec is possible as part of machine_kexec_prepare(), so that user-space's kexec_load() operation can report any problems. Performing validation in machine_kexec() itself is too late, since it is not allowed to return. Prior to this patch, ARM's machine_kexec() was testing after-the-fact whether machine_kexec_prepare() was able to disable all but one CPU. Instead, modify machine_kexec_prepare() to validate all conditions necessary for machine_kexec_prepare()'s to succeed. BUG if the validation succeeded, yet disabling the CPUs didn't actually work. Signed-off-by: Stephen Warren <swarren@nvidia.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-13ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free lockWill Deacon
Commit 15e7e5c1ebf5 ("ARM: 7749/1: spinlock: retry trylock operation if strex fails on free lock") modifying our arch_spin_trylock to retry the acquisition if the lock appeared uncontended, but the strex failed. This patch does the same for rwlocks, which were missed by the original patch. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-13ARM: 7811/1: locks: use early clobber in arch_spin_trylockWill Deacon
The res variable is written before we've finished with the input operands (namely the lock address), so ensure that we mark it as `early clobber' to avoid unintended register sharing. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-12ARM: pci: add ->add_bus() and ->remove_bus() hooks to hw_pciThomas Petazzoni
Some PCI drivers may need to adjust the pci_bus structure after it has been allocated by the Linux PCI core. The PCI core allows architectures to implement the pcibios_add_bus() and pcibios_remove_bus() for this purpose. This commit therefore extends the hw_pci and pci_sys_data structures of the ARM PCI core to allow PCI drivers to register ->add_bus() and ->remove_bus() in hw_pci, which will get called when a bus is added or removed from the system. This will be used for example by the Marvell PCIe driver to connect a particular PCI bus with its corresponding MSI chip to handle Message Signaled Interrupts. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Reviewed-by: Thierry Reding <thierry.reding@gmail.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12ARM: cacheflush: use -ishst dsb variant for ensuring flush completionWill Deacon
flush_cache_vmap contains a dsb to ensure that any cacheflushing operations to flush out newly written ptes have completed. This patch adds the -ishst option to the dsb, since that is all that is required for completing cacheflushing in the inner-shareable domain. Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12ARM: spinlock: use inner-shareable dsb variant prior to sev instructionWill Deacon
When unlocking a spinlock, we use the sev instruction to signal other CPUs waiting on the lock. Since sev is not a memory access instruction, we require a dsb in order to ensure that the sev is not issued ahead of the store placing the lock in an unlocked state. However, as sev is only concerned with other processors in a multiprocessor system, we can restrict the scope of the preceding dsb to the inner-shareable domain. Furthermore, we can restrict the scope to consider only stores, since there are no independent loads on the unlock path. A side-effect of this change is that a spin_unlock operation no longer forces completion of pending TLB invalidation, something which we rely on when unlocking runqueues to ensure that CPU migration during TLB maintenance routines doesn't cause us to continue before the operation has completed. This patch adds the -ishst suffix to the ARMv7 definition of dsb_sev() and adds an inner-shareable dsb to the context-switch path when running a preemptible, SMP, v7 kernel. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12ARM: tlb: reduce scope of barrier domains for TLB invalidationWill Deacon
Our TLB invalidation routines may require a barrier before the maintenance (in order to ensure pending page table writes are visible to the hardware walker) and barriers afterwards (in order to ensure completion of the maintenance and visibility in the instruction stream). Whilst this is expensive, the cost can be reduced somewhat by reducing the scope of the barrier instructions: - The barrier before only needs to apply to stores (pte writes) - Local ops are required only to affect the non-shareable domain - Global ops are required only to affect the inner-shareable domain This patch makes these changes for the TLB flushing code. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12ARM: barrier: allow options to be passed to memory barrier instructionsWill Deacon
On ARMv7, the memory barrier instructions take an optional `option' field which can be used to constrain the effects of a memory barrier based on shareability and access type. This patch allows the caller to pass these options if required, and updates the smp_*() barriers to request inner-shareable barriers, affecting only stores for the _wmb variant. wmb() is also changed to use the -st version of dsb. Reported-by: Albin Tonnerre <albin.tonnerre@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12ARM: tlb: don't perform inner-shareable invalidation for local BP opsWill Deacon
Now that the ASID allocator doesn't require inner-shareable maintenance, we can convert the local_bp_flush_all function to perform only non-shareable flushing, in a similar manner to the TLB invalidation routines. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12ARM: tlb: don't bother with barriers for branch predictor maintenanceWill Deacon
Branch predictor maintenance is only required when we are either changing the kernel's view of memory (switching tables completely) or dealing with ASID rollover. Both of these use-cases require subsequent TLB invalidation, which has the relevant barrier instructions to ensure completion and visibility of the maintenance, so this patch removes the instruction barrier from [local_]flush_bp_all. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12ARM: tlb: don't perform inner-shareable invalidation for local TLB opsWill Deacon
Inner-shareable TLB invalidation is typically more expensive than local (non-shareable) invalidation, so performing the broadcasting for local_flush_tlb_* operations is a waste of cycles and needlessly clobbers entries in the TLBs of other CPUs. This patch introduces __flush_tlb_* versions for many of the TLB invalidation functions, which only respect inner-shareable variants of the invalidation instructions when presented with the TLB_V7_UIS_FULL flag. The local version is also inlined to prevent SMP_ON_UP kernels from missing flushes, where the __flush variant would be called with the UP flags. This gains us around 0.5% in hackbench scores for a dual-core A15, but I would expect this to improve as more cores (and clusters) are added to the equation. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Albin Tonnerre <Albin.Tonnerre@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-06ARM: msm: Move debug-macro.S to include/debugStephen Boyd
One more step to allowing MSM to participate in the multi-platform defconfig. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> [davidb: Comment cleanup requested by sboyd] Signed-off-by: David Brown <davidb@codeaurora.org>
2013-08-03Merge branch 'security-fixes' into fixesRussell King
2013-08-03ARM: fix nommu builds with 48be69a02 (ARM: move signal handlers into a ↵Russell King
vdso-like page) Olof reports that noMMU builds error out with: arch/arm/kernel/signal.c: In function 'setup_return': arch/arm/kernel/signal.c:413:25: error: 'mm_context_t' has no member named 'sigpage' This shows one of the evilnesses of IS_ENABLED(). Get rid of it here and replace it with #ifdef's - and as no noMMU platform can make use of sigpage, depend on CONIFG_MMU not CONFIG_ARM_MPU. Reported-by: Olof Johansson <olof@lixom.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-01Merge branch 'security-fixes' into fixesRussell King
2013-08-01ARM: make vectors page inaccessible from userspaceRussell King
If kuser helpers are not provided by the kernel, disable user access to the vectors page. With the kuser helpers gone, there is no reason for this page to be visible to userspace. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-01ARM: move signal handlers into a vdso-like pageRussell King
Move the signal handlers into a VDSO page rather than keeping them in the vectors page. This allows us to place them randomly within this page, and also map the page at a random location within userspace further protecting these code fragments from ROP attacks. The new VDSO page is also poisoned in the same way as the vector page. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-01clocksource: arch_timer: Push the read/write wrappers deeperStephen Boyd
We're going to introduce support to read and write the memory mapped timer registers in the next patch, so push the cp15 read/write functions one level deeper. This simplifies the next patch and makes it clearer what's going on. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <Marc.Zyngier@arm.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com>
2013-08-01clocksource: arch_timer: Make register accessors less error-proneStephen Boyd
Using an enum for the register we wish to access allows newer compilers to determine if we've forgotten a case in our switch statement. This allows us to remove the BUILD_BUG() instances in the arm64 port, avoiding problems where optimizations may not happen. To try and force better code generation we're currently marking the accessor functions as inline, but newer compilers can ignore the inline keyword unless it's marked __always_inline. Luckily on arm and arm64 inline is __always_inline, but let's make everything __always_inline to be explicit. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <Marc.Zyngier@arm.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com>
2013-07-31ARM: 7801/1: v6: prevent gcc 4.5 from reordering extended CP15 reads above ↵Paul Walmsley
is_smp() test Commit 621a0147d5c921f4cc33636ccd0602ad5d7cbfbc ("ARM: 7757/1: mm: don't flush icache in switch_mm with hardware broadcasting") breaks the boot on OMAP2430SDP with omap2plus_defconfig. Tracked to an undefined instruction abort from the CP15 read in cache_ops_need_broadcast(). It turns out that gcc 4.5 reorders the extended CP15 read above the is_smp() test. This breaks ARM1136 r0 cores, since they don't support several CP15 registers that later ARM cores do. ARM1136JF-S TRM section 3.2.1 "Register allocation" has the details. So mark the extended CP15 read as clobbering memory, which prevents the compiler from reordering it before the is_smp() test. Russell states that the code generated from this approach is preferable to marking the inline asm as volatile. Remove the existing condition code clobber as it's obsolete, per Nico's post: http://www.spinics.net/lists/arm-kernel/msg261208.html This patch is a collaboration with Will Deacon and Russell King. Comments from Paul Walmsley: Russell, if you accept this one, might you also add Will's ack from the lists: Comments from Paul Walmsley: I'd also be obliged if you could add a Cc: line for Jonathan Austin, since he helped test: Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Tony Lindgren <tony@atomide.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Jonathan Austin <jonathan.austin@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-30ARM: bL_switcher: move to dedicated threads rather than workqueuesNicolas Pitre
The workqueues are problematic as they may be contended. They can't be scheduled with top priority either. Also the optimization in bL_switch_request() to skip the workqueue entirely when the target CPU and the calling CPU were the same didn't allow for bL_switch_request() to be called from atomic context, as might be the case for some cpufreq drivers. Let's move to dedicated kthreads instead. Signed-off-by: Nicolas Pitre <nico@linaro.org>
2013-07-30ARM: b.L: core switcher codeNicolas Pitre
This is the core code implementing big.LITTLE switcher functionality. Rationale for this code is available here: http://lwn.net/Articles/481055/ The main entry point for a switch request is: void bL_switch_request(unsigned int cpu, unsigned int new_cluster_id) If the calling CPU is not the wanted one, this wrapper takes care of sending the request to the appropriate CPU with schedule_work_on(). At the moment the core switch operation is handled by bL_switch_to() which must be called on the CPU for which a switch is requested. What this code does: * Return early if the current cluster is the wanted one. * Close the gate in the kernel entry vector for both the inbound and outbound CPUs. * Wake up the inbound CPU so it can perform its reset sequence in parallel up to the kernel entry vector gate. * Migrate all interrupts in the GIC targeting the outbound CPU interface to the inbound CPU interface, including SGIs. This is performed by gic_migrate_target() in drivers/irqchip/irq-gic.c. * Call cpu_pm_enter() which takes care of flushing the VFP state to RAM and save the CPU interface config from the GIC to RAM. * Modify the cpu_logical_map to refer to the inbound physical CPU. * Call cpu_suspend() which saves the CPU state (general purpose registers, page table address) onto the stack and store the resulting stack pointer in an array indexed by the updated cpu_logical_map, then call the provided shutdown function. This happens in arch/arm/kernel/sleep.S. At this point, the provided shutdown function executed by the outbound CPU ungates the inbound CPU. Therefore the inbound CPU: * Picks up the saved stack pointer in the array indexed by its MPIDR in arch/arm/kernel/sleep.S. * The MMU and caches are re-enabled using the saved state on the provided stack, just like if this was a resume operation from a suspended state. * Then cpu_suspend() returns, although this is on the inbound CPU rather than the outbound CPU which called it initially. * The function cpu_pm_exit() is called which effect is to restore the CPU interface state in the GIC using the state previously saved by the outbound CPU. * Exit of bL_switch_to() to resume normal kernel execution on the new CPU. However, the outbound CPU is potentially still running in parallel while the inbound CPU is resuming normal kernel execution, hence we need per CPU stack isolation to execute bL_do_switch(). After the outbound CPU has ungated the inbound CPU, it calls mcpm_cpu_power_down() to: * Clean its L1 cache. * If it is the last CPU still alive in its cluster (last man standing), it also cleans its L2 cache and disables cache snooping from the other cluster. * Power down the CPU (or whole cluster). Code called from bL_do_switch() might end up referencing 'current' for some reasons. However, 'current' is derived from the stack pointer. With any arbitrary stack, the returned value for 'current' and any dereferenced values through it are just random garbage which may lead to segmentation faults. The active page table during the execution of bL_do_switch() is also a problem. There is no guarantee that the inbound CPU won't destroy the corresponding task which would free the attached page table while the outbound CPU is still running and relying on it. To solve both issues, we borrow some of the task space belonging to the init/idle task which, by its nature, is lightly used and therefore is unlikely to clash with our usage. The init task is also never going away. Right now the logical CPU number is assumed to be equivalent to the physical CPU number within each cluster. The kernel should also be booted with only one cluster active. These limitations will be lifted eventually. Signed-off-by: Nicolas Pitre <nico@linaro.org>
2013-07-26ARM: constify machine_desc structure usesRussell King
struct machine_desc records are defined everywhere as a 'const' structure, but unfortuantely it loses its const-ness through the use of linker magic - the symbols which surround the section are not declared const so it becomes possible not to use 'const' for pointers to these const structures. Let's fix this oversight - all pointers to these structures should be marked const too. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26ARM: 7791/1: a.out: remove partial a.out supportWill Deacon
a.out support on ARM requires that argc, argv and envp are passed in r0-r2 respectively, which requires hacking load_aout_binary to prevent argc being clobbered by the return code. Whilst mainline kernels do set the registers up in start_thread, the aout loader has never carried the hack in mainline. Initialising the registers in this way actually goes against the libc expectations for ELF binaries, where argc, argv and envp are passed on the stack, with r0 being used to hold a pointer to an exit function for cleaning up after the dynamic linker if required. If the pointer is NULL, then it is ignored. When execing an ELF binary, Linux currently zeroes r0, then sets it to argc and then finally clobbers it with the return value of the execve syscall, so we actually end up with: r0 = 0 stack[0] = argc r1 = stack[1] = argv r2 = stack[2] = envp libc treats r1 and r2 as undefined. The clobbering of r0 by sys_execve works for user-spawned threads, but when executing an ELF binary from a kernel thread (via call_usermodehelper), the execve is performed on the ret_from_fork path, which restores r0 from the saved pt_regs, resulting in argc being presented to the C library. This has horrible consequences when the application exits, since we have an exit function registered using argc, resulting in a jump to hyperspace. This patch solves the problem by removing the partial a.out support from arch/arm/ altogether. Cc: <stable@vger.kernel.org> Cc: Ashish Sangwan <ashishsangwan2@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26ARM: 7790/1: Fix deferred mm switch on VIVT processorsCatalin Marinas
As of commit b9d4d42ad9 (ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs), the mm switching on VIVT processors is done in the finish_arch_post_lock_switch() function to avoid whole cache flushing with interrupts disabled. The need for deferred mm switch is stored as a thread flag (TIF_SWITCH_MM). However, with preemption enabled, we can have another thread switch before finish_arch_post_lock_switch(). If the new thread has the same mm as the previous 'next' thread, the scheduler will not call switch_mm() and the TIF_SWITCH_MM flag won't be set for the new thread. This patch moves the switch pending flag to the mm_context_t structure since this is specific to the mm rather than thread. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de> Cc: <stable@vger.kernel.org> # 3.5+ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26ARM: 7789/1: Do not run dummy_flush_tlb_a15_erratum() on non-Cortex-A15Fabio Estevam
Commit 93dc688 (ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)) causes the following undefined instruction error on a mx53 (Cortex-A8): Internal error: Oops - undefined instruction: 0 [#1] SMP ARM CPU: 0 PID: 275 Comm: modprobe Not tainted 3.11.0-rc2-next-20130722-00009-g9b0f371 #881 task: df46cc00 ti: df48e000 task.ti: df48e000 PC is at check_and_switch_context+0x17c/0x4d0 LR is at check_and_switch_context+0xdc/0x4d0 This problem happens because check_and_switch_context() calls dummy_flush_tlb_a15_erratum() without checking if we are really running on a Cortex-A15 or not. To avoid this issue, only call dummy_flush_tlb_a15_erratum() inside check_and_switch_context() if erratum_a15_798181() returns true, which means that we are really running on a Cortex-A15. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-26ARM: 7787/1: virt: ensure visibility of __boot_cpu_modeMark Rutland
Secondary CPUs write to __boot_cpu_mode with caches disabled, and thus a cached value of __boot_cpu_mode may be incoherent with that in memory. This could lead to a failure to detect mismatched boot modes. This patch adds flushing to ensure that writes by secondaries to __boot_cpu_mode are made visible before we test against it. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-23Merge tag 'remove-local-timers' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/davidb/linux-msm into next/cleanup From Stephen Boyd: Now that we have a generic arch hook for broadcast we can remove the local timer API entirely. Doing so will reduce code in ARM core, reduce the architecture dependencies of our timer drivers, and simplify the code because we no longer go through an architecture layer that is essentially a hotplug notifier. * tag 'remove-local-timers' of git://git.kernel.org/pub/scm/linux/kernel/git/davidb/linux-msm: ARM: smp: Remove local timer API clocksource: time-armada-370-xp: Divorce from local timer API clocksource: time-armada-370-xp: Fix sparse warning ARM: msm: Divorce msm_timer from local timer API ARM: PRIMA2: Divorce timer-marco from local timer API ARM: EXYNOS4: Divorce mct from local timer API ARM: OMAP2+: Divorce from local timer API ARM: smp_twd: Divorce smp_twd from local timer API ARM: smp: Remove duplicate dummy timer implementation Resolved a large number of conflicts due to __cpuinit cleanups, etc. Signed-off-by: Olof Johansson <olof@lixom.net>
2013-07-22Pull branch 'for-rmk' of git://git.linaro.org/people/ardbiesheuvel/linux-arm ↵Russell King
into devel-stable Comments from Ard Biesheuvel: I have included two use cases that I have been using, XOR and RAID-6 checksumming. The former gets a 60% performance boost on the NEON, the latter over 400%. ARM: add support for kernel mode NEON Adds kernel_neon_begin/end (renamed from kernel_vfp_begin/end in the previous version to de-emphasize the VFP part as VFP code that needs software assistance is not supported currently.) Introduces <asm/neon.h> and the Kconfig symbol KERNEL_MODE_NEON. This has been aligned with Catalin for arm64, so any NEON code that does not use assembly but intrinsics or the GCC vectorizer (such as my examples) can potentially be shared between arm and arm64 archs. ARM: move VFP init to an earlier boot stage This is needed so the NEON is enabled when the XOR and RAID-6 algo boot time benchmarks are run. ARM: be strict about FP exceptions in kernel mode This adds a check to vfp_support_entry() to flag unsupported uses of the NEON/VFP in kernel mode. FP exceptions (bounces) are flagged as a bug, this is because of their potentially intermittent nature. Exceptions caused by the fact that kernel_neon_begin has not been called are just routed through the undef handler. ARM: crypto: add NEON accelerated XOR implementation This is the xor_blocks() implementation built with -ftree-vectorize, 60% faster than optimized ARM code. It calls in_interrupt() to check whether the NEON flavor can be used: this should really not be necessary, but due to xor_blocks'squite generic nature, there is no telling how exactly people may be using it in the real world. lib/raid6: add ARM-NEON accelerated syndrome calculation This is a port of the RAID-6 checksumming code in altivec.uc ported to use NEON intrinsics. It is about 4x faster than the sequential code.
2013-07-15Merge remote-tracking branch 'dma-public/for-v3.12-cma-dma' into for-nextMarek Szyprowski
Conflicts: mm/Kconfig Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2013-07-14arm: delete __cpuinit/__CPUINIT usage from all ARM usersPaul Gortmaker
The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. Note that some harmless section mismatch warnings may result, since notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c) and are flagged as __cpuinit -- so if we remove the __cpuinit from the arch specific callers, we will also get section mismatch warnings. As an intermediate step, we intend to turn the linux/init.h cpuinit related content into no-ops as early as possible, since that will get rid of these warnings. In any case, they are temporary and harmless. This removes all the ARM uses of the __cpuinit macros from C code, and all __CPUINIT from assembly code. It also had two ".previous" section statements that were paired off against __CPUINIT (aka .section ".cpuinit.text") that also get removed here. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-07-13Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "This is our first set of fixes from arm-soc for 3.11. - A handful of build and warning fixes from Arnd - A collection of OMAP fixes - defconfig updates to make the default configs more useful for real use (and testing) out of the box on hardware And a couple of other small fixes. Some of these have been recently applied but it's normally how we deal with fixes, with less bake time in -next needed" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (32 commits) arm: multi_v7_defconfig: Tweaks for omap and sunxi arm: multi_v7_defconfig: add i.MX options and NFS root ARM: omap2: add select of TI_PRIV_EDMA ARM: exynos: select PM_GENERIC_DOMAINS only when used ARM: ixp4xx: avoid circular header dependency ARM: OMAP: omap_common_late_init may be unused ARM: sti: move DEBUG_STI_UART into alphabetical order ARM: OMAP: build mach-omap code only if needed ARM: zynq: use DT_MACHINE_START ARM: omap5: omap5 has SCU and TWD ARM: OMAP2+: omap2plus_defconfig: Enable appended DTB support ARM: OMAP2+: Enable TI_EDMA in omap2plus_defconfig ARM: OMAP2+: omap2plus_defconfig: enable DRA752 thermal support by default ARM: OMAP2+: omap2plus_defconfig: enable TI bandgap driver ARM: OMAP2+: devices: remove duplicated include from devices.c ARM: OMAP3: igep0020: Set DSS pins in correct mux mode. ARM: OMAP2+: N900: enable N900-specific drivers even if device tree is enabled ARM: OMAP2+: Cocci spatch "ptr_ret.spatch" ARM: OMAP2+: Remove obsolete Makefile line ARM: OMAP5: Enable Cortex A15 errata 798181 ...
2013-07-09reboot: arm: change reboot_mode to use enum reboot_modeRobin Holt
Preparing to move the parsing of reboot= to generic kernel code forces the change in reboot_mode handling to use the enum. [akpm@linux-foundation.org: fix arch/arm/mach-socfpga/socfpga.c] Signed-off-by: Robin Holt <holt@sgi.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Russ Anderson <rja@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09reboot: arm: prepare reboot_mode for moving to generic kernel codeRobin Holt
Prepare for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt <holt@sgi.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Russ Anderson <rja@sgi.com> Cc: Robin Holt <holt@sgi.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-08ARM: crypto: add NEON accelerated XOR implementationArd Biesheuvel
Add a source file xor-neon.c (which is really just the reference C implementation passed through the GCC vectorizer) and hook it up to the XOR framework. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org>
2013-07-08ARM: add support for kernel mode NEONArd Biesheuvel
In order to safely support the use of NEON instructions in kernel mode, some precautions need to be taken: - the userland context that may be present in the registers (even if the NEON/VFP is currently disabled) must be stored under the correct task (which may not be 'current' in the UP case), - to avoid having to keep track of additional vfpstates for the kernel side, disallow the use of NEON in interrupt context and run with preemption disabled, - after use, re-enable preemption and re-enable the lazy restore machinery by disabling the NEON/VFP unit. This patch adds the functions kernel_neon_begin() and kernel_neon_end() which take care of the above. It also adds the Kconfig symbol KERNEL_MODE_NEON to enable it. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org>
2013-07-08Merge remote-tracking branch 'cmadma/for-v3.12-cma-dma' into kvm-ppc-nextAlexander Graf
Add prerequisite patch for CMA RMA allocation patches
2013-07-06Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner: "The timer changes contain: - posix timer code consolidation and fixes for odd corner cases - sched_clock implementation moved from ARM to core code to avoid duplication by other architectures - alarm timer updates - clocksource and clockevents unregistration facilities - clocksource/events support for new hardware - precise nanoseconds RTC readout (Xen feature) - generic support for Xen suspend/resume oddities - the usual lot of fixes and cleanups all over the place The parts which touch other areas (ARM/XEN) have been coordinated with the relevant maintainers. Though this results in an handful of trivial to solve merge conflicts, which we preferred over nasty cross tree merge dependencies. The patches which have been committed in the last few days are bug fixes plus the posix timer lot. The latter was in akpms queue and next for quite some time; they just got forgotten and Frederic collected them last minute." * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (59 commits) hrtimer: Remove unused variable hrtimers: Move SMP function call to thread context clocksource: Reselect clocksource when watchdog validated high-res capability posix-cpu-timers: don't account cpu timer after stopped thread runtime accounting posix_timers: fix racy timer delta caching on task exit posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule() selftests: add basic posix timers selftests posix_cpu_timers: consolidate expired timers check posix_cpu_timers: consolidate timer list cleanups posix_cpu_timer: consolidate expiry time type tick: Sanitize broadcast control logic tick: Prevent uncontrolled switch to oneshot mode tick: Make oneshot broadcast robust vs. CPU offlining x86: xen: Sync the CMOS RTC as well as the Xen wallclock x86: xen: Sync the wallclock when the system time is set timekeeping: Indicate that clock was set in the pvclock gtod notifier timekeeping: Pass flags instead of multiple bools to timekeeping_update() xen: Remove clock_was_set() call in the resume path hrtimers: Support resuming with two or more CPUs online (but stopped) timer: Fix jiffies wrap behavior of round_jiffies_common() ...
2013-07-06Merge tag 'xenarm-for-3.11-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen Pull Xen ARM update rom Stefano Stabellini: "Just one commit this time: the implementation of the tmem hypercall for arm and arm64" * tag 'xenarm-for-3.11-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen: xen/arm and xen/arm64: implement HYPERVISOR_tmem_op
2013-07-04Merge branch 'timers/posix-cpu-timers-for-tglx' ofThomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/core Frederic sayed: "Most of these patches have been hanging around for several month now, in -mmotm for a significant chunk. They already missed a few releases." Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-07-04ARM: scu: provide inline dummy functions when SCU is not presentNishanth Menon
On platforms such as Cortex-A15 based OMAP5, SCU is not used, however since much code is shared between Cortex-A9 based OMAP4 (which uses SCU) and OMAP5, It does help to have inline functions returning error values when SCU is not present on the platform. arch/arm/mach-omap2/omap-smp.c which is common between OMAP4 and 5 handles the SCU usage only for OMAP4. This fixes the following build failure with OMAP5 only build: arch/arm/mach-omap2/built-in.o: In function `omap4_smp_init_cpus': arch/arm/mach-omap2/omap-smp.c:185: undefined reference to `scu_get_core_count' arch/arm/mach-omap2/built-in.o: In function `omap4_smp_prepare_cpus': arch/arm/mach-omap2/omap-smp.c:211: undefined reference to `scu_enable' Reported-by: Pekon Gupta <pekon@ti.com> Reported-by: Vincent Stehlé <v-stehle@ti.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2013-07-04xen/arm and xen/arm64: implement HYPERVISOR_tmem_opStefano Stabellini
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>