summaryrefslogtreecommitdiff
path: root/arch/arm/vfp/vfphw.S
AgeCommit message (Collapse)Author
2014-11-17ARM: 8195/1: vfp: Bounce undefined instructions in vectored modeStepan Moskovchenko
Certain ARM CPU implementations (e.g. Cortex-A15) may not raise a floating- point exception whenever deprecated short-vector VFP instructions are executed. Instead these instructions are treated as UNALLOCATED. Change the VFP exception handling code to emulate short-vector instructions even if FPEXC exception bits are not set. Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org> Tested-by: Will Deacon <will.deacon@arm.com> Tested-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-07-18ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+Russell King
ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-04-09ARM: 8018/1: Add {inc,dec}_preempt_count asm macrosCatalin Marinas
The patch adds asm macros for inc_preempt_count and dec_preempt_count_ti (which also gets the current thread_info) instead of open-coding them in arch/arm/vfp/*.S files. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Arun KS <getarunks@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-04-09ARM: 8017/1: Move asm macro get_thread_info to asm/assembler.hCatalin Marinas
asm/assembler.h is a better place for this macro since it is used by asm files outside arch/arm/kernel/ Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Arun KS <getarunks@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-07-08ARM: be strict about FP exceptions in kernel modeArd Biesheuvel
The support code in vfp_support_entry does not care whether the exception that caused it to be invoked occurred in kernel mode or in user mode. However, neither condition that could trigger this exception (lazy restore and VFP bounce to support code) is currently allowable in kernel mode. In either case, print a message describing the condition before letting the undefined instruction handler run its course and trigger an oops. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org>
2013-03-01ARM: Fix broken commit 0cc41e4a21d43 corrupting kernel messagesRussell King
Commit 0cc41e4a21d43 (arch: remove direct definitions of KERN_<LEVEL> uses) is broken - not enough thought was put into changing: .asciz "string" to .asciz "string1" "string2" The problem is that each string gets _separately_ NUL terminated, so the result is a string containing: "string1\0string2\0" rather than: "string1string2\0" With our new printk levels, this ends up as - eg, KERN_DEBUG "string": 0x01 0x00 0x07 0x00 "string" 0x00 which produces lots of \x01 in the kernel log. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-01-16ARM: 7627/1: Predicate preempt logic on PREEMP_COUNT not PREEMPT aloneStephen Boyd
Patrik Kluba reports that the preempt count becomes invalid due to the preempt_enable() call being unbalanced with a preempt_disable() call in the vfp assembly routines. This happens because preempt_enable() and preempt_disable() update preempt counts under PREEMPT_COUNT=y but the vfp assembly routines do so under PREEMPT=y. In a configuration where PREEMPT=n and DEBUG_ATOMIC_SLEEP=y, PREEMPT_COUNT=y and so the preempt_enable() call in VFP_bounce() keeps subtracting from the preempt count until it goes negative. Fix this by always using PREEMPT_COUNT to decided when to update preempt counts in the ARM assembly code. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Reported-by: Patrik Kluba <pkluba@dension.com> Tested-by: Patrik Kluba <pkluba@dension.com> Cc: <stable@vger.kernel.org> # 2.6.30 Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-08-01Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "This fixes various issues found during July" * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7479/1: mm: avoid NULL dereference when flushing gate_vma with VIVT caches ARM: Fix undefined instruction exception handling ARM: 7480/1: only call smp_send_stop() on SMP ARM: 7478/1: errata: extend workaround for erratum #720789 ARM: 7477/1: vfp: Always save VFP state in vfp_pm_suspend on UP ARM: 7476/1: vfp: only clear vfp state for current cpu in vfp_pm_suspend ARM: 7468/1: ftrace: Trace function entry before updating index ARM: 7467/1: mutex: use generic xchg-based implementation for ARMv6+ ARM: 7466/1: disable interrupt before spinning endlessly ARM: 7465/1: Handle >4GB memory sizes in device tree and mem=size@start option
2012-07-31ARM: Fix undefined instruction exception handlingRussell King
While trying to get a v3.5 kernel booted on the cubox, I noticed that VFP does not work correctly with VFP bounce handling. This is because of the confusion over 16-bit vs 32-bit instructions, and where PC is supposed to point to. The rule is that FP handlers are entered with regs->ARM_pc pointing at the _next_ instruction to be executed. However, if the exception is not handled, regs->ARM_pc points at the faulting instruction. This is easy for ARM mode, because we know that the next instruction and previous instructions are separated by four bytes. This is not true of Thumb2 though. Since all FP instructions are 32-bit in Thumb2, it makes things easy. We just need to select the appropriate adjustment. Do this by moving the adjustment out of do_undefinstr() into the assembly code, as only the assembly code knows whether it's dealing with a 32-bit or 16-bit instruction. Cc: <stable@vger.kernel.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-07-30arch: remove direct definitions of KERN_<LEVEL> usesJoe Perches
Add #include <linux/kern_levels.h> so that the #define KERN_<LEVEL> macros don't have to be duplicated. Signed-off-by: Joe Perches <joe@perches.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Kay Sievers <kay@vrfy.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-09ARM: vfp: fix a hole in VFP thread migrationRussell King
Fix a hole in the VFP thread migration. Lets define two threads. Thread 1, we'll call 'interesting_thread' which is a thread which is running on CPU0, using VFP (so vfp_current_hw_state[0] = &interesting_thread->vfpstate) and gets migrated off to CPU1, where it continues execution of VFP instructions. Thread 2, we'll call 'new_cpu0_thread' which is the thread which takes over on CPU0. This has also been using VFP, and last used VFP on CPU0, but doesn't use it again. The following code will be executed twice: cpu = thread->cpu; /* * On SMP, if VFP is enabled, save the old state in * case the thread migrates to a different CPU. The * restoring is done lazily. */ if ((fpexc & FPEXC_EN) && vfp_current_hw_state[cpu]) { vfp_save_state(vfp_current_hw_state[cpu], fpexc); vfp_current_hw_state[cpu]->hard.cpu = cpu; } /* * Thread migration, just force the reloading of the * state on the new CPU in case the VFP registers * contain stale data. */ if (thread->vfpstate.hard.cpu != cpu) vfp_current_hw_state[cpu] = NULL; The first execution will be on CPU0 to switch away from 'interesting_thread'. interesting_thread->cpu will be 0. So, vfp_current_hw_state[0] points at interesting_thread->vfpstate. The hardware state will be saved, along with the CPU number (0) that it was executing on. 'thread' will be 'new_cpu0_thread' with new_cpu0_thread->cpu = 0. Also, because it was executing on CPU0, new_cpu0_thread->vfpstate.hard.cpu = 0, and so the thread migration check is not triggered. This means that vfp_current_hw_state[0] remains pointing at interesting_thread. The second execution will be on CPU1 to switch _to_ 'interesting_thread'. So, 'thread' will be 'interesting_thread' and interesting_thread->cpu now will be 1. The previous thread executing on CPU1 is not relevant to this so we shall ignore that. We get to the thread migration check. Here, we discover that interesting_thread->vfpstate.hard.cpu = 0, yet interesting_thread->cpu is now 1, indicating thread migration. We set vfp_current_hw_state[1] to NULL. So, at this point vfp_current_hw_state[] contains the following: [0] = &interesting_thread->vfpstate [1] = NULL Our interesting thread now executes a VFP instruction, takes a fault which loads the state into the VFP hardware. Now, through the assembly we now have: [0] = &interesting_thread->vfpstate [1] = &interesting_thread->vfpstate CPU1 stops due to ptrace (and so saves its VFP state) using the thread switch code above), and CPU0 calls vfp_sync_hwstate(). if (vfp_current_hw_state[cpu] == &thread->vfpstate) { vfp_save_state(&thread->vfpstate, fpexc | FPEXC_EN); BANG, we corrupt interesting_thread's VFP state by overwriting the more up-to-date state saved by CPU1 with the old VFP state from CPU0. Fix this by ensuring that we have sane semantics for the various state describing variables: 1. vfp_current_hw_state[] points to the current owner of the context information stored in each CPUs hardware, or NULL if that state information is invalid. 2. thread->vfpstate.hard.cpu always contains the most recent CPU number which the state was loaded into or NR_CPUS if no CPU owns the state. So, for a particular CPU to be a valid owner of the VFP state for a particular thread t, two things must be true: vfp_current_hw_state[cpu] == &t->vfpstate && t->vfpstate.hard.cpu == cpu. and that is valid from the moment a CPU loads the saved VFP context into the hardware. This gives clear and consistent semantics to interpreting these variables. This patch also fixes thread copying, ensuring that t->vfpstate.hard.cpu is invalidated, otherwise CPU0 may believe it was the last owner. The hole can happen thus: - thread1 runs on CPU2 using VFP, migrates to CPU3, exits and thread_info freed. - New thread allocated from a previously running thread on CPU2, reusing memory for thread1 and copying vfp.hard.cpu. At this point, the following are true: new_thread1->vfpstate.hard.cpu == 2 &new_thread1->vfpstate == vfp_current_hw_state[2] Lastly, this also addresses thread flushing in a similar way to thread copying. Hole is: - thread runs on CPU0, using VFP, migrates to CPU1 but does not use VFP. - thread calls execve(), so thread flush happens, leaving vfp_current_hw_state[0] intact. This vfpstate is memset to 0 causing thread->vfpstate.hard.cpu = 0. - thread migrates back to CPU0 before using VFP. At this point, the following are true: thread->vfpstate.hard.cpu == 0 &thread->vfpstate == vfp_current_hw_state[0] Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-07-09ARM: vfp: rename check_exception to vfp_hw_state_validRussell King
Rename this branch to more accurately reflect why its taken, rather than what the following code does. It is the only caller of this code. This helps to clarify following changes, yet this change results in no actual code change. Document the VFP hardware state at the target of this branch. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-07-09ARM: vfp: rename last_VFP_context to vfp_current_hw_stateRussell King
Rename the slightly confusing 'last_VFP_context' variable to be more descriptive of what it actually is. This variable stores a pointer to the current owner's vfpstate structure for the context held in the VFP hardware. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-11-30ARM: 6498/1: vfp: Correct data alignment for CONFIG_THUMB2_KERNELDave Martin
Directives such as .long and .word do not magically cause the assembler location counter to become aligned in gas. As a result, using these directives in code sections can result in misaligned data words when building a Thumb-2 kernel (CONFIG_THUMB2_KERNEL). This is a Bad Thing, since the ABI permits the compiler to assume that fundamental types of word size or above are word- aligned when accessing them from C. If the data is not really word-aligned, this can cause impaired performance and stray alignment faults in some circumstances. In general, the following rules should be applied when using data word declaration directives inside code sections: * .quad and .double: .align 3 * .long, .word, .single, .float: .align (or .align 2) * .short: No explicit alignment required, since Thumb-2 instructions are always 2 or 4 bytes in size. immediately after an instruction. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Dave Martin <dave.martin@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-27ARM: VFP: Fix vfp_put_double() for d16-d31Russell King
vfp_put_double() takes the double value in r0,r1 not r1,r2. Reported-by: Tarun Kanti DebBarma <tarun.kanti@ti.com> Cc: <stable@kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-07-24Thumb-2: Implement the unified VFP supportCatalin Marinas
This patch modifies the VFP files for the ARM/Thumb-2 unified assembly syntax. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2009-05-30Fix the VFP handling on the Feroceon CPUCatalin Marinas
This CPU generates synchronous VFP exceptions in a non-standard way - the FPEXC.EX bit set but without the FPSCR.IXE bit being set like in the VFP subarchitecture 1 or just the FPEXC.DEX bit like in VFP subarchitecture 2. The main problem is that the faulty instruction (which needs to be emulated in software) will be restarted several times (normally until a context switch disables the VFP). This patch ensures that the VFP exception is treated as synchronous. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Nicolas Pitre <nico@cam.org>
2009-04-01[ARM] 5440/1: Fix VFP state corruption due to preemption during VFP exceptionsGeorge G. Davis
We've observed that ARM VFP state can be corrupted during VFP exception handling when PREEMPT is enabled. The exact conditions are difficult to reproduce but appear to occur during VFP exception handling when a task causes a VFP exception which is handled via VFP_bounce and is then preempted by yet another task which in turn causes yet another VFP exception. Since the VFP_bounce code is not preempt safe, VFP state then becomes corrupt. In order to prevent preemption from occuring while handling a VFP exception, this patch disables preemption while handling VFP exceptions. Signed-off-by: George G. Davis <gdavis@mvista.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-02-12[ARM] 5387/1: Add ptrace VFP support on ARMCatalin Marinas
This patch adds ptrace support for setting and getting the VFP registers using PTRACE_SETVFPREGS and PTRACE_GETVFPREGS. The user_vfp structure defined in asm/user.h contains 32 double registers (to cover VFPv3 and Neon hardware) and the FPSCR register. Cc: Paul Brook <paul@codesourcery.com> Cc: Daniel Jacobowitz <dan@codesourcery.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-12-18[ARM] 5349/1: VFP: Add PM code to save and restore current VFP stateBen Dooks
When CONFIG_PM is selected, the VFP code does not have any handler installed to deal with either saving the VFP state of the current task, nor does it do anything to try and restore the VFP after a resume. On resume, the VFP will have been reset and the co-processor access control registers are in an indeterminate state (very probably the CP10 and CP11 the VFP uses will have been disabled by the ARM core reset). When this happens, resume will break as soon as it tries to unfreeze the tasks and restart scheduling. Add a sys device to allow us to hook the suspend call to save the current thread state if the thread is using VFP and a resume hook which restores the CP10/CP11 access and ensures the VFP is disabled so that the lazy swapping will take place on next access. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-11-06ARMv7: Branch over conditional undefined instructions in vfphw.SCatalin Marinas
On ARMv7, conditional undefined instructions may generate exceptions even if the condition is not met. The vfphw.S contains the FPINST and FPINST2 access instructions which may not be present on processors with synchronous VFP exceptions. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2008-09-01[ARM] 5227/1: Add the ENDPROC declarations to the .S filesCatalin Marinas
This declaration specifies the "function" type and size for various assembly functions, mainly needed for generating the correct branch instructions in Thumb-2. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-01-26[ARM] 4583/1: ARMv7: Add VFPv3 supportCatalin Marinas
This patch adds the support for VFPv3 (the kernel currently supports VFPv2). The main difference is 32 double registers (compared to 16). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-01-26[ARM] 4582/2: Add support for the common VFP subarchitectureCatalin Marinas
This patch allows the VFP support code to run correctly on CPUs compatible with the common VFP subarchitecture specification (Appendix B in the ARM ARM v7-A and v7-R edition). It implements support for VFP subarchitecture 2 while being backwards compatible with subarchitecture 1. On VFP subarchitecture 1, the arithmetic exceptions are asynchronous (or imprecise as described in the old ARM ARM) unless the FPSCR.IXE bit is 1. The exceptional instructions can be read from FPINST and FPINST2 registers. With VFP subarchitecture 2, the arithmetic exceptions can also be synchronous and marked by the FPEXC.DEX bit (the FPEXC.EX bit is cleared). CPUs implementing the synchronous arithmetic exceptions don't have the FPINST and FPINST2 registers and accessing them would trigger and undefined exception. Note that FPEXC.EX bit has an additional meaning on subarchitecture 1 - if it isn't set, there is no additional information in FPINST and FPINST2 that needs to be saved at context switch or when lazy-loading the VFP state of a different thread. The patch also removes the clearing of the cumulative exception flags in FPSCR when additional exceptions were raised. It is up to the user application to clear these bits. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-07-20[ARM] vfp: make fpexc bit names less verboseRussell King
Use the fpexc abbreviated names instead of long verbose names for fpexc bits. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-01-25[ARM] 4111/1: Allow VFP to work with thread migration on SMPCatalin Marinas
The current lazy saving of the VFP registers is no longer possible with thread migration on SMP. This patch implements a per-CPU vfp-state pointer and the saving of the VFP registers at every context switch. The registers restoring is still performed in a lazy way. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-08-30[ARM] 3750/3: Fix double VFP emulation for EABI kernelsDaniel Jacobowitz
Patch from Daniel Jacobowitz vfp_put_double didn't work in a CONFIG_AEABI kernel. By swapping the arguments, we arrange for them to be in the same place regardless of ABI. I made the same change to vfp_put_float for consistency. Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-22[ARM] Enable VFP to be built when non-VFP capable CPUs are selectedRussell King
Since we pass flags to the compiler to control code generation based on the least capable selected CPU, if we want to include VFP support, we must tweak the assembler flags to allow the VFP instructions. Moreover, we must not use the mrrc/mcrr versions since these will not be recognised by the assembler. We do not convert all instructions to the VFP-equivalent (yet) since binutils appears to barf on "fmrx rn, fpinst" and doesn't provide any other way (other than using the mrc equivalent) to encode this instruction - which is rather a problem when you have a VFP implementation which requires these instructions. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-04-10[ARM] 3473/1: Use numbers 0-15 for the VFP double registersCatalin Marinas
Patch from Catalin Marinas This patch changes the double registers numbering to 0-15 from even 0-30, in preparation for future VFP extensions. It also fixes the VFP_REG_ZERO bug (value 16 actually represents the 8th double register with the original numbering). The original mcrr/mrrc on CP10 were generating FMRRS/FMSRR instead of FMRRD/FMDRR. The patch changes to CP11 for the correct instructions. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-25[ARM] 3398/1: Fix the VFP registers loading/storing base addressCatalin Marinas
Patch from Catalin Marinas The current VFP code corrupts the VFP registers (including the control ones) if more than one floating point application is executed at the same time. This patch fixes the updating of the load/store base addresses for the VFP registers. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-04-16Linux-2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!