summaryrefslogtreecommitdiff
path: root/arch/powerpc/lib
AgeCommit message (Collapse)Author
2017-11-12powerpc/kprobes: Blacklist emulate_update_regs() from kprobesNaveen N. Rao
Commit 3cdfcbfd32b9d ("powerpc: Change analyse_instr so it doesn't modify *regs") introduced emulate_update_regs() to perform part of what emulate_step() was doing earlier. However, this function was not added to the kprobes blacklist. Add it so as to prevent it from being probed. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-10Merge branch 'fixes' into nextMichael Ellerman
We have some dependencies & conflicts between patches in fixes and things to go in next, both in the radix TLB flush code and the IMC PMU driver. So merge fixes into next.
2017-10-10powerpc/lib/sstep: Fix count leading zeros instructionsSandipan Das
According to the GCC documentation, the behaviour of __builtin_clz() and __builtin_clzl() is undefined if the value of the input argument is zero. Without handling this special case, these builtins have been used for emulating the following instructions: * Count Leading Zeros Word (cntlzw[.]) * Count Leading Zeros Doubleword (cntlzd[.]) This fixes the emulated behaviour of these instructions by adding an additional check for this special case. Fixes: 3cdfcbfd32b9d ("powerpc: Change analyse_instr so it doesn't modify *regs") Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/lib/sstep: Fix fixed-point shift instructions that set CA32Sandipan Das
This fixes the emulated behaviour of existing fixed-point shift right algebraic instructions that are supposed to set both the CA and CA32 bits of XER when running on a system that is compliant with POWER ISA v3.0 independent of whether the system is executing in 32-bit mode or 64-bit mode. The following instructions are affected: * Shift Right Algebraic Word Immediate (srawi[.]) * Shift Right Algebraic Word (sraw[.]) * Shift Right Algebraic Doubleword Immediate (sradi[.]) * Shift Right Algebraic Doubleword (srad[.]) Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()") Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/lib/sstep: Fix fixed-point arithmetic instructions that set CA32Sandipan Das
There are existing fixed-point arithmetic instructions that always set the CA bit of XER to reflect the carry out of bit 0 in 64-bit mode and out of bit 32 in 32-bit mode. In ISA v3.0, these instructions also always set the CA32 bit of XER to reflect the carry out of bit 32. This fixes the emulated behaviour of such instructions when running on a system that is compliant with POWER ISA v3.0. The following instructions are affected: * Add Immediate Carrying (addic) * Add Immediate Carrying and Record (addic.) * Subtract From Immediate Carrying (subfic) * Add Carrying (addc[.]) * Subtract From Carrying (subfc[.]) * Add Extended (adde[.]) * Subtract From Extended (subfe[.]) * Add to Minus One Extended (addme[.]) * Subtract From Minus One Extended (subfme[.]) * Add to Zero Extended (addze[.]) * Subtract From Zero Extended (subfze[.]) Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()") Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-10-04powerpc/lib/sstep: Add XER bits introduced in POWER ISA v3.0Sandipan Das
This adds definitions for the OV32 and CA32 bits of XER that were introduced in POWER ISA v3.0. There are some existing instructions that currently set the OV and CA bits based on certain conditions. The emulation behaviour of all these instructions needs to be updated to set these new bits accordingly. Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-20powerpc/sstep: mullw should calculate a 64 bit signed resultAnton Blanchard
mullw should do a 32 bit signed multiply and create a 64 bit signed result. It currently truncates the result to 32 bits. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-20powerpc/sstep: Fix issues with mcrfAnton Blanchard
mcrf broke when we changed analyse_instr() to not modify the register state. The instruction writes to the CR, so we need to store the result in op->ccval, not op->val. Fixes: 3cdfcbfd32b9 ("powerpc: Change analyse_instr so it doesn't modify *regs") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-20powerpc/sstep: Fix issues with set_cr0()Anton Blanchard
set_cr0() broke when we changed analyse_instr() to not modify the register state. Instead of looking at regs->gpr[x] which has not been updated yet, we need to look at op->val. Fixes: 3cdfcbfd32b9 ("powerpc: Change analyse_instr so it doesn't modify *regs") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-04powerpc: Fix kernel crash in emulation of vector loads and storesPaul Mackerras
Commit 350779a29f11 ("powerpc: Handle most loads and stores in instruction emulation code", 2017-08-30) changed the register usage in get_vr and put_vr with the aim of leaving the register number in r3 untouched on return. Unfortunately, r6 was not a good choice, as the callers as of 350779a29f11 store a MSR value in r6. Then, in commit c22435a5f3d8 ("powerpc: Emulate FP/vector/VSX loads/stores correctly when regs not live", 2017-08-30), the saving and restoring of the MSR got moved into get_vr and put_vr. Either way, the effect is that we put a value in MSR that only has the 0x3f8 bits non-zero, meaning that we are switching to 32-bit mode. That leads to a crash like this: Unable to handle kernel paging request for instruction fetch Faulting instruction address: 0x0007bea0 Oops: Kernel access of bad area, sig: 11 [#12] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: vmx_crypto binfmt_misc ip_tables x_tables autofs4 crc32c_vpmsum CPU: 6 PID: 32659 Comm: trashy_testcase Tainted: G D 4.13.0-rc2-00313-gf3026f57e6ed-dirty #23 task: c000000f1bb9e780 task.stack: c000000f1ba98000 NIP: 000000000007bea0 LR: c00000000007b054 CTR: c00000000007be70 REGS: c000000f1ba9b960 TRAP: 0400 Tainted: G D (4.13.0-rc2-00313-gf3026f57e6ed-dirty) MSR: 10000000400010a1 <HV,ME,IR,LE> CR: 48000228 XER: 00000000 CFAR: c00000000007be74 SOFTE: 1 GPR00: c00000000007b054 c000000f1ba9bbe0 c000000000e6e000 000000000000001d GPR04: c000000f1ba9bc00 c00000000007be70 00000000000000e8 9000000002009033 GPR08: 0000000002000000 100000000282f033 000000000b0a0900 0000000000001009 GPR12: 0000000000000000 c00000000fd42100 0706050303020100 a5a5a5a5a5a5a5a5 GPR16: 2e2e2e2e2e2de70c 2e2e2e2e2e2e2e2d 0000000000ff00ff 0606040202020000 GPR20: 000000000000005b ffffffffffffffff 0000000003020100 0000000000000000 GPR24: c000000f1ab90020 c000000f1ba9bc00 0000000000000001 0000000000000001 GPR28: c000000f1ba9bc90 c000000f1ba9bea0 000000000b0a0908 0000000000000001 NIP [000000000007bea0] 0x7bea0 LR [c00000000007b054] emulate_loadstore+0x1044/0x1280 Call Trace: [c000000f1ba9bbe0] [c000000000076b80] analyse_instr+0x60/0x34f0 (unreliable) [c000000f1ba9bc70] [c00000000007b7ec] emulate_step+0x23c/0x544 [c000000f1ba9bce0] [c000000000053424] arch_uprobe_skip_sstep+0x24/0x40 [c000000f1ba9bd00] [c00000000024b2f8] uprobe_notify_resume+0x598/0xba0 [c000000f1ba9be00] [c00000000001c284] do_notify_resume+0xd4/0xf0 [c000000f1ba9be30] [c00000000000bd44] ret_from_except_lite+0x70/0x74 Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace a7ae7a7f3e0256b5 ]--- To fix this, we just revert to using r3 as before, since the callers don't rely on r3 being left unmodified. Fortunately, this can't be triggered by a misaligned load or store, because vector loads and stores truncate misaligned addresses rather than taking an alignment interrupt. It can be triggered using uprobes. Fixes: 350779a29f11 ("powerpc: Handle most loads and stores in instruction emulation code") Reported-by: Anton Blanchard <anton@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Tested-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-02powerpc/sstep: Avoid used uninitialized errorMichael Ellerman
Older compilers think val may be used uninitialized: arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': arch/powerpc/lib/sstep.c:2758:23: error: 'val' may be used uninitialized in this function We know better, but initialise val to 0 to avoid breaking the build. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc/32: remove a NOP from memset()Christophe Leroy
memset() is patched after initialisation to activate the optimised part which uses cache instructions. Today we have a 'b 2f' to skip the optimised patch, which then gets replaced by a NOP, implying a useless cycle consumption. As we have a 'bne 2f' just before, we could use that instruction for the live patching, hence removing the need to have a dedicated 'b 2f' to be replaced by a NOP. This patch changes the 'bne 2f' by a 'b 2f'. During init, that 'b 2f' is then replaced by 'bne 2f' Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc/32: optimise memset()Christophe Leroy
There is no need to extend the set value to an int when the length is lower than 4 as in that case we only do byte stores. We can therefore immediately branch to the part handling it. By separating it from the normal case, we are able to eliminate a few actions on the destination pointer. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: fix location of two EXPORT_SYMBOLChristophe Leroy
Commit 9445aa1a3062a ("ppc: move exports to definitions") added EXPORT_SYMBOL() for memset() and flush_hash_pages() in the middle of the functions. This patch moves them at the end of the two functions. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc/32: add memset16()Christophe Leroy
Commit 694fc88ce271f ("powerpc/string: Implement optimized memset variants") added memset16(), memset32() and memset64() for the 64 bits PPC. On 32 bits, memset64() is not relevant, and as shown below, the generic version of memset32() gives a good code, so only memset16() is candidate for an optimised version. 000009c0 <memset32>: 9c0: 2c 05 00 00 cmpwi r5,0 9c4: 39 23 ff fc addi r9,r3,-4 9c8: 4d 82 00 20 beqlr 9cc: 7c a9 03 a6 mtctr r5 9d0: 94 89 00 04 stwu r4,4(r9) 9d4: 42 00 ff fc bdnz 9d0 <memset32+0x10> 9d8: 4e 80 00 20 blr The last part of memset() handling the not 4-bytes multiples operates on bytes, making it unsuitable for handling word without modification. As it would increase memset() complexity, it is better to implement memset16() from scratch. In addition it has the advantage of allowing a more optimised memset16() than what we would have by using the memset() function. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Wrap register number correctly for string load/store instructionsPaul Mackerras
Michael Ellerman reported that emulate_loadstore() was trying to access element 32 of regs->gpr[], which doesn't exist, when emulating a string store instruction. This is because the string load and store instructions (lswi, lswx, stswi and stswx) are defined to wrap around from register 31 to register 0 if the number of bytes being loaded or stored is sufficiently large. This wrapping was not implemented in the emulation code. To fix it, we mask the register number after incrementing it. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Fixes: c9f6f4ed95d4 ("powerpc: Implement emulation of string loads and stores") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Emulate load/store floating point as integer word instructionsPaul Mackerras
This adds emulation for the lfiwax, lfiwzx and stfiwx instructions. This necessitated adding a new flag to indicate whether a floating point or an integer conversion was needed for LOAD_FP and STORE_FP, so this moves the size field in op->type up 4 bits. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Use instruction emulation infrastructure to handle alignment faultsPaul Mackerras
This replaces almost all of the instruction emulation code in fix_alignment() with calls to analyse_instr(), emulate_loadstore() and emulate_dcbz(). The only emulation code left is the SPE emulation code; analyse_instr() etc. do not handle SPE instructions at present. One result of this is that we can now handle alignment faults on all the new VSX load and store instructions that were added in POWER9. VSX loads/stores will take alignment faults for unaligned accesses to cache-inhibited memory. Another effect is that we no longer rely on the DAR and DSISR values set by the processor. With this, we now need to include the instruction emulation code unconditionally. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Separate out load/store emulation into its own functionPaul Mackerras
This moves the parts of emulate_step() that deal with emulating load and store instructions into a new function called emulate_loadstore(). This is to make it possible to reuse this code in the alignment handler. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Handle opposite-endian processes in emulation codePaul Mackerras
This adds code to the load and store emulation code to byte-swap the data appropriately when the process being emulated is set to the opposite endianness to that of the kernel. This also enables the emulation for the multiple-register loads and stores (lmw, stmw, lswi, stswi, lswx, stswx) to work for little-endian. In little-endian mode, the partial word at the end of a transfer for lsw*/stsw* (when the byte count is not a multiple of 4) is loaded/stored at the least-significant end of the register. Additionally, this fixes a bug in the previous code in that it could call read_mem/write_mem with a byte count that was not 1, 2, 4 or 8. Note that this only works correctly on processors with "true" little-endian mode, such as IBM POWER processors from POWER6 on, not the so-called "PowerPC" little-endian mode that uses address swizzling as implemented on the old 32-bit 603, 604, 740/750, 74xx CPUs. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Set regs->dar if memory access fails in emulate_step()Paul Mackerras
This adds code to the instruction emulation code to set regs->dar to the address of any memory access that fails. This address is not necessarily the same as the effective address of the instruction, because if the memory access is unaligned, it might cross a page boundary and fault on the second page. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Emulate the dcbz instructionPaul Mackerras
This adds code to analyse_instr() and emulate_step() to understand the dcbz (data cache block zero) instruction. The emulate_dcbz() function is made public so it can be used by the alignment handler in future. (The apparently unnecessary cropping of the address to 32 bits is there because it will be needed in that situation.) Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Emulate load/store floating double pair instructionsPaul Mackerras
This adds lfdp[x] and stfdp[x] to the set of instructions that analyse_instr() and emulate_step() understand. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Emulate vector element load/store instructionsPaul Mackerras
This adds code to analyse_instr() and emulate_step() to handle the vector element loads and stores: lvebx, lvehx, lvewx, stvebx, stvehx, stvewx. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Emulate FP/vector/VSX loads/stores correctly when regs not livePaul Mackerras
At present, the analyse_instr/emulate_step code checks for the relevant MSR_FP/VEC/VSX bit being set when a FP/VMX/VSX load or store is decoded, but doesn't recheck the bit before reading or writing the relevant FP/VMX/VSX register in emulate_step(). Since we don't have preemption disabled, it is possible that we get preempted between checking the MSR bit and doing the register access. If that happened, then the registers would have been saved to the thread_struct for the current process. Accesses to the CPU registers would then potentially read stale values, or write values that would never be seen by the user process. Another way that the registers can become non-live is if a page fault occurs when accessing user memory, and the page fault code calls a copy routine that wants to use the VMX or VSX registers. To fix this, the code for all the FP/VMX/VSX loads gets restructured so that it forms an image in a local variable of the desired register contents, then disables preemption, checks the MSR bit and either sets the CPU register or writes the value to the thread struct. Similarly, the code for stores checks the MSR bit, copies either the CPU register or the thread struct to a local variable, then reenables preemption and then copies the register image to memory. If the instruction being emulated is in the kernel, then we must not use the register values in the thread_struct. In this case, if the relevant MSR enable bit is not set, then emulate_step refuses to emulate the instruction. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Make load/store emulation use larger memory accessesPaul Mackerras
At the moment, emulation of loads and stores of up to 8 bytes to unaligned addresses on a little-endian system uses a sequence of single-byte loads or stores to memory. This is rather inefficient, and the code is hard to follow because it has many ifdefs. In addition, the Power ISA has requirements on how unaligned accesses are performed, which are not met by doing all accesses as sequences of single-byte accesses. Emulation of VSX loads and stores uses __copy_{to,from}_user, which means the emulation code has no control on the size of accesses. To simplify this, we add new copy_mem_in() and copy_mem_out() functions for accessing memory. These use a sequence of the largest possible aligned accesses, up to 8 bytes (or 4 on 32-bit systems), to copy memory between a local buffer and user memory. We then rewrite {read,write}_mem_unaligned and the VSX load/store emulation using these new functions. These new functions also simplify the code in do_fp_load() and do_fp_store() for the unaligned cases. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Add emulation for the addpcis instructionPaul Mackerras
The addpcis instruction puts the sum of the next instruction address plus a constant into a register. Since the result depends on the address of the instruction, it will give an incorrect result if it is single-stepped out of line, which is what the *probes subsystem will currently do if a probe is placed on an addpcis instruction. This fixes the problem by adding emulation of it to analyse_instr(). Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Don't update CR0 in emulation of popcnt, prty, bpermd instructionsPaul Mackerras
The architecture shows the least-significant bit of the instruction word as reserved for the popcnt[bwd], prty[wd] and bpermd instructions, that is, these instructions never update CR0. Therefore this changes the emulation of these instructions to skip the CR0 update. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Fix emulation of the isel instructionPaul Mackerras
The case added for the isel instruction was added inside a switch statement which uses the 10-bit minor opcode field in the 0x7fe bits of the instruction word. However, for the isel instruction, the minor opcode field is only the 0x3e bits, and the 0x7c0 bits are used for the "BC" field, which indicates which CR bit to use to select the result. Therefore, for the isel emulation to work correctly when BC != 0, we need to match on ((instr >> 1) & 0x1f) == 15). To do this, we pull the isel case out of the switch statement and put it in an if statement of its own. Fixes: e27f71e5ff3c ("powerpc/lib/sstep: Add isel instruction emulation") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc/64: Fix update forms of loads and stores to write 64-bit EAPaul Mackerras
When a 64-bit processor is executing in 32-bit mode, the update forms of load and store instructions are required by the architecture to write the full 64-bit effective address into the RA register, though only the bottom 32 bits are used to address memory. Currently, the instruction emulation code writes the truncated address to the RA register. This fixes it by keeping the full 64-bit EA in the instruction_op structure, truncating the address in emulate_step() where it is used to address memory, rather than in the address computations in analyse_instr(). Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Handle most loads and stores in instruction emulation codePaul Mackerras
This extends the instruction emulation infrastructure in sstep.c to handle all the load and store instructions defined in the Power ISA v3.0, except for the atomic memory operations, ldmx (which was never implemented), lfdp/stfdp, and the vector element load/stores. The instructions added are: Integer loads and stores: lbarx, lharx, lqarx, stbcx., sthcx., stqcx., lq, stq. VSX loads and stores: lxsiwzx, lxsiwax, stxsiwx, lxvx, lxvl, lxvll, lxvdsx, lxvwsx, stxvx, stxvl, stxvll, lxsspx, lxsdx, stxsspx, stxsdx, lxvw4x, lxsibzx, lxvh8x, lxsihzx, lxvb16x, stxvw4x, stxsibx, stxvh8x, stxsihx, stxvb16x, lxsd, lxssp, lxv, stxsd, stxssp, stxv. These instructions are handled both in the analyse_instr phase and in the emulate_step phase. The code for lxvd2ux and stxvd2ux has been taken out, as those instructions were never implemented in any processor and have been taken out of the architecture, and their opcodes have been reused for other instructions in POWER9 (lxvb16x and stxvb16x). The emulation for the VSX loads and stores uses helper functions which don't access registers or memory directly, which can hopefully be reused by KVM later. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Don't check MSR FP/VMX/VSX enable bits in analyse_instr()Paul Mackerras
This removes the checks for the FP/VMX/VSX enable bits in the MSR from analyse_instr() and adds them to emulate_step() instead. The reason for this is that we may want to use analyse_instr() in a situation where the FP/VMX/VSX register values are stored in the current thread_struct and the FP/VMX/VSX enable bits in the MSR image in the pt_regs are zero. Since analyse_instr() doesn't make any changes to register state, it is reasonable for it to indicate what the effect of an instruction would be even though the relevant enable bit is off. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-09-01powerpc: Change analyse_instr so it doesn't modify *regsPaul Mackerras
The analyse_instr function currently doesn't just work out what an instruction does, it also executes those instructions whose effect is only to update CPU registers that are stored in struct pt_regs. This is undesirable because optprobes uses analyse_instr to work out if an instruction could be successfully emulated in future. This changes analyse_instr so it doesn't modify *regs; instead it stores information in the instruction_op structure to indicate what registers (GPRs, CR, XER, LR) would be set and what value they would be set to. A companion function called emulate_update_regs() can then use that information to update a pt_regs struct appropriately. As a minor cleanup, this replaces inline asm using the cntlzw and cntlzd instructions with calls to __builtin_clz() and __builtin_clzl(). Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-17powerpc/string: Implement optimized memset variantsNaveen N. Rao
Based on Matthew Wilcox's patches for other architectures. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-10powerpc/lib/sstep: Add isel instruction emulationMatt Brown
This adds emulation for the isel instruction. Tested for correctness against the isel instruction and its extended mnemonics (lt, gt, eq) on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-10powerpc/lib/sstep: Add prty instruction emulationMatt Brown
This adds emulation for the prtyw and prtyd instructions. Tested for logical correctness against the prtyw and prtyd instructions on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-10powerpc/lib/sstep: Add bpermd instruction emulationMatt Brown
This adds emulation for the bpermd instruction. Tested for correctness against the bpermd instruction on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-10powerpc/lib/sstep: Add popcnt instruction emulationMatt Brown
This adds emulations for the popcntb, popcntw, and popcntd instructions. Tested for correctness against the popcnt{b,w,d} instructions on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-10powerpc/lib/sstep: Add cmpb instruction emulationMatt Brown
This patch adds emulation of the cmpb instruction, enabling xmon to emulate this instruction. Tested for correctness against the cmpb asm instruction on ppc64le. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-10powerpc: Fix invalid use of register expressionsAndreas Schwab
binutils >= 2.26 now warns about misuse of register expressions in assembler operands that are actually literals, for example: arch/powerpc/kernel/entry_64.S:535: Warning: invalid register expression In practice these are almost all uses of r0 that should just be a literal 0. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> [mpe: Mention r0 is almost always the culprit, fold in purgatory change] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-14Merge tag 'powerpc-4.13-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Nothing that really stands out, just a bunch of fixes that have come in in the last couple of weeks. None of these are actually fixes for code that is new in 4.13. It's roughly half older bugs, with fixes going to stable, and half fixes/updates for Power9. Thanks to: Aneesh Kumar K.V, Anton Blanchard, Balbir Singh, Benjamin Herrenschmidt, Madhavan Srinivasan, Michael Neuling, Nicholas Piggin, Oliver O'Halloran" * tag 'powerpc-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64: Fix atomic64_inc_not_zero() to return an int powerpc: Fix emulation of mfocrf in emulate_step() powerpc: Fix emulation of mcrf in emulate_step() powerpc/perf: Add POWER9 alternate PM_RUN_CYC and PM_RUN_INST_CMPL events powerpc/perf: Fix SDAR_MODE value for continous sampling on Power9 powerpc/asm: Mark cr0 as clobbered in mftb() powerpc/powernv: Fix local TLB flush for boot and MCE on POWER9 powerpc/mm/radix: Synchronize updates to the process table powerpc/mm/radix: Properly clear process table entry powerpc/powernv: Tell OPAL about our MMU mode on POWER9 powerpc/kexec: Fix radix to hash kexec due to IAMR/AMOR
2017-07-12powerpc: make feature-fixup tests fortify-safeDaniel Axtens
Testing the fortified string functions[1] would cause a kernel panic on boot in test_feature_fixups() due to a buffer overflow in memcmp. This boils down to things like this: extern unsigned int ftr_fixup_test1; extern unsigned int ftr_fixup_test1_orig; check(memcmp(&ftr_fixup_test1, &ftr_fixup_test1_orig, size) == 0); We know that these are asm labels so it is safe to read up to 'size' bytes at those addresses. However, because we have passed the address of a single unsigned int to memcmp, the compiler believes the underlying object is in fact a single unsigned int. So if size > sizeof(unsigned int), there will be a panic at runtime. We can fix this by changing the types: instead of calling the asm labels unsigned ints, call them unsigned int[]s. Therefore the size isn't incorrectly determined at compile time and we get a regular unsafe memcmp and no panic. [1] http://openwall.com/lists/kernel-hardening/2017/05/09/2 Link: http://lkml.kernel.org/r/1497903987-21002-7-git-send-email-keescook@chromium.org Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Kees Cook <keescook@chromium.org> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12powerpc: Fix emulation of mfocrf in emulate_step()Anton Blanchard
From POWER4 onwards, mfocrf() only places the specified CR field into the destination GPR, and the rest of it is set to 0. The PowerPC AS from version 3.0 now requires this behaviour. The emulation code currently puts the entire CR into the destination GPR. Fix it. Fixes: 6888199f7fe5 ("[POWERPC] Emulate more instructions in software") Cc: stable@vger.kernel.org # v2.6.22+ Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-12powerpc: Fix emulation of mcrf in emulate_step()Anton Blanchard
The mcrf emulation code was using the CR field number directly as the shift value, without taking into account that CR fields are numbered from 0-7 starting at the high bits. That meant it was looking at the CR fields in the reverse order. Fixes: cf87c3f6b647 ("powerpc: Emulate icbi, mcrf and conditional-trap instructions") Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-03powerpc/lib/code-patching: Use alternate map for patch_instruction()Balbir Singh
This patch creates the window using text_poke_area, allocated via get_vm_area(). text_poke_area is per CPU to avoid locking. text_poke_area for each cpu is setup using late_initcall, prior to setup of these alternate mapping areas, we continue to use direct write to change/modify kernel text. With the ability to use alternate mappings to write to kernel text, it provides us the freedom to then turn text read-only and implement CONFIG_STRICT_KERNEL_RWX. This code is CPU hotplug aware to ensure that the we have mappings for any new cpus as they come online and tear down mappings for any CPUs that go offline. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-02powerpc/lib/xor_vmx: Ensure no altivec code executes before ↵Matt Brown
enable_kernel_altivec() The xor_vmx.c file is used for the RAID5 xor operations. In these functions altivec is enabled to run the operation and then disabled. The code uses enable_kernel_altivec() around the core of the algorithm, however the whole file is built with -maltivec, so the compiler is within its rights to generate altivec code anywhere. This has been seen at least once in the wild: 0:mon> di $xor_altivec_2 c0000000000b97d0 3c4c01d9 addis r2,r12,473 c0000000000b97d4 3842db30 addi r2,r2,-9424 c0000000000b97d8 7c0802a6 mflr r0 c0000000000b97dc f8010010 std r0,16(r1) c0000000000b97e0 60000000 nop c0000000000b97e4 7c0802a6 mflr r0 c0000000000b97e8 faa1ffa8 std r21,-88(r1) ... c0000000000b981c f821ff41 stdu r1,-192(r1) c0000000000b9820 7f8101ce stvx v28,r1,r0 <-- POP c0000000000b9824 38000030 li r0,48 c0000000000b9828 7fa101ce stvx v29,r1,r0 ... c0000000000b984c 4bf6a06d bl c0000000000238b8 # enable_kernel_altivec This patch splits the non-altivec code into xor_vmx_glue.c which calls the altivec functions in xor_vmx.c. By compiling xor_vmx_glue.c without -maltivec we can guarantee that altivec instruction will not be executed outside of the enable/disable block. Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com> [mpe: Rework change log and include disassembly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30powerpc/64: Linker on-demand sfpr functions for modulesNicholas Piggin
For final link, the powerpc64 linker generates fpr save/restore functions on-demand, placing them in the .sfpr section. Starting with binutils 2.25, these can be provided for non-final links with --save-restore-funcs. Use that where possible for module links. This saves about 200 bytes per module (~60kB) on powernv defconfig build. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30powerpc/64: Do not create new section for save/restore functionsNicholas Piggin
There is no need to create a new section for these. Consolidate with 32-bit and just use .text. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30powerpc/64: Do not link crtsavres.o in vmlinuxNicholas Piggin
The 64-bit linker creates save/restore functions on demand with final links, so vmlinux does not require crtsavres.o. Make crtsavres.o extra-y on 64-bit (it is still required by modules). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-30powerpc: Tweak copy selection parameter in __copy_tofrom_user_power7()Andrew Jeffery
Experiments with the netperf benchmark indicated that the size selecting VMX-based copies in __copy_tofrom_user_power7() was suboptimal on POWER8. Measurements showed that parity was in the neighbourhood of 3328 bytes, rather than greater than 4096. The change gives a 1.5-2.0% improvement in performance for 4096-byte buffers, reducing the relative time spent in __copy_tofrom_user_power7() from approximately 7% to approximately 5% in the TCP_RR benchmark. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>