linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2017-02-11	crypto: arm64/crc32 - merge CRC32 and PMULL instruction based drivers	Ard Biesheuvel
	The PMULL based CRC32 implementation already contains code based on the separate, optional CRC32 instructions to fallback to when operating on small quantities of data. We can expose these routines directly on systems that lack the 64x64 PMULL instructions but do implement the CRC32 ones, which makes the driver that is based solely on those CRC32 instructions redundant. So remove it. Note that this aligns arm64 with ARM, whose accelerated CRC32 driver also combines the CRC32 extension based and the PMULL based versions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Matthias Brugger <mbrugger@suse.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03	crypto: arm64/aes - replace scalar fallback with plain NEON fallback	Ard Biesheuvel
	The new bitsliced NEON implementation of AES uses a fallback in two places: CBC encryption (which is strictly sequential, whereas this driver can only operate efficiently on 8 blocks at a time), and the XTS tweak generation, which involves encrypting a single AES block with a different key schedule. The plain (i.e., non-bitsliced) NEON code is more suitable as a fallback, given that it is faster than scalar on low end cores (which is what the NEON implementations target, since high end cores have dedicated instructions for AES), and shows similar behavior in terms of D-cache footprint and sensitivity to cache timing attacks. So switch the fallback handling to the plain NEON driver. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-13	crypto: arm64/aes - reimplement bit-sliced ARM/NEON implementation for arm64	Ard Biesheuvel
	This is a reimplementation of the NEON version of the bit-sliced AES algorithm. This code is heavily based on Andy Polyakov's OpenSSL version for ARM, which is also available in the kernel. This is an alternative for the existing NEON implementation for arm64 authored by me, which suffers from poor performance due to its reliance on the pathologically slow four register variant of the tbl/tbx NEON instruction. This version is about ~30% () faster than the generic C code, but only in cases where the input can be 8x interleaved (this is a fundamental property of bit slicing). For this reason, only the chaining modes ECB, XTS and CTR are implemented. (The significance of ECB is that it could potentially be used by other chaining modes) Measured on Cortex-A57. Note that this is still an order of magnitude slower than the implementations that use the dedicated AES instructions introduced in ARMv8, but those are part of an optional extension, and so it is good to have a fallback. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-13	crypto: arm64/aes - add scalar implementation	Ard Biesheuvel
	This adds a scalar implementation of AES, based on the precomputed tables that are exposed by the generic AES code. Since rotates are cheap on arm64, this implementation only uses the 4 core tables (of 1 KB each), and avoids the prerotated ones, reducing the D-cache footprint by 75%. On Cortex-A57, this code manages 13.0 cycles per byte, which is ~34% faster than the generic C code. (Note that this is still >13x slower than the code that uses the optional ARMv8 Crypto Extensions, which manages <1 cycles per byte.) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-13	crypto: arm64/chacha20 - implement NEON version based on SSE3 code	Ard Biesheuvel
	This is a straight port to arm64/NEON of the x86 SSE3 implementation of the ChaCha20 stream cipher. It uses the new skcipher walksize attribute to process the input in strides of 4x the block size. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-28	Revert "crypto: arm64/ARM: NEON accelerated ChaCha20"	Herbert Xu
	This patch reverts the following commits: 8621caa0d45e731f2e9f5889ff5bb384fcd6e059 8096667273477e735b0072b11a6d617ccee45e5f I should not have applied them because they had already been obsoleted by a subsequent patch series. They also cause a build failure because of the subsequent commit 9ae433bc79f9. Fixes: 9ae433bc79f ("crypto: chacha20 - convert generic and...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-27	crypto: arm64/chacha20 - implement NEON version based on SSE3 code	Ard Biesheuvel
	This is a straight port to arm64/NEON of the x86 SSE3 implementation of the ChaCha20 stream cipher. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-07	crypto: arm64/crc32 - accelerated support based on x86 SSE implementation	Ard Biesheuvel
	This is a combination of the the Intel algorithm implemented using SSE and PCLMULQDQ instructions from arch/x86/crypto/crc32-pclmul_asm.S, and the new CRC32 extensions introduced for both 32-bit and 64-bit ARM in version 8 of the architecture. Two versions of the above combo are provided, one for CRC32 and one for CRC32C. The PMULL/NEON algorithm is faster, but operates on blocks of at least 64 bytes, and on multiples of 16 bytes only. For the remaining input, or for all input on systems that lack the PMULL 64x64->128 instructions, the CRC32 instructions will be used. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-07	crypto: arm64/crct10dif - port x86 SSE implementation to arm64	Ard Biesheuvel
	This is a transliteration of the Intel algorithm implemented using SSE and PCLMULQDQ instructions that resides in the file arch/x86/crypto/crct10dif-pcl-asm_64.S, but simplified to only operate on buffers that are 16 byte aligned (but of any size) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-29	crypto: arm/aes - Select SIMD in Kconfig	Herbert Xu
	The skcipher conversion for ARM missed the select on CRYPTO_SIMD, causing build failures if SIMD was not otherwise enabled. Fixes: da40e7a4ba4d ("crypto: aes-ce - Convert to skcipher") Fixes: 211f41af534a ("crypto: aesbs - Convert to skcipher") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-28	crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512	Ard Biesheuvel
	This integrates both the accelerated scalar and the NEON implementations of SHA-224/256 as well as SHA-384/512 from the OpenSSL project. Relative performance compared to the respective generic C versions: \| SHA256-scalar \| SHA256-NEON* \| SHA512 \| ------------+-----------------+--------------+----------+ Cortex-A53 \| 1.63x \| 1.63x \| 2.34x \| Cortex-A57 \| 1.43x \| 1.59x \| 1.95x \| Cortex-A73 \| 1.26x \| 1.56x \| ? \| The core crypto code was authored by Andy Polyakov of the OpenSSL project, in collaboration with whom the upstream code was adapted so that this module can be built from the same version of sha512-armv8.pl. The version in this patch was taken from OpenSSL commit 32bbb62ea634 ("sha/asm/sha512-armv8.pl: fix big-endian support in __KERNEL__ case.") * The core SHA algorithm is fundamentally sequential, but there is a secondary transformation involved, called the schedule update, which can be performed independently. The NEON version of SHA-224/SHA-256 only implements this part of the algorithm using NEON instructions, the sequential part is always done using scalar instructions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-12-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Linus Torvalds
	Pull crypto update from Herbert Xu: - The crypto API is now documented :) - Disallow arbitrary module loading through crypto API. - Allow get request with empty driver name through crypto_user. - Allow speed testing of arbitrary hash functions. - Add caam support for ctr(aes), gcm(aes) and their derivatives. - nx now supports concurrent hashing properly. - Add sahara support for SHA1/256. - Add ARM64 version of CRC32. - Misc fixes. * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (77 commits) crypto: tcrypt - Allow speed testing of arbitrary hash functions crypto: af_alg - add user space interface for AEAD crypto: qat - fix problem with coalescing enable logic crypto: sahara - add support for SHA1/256 crypto: sahara - replace tasklets with kthread crypto: sahara - add support for i.MX53 crypto: sahara - fix spinlock initialization crypto: arm - replace memset by memzero_explicit crypto: powerpc - replace memset by memzero_explicit crypto: sha - replace memset by memzero_explicit crypto: sparc - replace memset by memzero_explicit crypto: algif_skcipher - initialize upon init request crypto: algif_skcipher - removed unneeded code crypto: algif_skcipher - Fixed blocking recvmsg crypto: drbg - use memzero_explicit() for clearing sensitive data crypto: drbg - use MODULE_ALIAS_CRYPTO crypto: include crypto- module prefix in template crypto: user - add MODULE_ALIAS crypto: sha-mb - remove a bogus NULL check crytpo: qat - Fix 64 bytes requests ...
2014-11-20	crypto: crc32 - Add ARM64 CRC32 hw accelerated module	Yazen Ghannam
	This module registers a crc32 algorithm and a crc32c algorithm that use the optional CRC32 and CRC32C instructions in ARMv8. Tested on AMD Seattle. Improvement compared to crc32c-generic algorithm: TCRYPT CRC32C speed test shows ~450% speedup. Simple dd write tests to btrfs filesystem show ~30% speedup. Signed-off-by: Yazen Ghannam <yazen.ghannam@linaro.org> Acked-by: Steve Capper <steve.capper@linaro.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-06	arm64/crypto: use crypto instructions to generate AES key schedule	Ard Biesheuvel
	This patch implements the AES key schedule generation using ARMv8 Crypto Instructions. It replaces the table based C implementation in aes_generic.ko, which means we can drop the dependency on that module. Tested-by: Steve Capper <steve.capper@linaro.org> Acked-by: Steve Capper <steve.capper@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-05-14	arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions	Ard Biesheuvel
	This adds ARMv8 implementations of AES in ECB, CBC, CTR and XTS modes, both for ARMv8 with Crypto Extensions and for plain ARMv8 NEON. The Crypto Extensions version can only run on ARMv8 implementations that have support for these optional extensions. The plain NEON version is a table based yet time invariant implementation. All S-box substitutions are performed in parallel, leveraging the wide range of ARMv8's tbl/tbx instructions, and the huge NEON register file, which can comfortably hold the entire S-box and still have room to spare for doing the actual computations. The key expansion routines were borrowed from aes_generic. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-05-14	arm64/crypto: AES in CCM mode using ARMv8 Crypto Extensions	Ard Biesheuvel
	This patch adds support for the AES-CCM encryption algorithm for CPUs that have support for the AES part of the ARM v8 Crypto Extensions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-05-14	arm64/crypto: AES using ARMv8 Crypto Extensions	Ard Biesheuvel
	This patch adds support for the AES symmetric encryption algorithm for CPUs that have support for the AES part of the ARM v8 Crypto Extensions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-05-14	arm64/crypto: GHASH secure hash using ARMv8 Crypto Extensions	Ard Biesheuvel
	This is a port to ARMv8 (Crypto Extensions) of the Intel implementation of the GHASH Secure Hash (used in the Galois/Counter chaining mode). It relies on the optional PMULL/PMULL2 instruction (polynomial multiply long, what Intel call carry-less multiply). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-05-14	arm64/crypto: SHA-224/SHA-256 using ARMv8 Crypto Extensions	Ard Biesheuvel
	This patch adds support for the SHA-224 and SHA-256 Secure Hash Algorithms for CPUs that have support for the SHA-2 part of the ARM v8 Crypto Extensions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-05-14	arm64/crypto: SHA-1 using ARMv8 Crypto Extensions	Ard Biesheuvel
	This patch adds support for the SHA-1 Secure Hash Algorithm for CPUs that have support for the SHA-1 part of the ARM v8 Crypto Extensions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>