summaryrefslogtreecommitdiff
path: root/arch/x86/crypto/aesni-intel_asm.S
AgeCommit message (Collapse)Author
2019-12-11crypto: x86 - Regularize glue function prototypesKees Cook
The crypto glue performed function prototype casting via macros to make indirect calls to assembly routines. Instead of performing casts at the call sites (which trips Control Flow Integrity prototype checking), switch each prototype to a common standard set of arguments which allows the removal of the existing macros. In order to keep pointer math unchanged, internal casting between u128 pointers and u8 pointers is added. Co-developed-by: João Moreira <joao.moreira@intel.com> Signed-off-by: João Moreira <joao.moreira@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-10-18x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*Jiri Slaby
These are all functions which are invoked from elsewhere, so annotate them as global using the new SYM_FUNC_START and their ENDPROC's by SYM_FUNC_END. Make sure ENTRY/ENDPROC is not defined on X86_64, given these were the last users. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [hibernate] Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [xen bits] Acked-by: Herbert Xu <herbert@gondor.apana.org.au> [crypto] Cc: Allison Randal <allison@lohutok.net> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Armijn Hemel <armijn@tjaldur.nl> Cc: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Enrico Weigelt <info@metux.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: linux-arch@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: linux-efi <linux-efi@vger.kernel.org> Cc: linux-efi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: platform-driver-x86@vger.kernel.org Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Wei Huang <wei@redhat.com> Cc: x86-ml <x86@kernel.org> Cc: xen-devel@lists.xenproject.org Cc: Xiaoyao Li <xiaoyao.li@linux.intel.com> Link: https://lkml.kernel.org/r/20191011115108.12392-25-jslaby@suse.cz
2019-10-18x86/asm: Annotate aliasesJiri Slaby
_key_expansion_128 is an alias to _key_expansion_256a, __memcpy to memcpy, xen_syscall32_target to xen_sysenter_target, and so on. Annotate them all using the new SYM_FUNC_START_ALIAS, SYM_FUNC_START_LOCAL_ALIAS, and SYM_FUNC_END_ALIAS. This will make the tools generating the debuginfo happy as it avoids nesting and double symbols. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Juergen Gross <jgross@suse.com> [xen parts] Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-arch@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Cc: xen-devel@lists.xenproject.org Link: https://lkml.kernel.org/r/20191011115108.12392-10-jslaby@suse.cz
2019-10-18x86/asm/crypto: Annotate local functionsJiri Slaby
Use the newly added SYM_FUNC_START_LOCAL to annotate beginnings of all functions which do not have ".globl" annotation, but their endings are annotated by ENDPROC. This is needed to balance ENDPROC for tools that generate debuginfo. These function names are not prepended with ".L" as they might appear in call traces and they wouldn't be visible after such change. To be symmetric, the functions' ENDPROCs are converted to the new SYM_FUNC_END. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-arch@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191011115108.12392-7-jslaby@suse.cz
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-29Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Check for the right CPU feature bit in sm4-ce on arm64. - Fix scatterwalk WARN_ON in aes-gcm-ce on arm64. - Fix unaligned fault in aesni on x86. - Fix potential NULL pointer dereference on exit in chtls. - Fix DMA mapping direction for RSA in caam. - Fix error path return value for xts setkey in caam. - Fix address endianness when DMA unmapping in caam. - Fix sleep-in-atomic in vmx. - Fix command corruption when queue is full in cavium/nitrox. * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: cavium/nitrox - fix for command corruption in queue full case with backlog submissions. crypto: vmx - Fix sleep-in-atomic bugs crypto: arm64/aes-gcm-ce - fix scatterwalk API violation crypto: aesni - Use unaligned loads from gcm_context_data crypto: chtls - fix null dereference chtls_free_uld() crypto: arm64/sm4-ce - check for the right CPU feature bit crypto: caam - fix DMA mapping direction for RSA forms 2 & 3 crypto: caam/qi - fix error path in xts setkey crypto: caam/jr - fix descriptor DMA unmapping
2018-08-25crypto: aesni - Use unaligned loads from gcm_context_dataDave Watson
A regression was reported bisecting to 1476db2d12 "Move HashKey computation from stack to gcm_context". That diff moved HashKey computation from the stack, which was explicitly aligned in the asm, to a struct provided from the C code, depending on AESNI_ALIGN_ATTR for alignment. It appears some compilers may not align this struct correctly, resulting in a crash on the movdqa instruction when attempting to encrypt or decrypt data. Fix by using unaligned loads for the HashKeys. On modern hardware there is no perf difference between the unaligned and aligned loads. All other accesses to gcm_context_data already use unaligned loads. Reported-by: Mauro Rossi <issor.oruam@gmail.com> Fixes: 1476db2d12 ("Move HashKey computation from stack to gcm_context") Cc: <stable@vger.kernel.org> Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-07-03x86/asm/64: Use 32-bit XOR to zero registersJan Beulich
Some Intel CPUs don't recognize 64-bit XORs as zeroing idioms. Zeroing idioms don't require execution bandwidth, as they're being taken care of in the frontend (through register renaming). Use 32-bit XORs instead. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: herbert@gondor.apana.org.au Cc: pavel@ucw.cz Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/5B39FF1A02000078001CFB54@prv1-mh.provo.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-22crypto: aesni - Introduce scatter/gather asm function stubsDave Watson
The asm macros are all set up now, introduce entry points. GCM_INIT and GCM_COMPLETE have arguments supplied, so that the new scatter/gather entry points don't have to take all the arguments, and only the ones they need. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Add fast path for > 16 byte updateDave Watson
We can fast-path any < 16 byte read if the full message is > 16 bytes, and shift over by the appropriate amount. Usually we are reading > 16 bytes, so this should be faster than the READ_PARTIAL macro introduced in b20209c91e2 for the average case. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Introduce partial block macroDave Watson
Before this diff, multiple calls to GCM_ENC_DEC will succeed, but only if all calls are a multiple of 16 bytes. Handle partial blocks at the start of GCM_ENC_DEC, and update aadhash as appropriate. The data offset %r11 is also updated after the partial block. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Move HashKey computation from stack to gcm_contextDave Watson
HashKey computation only needs to happen once per scatter/gather operation, save it between calls in gcm_context struct instead of on the stack. Since the asm no longer stores anything on the stack, we can use %rsp directly, and clean up the frame save/restore macros a bit. Hashkeys actually only need to be calculated once per key and could be moved to when set_key is called, however, the current glue code falls back to generic aes code if fpu is disabled. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Move ghash_mul to GCM_COMPLETEDave Watson
Prepare to handle partial blocks between scatter/gather calls. For the last partial block, we only want to calculate the aadhash in GCM_COMPLETE, and a new partial block macro will handle both aadhash update and encrypting partial blocks between calls. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Fill in new context data structuresDave Watson
Fill in aadhash, aadlen, pblocklen, curcount with appropriate values. pblocklen, aadhash, and pblockenckey are also updated at the end of each scatter/gather operation, to be carried over to the next operation. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Split AAD hash calculation to separate macroDave Watson
AAD hash only needs to be calculated once for each scatter/gather operation. Move it to its own macro, and call it from GCM_INIT instead of INITIAL_BLOCKS. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Introduce gcm_context_dataDave Watson
Introduce a gcm_context_data struct that will be used to pass context data between scatter/gather update calls. It is passed as the second argument (after crypto keys), other args are renumbered. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Merge encode and decode to GCM_ENC_DEC macroDave Watson
Make a macro for the main encode/decode routine. Only a small handful of lines differ for enc and dec. This will also become the main scatter/gather update routine. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Add GCM_COMPLETE macroDave Watson
Merge encode and decode tag calculations in GCM_COMPLETE macro. Scatter/gather routines will call this once at the end of encryption or decryption. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Add GCM_INIT macroDave Watson
Reduce code duplication by introducting GCM_INIT macro. This macro will also be exposed as a function for implementing scatter/gather support, since INIT only needs to be called once for the full operation. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Macro-ify func save/restoreDave Watson
Macro-ify function save and restore. These will be used in new functions added for scatter/gather update operations. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22crypto: aesni - Merge INITIAL_BLOCKS_ENC/DECDave Watson
Use macro operations to merge implemetations of INITIAL_BLOCKS, since they differ by only a small handful of lines. Use macro counter \@ to simplify implementation. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-01-31Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Enforce the setting of keys for keyed aead/hash/skcipher algorithms. - Add multibuf speed tests in tcrypt. Algorithms: - Improve performance of sha3-generic. - Add native sha512 support on arm64. - Add v8.2 Crypto Extentions version of sha3/sm3 on arm64. - Avoid hmac nesting by requiring underlying algorithm to be unkeyed. - Add cryptd_max_cpu_qlen module parameter to cryptd. Drivers: - Add support for EIP97 engine in inside-secure. - Add inline IPsec support to chelsio. - Add RevB core support to crypto4xx. - Fix AEAD ICV check in crypto4xx. - Add stm32 crypto driver. - Add support for BCM63xx platforms in bcm2835 and remove bcm63xx. - Add Derived Key Protocol (DKP) support in caam. - Add Samsung Exynos True RNG driver. - Add support for Exynos5250+ SoCs in exynos PRNG driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (166 commits) crypto: picoxcell - Fix error handling in spacc_probe() crypto: arm64/sha512 - fix/improve new v8.2 Crypto Extensions code crypto: arm64/sm3 - new v8.2 Crypto Extensions implementation crypto: arm64/sha3 - new v8.2 Crypto Extensions implementation crypto: testmgr - add new testcases for sha3 crypto: sha3-generic - export init/update/final routines crypto: sha3-generic - simplify code crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize crypto: sha3-generic - fixes for alignment and big endian operation crypto: aesni - handle zero length dst buffer crypto: artpec6 - remove select on non-existing CRYPTO_SHA384 hwrng: bcm2835 - Remove redundant dev_err call in bcm2835_rng_probe() crypto: stm32 - remove redundant dev_err call in stm32_cryp_probe() crypto: axis - remove unnecessary platform_get_resource() error check crypto: testmgr - test misuse of result in ahash crypto: inside-secure - make function safexcel_try_push_requests static crypto: aes-generic - fix aes-generic regression on powerpc crypto: chelsio - Fix indentation warning crypto: arm64/sha1-ce - get rid of literal pool crypto: arm64/sha2-ce - move the round constant table to .rodata section ...
2018-01-12x86/retpoline/crypto: Convert crypto assembler indirect jumpsDavid Woodhouse
Convert all indirect jumps in crypto assembler code to use non-speculative sequences when CONFIG_RETPOLINE is enabled. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-6-git-send-email-dwmw@amazon.co.uk
2017-12-28crypto: aesni - Fix out-of-bounds access of the AAD buffer in generic-gcm-aesniJunaid Shahid
The aesni_gcm_enc/dec functions can access memory after the end of the AAD buffer if the AAD length is not a multiple of 4 bytes. It didn't matter with rfc4106-gcm-aesni as in that case the AAD was always followed by the 8 byte IV, but that is no longer the case with generic-gcm-aesni. This can potentially result in accessing a page that is not mapped and thus causing the machine to crash. This patch fixes that by reading the last <16 byte block of the AAD byte-by-byte and optionally via an 8-byte load if the block was at least 8 bytes. Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen") Cc: <stable@vger.kernel.org> Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-12-28crypto: aesni - Fix out-of-bounds access of the data buffer in generic-gcm-aesniJunaid Shahid
The aesni_gcm_enc/dec functions can access memory before the start of the data buffer if the length of the data buffer is less than 16 bytes. This is because they perform the read via a single 16-byte load. This can potentially result in accessing a page that is not mapped and thus causing the machine to crash. This patch fixes that by reading the partial block byte-by-byte and optionally an via 8-byte load if the block was at least 8 bytes. Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen") Cc: <stable@vger.kernel.org> Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-05-18crypto: aesni - make non-AVX AES-GCM work with all valid auth_tag_lenSabrina Dubroca
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-05-18crypto: aesni - make non-AVX AES-GCM work with any aadlenSabrina Dubroca
This is the first step to make the aesni AES-GCM implementation generic. The current code was written for rfc4106, so it handles only some specific sizes of associated data. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-23crypto: x86 - make constants readonly, allow linker to merge themDenys Vlasenko
A lot of asm-optimized routines in arch/x86/crypto/ keep its constants in .data. This is wrong, they should be on .rodata. Mnay of these constants are the same in different modules. For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F exists in at least half a dozen places. There is a way to let linker merge them and use just one copy. The rules are as follows: mergeable objects of different sizes should not share sections. You can't put them all in one .rodata section, they will lose "mergeability". GCC puts its mergeable constants in ".rodata.cstSIZE" sections, or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used. This patch does the same: .section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16 It is important that all data in such section consists of 16-byte elements, not larger ones, and there are no implicit use of one element from another. When this is not the case, use non-mergeable section: .section .rodata[.VAR_NAME], "a", @progbits This reduces .data by ~15 kbytes: text data bss dec hex filename 11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o 11112095 2690672 2630712 16433479 fac147 vmlinux.o Merged objects are visible in System.map: ffffffff81a28810 r POLY ffffffff81a28810 r POLY ffffffff81a28820 r TWOONE ffffffff81a28820 r TWOONE ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of ffffffff81a28830 r SHUF_MASK <------------- the name difference ffffffff81a28830 r SHUF_MASK ffffffff81a28830 r SHUF_MASK .. ffffffff81a28d00 r K512 <- merged three identical 640-byte tables ffffffff81a28d00 r K512 ffffffff81a28d00 r K512 Use of object names in section name suffixes is not strictly necessary, but might help if someday link stage will use garbage collection to eliminate unused sections (ld --gc-sections). Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Josh Poimboeuf <jpoimboe@redhat.com> CC: Xiaodong Liu <xiaodong.liu@intel.com> CC: Megha Dey <megha.dey@intel.com> CC: linux-crypto@vger.kernel.org CC: x86@kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-24x86/asm/crypto: Create stack frames in crypto functionsJosh Poimboeuf
The crypto code has several callable non-leaf functions which don't honor CONFIG_FRAME_POINTER, which can result in bad stack traces. Create stack frames for them when CONFIG_FRAME_POINTER is enabled. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: David S. Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/6c20192bcf1102ae18ae5a242cabf30ce9b29895.1453405861.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-24x86/asm/crypto: Move .Lbswap_mask data to .rodata sectionJosh Poimboeuf
stacktool reports the following warning: stacktool: arch/x86/crypto/aesni-intel_asm.o: _aesni_inc_init(): can't find starting instruction stacktool gets confused when it tries to disassemble the following data in the .text section: .Lbswap_mask: .byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 Move it to .rodata which is a more appropriate section for read-only data. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: David S. Miller <davem@davemloft.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/b6a2f3f8bda705143e127c025edb2b53c86e6eb4.1453405861.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-14crypto: aesni - Add support for 192 & 256 bit keys to AESNI RFC4106Timothy McCaffrey
These patches fix the RFC4106 implementation in the aesni-intel module so it supports 192 & 256 bit keys. Since the AVX support that was added to this module also only supports 128 bit keys, and this patch only affects the SSE implementation, changes were also made to use the SSE version if key sizes other than 128 are specified. RFC4106 specifies that 192 & 256 bit keys must be supported (section 8.4). Also, this should fix Strongswan issue 341 where the aesni module needs to be unloaded if 256 bit keys are used: http://wiki.strongswan.org/issues/341 This patch has been tested with Sandy Bridge and Haswell processors. With 128 bit keys and input buffers > 512 bytes a slight performance degradation was noticed (~1%). For input buffers of less than 512 bytes there was no performance impact. Compared to 128 bit keys, 256 bit key size performance is approx. .5 cycles per byte slower on Sandy Bridge, and .37 cycles per byte slower on Haswell (vs. SSE code). This patch has also been tested with StrongSwan IPSec connections where it worked correctly. I created this diff from a git clone of crypto-2.6.git. Any questions, please feel free to contact me. Signed-off-by: Timothy McCaffrey <timothy.mccaffrey@unisys.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-06-13crypto: aesni_intel - fix accessing of unaligned memoryJussi Kivilinna
The new XTS code for aesni_intel uses input buffers directly as memory operands for pxor instructions, which causes crash if those buffers are not aligned to 16 bytes. Patch changes XTS code to handle unaligned memory correctly, by loading memory with movdqu instead. Reported-by: Dave Jones <davej@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-04-25crypto: aesni_intel - add more optimized XTS mode for x86-64Jussi Kivilinna
Add more optimized XTS code for aesni_intel in 64-bit mode, for smaller stack usage and boost for speed. tcrypt results, with Intel i5-2450M: 256-bit key enc dec 16B 0.98x 0.99x 64B 0.64x 0.63x 256B 1.29x 1.32x 1024B 1.54x 1.58x 8192B 1.57x 1.60x 512-bit key enc dec 16B 0.98x 0.99x 64B 0.60x 0.59x 256B 1.24x 1.25x 1024B 1.39x 1.42x 8192B 1.38x 1.42x I chose not to optimize smaller than block size of 256 bytes, since XTS is practically always used with data blocks of size 512 bytes. This is why performance is reduced in tcrypt for 64 byte long blocks. Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-01-20crypto: aesni-intel - add ENDPROC statements for assembler functionsJussi Kivilinna
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-05-31crypto: aesni-intel - fix unaligned cbc decrypt for x86-32Mathias Krause
The 32 bit variant of cbc(aes) decrypt is using instructions requiring 128 bit aligned memory locations but fails to ensure this constraint in the code. Fix this by loading the data into intermediate registers with load unaligned instructions. This fixes reported general protection faults related to aesni. References: https://bugzilla.kernel.org/show_bug.cgi?id=43223 Reported-by: Daniel <garkein@mailueberfall.de> Cc: stable@kernel.org [v2.6.39+] Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-03-27crypto: aesni-intel - fixed problem with packets that are not multiple of ↵Tadeusz Struk
64bytes This patch fixes problem with packets that are not multiple of 64bytes. Signed-off-by: Adrian Hoban <adrian.hoban@intel.com> Signed-off-by: Aidan O'Mahony <aidan.o.mahony@intel.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-03-18x86: Fix common misspellingsLucas De Marchi
They were generated by 'codespell' and then manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-13crypto: aesni-intel - Fixed build with binutils 2.16Tadeusz Struk
This patch fixes the problem with 2.16 binutils. Signed-off-by: Aidan O'Mahony <aidan.o.mahony@intel.com> Signed-off-by: Adrian Hoban <adrian.hoban@intel.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-29crypto: aesni-intel - Fixed build error on x86-32Mathias Krause
Exclude AES-GCM code for x86-32 due to heavy usage of 64-bit registers not available on x86-32. While at it, fixed unregister order in aesni_exit(). Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27crypto: aesni-intel - Ported implementation to x86-32Mathias Krause
The AES-NI instructions are also available in legacy mode so the 32-bit architecture may profit from those, too. To illustrate the performance gain here's a short summary of a dm-crypt speed test on a Core i7 M620 running at 2.67GHz comparing both assembler implementations: x86: i568 aes-ni delta ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4% CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3% LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5% XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7% Additionally, due to some minor optimizations, the 64-bit version also got a minor performance gain as seen below: x86-64: old impl. new impl. delta ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5% CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9% LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6% XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7% Signed-off-by: Mathias Krause <minipli@googlemail.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-13crypto: aesni-intel - RFC4106 AES-GCM Driver Using Intel New InstructionsTadeusz Struk
This patch adds an optimized RFC4106 AES-GCM implementation for 64-bit kernels. It supports 128-bit AES key size. This leverages the crypto AEAD interface type to facilitate a combined AES & GCM operation to be implemented in assembly code. The assembly code leverages Intel(R) AES New Instructions and the PCLMULQDQ instruction. Signed-off-by: Adrian Hoban <adrian.hoban@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: Aidan O'Mahony <aidan.o.mahony@intel.com> Signed-off-by: Erdinc Ozturk <erdinc.ozturk@intel.com> Signed-off-by: James Guilford <james.guilford@intel.com> Signed-off-by: Wajdi Feghali <wajdi.k.feghali@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-03-13crypto: aesni-intel - Fix CTR optimization build failure with gas 2.16.1Huang Ying
Andrew Morton reported that AES-NI CTR optimization failed to compile with gas 2.16.1, the error message is as follow: arch/x86/crypto/aesni-intel_asm.S: Assembler messages: arch/x86/crypto/aesni-intel_asm.S:752: Error: suffix or operands invalid for `movq' arch/x86/crypto/aesni-intel_asm.S:753: Error: suffix or operands invalid for `movq' To fix this, a gas macro is defined to assemble movq with 64bit general purpose registers and XMM registers. The macro will generate the raw .byte sequence for needed instructions. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-03-10crypto: aesni-intel - Add AES-NI accelerated CTR modeHuang Ying
To take advantage of the hardware pipeline implementation of AES-NI instructions. CTR mode cryption is implemented in ASM to schedule multiple AES-NI instructions one after another. This way, some latency of AES-NI instruction can be eliminated. Performance testing based on dm-crypt should 50% reduction of ecryption/decryption time. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-11-23crypto: aesni-intel - Use gas macro for AES-NI instructionsHuang Ying
Old binutils do not support AES-NI instructions, to make kernel can be compiled by them, .byte code is used instead of AES-NI assembly instructions. But the readability and flexibility of raw .byte code is not good. So corresponding assembly instruction like gas macro is used instead. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-06-18crypto: aes-ni - Fix cbc mode IV savingHuang Ying
Original implementation of aesni_cbc_dec do not save IV if input length % 4 == 0. This will make decryption of next block failed. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-02-18crypto: aes-ni - Add support to Intel AES-NI instructions for x86_64 platformHuang Ying
Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD) instructions that are going to be introduced in the next generation of Intel processor, as of 2009. These instructions enable fast and secure data encryption and decryption, using the Advanced Encryption Standard (AES), defined by FIPS Publication number 197. The architecture introduces six instructions that offer full hardware support for AES. Four of them support high performance data encryption and decryption, and the other two instructions support the AES key expansion procedure. The white paper can be downloaded from: http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf AES may be used in soft_irq context, but MMX/SSE context can not be touched safely in soft_irq context. So in_interrupt() is checked, if in IRQ or soft_irq context, the general x86_64 implementation are used instead. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>