diff options
author | Eric Biggers <ebiggers@kernel.org> | 2025-07-18 12:18:59 -0700 |
---|---|---|
committer | Eric Biggers <ebiggers@kernel.org> | 2025-07-20 21:42:34 -0700 |
commit | f88ed14aa0ef64ff5633605114efe313a0bed84b (patch) | |
tree | 1351369fb644f27cd262f89d071d76e6b33c5a24 /scripts/lib/kdoc/kdoc_files.py | |
parent | c76ed8790b3018fe36647d9aae96e0373f321184 (diff) |
lib/crypto: x86/sha1-ni: Minor optimizations and cleanup
- Store the previous state in %xmm8-%xmm9 instead of spilling it to the
stack. There are plenty of unused XMM registers here, so there is no
reason to spill to the stack. (While 32-bit code is limited to
%xmm0-%xmm7, this is 64-bit code, so it's free to use %xmm8-%xmm15.)
- Remove the unnecessary check for nblocks == 0. sha1_ni_transform() is
always passed a positive nblocks.
- To get an XMM register with 'e' in the high dword and the rest zeroes,
just zeroize the register using pxor, then load 'e'. Previously the
code loaded 'e', then zeroized the lower dwords by AND-ing with a
constant, which was slightly less efficient.
- Instead of computing &DATA_PTR[NBLOCKS << 6] and stopping when
DATA_PTR reaches that value, instead just decrement NBLOCKS on each
iteration and stop when it reaches 0. This is fewer instructions.
- Rename DIGEST_PTR to STATE_PTR. It points to the SHA-1 internal
state, not a SHA-1 digest value.
This commit shrinks the code size of sha1_ni_transform() from 624 bytes
to 589 bytes and also shrinks rodata by 16 bytes.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718191900.42877-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Diffstat (limited to 'scripts/lib/kdoc/kdoc_files.py')
0 files changed, 0 insertions, 0 deletions