linux-sg2042/arch/x86/crypto
George Spelvin 473946e674 crypto: crc32c-pclmul - Shrink K_table to 32-bit words
There's no need for the K_table to be made of 64-bit words.  For some
reason, the original authors didn't fully reduce the values modulo the
CRC32C polynomial, and so had some 33-bit values in there.  They can
all be reduced to 32 bits.

Doing that cuts the table size in half.  Since the code depends on both
pclmulq and crc32, SSE 4.1 is obviously present, so we can use pmovzxdq
to fetch it in the correct format.

This adds (measured on Ivy Bridge) 1 cycle per main loop iteration
(CRC of up to 3K bytes), less than 0.2%.  The hope is that the reduced
D-cache footprint will make up the loss in other code.

Two other related fixes:
* K_table is read-only, so belongs in .rodata, and
* There's no need for more than 8-byte alignment

Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20 21:27:57 +08:00
..
Makefile crypto: sha - SHA1 transform x86_64 AVX2 2014-03-21 21:54:30 +08:00
aes-i586-asm_32.S crypto: x86/aes - assembler clean-ups: use ENTRY/ENDPROC, localize jump targets 2013-01-20 10:16:47 +11:00
aes-x86_64-asm_64.S crypto: x86/aes - assembler clean-ups: use ENTRY/ENDPROC, localize jump targets 2013-01-20 10:16:47 +11:00
aes_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
aesni-intel_asm.S crypto: aesni_intel - fix accessing of unaligned memory 2013-06-13 14:57:42 +08:00
aesni-intel_avx-x86_64.S crypto: aesni - fix build on x86 (32bit) 2014-01-15 11:36:34 +08:00
aesni-intel_glue.c crypto: aesni - fix build on x86 (32bit) 2013-12-31 19:47:46 +08:00
blowfish-x86_64-asm_64.S crypto: blowfish-x86_64: use ENTRY()/ENDPROC() for assembler functions and localize jump targets 2013-01-20 10:16:48 +11:00
blowfish_glue.c crypto: remove a duplicate checks in __cbc_decrypt() 2014-02-27 05:56:54 +08:00
camellia-aesni-avx-asm_64.S crypto: x86/camellia-aesni-avx - add more optimized XTS code 2013-04-25 21:01:52 +08:00
camellia-aesni-avx2-asm_64.S crypto: camellia-aesni-avx2 - tune assembly code for more performance 2013-06-21 14:44:23 +08:00
camellia-x86_64-asm_64.S crypto: camellia-x86_64/aes-ni: use ENTRY()/ENDPROC() for assembler functions and localize jump targets 2013-01-20 10:16:48 +11:00
camellia_aesni_avx2_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
camellia_aesni_avx_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
camellia_glue.c crypto: camellia-x86-64 - replace commas by semicolons and adjust code alignment 2013-08-21 21:08:32 +10:00
cast5-avx-x86_64-asm_64.S crypto: cast5-avx: use ENTRY()/ENDPROC() for assembler functions and localize jump targets 2013-01-20 10:16:48 +11:00
cast5_avx_glue.c crypto: remove a duplicate checks in __cbc_decrypt() 2014-02-27 05:56:54 +08:00
cast6-avx-x86_64-asm_64.S crypto: cast6-avx: use new optimized XTS code 2013-04-25 21:01:52 +08:00
cast6_avx_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
crc32-pclmul_asm.S x86, crc32-pclmul: Fix build with older binutils 2013-05-30 16:36:23 -07:00
crc32-pclmul_glue.c crypto: crc32 - add crc32 pclmulqdq implementation and wrappers for table implementation 2013-01-20 10:16:45 +11:00
crc32c-intel_glue.c crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction 2012-10-15 22:18:24 +08:00
crc32c-pcl-intel-asm_64.S crypto: crc32c-pclmul - Shrink K_table to 32-bit words 2014-06-20 21:27:57 +08:00
crct10dif-pcl-asm_64.S Reinstate "crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform framework" 2013-09-07 12:56:26 +10:00
crct10dif-pclmul_glue.c Reinstate "crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform framework" 2013-09-07 12:56:26 +10:00
fpu.c crypto: aesni-intel - Merge with fpu.ko 2011-05-16 15:12:47 +10:00
ghash-clmulni-intel_asm.S crypto: ghash-clmulni-intel - Use u128 instead of be128 for internal key 2014-04-04 21:06:14 +08:00
ghash-clmulni-intel_glue.c crypto: ghash-clmulni-intel - Use u128 instead of be128 for internal key 2014-04-04 21:06:14 +08:00
glue_helper-asm-avx.S crypto: x86 - add more optimized XTS-mode for serpent-avx 2013-04-25 21:01:51 +08:00
glue_helper-asm-avx2.S crypto: twofish - add AVX2/x86_64 assembler implementation of twofish cipher 2013-04-25 21:09:05 +08:00
glue_helper.c crypto: x86 - add more optimized XTS-mode for serpent-avx 2013-04-25 21:01:51 +08:00
salsa20-i586-asm_32.S crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember functions and rename ECRYPT_* to salsa20_* 2013-01-20 10:16:50 +11:00
salsa20-x86_64-asm_64.S crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember functions and rename ECRYPT_* to salsa20_* 2013-01-20 10:16:50 +11:00
salsa20_glue.c crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember functions and rename ECRYPT_* to salsa20_* 2013-01-20 10:16:50 +11:00
serpent-avx-x86_64-asm_64.S crypto: x86 - add more optimized XTS-mode for serpent-avx 2013-04-25 21:01:51 +08:00
serpent-avx2-asm_64.S crypto: serpent - add AVX2/x86_64 assembler implementation of serpent cipher 2013-04-25 21:09:07 +08:00
serpent-sse2-i586-asm_32.S crypto: x86/serpent - use ENTRY/ENDPROC for assember functions and localize jump targets 2013-01-20 10:16:50 +11:00
serpent-sse2-x86_64-asm_64.S crypto: x86/serpent - use ENTRY/ENDPROC for assember functions and localize jump targets 2013-01-20 10:16:50 +11:00
serpent_avx2_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
serpent_avx_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
serpent_sse2_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
sha1_avx2_x86_64_asm.S crypto: x86/sha1 - reduce size of the AVX2 asm implementation 2014-03-25 20:25:43 +08:00
sha1_ssse3_asm.S crypto: x86/sha1 - assembler clean-ups: use ENTRY/ENDPROC 2013-01-20 10:16:51 +11:00
sha1_ssse3_glue.c crypto: x86/sha1 - re-enable the AVX variant 2014-03-25 20:25:42 +08:00
sha256-avx-asm.S crypto: sha256_ssse3 - fix stack corruption with SSSE3 and AVX implementations 2013-05-28 13:46:47 +08:00
sha256-avx2-asm.S crypto: sha256 - Optimized sha256 x86_64 routine using AVX2's RORX instructions 2013-04-03 09:06:32 +08:00
sha256-ssse3-asm.S crypto: sha256_ssse3 - fix stack corruption with SSSE3 and AVX implementations 2013-05-28 13:46:47 +08:00
sha256_ssse3_glue.c crypto: sha256_ssse3 - also test for BMI2 2013-10-07 14:17:10 +08:00
sha512-avx-asm.S crypto: sha512 - Optimized SHA512 x86_64 assembly routine using AVX instructions. 2013-04-25 21:00:58 +08:00
sha512-avx2-asm.S crypto: sha512 - Optimized SHA512 x86_64 assembly routine using AVX2 RORX instruction. 2013-04-25 21:00:58 +08:00
sha512-ssse3-asm.S crypto: sha512 - Optimized SHA512 x86_64 assembly routine using Supplemental SSE3 instructions. 2013-04-25 21:00:58 +08:00
sha512_ssse3_glue.c crypto: sha512_ssse3 - add sha384 support 2013-05-28 15:43:05 +08:00
twofish-avx-x86_64-asm_64.S crypto: x86/twofish-avx - use optimized XTS code 2013-04-25 21:01:51 +08:00
twofish-i586-asm_32.S crypto: x86/twofish - assembler clean-ups: use ENTRY/ENDPROC, localize jump labels 2013-01-20 10:16:51 +11:00
twofish-x86_64-asm_64-3way.S crypto: x86/twofish - assembler clean-ups: use ENTRY/ENDPROC, localize jump labels 2013-01-20 10:16:51 +11:00
twofish-x86_64-asm_64.S crypto: x86/twofish - assembler clean-ups: use ENTRY/ENDPROC, localize jump labels 2013-01-20 10:16:51 +11:00
twofish_avx_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
twofish_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
twofish_glue_3way.c crypto: x86/glue_helper - use le128 instead of u128 for CTR mode 2012-10-24 21:10:54 +08:00