OpenCloudOS-Kernel/arch/x86/crypto
Tianjia Zhang a7ee22ee14 crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation
This patch adds AES-NI/AVX/x86_64 assembler implementation of SM4
block cipher. Through two affine transforms, we can use the AES S-Box
to simulate the SM4 S-Box to achieve the effect of instruction
acceleration.

The main algorithm implementation comes from SM4 AES-NI work by
libgcrypt and Markku-Juhani O. Saarinen at:
https://github.com/mjosaarinen/sm4ni

This optimization supports the four modes of SM4, ECB, CBC, CFB, and
CTR. Since CBC and CFB do not support multiple block parallel
encryption, the optimization effect is not obvious.

Benchmark on Intel Xeon Cascadelake, the data comes from the 218 mode
and 518 mode of tcrypt. The abscissas are blocks of different lengths.
The data is tabulated and the unit is Mb/s:

sm4-generic   |    16      64     128     256    1024    1420    4096
      ECB enc | 40.99   46.50   48.05   48.41   49.20   49.25   49.28
      ECB dec | 41.07   46.99   48.15   48.67   49.20   49.25   49.29
      CBC enc | 37.71   45.28   46.77   47.60   48.32   48.37   48.40
      CBC dec | 36.48   44.82   46.43   47.45   48.23   48.30   48.36
      CFB enc | 37.94   44.84   46.12   46.94   47.57   47.46   47.68
      CFB dec | 37.50   42.84   43.74   44.37   44.85   44.80   44.96
      CTR enc | 39.20   45.63   46.75   47.49   48.09   47.85   48.08
      CTR dec | 39.64   45.70   46.72   47.47   47.98   47.88   48.06
sm4-aesni-avx
      ECB enc | 33.75  134.47  221.64  243.43  264.05  251.58  258.13
      ECB dec | 34.02  134.92  223.11  245.14  264.12  251.04  258.33
      CBC enc | 38.85   46.18   47.67   48.34   49.00   48.96   49.14
      CBC dec | 33.54  131.29  223.88  245.27  265.50  252.41  263.78
      CFB enc | 38.70   46.10   47.58   48.29   49.01   48.94   49.19
      CFB dec | 32.79  128.40  223.23  244.87  265.77  253.31  262.79
      CTR enc | 32.58  122.23  220.29  241.16  259.57  248.32  256.69
      CTR dec | 32.81  122.47  218.99  241.54  258.42  248.58  256.61

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-07-30 10:58:31 +08:00
..
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
Makefile crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation 2021-07-30 10:58:31 +08:00
aegis128-aesni-asm.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
aegis128-aesni-glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
aes_ctrby8_avx-x86_64.S crypto: x86 - Remove include/asm/inst.h 2020-07-16 21:49:07 +10:00
aesni-intel_asm.S crypto: x86/aes-ni-xts - rewrite and drop indirections via glue helper 2021-01-08 15:39:47 +11:00
aesni-intel_avx-x86_64.S x86/crypto/aesni-intel_avx: Standardize stack alignment prologue 2021-04-19 12:36:34 -05:00
aesni-intel_glue.c crypto: x86/aes-ni - add missing error checks in XTS code 2021-07-23 14:49:18 +08:00
blake2s-core.S x86: remove always-defined CONFIG_AS_SSSE3 2020-04-09 00:01:59 +09:00
blake2s-glue.c crypto: blake2s - share the "shash" API boilerplate code 2021-01-03 08:41:38 +11:00
blowfish-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
blowfish_glue.c crypto: x86/blowfish - drop CTR mode implementation 2021-01-14 17:10:29 +11:00
camellia-aesni-avx-asm_64.S crypto: x86/camellia - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
camellia-aesni-avx2-asm_64.S x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer 2021-04-19 12:36:34 -05:00
camellia-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
camellia.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
camellia_aesni_avx2_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
camellia_aesni_avx_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
camellia_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
cast5-avx-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
cast5_avx_glue.c crypto: x86/cast5 - drop dependency on glue helper 2021-01-14 17:10:29 +11:00
cast6-avx-x86_64-asm_64.S crypto: x86/cast6 - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
cast6_avx_glue.c crypto: x86/cast6 - drop dependency on glue helper 2021-01-14 17:10:29 +11:00
chacha-avx2-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
chacha-avx512vl-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
chacha-ssse3-x86_64.S crypto: x86/chacha-sse3 - use unaligned loads for state array 2020-07-16 21:49:04 +10:00
chacha_glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
crc32-pclmul_asm.S crypto: x86 - Remove include/asm/inst.h 2020-07-16 21:49:07 +10:00
crc32-pclmul_glue.c x86: Fix various typos in comments, take #2 2021-03-21 23:50:28 +01:00
crc32c-intel_glue.c crypto: x86/crc32c-intel - Use CRC32 mnemonic 2020-08-21 14:45:28 +10:00
crc32c-pcl-intel-asm_64.S x86/crypto/crc32c-pcl-intel: Standardize jump table 2021-04-19 12:36:34 -05:00
crct10dif-pcl-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
crct10dif-pclmul_glue.c crypto: Convert to new CPU match macros 2020-03-24 21:36:06 +01:00
curve25519-x86_64.c crypto: x86/curve25519 - fix cpu feature checking logic in mod_exit 2021-06-11 15:03:29 +08:00
des3_ede-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
des3_ede_glue.c crypto: x86/des - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
ecb_cbc_helpers.h crypto: x86 - add some helper macros for ECB and CBC modes 2021-01-14 17:10:29 +11:00
ghash-clmulni-intel_asm.S crypto: x86 - Remove include/asm/inst.h 2020-07-16 21:49:07 +10:00
ghash-clmulni-intel_glue.c crypto: Convert to new CPU match macros 2020-03-24 21:36:06 +01:00
glue_helper-asm-avx.S crypto: x86/glue-helper - drop CTR helper routines 2021-01-14 17:10:28 +11:00
glue_helper-asm-avx2.S crypto: x86/glue-helper - drop CTR helper routines 2021-01-14 17:10:28 +11:00
nh-avx2-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
nh-sse2-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
nhpoly1305-avx2-glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
nhpoly1305-sse2-glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
poly1305-x86_64-cryptogams.pl crypto: x86/poly1305 - Use TEST %reg,%reg instead of CMP $0,%reg 2020-12-04 18:13:16 +11:00
poly1305_glue.c crypto: poly1305 - fix poly1305_core_setkey() declaration 2021-04-02 18:28:12 +11:00
serpent-avx-x86_64-asm_64.S crypto: x86/serpent - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
serpent-avx.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent-avx2-asm_64.S crypto: x86/serpent - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
serpent-sse2-i586-asm_32.S x86/asm/32: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 12:03:43 +02:00
serpent-sse2-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
serpent-sse2.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent_avx2_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent_avx_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent_sse2_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
sha1_avx2_x86_64_asm.S x86/crypto/sha1_avx2: Standardize stack alignment prologue 2021-04-19 12:36:35 -05:00
sha1_ni_asm.S x86/crypto/sha_ni: Standardize stack alignment prologue 2021-04-19 12:36:35 -05:00
sha1_ssse3_asm.S x86: remove always-defined CONFIG_AS_AVX 2020-04-09 00:01:59 +09:00
sha1_ssse3_glue.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
sha256-avx-asm.S x86: remove always-defined CONFIG_AS_AVX 2020-04-09 00:01:59 +09:00
sha256-avx2-asm.S x86/crypto/sha256-avx2: Standardize stack alignment prologue 2021-04-19 12:36:36 -05:00
sha256-ssse3-asm.S crypto: x86/sha - Eliminate casts on asm implementations 2020-01-22 16:21:08 +08:00
sha256_ni_asm.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
sha256_ssse3_glue.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
sha512-avx-asm.S x86/crypto/sha512-avx: Standardize stack alignment prologue 2021-04-19 12:36:36 -05:00
sha512-avx2-asm.S x86/crypto/sha512-avx2: Standardize stack alignment prologue 2021-04-19 12:36:36 -05:00
sha512-ssse3-asm.S x86/crypto/sha512-ssse3: Standardize stack alignment prologue 2021-04-19 12:36:37 -05:00
sha512_ssse3_glue.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
sm4-aesni-avx-asm_64.S crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation 2021-07-30 10:58:31 +08:00
sm4_aesni_avx_glue.c crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation 2021-07-30 10:58:31 +08:00
twofish-avx-x86_64-asm_64.S crypto: x86/twofish - drop CTR mode implementation 2021-01-14 17:10:28 +11:00
twofish-i586-asm_32.S x86/asm/32: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 12:03:43 +02:00
twofish-x86_64-asm_64-3way.S x86: Fix various typos in comments, take #2 2021-03-21 23:50:28 +01:00
twofish-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
twofish.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
twofish_avx_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
twofish_glue.c crypto: prefix module autoloading with "crypto-" 2014-11-24 22:43:57 +08:00
twofish_glue_3way.c x86: Fix various typos in comments 2021-03-18 15:31:53 +01:00