Commit Graph

287 Commits

Author SHA1 Message Date
Tianjia Zhang b45b0a1220 crypto: arm64/sm4 - Fix wrong dependency of NEON/CE implementation
Commit d2825fa936 ("crypto: sm3,sm4 - move into crypto directory")
moved the sm4 library implementation from the lib/crypto directory to
the crypto directory and configured the name as CRYPTO_SM4. The arm64
SM4 NEON/CE implementation depends on this and needs to be modified
uniformly.

Fixes: 4f1aef9b80 ("crypto: arm64/sm4 - add ARMv8 NEON implementation")
Fixes: 5b33e0ec88 ("crypto: arm64/sm4 - add ARMv8 Crypto Extensions implementation")
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-04-15 16:34:30 +08:00
Tianjia Zhang 5b33e0ec88 crypto: arm64/sm4 - add ARMv8 Crypto Extensions implementation
This adds ARMv8 implementations of SM4 in ECB, CBC, CFB and CTR
modes using Crypto Extensions, also includes key expansion operations
because the Crypto Extensions instruction is much faster than software
implementations.

The Crypto Extensions for SM4 can only run on ARMv8 implementations
that have support for these optional extensions.

Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 218
mode of tcrypt. The abscissas are blocks of different lengths. The
data is tabulated and the unit is Mb/s:

sm4-generic |     16       64      128      256     1024     1420     4096
    ECB enc |  80.05    91.42    93.66    94.77    95.69    95.77    95.86
    ECB dec |  79.98    91.41    93.64    94.76    95.66    95.77    95.85
    CBC enc |  78.55    86.50    88.02    88.77    89.36    89.42    89.48
    CBC dec |  76.82    89.06    91.52    92.77    93.75    93.83    93.96
    CFB enc |  77.64    86.13    87.62    88.42    89.08    88.83    89.18
    CFB dec |  77.57    88.34    90.36    91.45    92.34    92.00    92.44
    CTR enc |  77.80    88.28    90.23    91.22    92.11    91.81    92.25
    CTR dec |  77.83    88.22    90.22    91.22    92.04    91.82    92.28
sm4-neon
    ECB enc |  28.31   112.77   203.03   209.89   215.49   202.11   210.59
    ECB dec |  28.36   113.45   203.23   210.00   215.52   202.13   210.65
    CBC enc |  79.32    87.02    88.51    89.28    89.85    89.89    89.97
    CBC dec |  28.29   112.20   203.30   209.82   214.99   201.51   209.95
    CFB enc |  79.59    87.16    88.54    89.30    89.83    89.62    89.92
    CFB dec |  28.12   111.05   202.47   209.02   214.21   210.90   209.12
    CTR enc |  28.04   108.81   200.62   206.65   211.78   208.78   206.74
    CTR dec |  28.02   108.82   200.45   206.62   211.78   208.74   206.70
sm4-ce-cipher
    ECB enc | 336.79   587.13   682.70   747.37   803.75   811.52   818.06
    ECB dec | 339.18   584.52   679.72   743.68   798.82   803.83   811.54
    CBC enc | 316.63   521.47   597.00   647.14   690.82   695.21   700.55
    CBC dec | 291.80   503.79   585.66   640.82   689.86   695.16   701.72
    CFB enc | 294.79   482.31   552.13   594.71   631.60   628.91   638.92
    CFB dec | 293.09   466.44   526.56   563.17   594.41   592.26   601.97
    CTR enc | 309.61   506.13   576.86   620.47   656.38   654.51   665.10
    CTR dec | 306.69   505.57   576.84   620.18   657.09   654.52   665.32
sm4-ce
    ECB enc | 366.96  1329.81  2024.29  2755.50  3790.07  3861.91  4051.40
    ECB dec | 367.30  1323.93  2018.72  2747.43  3787.39  3862.55  4052.62
    CBC enc | 358.09   682.68   807.24   885.35   958.29   963.60   973.73
    CBC dec | 366.51  1303.63  1978.64  2667.93  3624.53  3683.41  3856.08
    CFB enc | 351.51   681.26   807.81   893.10   968.54   969.17   985.83
    CFB dec | 354.98  1266.61  1929.63  2634.81  3614.23  3611.59  3841.68
    CTR enc | 324.23  1121.25  1689.44  2256.70  2981.90  3007.79  3060.74
    CTR dec | 324.18  1120.44  1694.31  2258.32  2982.01  3010.09  3060.99

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-04-08 16:13:29 +08:00
Tianjia Zhang 4f1aef9b80 crypto: arm64/sm4 - add ARMv8 NEON implementation
This adds ARMv8 NEON implementations of SM4 in ECB, CBC, CFB and CTR
modes. This implementation uses the plain NEON instruction set, All
S-BOX substitutions uses the tbl/tbx instructions of ARMv8, combined
with the out-of-order execution in CPU, this optimization supports
encryption of up to 8 blocks at the same time.

The performance of encrypting one block is not as good as software
implementation, so the encryption operations of CBC and CFB still
use pure software algorithms.

Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 218
mode of tcrypt. The abscissas are blocks of different lengths. The
data is tabulated and the unit is Mb/s:

sm4-generic |     16       64      128      256     1024     1420     4096
    ECB enc |  80.05    91.42    93.66    94.77    95.69    95.77    95.86
    ECB dec |  79.98    91.41    93.64    94.76    95.66    95.77    95.85
    CBC enc |  78.55    86.50    88.02    88.77    89.36    89.42    89.48
    CBC dec |  76.82    89.06    91.52    92.77    93.75    93.83    93.96
    CFB enc |  77.64    86.13    87.62    88.42    89.08    88.83    89.18
    CFB dec |  77.57    88.34    90.36    91.45    92.34    92.00    92.44
    CTR enc |  77.80    88.28    90.23    91.22    92.11    91.81    92.25
    CTR dec |  77.83    88.22    90.22    91.22    92.04    91.82    92.28
sm4-neon
    ECB enc |  28.31   112.77   203.03   209.89   215.49   202.11   210.59
    ECB dec |  28.36   113.45   203.23   210.00   215.52   202.13   210.65
    CBC enc |  79.32    87.02    88.51    89.28    89.85    89.89    89.97
    CBC dec |  28.29   112.20   203.30   209.82   214.99   201.51   209.95
    CFB enc |  79.59    87.16    88.54    89.30    89.83    89.62    89.92
    CFB dec |  28.12   111.05   202.47   209.02   214.21   210.90   209.12
    CTR enc |  28.04   108.81   200.62   206.65   211.78   208.78   206.74
    CTR dec |  28.02   108.82   200.45   206.62   211.78   208.74   206.70

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-04-08 16:13:13 +08:00
Tianjia Zhang 02436762f5 crypto: arm64/sm4-ce - rename to sm4-ce-cipher
The subsequent patches of the series will have an implementation
of SM4-ECB/CBC/CFB/CTR accelerated by the CE instruction set, which
conflicts with the current module name. In order to keep the naming
rules of the AES algorithm consistent, the sm4-ce algorithm is
renamed to sm4-ce-cipher.

In addition, the speed of sm4-ce-cipher is better than that of SM4
NEON. By the way, the priority of the algorithm is adjusted to 300,
which is also to leave room for the priority of SM4 NEON.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-04-08 16:12:47 +08:00
Jason A. Donenfeld d2825fa936 crypto: sm3,sm4 - move into crypto directory
The lib/crypto libraries live in lib because they are used by various
drivers of the kernel. In contrast, the various helper functions in
crypto are there because they're used exclusively by the crypto API. The
SM3 and SM4 helper functions were erroniously moved into lib/crypto/
instead of crypto/, even though there are no in-kernel users outside of
the crypto API of those functions. This commit moves them into crypto/.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-04-08 16:11:48 +08:00
Tom Rix cd6714f940 crypto: arm64 - cleanup comments
For spdx, use // for *.c files

Replacements
significanty to significantly

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-03-09 15:12:32 +12:00
Ard Biesheuvel dfc6031ec9 crypto: arm64/aes-neonbs-xts - use plain NEON for non-power-of-2 input sizes
Even though the kernel's implementations of AES-XTS were updated to
implement ciphertext stealing and can operate on inputs of any size
larger than or equal to the AES block size, this feature is rarely used
in practice.

In fact, in the kernel, AES-XTS is only used to operate on 4096 or 512
byte blocks, which means that not only the ciphertext stealing is
effectively dead code, the logic in the bit sliced NEON implementation
to deal with fewer than 8 blocks at a time is also never used.

Since the bit-sliced NEON driver already depends on the plain NEON
version, which is slower but can operate on smaller data quantities more
straightforwardly, let's fallback to the plain NEON implementation of
XTS for any residual inputs that are not multiples of 128 bytes. This
allows us to remove a lot of complicated logic that rarely gets
exercised in practice.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-02-05 15:10:51 +11:00
Ard Biesheuvel fc074e1300 crypto: arm64/aes-neonbs-ctr - fallback to plain NEON for final chunk
Instead of processing the entire input with the 8-way bit sliced
algorithm, which is sub-optimal for inputs that are not a multiple of
128 bytes in size, invoke the plain NEON version of CTR for the
remainder of the input after processing the bulk using 128 byte strides.

This allows us to greatly simplify the asm code that implements CTR, and
get rid of all the branches and special code paths. It also gains us a
couple of percent of performance.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-02-05 15:10:51 +11:00
Ard Biesheuvel 8daa399ede crypto: arm64/aes-neon-ctr - improve handling of single tail block
Instead of falling back to C code to do a memcpy of the output of the
last block, handle this in the asm code directly if possible, which is
the case if the entire input is longer than 16 bytes.

Cc: Nathan Huckleberry <nhuck@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-02-05 15:10:51 +11:00
Tianjia Zhang f3a03d319d crypto: arm64/sm3-ce - make dependent on sm3 library
SM3 generic library is stand-alone implementation, sm3-ce can depend
on the SM3 library instead of sm3-generic.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-01-28 16:51:10 +11:00
Mark Brown 9be34be87c arm64: Add macro version of the BTI instruction
BTI is only available from v8.5 so we need to encode it using HINT in
generic code and for older toolchains. Add an assembler macro based on
one written by Mark Rutland which lets us use the mnemonic and update
the existing users.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211214152714.2380849-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-14 18:12:58 +00:00
Ard Biesheuvel 898387e40c crypto: arm64/aes-ccm - avoid by-ref argument for ce_aes_ccm_auth_data
With the SIMD code path removed, we can clean up the CCM auth-only path
a bit further, by passing the 'macp' input buffer pointer by value,
rather than by reference, and taking the output value from the
function's return value.

This way, the compiler is no longer forced to allocate macp on the
stack. This is not expected to make any difference in practice, it just
makes for slightly cleaner code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-09-17 11:05:11 +08:00
Ard Biesheuvel 741691c446 crypto: arm64/aes-ccm - reduce NEON begin/end calls for common case
AES-CCM (as used in WPA2 CCMP, for instance) typically involves
authenticate-only data, and operates on a single network packet, and so
the common case is for the authenticate, en/decrypt and finalize SIMD
helpers to all be called exactly once in sequence. Since
kernel_neon_end() now involves manipulation of the preemption state as
well as the softirq mask state, let's reduce the number of times we are
forced to call it to only once if we are handling this common case.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-09-17 11:05:11 +08:00
Ard Biesheuvel b3482635e5 crypto: arm64/aes-ccm - remove non-SIMD fallback path
AES/CCM on arm64 is implemented as a synchronous AEAD, and so it is
guaranteed by the API that it is only invoked in task or softirq
context. Since softirqs are now only handled when the SIMD is not
being used in the task context that was interrupted to service the
softirq, we no longer need a fallback path. Let's remove it.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-09-17 11:05:11 +08:00
Ard Biesheuvel 36a916af64 crypto: arm64/aes-ccm - yield NEON when processing auth-only data
In SIMD accelerated crypto drivers, we typically yield the SIMD unit
after processing 4 KiB of input, to avoid scheduling blackouts caused by
the fact that claiming the SIMD unit disables preemption as well as
softirq processing.

The arm64 CCM driver does this implicitly for the ciphertext, due to the
fact that the skcipher API never processes more than a single page at a
time. However, the scatterwalk performed by this driver when processing
the authenticate-only data will keep the SIMD unit occupied until it
completes.

So cap the scatterwalk steps to 4 KiB.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-09-17 11:05:10 +08:00
Ard Biesheuvel 676e508122 crypto: arm64/aes-ce - stop using SIMD helper for skciphers
Calls into the skcipher API can only occur from contexts where the SIMD
unit is available, so there is no need for the SIMD helper.

Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-09-17 11:05:10 +08:00
Ard Biesheuvel 96c34e1436 crypto: arm64/aes-neonbs - stop using SIMD helper for skciphers
Calls into the skcipher API can only occur from contexts where the SIMD
unit is available, so there is no need for the SIMD helper.

Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-09-17 11:05:10 +08:00
Ard Biesheuvel b9e699f912 crypto: arm64/gcm-aes-ce - remove non-SIMD fallback path
Now that kernel mode SIMD is guaranteed to be available when executing
in task or softirq context, we no longer need scalar fallbacks to use
when the NEON is unavailable. So get rid of them.

Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-09-17 11:05:10 +08:00
Tianjia Zhang c59de48e12 crypto: arm64/sm4-ce - Make dependent on sm4 library instead of sm4-generic
SM4 library is abstracted from sm4-generic algorithm, sm4-ce can depend on
the SM4 library instead of sm4-generic, and some functions in sm4-generic
do not need to be exported.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-07-30 10:58:30 +08:00
Masahiro Yamada 2063257d4b crypto: arm64 - use a pattern rule for generating *.S files
Unify similar build rules.

sha256-core.S opts out it because it is generated from sha512-armv8.pl.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-05-14 19:07:54 +08:00
Masahiro Yamada 12dd461ebd crypto: arm64 - generate *.S by Perl at build time instead of shipping them
Generate *.S by Perl like arch/{mips,x86}/crypto/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-05-14 19:07:54 +08:00
Linus Torvalds 31a24ae89c arm64 updates for 5.13:
- MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not allow
   precise identification of the illegal access.
 
 - Run kernel mode SIMD with softirqs disabled. This allows using NEON in
   softirq context for crypto performance improvements. The conditional
   yield support is modified to take softirqs into account and reduce the
   latency.
 
 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.
 
 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers, new
   functions for the HiSilicon HHA and L3C PMU, cleanups.
 
 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.
 
 - Disable fine-grained traps at boot and improve the documented boot
   requirements.
 
 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).
 
 - Add hierarchical eXecute Never permissions for all page tables.
 
 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.
 
 - arm64 kselftests for BTI and some improvements to the MTE tests.
 
 - Minor improvements to the compat vdso and sigpage.
 
 - Miscellaneous cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmB5xkkACgkQa9axLQDI
 XvEBgRAAsr6r8gsBQJP3FDHmbtbVf2ej5QJTCOAQAGHbTt0JH7Pk03pWSBr7h5nF
 vsddRDxxeDgB6xd7jWP7EvDaPxHeB0CdSj5gG8EP/ZdOm8sFAwB1ZIHWikgUgSwW
 nu6R28yXTMSj+EkyFtahMhTMJ1EMF4sCPuIgAo59ST5w/UMMqLCJByOu4ej6RPKZ
 aeSJJWaDLBmbgnTKWxRvCc/MgIx4J/LAHWGkdpGjuMK6SLp38Kdf86XcrklXtzwf
 K30ZYeoKq8zZ+nFOsK9gBVlOlocZcbS1jEbN842jD6imb6vKLQtBWrKk9A6o4v5E
 XulORWcSBhkZb3ItIU9+6SmelUExf0VeVlSp657QXYPgquoIIGvFl6rCwhrdGMGO
 bi6NZKCfJvcFZJoIN1oyhuHejgZSBnzGEcvhvzNdg7ItvOCed7q3uXcGHz/OI6tL
 2TZKddzHSEMVfTo0D+RUsYfasZHI1qAiQ0mWVC31c+YHuRuW/K/jlc3a5TXlSBUa
 Dwu0/zzMLiqx65ISx9i7XNMrngk55uzrS6MnwSByPoz4M4xsElZxt3cbUxQ8YAQz
 jhxTHs1Pwes8i7f4n61ay/nHCFbmVvN/LlsPRpZdwd8JumThLrDolF3tc6aaY0xO
 hOssKtnGY4Xvh/WitfJ5uvDb1vMObJKTXQEoZEJh4hlNQDxdeUE=
 =6NGI
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not
   allow precise identification of the illegal access.

 - Run kernel mode SIMD with softirqs disabled. This allows using NEON
   in softirq context for crypto performance improvements. The
   conditional yield support is modified to take softirqs into account
   and reduce the latency.

 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.

 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers,
   new functions for the HiSilicon HHA and L3C PMU, cleanups.

 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.

 - Disable fine-grained traps at boot and improve the documented boot
   requirements.

 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).

 - Add hierarchical eXecute Never permissions for all page tables.

 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.

 - arm64 kselftests for BTI and some improvements to the MTE tests.

 - Minor improvements to the compat vdso and sigpage.

 - Miscellaneous cleanups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits)
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64: pac: Optimize kernel entry/exit key installation code paths
  arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
  arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
  kasan, arm64: tests supports for HW_TAGS async mode
  arm64: mte: Report async tag faults before suspend
  arm64: mte: Enable async tag check fault
  arm64: mte: Conditionally compile mte_enable_kernel_*()
  arm64: mte: Enable TCO in functions that can read beyond buffer limits
  kasan: Add report for async mode
  arm64: mte: Drop arch_enable_tagging()
  kasan: Add KASAN mode kernel parameter
  arm64: mte: Add asynchronous mode support
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  ...
2021-04-26 10:25:03 -07:00
Ard Biesheuvel 0f19dbc994 crypto: arm64/aes-ce - deal with oversight in new CTR carry code
The new carry handling code in the CTR driver can deal with a carry
occurring in the 4x/5x parallel code path, by using a computed goto to
jump into the carry sequence at the right place as to only apply the
carry to a subset of the blocks being processed.

If the lower half of the counter wraps and ends up at exactly 0x0, a
carry needs to be applied to the counter, but not to the counter values
taken for the 4x/5x parallel sequence. In this case, the computed goto
skips all register assignments, and branches straight to the jump
instruction that gets us back to the fast path. This produces the
correct result, but due to the fact that this branch target does not
carry the correct BTI annotation, this fails when BTI is enabled.

Let's omit the computed goto entirely in this case, and jump straight
back to the fast path after applying the carry to the main counter.

Fixes: 5318d3db46 ("crypto: arm64/aes-ctr - improve tail handling")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-04-16 21:16:31 +10:00
Ard Biesheuvel 13150149aa arm64: fpsimd: run kernel mode NEON with softirqs disabled
Kernel mode NEON can be used in task or softirq context, but only in
a non-nesting manner, i.e., softirq context is only permitted if the
interrupt was not taken at a point where the kernel was using the NEON
in task context.

This means all users of kernel mode NEON have to be aware of this
limitation, and either need to provide scalar fallbacks that may be much
slower (up to 20x for AES instructions) and potentially less safe, or
use an asynchronous interface that defers processing to a later time
when the NEON is guaranteed to be available.

Given that grabbing and releasing the NEON is cheap, we can relax this
restriction, by increasing the granularity of kernel mode NEON code, and
always disabling softirq processing while the NEON is being used in task
context.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210302090118.30666-4-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-12 11:55:34 +01:00
Arnd Bergmann 8d195e7a8a crypto: poly1305 - fix poly1305_core_setkey() declaration
gcc-11 points out a mismatch between the declaration and the definition
of poly1305_core_setkey():

lib/crypto/poly1305-donna32.c:13:67: error: argument 2 of type ‘const u8[16]’ {aka ‘const unsigned char[16]’} with mismatched bound [-Werror=array-parameter=]
   13 | void poly1305_core_setkey(struct poly1305_core_key *key, const u8 raw_key[16])
      |                                                          ~~~~~~~~~^~~~~~~~~~~
In file included from lib/crypto/poly1305-donna32.c:11:
include/crypto/internal/poly1305.h:21:68: note: previously declared as ‘const u8 *’ {aka ‘const unsigned char *’}
   21 | void poly1305_core_setkey(struct poly1305_core_key *key, const u8 *raw_key);

This is harmless in principle, as the calling conventions are the same,
but the more specific prototype allows better type checking in the
caller.

Change the declaration to match the actual function definition.
The poly1305_simd_init() is a bit suspicious here, as it previously
had a 32-byte argument type, but looks like it needs to take the
16-byte POLY1305_BLOCK_SIZE array instead.

Fixes: 1c08a10436 ("crypto: poly1305 - add new 32 and 64-bit generic versions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-04-02 18:28:12 +11:00
Ard Biesheuvel fc754c024a crypto: arm64/crc-t10dif - move NEON yield to C code
Instead of yielding from the bowels of the asm routine if a reschedule
is needed, divide up the input into 4 KB chunks in the C glue. This
simplifies the code substantially, and avoids scheduling out the task
with the asm routine on the call stack, which is undesirable from a
CFI/instrumentation point of view.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-02-10 17:55:58 +11:00
Ard Biesheuvel f0070f4a79 crypto: arm64/aes-ce-mac - simplify NEON yield
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-02-10 17:55:58 +11:00
Ard Biesheuvel f5943ef456 crypto: arm64/aes-neonbs - remove NEON yield calls
There is no need for elaborate yield handling in the bit-sliced NEON
implementation of AES, given that skciphers are naturally bounded by the
size of the chunks returned by the skcipher_walk API. So remove the
yield calls from the asm code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-02-10 17:55:58 +11:00
Ard Biesheuvel 5f6cb2e617 crypto: arm64/sha512-ce - simplify NEON yield
Instead of calling into kernel_neon_end() and kernel_neon_begin() (and
potentially into schedule()) from the assembler code when running in
task mode and a reschedule is pending, perform only the preempt count
check in assembler, but simply return early in this case, and let the C
code deal with the consequences.

This reverts commit 6caf7adc5e.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-02-10 17:55:58 +11:00
Ard Biesheuvel 9ecc9f31d0 crypto: arm64/sha3-ce - simplify NEON yield
Instead of calling into kernel_neon_end() and kernel_neon_begin() (and
potentially into schedule()) from the assembler code when running in
task mode and a reschedule is pending, perform only the preempt count
check in assembler, but simply return early in this case, and let the C
code deal with the consequences.

This reverts commit 7edc86cb1c.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-02-10 17:55:58 +11:00
Ard Biesheuvel b2eadbf40e crypto: arm64/sha2-ce - simplify NEON yield
Instead of calling into kernel_neon_end() and kernel_neon_begin() (and
potentially into schedule()) from the assembler code when running in
task mode and a reschedule is pending, perform only the preempt count
check in assembler, but simply return early in this case, and let the C
code deal with the consequences.

This reverts commit d82f37ab5e.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-02-10 17:55:57 +11:00
Ard Biesheuvel 5a69e1b73d crypto: arm64/sha1-ce - simplify NEON yield
Instead of calling into kernel_neon_end() and kernel_neon_begin() (and
potentially into schedule()) from the assembler code when running in
task mode and a reschedule is pending, perform only the preempt count
check in assembler, but simply return early in this case, and let the C
code deal with the consequences.

This reverts commit 7df8d16475.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-02-10 17:55:57 +11:00
Ard Biesheuvel 0df07d8117 crypto: arm64/sha - add missing module aliases
The accelerated, instruction based implementations of SHA1, SHA2 and
SHA3 are autoloaded based on CPU capabilities, given that the code is
modest in size, and widely used, which means that resolving the algo
name, loading all compatible modules and picking the one with the
highest priority is taken to be suboptimal.

However, if these algorithms are requested before this CPU feature
based matching and autoloading occurs, these modules are not even
considered, and we end up with suboptimal performance.

So add the missing module aliases for the various SHA implementations.

Cc: <stable@vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-01-22 14:58:01 +11:00
Ard Biesheuvel 5318d3db46 crypto: arm64/aes-ctr - improve tail handling
Counter mode is a stream cipher chaining mode that is typically used
with inputs that are of arbitrarily length, and so a tail block which
is smaller than a full AES block is rule rather than exception.

The current ctr(aes) implementation for arm64 always makes a separate
call into the assembler routine to process this tail block, which is
suboptimal, given that it requires reloading of the AES round keys,
and prevents us from handling this tail block using the 5-way stride
that we use for better performance on deep pipelines.

So let's update the assembler routine so it can handle any input size,
and uses NEON permutation instructions and overlapping loads and stores
to handle the tail block. This results in a ~16% speedup for 1420 byte
blocks on cores with deep pipelines such as ThunderX2.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-01-03 08:41:37 +11:00
Ard Biesheuvel 15deb4333c crypto: arm64/aes-ce - really hide slower algos when faster ones are enabled
Commit 69b6f2e817 ("crypto: arm64/aes-neon - limit exposed routines if
faster driver is enabled") intended to hide modes from the plain NEON
driver that are also implemented by the faster bit sliced NEON one if
both are enabled. However, the defined() CPP function does not detect
if the bit sliced NEON driver is enabled as a module. So instead, let's
use IS_ENABLED() here.

Fixes: 69b6f2e817 ("crypto: arm64/aes-neon - limit exposed routines if ...")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-01-03 08:41:37 +11:00
Eric Biggers a24d22b225 crypto: sha - split sha.h into sha1.h and sha2.h
Currently <crypto/sha.h> contains declarations for both SHA-1 and SHA-2,
and <crypto/sha3.h> contains declarations for SHA-3.

This organization is inconsistent, but more importantly SHA-1 is no
longer considered to be cryptographically secure.  So to the extent
possible, SHA-1 shouldn't be grouped together with any of the other SHA
versions, and usage of it should be phased out.

Therefore, split <crypto/sha.h> into two headers <crypto/sha1.h> and
<crypto/sha2.h>, and make everyone explicitly specify whether they want
the declarations for SHA-1, SHA-2, or both.

This avoids making the SHA-1 declarations visible to files that don't
want anything to do with SHA-1.  It also prepares for potentially moving
sha1.h into a new insecure/ or dangerous/ directory.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-11-20 14:45:33 +11:00
Ard Biesheuvel 3ad99c22ce crypto: arm64/gcm - move authentication tag check to SIMD domain
Instead of copying the calculated authentication tag to memory and
calling crypto_memneq() to verify it, use vector bytewise compare and
min across vector instructions to decide whether the tag is valid. This
is more efficient, and given that the tag is only transiently held in a
NEON register, it is also safer, given that calculated tags for failed
decryptions should be withheld.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-11-20 14:45:32 +11:00
Ard Biesheuvel c4fc6328d6 crypto: arm64/chacha - simplify tail block handling
Based on lessons learnt from optimizing the 32-bit version of this driver,
we can simplify the arm64 version considerably, by reordering the final
two stores when the last block is not a multiple of 64 bytes. This removes
the need to use permutation instructions to calculate the elements that are
clobbered by the final overlapping store, given that the store of the
penultimate block now follows it, and that one carries the correct values
for those elements already.

While at it, simplify the overlapping loads as well, by calculating the
address of the final overlapping load upfront, and switching to this
address for every load that would otherwise extend past the end of the
source buffer.

There is no impact on performance, but the resulting code is substantially
smaller and easier to follow.

Cc: Eric Biggers <ebiggers@google.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-11-13 20:38:55 +11:00
Ard Biesheuvel 519a0d7e49 crypto: arm64/poly1305-neon - reorder PAC authentication with SP update
PAC pointer authentication signs the return address against the value
of the stack pointer, to prevent stack overrun exploits from corrupting
the control flow. However, this requires that the AUTIASP is issued with
SP holding the same value as it held when the PAC value was generated.
The Poly1305 NEON code got this wrong, resulting in crashes on PAC
capable hardware.

Fixes: f569ca1647 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS ...")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-11-06 14:29:11 +11:00
Arvind Sankar 458c0480dc crypto: hash - Use memzero_explicit() for clearing state
Without the barrier_data() inside memzero_explicit(), the compiler may
optimize away the state-clearing if it can tell that the state is not
used afterwards.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-10-30 17:35:03 +11:00
Linus Torvalds 39a5101f98 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Allow DRBG testing through user-space af_alg
   - Add tcrypt speed testing support for keyed hashes
   - Add type-safe init/exit hooks for ahash

  Algorithms:
   - Mark arc4 as obsolete and pending for future removal
   - Mark anubis, khazad, sead and tea as obsolete
   - Improve boot-time xor benchmark
   - Add OSCCA SM2 asymmetric cipher algorithm and use it for integrity

  Drivers:
   - Fixes and enhancement for XTS in caam
   - Add support for XIP8001B hwrng in xiphera-trng
   - Add RNG and hash support in sun8i-ce/sun8i-ss
   - Allow imx-rngc to be used by kernel entropy pool
   - Use crypto engine in omap-sham
   - Add support for Ingenic X1830 with ingenic"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (205 commits)
  X.509: Fix modular build of public_key_sm2
  crypto: xor - Remove unused variable count in do_xor_speed
  X.509: fix error return value on the failed path
  crypto: bcm - Verify GCM/CCM key length in setkey
  crypto: qat - drop input parameter from adf_enable_aer()
  crypto: qat - fix function parameters descriptions
  crypto: atmel-tdes - use semicolons rather than commas to separate statements
  crypto: drivers - use semicolons rather than commas to separate statements
  hwrng: mxc-rnga - use semicolons rather than commas to separate statements
  hwrng: iproc-rng200 - use semicolons rather than commas to separate statements
  hwrng: stm32 - use semicolons rather than commas to separate statements
  crypto: xor - use ktime for template benchmarking
  crypto: xor - defer load time benchmark to a later time
  crypto: hisilicon/zip - fix the uninitalized 'curr_qm_qp_num'
  crypto: hisilicon/zip - fix the return value when device is busy
  crypto: hisilicon/zip - fix zero length input in GZIP decompress
  crypto: hisilicon/zip - fix the uncleared debug registers
  lib/mpi: Fix unused variable warnings
  crypto: x86/poly1305 - Remove assignments with no effect
  hwrng: npcm - modify readl to readb
  ...
2020-10-13 08:50:16 -07:00
Jeremy Linton 39e4716caa crypto: arm64: Use x16 with indirect branch to bti_c
The AES code uses a 'br x7' as part of a function called by
a macro. That branch needs a bti_j as a target. This results
in a panic as seen below. Using x16 (or x17) with an indirect
branch keeps the target bti_c.

  Bad mode in Synchronous Abort handler detected on CPU1, code 0x34000003 -- BTI
  CPU: 1 PID: 265 Comm: cryptomgr_test Not tainted 5.8.11-300.fc33.aarch64 #1
  pstate: 20400c05 (nzCv daif +PAN -UAO BTYPE=j-)
  pc : aesbs_encrypt8+0x0/0x5f0 [aes_neon_bs]
  lr : aesbs_xts_encrypt+0x48/0xe0 [aes_neon_bs]
  sp : ffff80001052b730

  aesbs_encrypt8+0x0/0x5f0 [aes_neon_bs]
   __xts_crypt+0xb0/0x2dc [aes_neon_bs]
   xts_encrypt+0x28/0x3c [aes_neon_bs]
  crypto_skcipher_encrypt+0x50/0x84
  simd_skcipher_encrypt+0xc8/0xe0
  crypto_skcipher_encrypt+0x50/0x84
  test_skcipher_vec_cfg+0x224/0x5f0
  test_skcipher+0xbc/0x120
  alg_test_skcipher+0xa0/0x1b0
  alg_test+0x3dc/0x47c
  cryptomgr_test+0x38/0x60

Fixes: 0e89640b64 ("crypto: arm64 - Use modern annotations for assembly functions")
Cc: <stable@vger.kernel.org> # 5.6.x-
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Suggested-by: Dave P Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20201006163326.2780619-1-jeremy.linton@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-10-06 18:14:47 +01:00
Herbert Xu a4cb40f48f crypto: arm64/gcm - Fix endianness warnings
This patch changes a couple u128's to be128 which is the correct
type to use and fixes a few sparse warnings.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-09-04 17:57:15 +10:00
Herbert Xu 0f4128dc4e crypto: arm64/sha - Add declarations for assembly variables
This patch adds declarations for variables only used by assembly
code to silence compiler warnings:

  CC [M]  arch/arm64/crypto/sha1-ce-glue.o
  AS [M]  arch/arm64/crypto/sha1-ce-core.o
  CC [M]  arch/arm64/crypto/sha2-ce-glue.o
  AS [M]  arch/arm64/crypto/sha2-ce-core.o
  CHECK   ../arch/arm64/crypto/sha1-ce-glue.c
  CHECK   ../arch/arm64/crypto/sha2-ce-glue.c
../arch/arm64/crypto/sha1-ce-glue.c:38:11: warning: symbol 'sha1_ce_offsetof_count' was not declared. Should it be static?
../arch/arm64/crypto/sha1-ce-glue.c:39:11: warning: symbol 'sha1_ce_offsetof_finalize' was not declared. Should it be static?
../arch/arm64/crypto/sha2-ce-glue.c:38:11: warning: symbol 'sha256_ce_offsetof_count' was not declared. Should it be static?
../arch/arm64/crypto/sha2-ce-glue.c:40:11: warning: symbol 'sha256_ce_offsetof_finalize' was not declared. Should it be static?

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-09-04 17:57:15 +10:00
Ard Biesheuvel e4f874858c crypto: arm64/gcm - use inline helper to suppress indirect calls
Introduce an inline wrapper for ghash_do_update() that incorporates
the indirect call to the asm routine that is passed as an argument,
and keep the non-SIMD fallback code out of line. This ensures that
all references to the function pointer are inlined where the address
is taken, removing the need for any indirect calls to begin with.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 22:14:32 +10:00
Ard Biesheuvel 17d0fb1feb crypto: arm64/gcm - use variably sized key struct
Now that the ghash and gcm drivers are split, we no longer need to allocate
a key struct for the former that carries powers of H that are only used by
the latter. Also, take this opportunity to clean up the code a little bit.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 22:14:32 +10:00
Ard Biesheuvel 94fe4501cd crypto: arm64/gcm - disentangle ghash and gcm setkey() routines
The remaining ghash implementation does not support aggregation, and so
there is no point in including the precomputed powers of H in the key
struct. So move that into the GCM setkey routine, and get rid of the
shared sub-routine entirely.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 22:14:32 +10:00
Ard Biesheuvel 37b6aab68f crypto: arm64/ghash - drop PMULL based shash
There are two ways to implement SIMD accelerated GCM on arm64:
- using the PMULL instructions for carryless 64x64->128 multiplication,
  in which case the architecture guarantees that the AES instructions are
  available as well, and so we can use the AEAD implementation that combines
  both,
- using the PMULL instructions for carryless 8x8->16 bit multiplication,
  which is implemented as a shash, and can be combined with any ctr(aes)
  implementation by the generic GCM AEAD template driver.

So let's drop the 64x64->128 shash driver, which is never needed for GCM,
and not suitable for use anywhere else.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 22:14:32 +10:00
Linus Torvalds 81e8c10dac Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Introduce crypto_shash_tfm_digest() and use it wherever possible.
   - Fix use-after-free and race in crypto_spawn_alg.
   - Add support for parallel and batch requests to crypto_engine.

  Algorithms:
   - Update jitter RNG for SP800-90B compliance.
   - Always use jitter RNG as seed in drbg.

  Drivers:
   - Add Arm CryptoCell driver cctrng.
   - Add support for SEV-ES to the PSP driver in ccp"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (114 commits)
  crypto: hisilicon - fix driver compatibility issue with different versions of devices
  crypto: engine - do not requeue in case of fatal error
  crypto: cavium/nitrox - Fix a typo in a comment
  crypto: hisilicon/qm - change debugfs file name from qm_regs to regs
  crypto: hisilicon/qm - add DebugFS for xQC and xQE dump
  crypto: hisilicon/zip - add debugfs for Hisilicon ZIP
  crypto: hisilicon/hpre - add debugfs for Hisilicon HPRE
  crypto: hisilicon/sec2 - add debugfs for Hisilicon SEC
  crypto: hisilicon/qm - add debugfs to the QM state machine
  crypto: hisilicon/qm - add debugfs for QM
  crypto: stm32/crc32 - protect from concurrent accesses
  crypto: stm32/crc32 - don't sleep in runtime pm
  crypto: stm32/crc32 - fix multi-instance
  crypto: stm32/crc32 - fix run-time self test issue.
  crypto: stm32/crc32 - fix ext4 chksum BUG_ON()
  crypto: hisilicon/zip - Use temporary sqe when doing work
  crypto: hisilicon - add device error report through abnormal irq
  crypto: hisilicon - remove codes of directly report device errors through MSI
  crypto: hisilicon - QM memory management optimization
  crypto: hisilicon - unify initial value assignment into QM
  ...
2020-06-01 12:00:10 -07:00
Eric Biggers 2aaba014b5 crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h
<linux/cryptohash.h> sounds very generic and important, like it's the
header to include if you're doing cryptographic hashing in the kernel.
But actually it only includes the library implementation of the SHA-1
compression function (not even the full SHA-1).  This should basically
never be used anymore; SHA-1 is no longer considered secure, and there
are much better ways to do cryptographic hashing in the kernel.

Most files that include this header don't actually need it.  So in
preparation for removing it, remove all these unneeded includes of it.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-05-08 15:32:17 +10:00