Commit Graph

183 Commits

Author SHA1 Message Date
Eric Biggers 8dfa20fcfb crypto: ghash - add comment and improve help text
To help avoid confusion, add a comment to ghash-generic.c which explains
the convention that the kernel's implementation of GHASH uses.

Also update the Kconfig help text and module descriptions to call GHASH
a "hash function" rather than a "message digest", since the latter
normally means a real cryptographic hash function, which GHASH is not.

Cc: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-27 21:08:38 +10:00
Ard Biesheuvel b46033fdd2 crypto: arm/aes-scalar - unexport en/decryption routines
The scalar table based AES routines are not used by other drivers, so
let's keep it that way and unexport the symbols.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:38 +10:00
Ard Biesheuvel 8de6dd3386 crypto: arm/aes-cipher - switch to shared AES inverse Sbox
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:37 +10:00
Ard Biesheuvel 0a5dff9882 crypto: arm/ghash - provide a synchronous version
GHASH is used by the GCM mode, which is often used in contexts where
only synchronous ciphers are permitted. So provide a synchronous version
of GHASH based on the existing code. This requires a non-SIMD fallback
to deal with invocations occurring from a context where SIMD instructions
may not be used.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:11 +10:00
Ard Biesheuvel e5f050402f crypto: arm/aes-neonbs - provide a synchronous version of ctr(aes)
AES in CTR mode is used by modes such as GCM and CCM, which are often
used in contexts where only synchronous ciphers are permitted. So
provide a synchronous version of ctr(aes) based on the existing code.
This requires a non-SIMD fallback to deal with invocations occurring
from a context where SIMD instructions may not be used. We have a
helper for this now in the AES library, so wire that up.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:11 +10:00
Ard Biesheuvel 5eedf315f2 crypto: arm/aes-ce - provide a synchronous version of ctr(aes)
AES in CTR mode is used by modes such as GCM and CCM, which are often
used in contexts where only synchronous ciphers are permitted. So
provide a synchronous version of ctr(aes) based on the existing code.
This requires a non-SIMD fallback to deal with invocations occurring
from a context where SIMD instructions may not be used. We have a
helper for this now in the AES library, so wire that up.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:10 +10:00
Ard Biesheuvel fafb1dca6f crypto: arm/aes - use native endiannes for key schedule
Align ARM's hw instruction based AES implementation with other versions
that keep the key schedule in native endianness. This will allow us to
merge the various implementations going forward.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:09 +10:00
Ard Biesheuvel aa6e2d2b35 crypto: arm/aes-neonbs - switch to library version of key expansion routine
Switch to the new AES library that also provides an implementation of
the AES key expansion routine. This removes the dependency on the
generic AES cipher, allowing it to be omitted entirely in the future.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:56:04 +10:00
Ard Biesheuvel 724ecd3c0e crypto: aes - rename local routines to prevent future clashes
Rename some local AES encrypt/decrypt routines so they don't clash with
the names we are about to introduce for the routines exposed by the
generic AES library.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:52:03 +10:00
Ard Biesheuvel 20bb4ef038 crypto: arm/aes-ce - cosmetic/whitespace cleanup
Rearrange the aes_algs[] array for legibility.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:52:02 +10:00
Linus Torvalds 4d2fa8b44b Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 5.3:

  API:
   - Test shash interface directly in testmgr
   - cra_driver_name is now mandatory

  Algorithms:
   - Replace arc4 crypto_cipher with library helper
   - Implement 5 way interleave for ECB, CBC and CTR on arm64
   - Add xxhash
   - Add continuous self-test on noise source to drbg
   - Update jitter RNG

  Drivers:
   - Add support for SHA204A random number generator
   - Add support for 7211 in iproc-rng200
   - Fix fuzz test failures in inside-secure
   - Fix fuzz test failures in talitos
   - Fix fuzz test failures in qat"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (143 commits)
  crypto: stm32/hash - remove interruptible condition for dma
  crypto: stm32/hash - Fix hmac issue more than 256 bytes
  crypto: stm32/crc32 - rename driver file
  crypto: amcc - remove memset after dma_alloc_coherent
  crypto: ccp - Switch to SPDX license identifiers
  crypto: ccp - Validate the the error value used to index error messages
  crypto: doc - Fix formatting of new crypto engine content
  crypto: doc - Add parameter documentation
  crypto: arm64/aes-ce - implement 5 way interleave for ECB, CBC and CTR
  crypto: arm64/aes-ce - add 5 way interleave routines
  crypto: talitos - drop icv_ool
  crypto: talitos - fix hash on SEC1.
  crypto: talitos - move struct talitos_edesc into talitos.h
  lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE
  crypto/NX: Set receive window credits to max number of CRBs in RxFIFO
  crypto: asymmetric_keys - select CRYPTO_HASH where needed
  crypto: serpent - mark __serpent_setkey_sbox noinline
  crypto: testmgr - dynamically allocate crypto_shash
  crypto: testmgr - dynamically allocate testvec_config
  crypto: talitos - eliminate unneeded 'done' functions at build time
  ...
2019-07-08 20:57:08 -07:00
Thomas Gleixner d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Eric Biggers 860ab2e502 crypto: chacha - constify ctx and iv arguments
Constify the ctx and iv arguments to crypto_chacha_init() and the
various chacha*_stream_xor() functions.  This makes it clear that they
are not modified.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13 14:31:40 +08:00
Thomas Gleixner 2874c5fd28 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:32 -07:00
YueHaibing efc77e8107 crypto: arm/sha512 - Make sha512_arm_final static
Fix sparse warning:

arch/arm/crypto/sha512-glue.c:40:5: warning:
 symbol 'sha512_arm_final' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-05-23 14:01:07 +08:00
Eric Biggers 877b5691f2 crypto: shash - remove shash_desc::flags
The flags field in 'struct shash_desc' never actually does anything.
The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP.
However, no shash algorithm ever sleeps, making this flag a no-op.

With this being the case, inevitably some users who can't sleep wrongly
pass MAY_SLEEP.  These would all need to be fixed if any shash algorithm
actually started sleeping.  For example, the shash_ahash_*() functions,
which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP
from the ahash API to the shash API.  However, the shash functions are
called under kmap_atomic(), so actually they're assumed to never sleep.

Even if it turns out that some users do need preemption points while
hashing large buffers, we could easily provide a helper function
crypto_shash_update_large() which divides the data into smaller chunks
and calls crypto_shash_update() and cond_resched() for each chunk.  It's
not necessary to have a flag in 'struct shash_desc', nor is it necessary
to make individual shash algorithms aware of this at all.

Therefore, remove shash_desc::flags, and document that the
crypto_shash_*() functions can be called from any context.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25 15:38:12 +08:00
Eric Biggers 767f015ea0 crypto: arm/aes-neonbs - don't access already-freed walk.iv
If the user-provided IV needs to be aligned to the algorithm's
alignmask, then skcipher_walk_virt() copies the IV into a new aligned
buffer walk.iv.  But skcipher_walk_virt() can fail afterwards, and then
if the caller unconditionally accesses walk.iv, it's a use-after-free.

arm32 xts-aes-neonbs doesn't set an alignmask, so currently it isn't
affected by this despite unconditionally accessing walk.iv.  However
this is more subtle than desired, and it was actually broken prior to
the alignmask being removed by commit cc477bf645 ("crypto: arm/aes -
replace bit-sliced OpenSSL NEON code").  Thus, update xts-aes-neonbs to
start checking the return value of skcipher_walk_virt().

Fixes: e4e7f10bfc ("ARM: add support for bit sliced AES using NEON instructions")
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-18 22:14:58 +08:00
Eric Biggers 99680c5e91 crypto: arm - convert to use crypto_simd_usable()
Replace all calls to may_use_simd() in the arm crypto code with
crypto_simd_usable(), in order to allow testing the no-SIMD code paths.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-03-22 20:57:27 +08:00
Linus Torvalds 63bdf4284c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "API:
   - Add helper for simple skcipher modes.
   - Add helper to register multiple templates.
   - Set CRYPTO_TFM_NEED_KEY when setkey fails.
   - Require neither or both of export/import in shash.
   - AEAD decryption test vectors are now generated from encryption
     ones.
   - New option CONFIG_CRYPTO_MANAGER_EXTRA_TESTS that includes random
     fuzzing.

  Algorithms:
   - Conversions to skcipher and helper for many templates.
   - Add more test vectors for nhpoly1305 and adiantum.

  Drivers:
   - Add crypto4xx prng support.
   - Add xcbc/cmac/ecb support in caam.
   - Add AES support for Exynos5433 in s5p.
   - Remove sha384/sha512 from artpec7 as hardware cannot do partial
     hash"

[ There is a merge of the Freescale SoC tree in order to pull in changes
  required by patches to the caam/qi2 driver. ]

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (174 commits)
  crypto: s5p - add AES support for Exynos5433
  dt-bindings: crypto: document Exynos5433 SlimSSS
  crypto: crypto4xx - add missing of_node_put after of_device_is_available
  crypto: cavium/zip - fix collision with generic cra_driver_name
  crypto: af_alg - use struct_size() in sock_kfree_s()
  crypto: caam - remove redundant likely/unlikely annotation
  crypto: s5p - update iv after AES-CBC op end
  crypto: x86/poly1305 - Clear key material from stack in SSE2 variant
  crypto: caam - generate hash keys in-place
  crypto: caam - fix DMA mapping xcbc key twice
  crypto: caam - fix hash context DMA unmap size
  hwrng: bcm2835 - fix probe as platform device
  crypto: s5p-sss - Use AES_BLOCK_SIZE define instead of number
  crypto: stm32 - drop pointless static qualifier in stm32_hash_remove()
  crypto: chelsio - Fixed Traffic Stall
  crypto: marvell - Remove set but not used variable 'ivsize'
  crypto: ccp - Update driver messages to remove some confusion
  crypto: adiantum - add 1536 and 4096-byte test vectors
  crypto: nhpoly1305 - add a test vector with len % 16 != 0
  crypto: arm/aes-ce - update IV after partial final CTR block
  ...
2019-03-05 09:09:55 -08:00
Eric Biggers 511306b2d0 crypto: arm/aes-ce - update IV after partial final CTR block
Make the arm ctr-aes-ce algorithm update the IV buffer to contain the
next counter after processing a partial final block, rather than leave
it as the last counter.  This makes ctr-aes-ce pass the updated AES-CTR
tests.  This change also makes the code match the arm64 version in
arch/arm64/crypto/aes-modes.S more closely.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-22 12:47:27 +08:00
Ard Biesheuvel c643165020 crypto: sha512/arm - fix crash bug in Thumb2 build
The SHA512 code we adopted from the OpenSSL project uses a rather
peculiar way to take the address of the round constant table: it
takes the address of the sha256_block_data_order() routine, and
substracts a constant known quantity to arrive at the base of the
table, which is emitted by the same assembler code right before
the routine's entry point.

However, recent versions of binutils have helpfully changed the
behavior of references emitted via an ADR instruction when running
in Thumb2 mode: it now takes the Thumb execution mode bit into
account, which is bit 0 af the address. This means the produced
table address also has bit 0 set, and so we end up with an address
value pointing 1 byte past the start of the table, which results
in crashes such as

  Unable to handle kernel paging request at virtual address bf825000
  pgd = 42f44b11
  [bf825000] *pgd=80000040206003, *pmd=5f1bd003, *pte=00000000
  Internal error: Oops: 207 [#1] PREEMPT SMP THUMB2
  Modules linked in: sha256_arm(+) sha1_arm_ce sha1_arm ...
  CPU: 7 PID: 396 Comm: cryptomgr_test Not tainted 5.0.0-rc6+ #144
  Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
  PC is at sha256_block_data_order+0xaaa/0xb30 [sha256_arm]
  LR is at __this_module+0x17fd/0xffffe800 [sha256_arm]
  pc : [<bf820bca>]    lr : [<bf824ffd>]    psr: 800b0033
  sp : ebc8bbe8  ip : faaabe1c  fp : 2fdd3433
  r10: 4c5f1692  r9 : e43037df  r8 : b04b0a5a
  r7 : c369d722  r6 : 39c3693e  r5 : 7a013189  r4 : 1580d26b
  r3 : 8762a9b0  r2 : eea9c2cd  r1 : 3e9ab536  r0 : 1dea4ae7
  Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment user
  Control: 70c5383d  Table: 6b8467c0  DAC: dbadc0de
  Process cryptomgr_test (pid: 396, stack limit = 0x69e1fe23)
  Stack: (0xebc8bbe8 to 0xebc8c000)
  ...
  unwind: Unknown symbol address bf820bca
  unwind: Index not found bf820bca
  Code: 441a ea80 40f9 440a (f85e) 3b04
  ---[ end trace e560cce92700ef8a ]---

Given that this affects older kernels as well, in case they are built
with a recent toolchain, apply a minimal backportable fix, which is
to emit another non-code label at the start of the routine, and
reference that instead. (This is similar to the current upstream state
of this file in OpenSSL)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-22 12:40:56 +08:00
Ard Biesheuvel 69216a545c crypto: sha256/arm - fix crash bug in Thumb2 build
The SHA256 code we adopted from the OpenSSL project uses a rather
peculiar way to take the address of the round constant table: it
takes the address of the sha256_block_data_order() routine, and
substracts a constant known quantity to arrive at the base of the
table, which is emitted by the same assembler code right before
the routine's entry point.

However, recent versions of binutils have helpfully changed the
behavior of references emitted via an ADR instruction when running
in Thumb2 mode: it now takes the Thumb execution mode bit into
account, which is bit 0 af the address. This means the produced
table address also has bit 0 set, and so we end up with an address
value pointing 1 byte past the start of the table, which results
in crashes such as

  Unable to handle kernel paging request at virtual address bf825000
  pgd = 42f44b11
  [bf825000] *pgd=80000040206003, *pmd=5f1bd003, *pte=00000000
  Internal error: Oops: 207 [#1] PREEMPT SMP THUMB2
  Modules linked in: sha256_arm(+) sha1_arm_ce sha1_arm ...
  CPU: 7 PID: 396 Comm: cryptomgr_test Not tainted 5.0.0-rc6+ #144
  Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
  PC is at sha256_block_data_order+0xaaa/0xb30 [sha256_arm]
  LR is at __this_module+0x17fd/0xffffe800 [sha256_arm]
  pc : [<bf820bca>]    lr : [<bf824ffd>]    psr: 800b0033
  sp : ebc8bbe8  ip : faaabe1c  fp : 2fdd3433
  r10: 4c5f1692  r9 : e43037df  r8 : b04b0a5a
  r7 : c369d722  r6 : 39c3693e  r5 : 7a013189  r4 : 1580d26b
  r3 : 8762a9b0  r2 : eea9c2cd  r1 : 3e9ab536  r0 : 1dea4ae7
  Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment user
  Control: 70c5383d  Table: 6b8467c0  DAC: dbadc0de
  Process cryptomgr_test (pid: 396, stack limit = 0x69e1fe23)
  Stack: (0xebc8bbe8 to 0xebc8c000)
  ...
  unwind: Unknown symbol address bf820bca
  unwind: Index not found bf820bca
  Code: 441a ea80 40f9 440a (f85e) 3b04
  ---[ end trace e560cce92700ef8a ]---

Given that this affects older kernels as well, in case they are built
with a recent toolchain, apply a minimal backportable fix, which is
to emit another non-code label at the start of the routine, and
reference that instead. (This is similar to the current upstream state
of this file in OpenSSL)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-22 12:40:56 +08:00
Eric Biggers e7b3ed3380 crypto: arm/crct10dif-ce - cleanup and optimizations
The x86, arm, and arm64 asm implementations of crct10dif are very
difficult to understand partly because many of the comments, labels, and
macros are named incorrectly: the lengths mentioned are usually off by a
factor of two from the actual code.  Many other things are unnecessarily
convoluted as well, e.g. there are many more fold constants than
actually needed and some aren't fully reduced.

This series therefore cleans up all these implementations to be much
more maintainable.  I also made some small optimizations where I saw
opportunities, resulting in slightly better performance.

This patch cleans up the arm version.

(Also moved the constants to .rodata as suggested by Ard Biesheuvel.)

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-08 15:29:48 +08:00
Ard Biesheuvel c03f3cb40b crypto: arm/crct10dif - remove dead code
Remove some code that is no longer called now that we make sure never
to invoke the SIMD routine with less that 16 bytes of input.

Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-01 14:45:52 +08:00
Ard Biesheuvel 62fecf295e crypto: arm/crct10dif - revert to C code for short inputs
The SIMD routine ported from x86 used to have a special code path
for inputs < 16 bytes, which got lost somewhere along the way.
Instead, the current glue code aligns the input pointer to permit
the NEON routine to use special versions of the vld1 instructions
that assume 16 byte alignment, but this could result in inputs of
less than 16 bytes to be passed in. This not only fails the new
extended tests that Eric has implemented, it also results in the
code reading past the end of the input, which could potentially
result in crashes when dealing with less than 16 bytes of input
at the end of a page which is followed by an unmapped page.

So update the glue code to only invoke the NEON routine if the
input is at least 16 bytes.

Reported-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Fixes: 1d481f1cd8 ("crypto: arm/crct10dif - port x86 SSE implementation to ARM")
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-02-01 14:44:39 +08:00
Linus Torvalds 668c35f69c Kbuild updates for v4.21
Kbuild core:
  - remove unneeded $(call cc-option,...) switches
  - consolidate Clang compiler flags into CLANG_FLAGS
  - announce the deprecation of SUBDIRS
  - fix single target build for external module
  - simplify the dependencies of 'prepare' stage targets
  - allow fixdep to directly write to .*.cmd files
  - simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS
  - change if_changed_rule to accept multi-line recipe
  - move .SECONDARY special target to scripts/Kbuild.include
  - remove redundant 'set -e'
  - improve parallel execution for CONFIG_HEADERS_CHECK
  - misc cleanups
 
 Treewide fixes and cleanups
  - set Clang flags correctly for PowerPC boot images
  - fix UML build error with CONFIG_GCC_PLUGINS
  - remove unneeded patterns from .gitignore files
  - refactor firmware/Makefile
  - remove unneeded rules for *offsets.s
  - avoid unneeded regeneration of intermediate .s files
  - clean up ./Kbuild
 
 Modpost:
  - remove unused -M, -K options
  - fix false positive warnings about section mismatch
  - use simple devtable lookup instead of linker magic
  - misc cleanups
 
 Coccinelle:
  - relax boolinit.cocci checks for overall consistency
  - fix warning messages of boolinit.cocci
 
 Other tools:
  - improve -dirty check of scripts/setlocalversion
  - add a tool to generate compile_commands.json from .*.cmd files
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJcJiVeAAoJED2LAQed4NsGXEMP/1fdkBfZGxYlxp4w1tEBeRKZ
 GInxxMfHiHZlu7nFPQ3k1mncczZjeLbnKcLaybnPmqBSfQe1F7tN0ux1sHtWkVfI
 DvqoKaHObZ3lEMEVxCHXQ2bKrN/j9nB2xSxzr4dvc9DscW8dwKElZuKVE7nHKdhl
 z70xsalxHGEGY6hptJrucbv8KTBOSleZ8Gaat79sEDkDSLCTjxXB3WcVMWqDT0M/
 IqN5QCwiPjZC3UCwuqq6+vnG1gyvDUORcbrVgHrBIKxLYAABYzugB5IlLpi8B31C
 ZUDmijSTandHd4SG5gw5uZoyYuK4YhZeI7g4yNyXSqnPurmcJxrGdpiuwRcE7xet
 5yB2uaNbAFO6Fbz4gUdDEvryA9IZEPPn1Z0Btfpp7UOxiWEqE61oHReCNdkiad94
 Oonl+ROZw5UOT3AZD3xCZTf9bTnQoCFccHTmnbaKqqSiDWxxLUXum7sNfnJ6GXEb
 fNQDuwuxtsPStaWoU93QNrGyl5e8jYTKFdphUCnZZ2/hE8ygoQ06PYmeBAk+UKfN
 UI2IqR8PHgmA97XJg+fHhbw6X4nQcPsg/usLqxGItAZNO2O4uXgaN24dOcHG74Bu
 2Dk/gQQB2sVB/Dxfmlaxvvj98MVg7IMtpusAT2bQ7miiSS3EqPFT8KQMZYLICWuZ
 u7QQe20yCho3ZULmsRwH
 =0PCt
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:
 "Kbuild core:
   - remove unneeded $(call cc-option,...) switches
   - consolidate Clang compiler flags into CLANG_FLAGS
   - announce the deprecation of SUBDIRS
   - fix single target build for external module
   - simplify the dependencies of 'prepare' stage targets
   - allow fixdep to directly write to .*.cmd files
   - simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS
   - change if_changed_rule to accept multi-line recipe
   - move .SECONDARY special target to scripts/Kbuild.include
   - remove redundant 'set -e'
   - improve parallel execution for CONFIG_HEADERS_CHECK
   - misc cleanups

  Treewide fixes and cleanups
   - set Clang flags correctly for PowerPC boot images
   - fix UML build error with CONFIG_GCC_PLUGINS
   - remove unneeded patterns from .gitignore files
   - refactor firmware/Makefile
   - remove unneeded rules for *offsets.s
   - avoid unneeded regeneration of intermediate .s files
   - clean up ./Kbuild

  Modpost:
   - remove unused -M, -K options
   - fix false positive warnings about section mismatch
   - use simple devtable lookup instead of linker magic
   - misc cleanups

  Coccinelle:
   - relax boolinit.cocci checks for overall consistency
   - fix warning messages of boolinit.cocci

  Other tools:
   - improve -dirty check of scripts/setlocalversion
   - add a tool to generate compile_commands.json from .*.cmd files"

* tag 'kbuild-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (51 commits)
  kbuild: remove unused cmd_gentimeconst
  kbuild: remove $(obj)/ prefixes in ./Kbuild
  treewide: add intermediate .s files to targets
  treewide: remove explicit rules for *offsets.s
  firmware: refactor firmware/Makefile
  firmware: remove unnecessary patterns from .gitignore
  scripts: remove unnecessary ihex2fw and check-lc_ctypes from .gitignore
  um: remove unused filechk_gen_header in Makefile
  scripts: add a tool to produce a compile_commands.json file
  kbuild: add -Werror=implicit-int flag unconditionally
  kbuild: add -Werror=strict-prototypes flag unconditionally
  kbuild: add -fno-PIE flag unconditionally
  scripts: coccinelle: Correct warning message
  scripts: coccinelle: only suggest true/false in files that already use them
  kbuild: handle part-of-module correctly for *.ll and *.symtypes
  kbuild: refactor part-of-module
  kbuild: refactor quiet_modtag
  kbuild: remove redundant quiet_modtag for $(obj-m)
  kbuild: refactor Makefile.asm-generic
  user/Makefile: Fix typo and capitalization in comment section
  ...
2018-12-29 12:03:17 -08:00
Masahiro Yamada 8e9b61b293 kbuild: move .SECONDARY special target to Kbuild.include
In commit 54a702f705 ("kbuild: mark $(targets) as .SECONDARY and
remove .PRECIOUS markers"), I missed one important feature of the
.SECONDARY target:

    .SECONDARY with no prerequisites causes all targets to be
    treated as secondary.

... which agrees with the policy of Kbuild.

Let's move it to scripts/Kbuild.include, with no prerequisites.

Note:
If an intermediate file is generated by $(call if_changed,...), you
still need to add it to "targets" so its .*.cmd file is included.

The arm/arm64 crypto files are generated by $(call cmd,shipped),
so they do not need to be added to "targets", but need to be added
to "clean-files" so "make clean" can properly clean them away.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-12-02 14:11:49 +09:00
Eric Biggers 16aae3595a crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305
Add an ARM NEON implementation of NHPoly1305, an ε-almost-∆-universal
hash function used in the Adiantum encryption mode.  For now, only the
NH portion is actually NEON-accelerated; the Poly1305 part is less
performance-critical so is just implemented in C.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-20 14:26:56 +08:00
Eric Biggers bdb063a79f crypto: arm/chacha - add XChaCha12 support
Now that the 32-bit ARM NEON implementation of ChaCha20 and XChaCha20
has been refactored to support varying the number of rounds, add support
for XChaCha12.  This is identical to XChaCha20 except for the number of
rounds, which is 12 instead of 20.

XChaCha12 is faster than XChaCha20 but has a lower security margin,
though still greater than AES-256's since the best known attacks make it
through only 7 rounds.  See the patch "crypto: chacha - add XChaCha12
support" for more details about why we need XChaCha12 support.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-20 14:26:56 +08:00
Eric Biggers 3cc215198e crypto: arm/chacha20 - refactor to allow varying number of rounds
In preparation for adding XChaCha12 support, rename/refactor the NEON
implementation of ChaCha20 to support different numbers of rounds.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-20 14:26:56 +08:00
Eric Biggers d97a94309d crypto: arm/chacha20 - add XChaCha20 support
Add an XChaCha20 implementation that is hooked up to the ARM NEON
implementation of ChaCha20.  This is needed for use in the Adiantum
encryption mode; see the generic code patch,
"crypto: chacha20-generic - add XChaCha20 support", for more details.

We also update the NEON code to support HChaCha20 on one block, so we
can use that in XChaCha20 rather than calling the generic HChaCha20.
This required factoring the permutation out into its own macro.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-20 14:26:56 +08:00
Eric Biggers be2830b15b crypto: arm/chacha20 - limit the preemption-disabled section
To improve responsivesess, disable preemption for each step of the walk
(which is at most PAGE_SIZE) rather than for the entire
encryption/decryption operation.

Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-20 14:26:56 +08:00
Eric Biggers 1ca1b91794 crypto: chacha20-generic - refactor to allow varying number of rounds
In preparation for adding XChaCha12 support, rename/refactor
chacha20-generic to support different numbers of rounds.  The
justification for needing XChaCha12 support is explained in more detail
in the patch "crypto: chacha - add XChaCha12 support".

The only difference between ChaCha{8,12,20} are the number of rounds
itself; all other parts of the algorithm are the same.  Therefore,
remove the "20" from all definitions, structures, functions, files, etc.
that will be shared by all ChaCha versions.

Also make ->setkey() store the round count in the chacha_ctx (previously
chacha20_ctx).  The generic code then passes the round count through to
chacha_block().  There will be a ->setkey() function for each explicitly
allowed round count; the encrypt/decrypt functions will be the same.  I
decided not to do it the opposite way (same ->setkey() function for all
round counts, with different encrypt/decrypt functions) because that
would have required more boilerplate code in architecture-specific
implementations of ChaCha and XChaCha.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-20 14:26:55 +08:00
Brajeswar Ghosh d65ddecbea crypto: aes-ce - Remove duplicate header
Remove asm/hwcap.h which is included more than once

Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-16 14:09:40 +08:00
Eric Biggers 913a3aa07d crypto: arm/aes - add some hardening against cache-timing attacks
Make the ARM scalar AES implementation closer to constant-time by
disabling interrupts and prefetching the tables into L1 cache.  This is
feasible because due to ARM's "free" rotations, the main tables are only
1024 bytes instead of the usual 4096 used by most AES implementations.

On ARM Cortex-A7, the speed loss is only about 5%.  The resulting code
is still over twice as fast as aes_ti.c.  Responsiveness is potentially
a concern, but interrupts are only disabled for a single AES block.

Note that even after these changes, the implementation still isn't
necessarily guaranteed to be constant-time; see
https://cr.yp.to/antiforgery/cachetiming-20050414.pdf for a discussion
of the many difficulties involved in writing truly constant-time AES
software.  But it's valuable to make such attacks more difficult.

Much of this patch is based on patches suggested by Ard Biesheuvel.

Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-09 17:36:48 +08:00
Stefan Agner cd560235d8 crypto: arm/crc32 - avoid warning when compiling with Clang
The table id (second) argument to MODULE_DEVICE_TABLE is often
referenced otherwise. This is not the case for CPU features. This
leads to a warning when building the kernel with Clang:
  arch/arm/crypto/crc32-ce-glue.c:239:33: warning: variable
    'crc32_cpu_feature' is not needed and will not be emitted
    [-Wunneeded-internal-declaration]
  static const struct cpu_feature crc32_cpu_feature[] = {
                                  ^

Avoid warnings by using __maybe_unused, similar to commit 1f318a8baf
("modules: mark __inittest/__exittest as __maybe_unused").

Fixes: 2a9faf8b7e ("crypto: arm/crc32 - enable module autoloading based on CPU feature bits")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-21 13:24:52 +08:00
Eric Biggers a1b22a5f45 crypto: arm/chacha20 - faster 8-bit rotations and other optimizations
Optimize ChaCha20 NEON performance by:

- Implementing the 8-bit rotations using the 'vtbl.8' instruction.
- Streamlining the part that adds the original state and XORs the data.
- Making some other small tweaks.

On ARM Cortex-A7, these optimizations improve ChaCha20 performance from
about 12.08 cycles per byte to about 11.37 -- a 5.9% improvement.

There is a tradeoff involved with the 'vtbl.8' rotation method since
there is at least one CPU (Cortex-A53) where it's not fastest.  But it
seems to be a better default; see the added comment.  Overall, this
patch reduces Cortex-A53 performance by less than 0.5%.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-04 11:37:05 +08:00
Ard Biesheuvel 00227e3a1d crypto: arm/ghash-ce - implement support for 4-way aggregation
Speed up the GHASH algorithm based on 64-bit polynomial multiplication
by adding support for 4-way aggregation. This improves throughput by
~85% on Cortex-A53, from 1.7 cycles per byte to 0.9 cycles per byte.

When combined with AES into GCM, throughput improves by ~25%, from
3.8 cycles per byte to 3.0 cycles per byte.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-04 11:37:04 +08:00
Jason A. Donenfeld 578bdaabd0 crypto: speck - remove Speck
These are unused, undesired, and have never actually been used by
anybody. The original authors of this code have changed their mind about
its inclusion. While originally proposed for disk encryption on low-end
devices, the idea was discarded [1] in favor of something else before
that could really get going. Therefore, this patch removes Speck.

[1] https://marc.info/?l=linux-crypto-vger&m=153359499015659

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Eric Biggers <ebiggers@google.com>
Cc: stable@vger.kernel.org
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-04 11:35:03 +08:00
Arnd Bergmann 3723c63247 treewide: convert ISO_8859-1 text comments to utf-8
Almost all files in the kernel are either plain text or UTF-8 encoded.  A
couple however are ISO_8859-1, usually just a few characters in a C
comments, for historic reasons.

This converts them all to UTF-8 for consistency.

Link: http://lkml.kernel.org/r/20180724111600.4158975-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Simon Horman <horms@verge.net.au>			[IPVS portion]
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	[IIO]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>			[powerpc]
Acked-by: Rob Herring <robh@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-23 18:48:43 -07:00
Eric Biggers 4e34e51f48 crypto: arm/chacha20 - always use vrev for 16-bit rotates
The 4-way ChaCha20 NEON code implements 16-bit rotates with vrev32.16,
but the one-way code (used on remainder blocks) implements it with
vshl + vsri, which is slower.  Switch the one-way code to vrev32.16 too.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-03 18:06:05 +08:00
Herbert Xu c5f5aeef9b Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
Merge mainline to pick up c7513c2a27 ("crypto/arm64: aes-ce-gcm -
add missing kernel_neon_begin/end pair").
2018-08-03 17:55:12 +08:00
Eric Biggers c87a405e3b crypto: ahash - remove useless setting of cra_type
Some ahash algorithms set .cra_type = &crypto_ahash_type.  But this is
redundant with the C structure type ('struct ahash_alg'), and
crypto_register_ahash() already sets the .cra_type automatically.
Apparently the useless assignment has just been copy+pasted around.

So, remove the useless assignment from all the ahash algorithms.

This patch shouldn't change any actual behavior.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-07-09 00:30:26 +08:00
Eric Biggers 6a38f62245 crypto: ahash - remove useless setting of type flags
Many ahash algorithms set .cra_flags = CRYPTO_ALG_TYPE_AHASH.  But this
is redundant with the C structure type ('struct ahash_alg'), and
crypto_register_ahash() already sets the type flag automatically,
clearing any type flag that was already there.  Apparently the useless
assignment has just been copy+pasted around.

So, remove the useless assignment from all the ahash algorithms.

This patch shouldn't change any actual behavior.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-07-09 00:30:25 +08:00
Eric Biggers e50944e219 crypto: shash - remove useless setting of type flags
Many shash algorithms set .cra_flags = CRYPTO_ALG_TYPE_SHASH.  But this
is redundant with the C structure type ('struct shash_alg'), and
crypto_register_shash() already sets the type flag automatically,
clearing any type flag that was already there.  Apparently the useless
assignment has just been copy+pasted around.

So, remove the useless assignment from all the shash algorithms.

This patch shouldn't change any actual behavior.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-07-09 00:30:24 +08:00
Eric Biggers a068b94d74 crypto: arm/speck - fix building in Thumb2 mode
Building the kernel with CONFIG_THUMB2_KERNEL=y and
CONFIG_CRYPTO_SPECK_NEON set fails with the following errors:

    arch/arm/crypto/speck-neon-core.S: Assembler messages:

    arch/arm/crypto/speck-neon-core.S:419: Error: r13 not allowed here -- `bic sp,#0xf'
    arch/arm/crypto/speck-neon-core.S:423: Error: r13 not allowed here -- `bic sp,#0xf'
    arch/arm/crypto/speck-neon-core.S:427: Error: r13 not allowed here -- `bic sp,#0xf'
    arch/arm/crypto/speck-neon-core.S:431: Error: r13 not allowed here -- `bic sp,#0xf'

The problem is that the 'bic' instruction can't operate on the 'sp'
register in Thumb2 mode.  Fix it by using a temporary register.  This
isn't in the main loop, so the performance difference is negligible.
This also matches what aes-neonbs-core.S does.

Reported-by: Stefan Agner <stefan@agner.ch>
Fixes: ede9622162 ("crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-07-01 23:31:46 +08:00
Adam Langley c2e415fe75 crypto: clarify licensing of OpenSSL asm code
Several source files have been taken from OpenSSL. In some of them a
comment that "permission to use under GPL terms is granted" was
included below a contradictory license statement. In several cases,
there was no indication that the license of the code was compatible
with the GPLv2.

This change clarifies the licensing for all of these files. I've
confirmed with the author (Andy Polyakov) that a) he has licensed the
files with the GPLv2 comment under that license and b) that he's also
happy to license the other files under GPLv2 too. In one case, the
file is already contained in his CRYPTOGAMS bundle, which has a GPLv2
option, and so no special measures are needed.

In all cases, the license status of code has been clarified by making
the GPLv2 license prominent.

The .S files have been regenerated from the updated .pl files.

This is a comment-only change. No code is changed.

Signed-off-by: Adam Langley <agl@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-05-31 00:13:44 +08:00
Masahiro Yamada 54a702f705 kbuild: mark $(targets) as .SECONDARY and remove .PRECIOUS markers
GNU Make automatically deletes intermediate files that are updated
in a chain of pattern rules.

Example 1) %.dtb.o <- %.dtb.S <- %.dtb <- %.dts
Example 2) %.o <- %.c <- %.c_shipped

A couple of makefiles mark such targets as .PRECIOUS to prevent Make
from deleting them, but the correct way is to use .SECONDARY.

  .SECONDARY
    Prerequisites of this special target are treated as intermediate
    files but are never automatically deleted.

  .PRECIOUS
    When make is interrupted during execution, it may delete the target
    file it is updating if the file was modified since make started.
    If you mark the file as precious, make will never delete the file
    if interrupted.

Both can avoid deletion of intermediate files, but the difference is
the behavior when Make is interrupted; .SECONDARY deletes the target,
but .PRECIOUS does not.

The use of .PRECIOUS is relatively rare since we do not want to keep
partially constructed (possibly corrupted) targets.

Another difference is that .PRECIOUS works with pattern rules whereas
.SECONDARY does not.

  .PRECIOUS: $(obj)/%.lex.c

works, but

  .SECONDARY: $(obj)/%.lex.c

has no effect.  However, for the reason above, I do not want to use
.PRECIOUS which could cause obscure build breakage.

The targets specified as .SECONDARY must be explicit.  $(targets)
contains all targets that need to include .*.cmd files.  So, the
intermediates you want to keep are mostly in there.  Therefore, mark
$(targets) as .SECONDARY.  It means primary targets are also marked
as .SECONDARY, but I do not see any drawback for this.

I replaced some .SECONDARY / .PRECIOUS markers with 'targets'.  This
will make Kbuild search for non-existing .*.cmd files, but this is
not a noticeable performance issue.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Frank Rowand <frowand.list@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
2018-04-07 19:04:02 +09:00
Leonard Crestez 6aaf49b495 crypto: arm,arm64 - Fix random regeneration of S_shipped
The decision to rebuild .S_shipped is made based on the relative
timestamps of .S_shipped and .pl files but git makes this essentially
random. This means that the perl script might run anyway (usually at
most once per checkout), defeating the whole purpose of _shipped.

Fix by skipping the rule unless explicit make variables are provided:
REGENERATE_ARM_CRYPTO or REGENERATE_ARM64_CRYPTO.

This can produce nasty occasional build failures downstream, for example
for toolchains with broken perl. The solution is minimally intrusive to
make it easier to push into stable.

Another report on a similar issue here: https://lkml.org/lkml/2018/3/8/1379

Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-03-23 23:43:19 +08:00
Eric Biggers ede9622162 crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS
Add an ARM NEON-accelerated implementation of Speck-XTS.  It operates on
128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for
Speck64.  Each 128-byte chunk goes through XTS preprocessing, then is
encrypted/decrypted (doing one cipher round for all the blocks, then the
next round, etc.), then goes through XTS postprocessing.

The performance depends on the processor but can be about 3 times faster
than the generic code.  For example, on an ARMv7 processor we observe
the following performance with Speck128/256-XTS:

    xts-speck128-neon:     Encryption 107.9 MB/s, Decryption 108.1 MB/s
    xts(speck128-generic): Encryption  32.1 MB/s, Decryption  36.6 MB/s

In comparison to AES-256-XTS without the Cryptography Extensions:

    xts-aes-neonbs:        Encryption  41.2 MB/s, Decryption  36.7 MB/s
    xts(aes-asm):          Encryption  31.7 MB/s, Decryption  30.8 MB/s
    xts(aes-generic):      Encryption  21.2 MB/s, Decryption  20.9 MB/s

Speck64/128-XTS is even faster:

    xts-speck64-neon:      Encryption 138.6 MB/s, Decryption 139.1 MB/s

Note that as with the generic code, only the Speck128 and Speck64
variants are supported.  Also, for now only the XTS mode of operation is
supported, to target the disk and file encryption use cases.  The NEON
code also only handles the portion of the data that is evenly divisible
into 128-byte chunks, with any remainder handled by a C fallback.  Of
course, other modes of operation could be added later if needed, and/or
the NEON code could be updated to handle other buffer sizes.

The XTS specification is only defined for AES which has a 128-bit block
size, so for the GF(2^64) math needed for Speck64-XTS we use the
reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX
paper.  Of course, when possible users should use Speck128-XTS, but even
that may be too slow on some processors; Speck64-XTS can be faster.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22 22:16:55 +08:00
Jinbum Park 4ff8b1dd81 crypto: arm/aes-cipher - move S-box to .rodata section
Move the AES inverse S-box to the .rodata section
where it is safe from abuse by speculation.

Signed-off-by: Jinbum Park <jinb.park7@gmail.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22 22:16:19 +08:00
Eric Biggers a208fa8f33 crypto: hash - annotate algorithms taking optional key
We need to consistently enforce that keyed hashes cannot be used without
setting the key.  To do this we need a reliable way to determine whether
a given hash algorithm is keyed or not.  AF_ALG currently does this by
checking for the presence of a ->setkey() method.  However, this is
actually slightly broken because the CRC-32 algorithms implement
->setkey() but can also be used without a key.  (The CRC-32 "key" is not
actually a cryptographic key but rather represents the initial state.
If not overridden, then a default initial state is used.)

Prepare to fix this by introducing a flag CRYPTO_ALG_OPTIONAL_KEY which
indicates that the algorithm has a ->setkey() method, but it is not
required to be called.  Then set it on all the CRC-32 algorithms.

The same also applies to the Adler-32 implementation in Lustre.

Also, the cryptd and mcryptd templates have to pass through the flag
from their underlying algorithm.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-01-12 23:03:35 +11:00
Gomonovych, Vasyl 26d85e5f3b crypto: arm/aes-neonbs - Use PTR_ERR_OR_ZERO()
Fix ptr_ret.cocci warnings:
arch/arm/crypto/aes-neonbs-glue.c:184:1-3: WARNING: PTR_ERR_OR_ZERO can be used
arch/arm/crypto/aes-neonbs-glue.c:261:1-3: WARNING: PTR_ERR_OR_ZERO can be used

Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-12-11 22:36:56 +11:00
Greg Kroah-Hartman b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Ard Biesheuvel 0d149ce67d crypto: arm/aes - avoid expanded lookup tables in the final round
For the final round, avoid the expanded and padded lookup tables
exported by the generic AES driver. Instead, for encryption, we can
perform byte loads from the same table we used for the inner rounds,
which will still be hot in the caches. For decryption, use the inverse
AES Sbox directly, which is 4x smaller than the inverse lookup table
exported by the generic driver.

This should significantly reduce the Dcache footprint of our code,
which makes the code more robust against timing attacks. It does not
introduce any additional module dependencies, given that we already
rely on the core AES module for the shared key expansion routines.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04 09:27:25 +08:00
Ard Biesheuvel 3759ee0572 crypto: arm/ghash - add NEON accelerated fallback for vmull.p64
Implement a NEON fallback for systems that do support NEON but have
no support for the optional 64x64->128 polynomial multiplication
instruction that is part of the ARMv8 Crypto Extensions. It is based
on the paper "Fast Software Polynomial Multiplication on ARM Processors
Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and
Ricardo Dahab (https://hal.inria.fr/hal-01506572)

On a 32-bit guest executing under KVM on a Cortex-A57, the new code is
not only 4x faster than the generic table based GHASH driver, it is also
time invariant. (Note that the existing vmull.p64 code is 16x faster on
this core).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04 09:27:24 +08:00
Ard Biesheuvel 45fe93dff2 crypto: algapi - make crypto_xor() take separate dst and src arguments
There are quite a number of occurrences in the kernel of the pattern

  if (dst != src)
          memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
  crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE);

or

  crypto_xor(keystream, src, nbytes);
  memcpy(dst, keystream, nbytes);

where crypto_xor() is preceded or followed by a memcpy() invocation
that is only there because crypto_xor() uses its output parameter as
one of the inputs. To avoid having to add new instances of this pattern
in the arm64 code, which will be refactored to implement non-SIMD
fallbacks, add an alternative implementation called crypto_xor_cpy(),
taking separate input and output arguments. This removes the need for
the separate memcpy().

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04 09:27:15 +08:00
Ard Biesheuvel 2a9faf8b7e crypto: arm/crc32 - enable module autoloading based on CPU feature bits
Make the module autoloadable by tying it to the CPU feature bits that
describe whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-06-01 12:55:42 +08:00
Ard Biesheuvel a83ff88bed crypto: arm/sha2-ce - enable module autoloading based on CPU feature bits
Make the module autoloadable by tying it to the CPU feature bit that
describes whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-06-01 12:55:41 +08:00
Ard Biesheuvel bd56f95ea9 crypto: arm/sha1-ce - enable module autoloading based on CPU feature bits
Make the module autoloadable by tying it to the CPU feature bit that
describes whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-06-01 12:55:40 +08:00
Ard Biesheuvel c9d9f608b4 crypto: arm/ghash-ce - enable module autoloading based on CPU feature bits
Make the module autoloadable by tying it to the CPU feature bit that
describes whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-06-01 12:55:39 +08:00
Ard Biesheuvel 4d8061a591 crypto: arm/aes-ce - enable module autoloading based on CPU feature bits
Make the module autoloadable by tying it to the CPU feature bit that
describes whether the optional instructions it relies on are implemented
by the current CPU.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-06-01 12:55:38 +08:00
Ard Biesheuvel b56f5cbc7e crypto: arm/aes-neonbs - resolve fallback cipher at runtime
Currently, the bit sliced NEON AES code for ARM has a link time
dependency on the scalar ARM asm implementation, which it uses as a
fallback to perform CBC encryption and the encryption of the initial
XTS tweak.

The bit sliced NEON code is both fast and time invariant, which makes
it a reasonable default on hardware that supports it. However, the
ARM asm code it pulls in is not time invariant, and due to the way it
is linked in, cannot be overridden by the new generic time invariant
driver. In fact, it will not be used at all, given that the ARM asm
code registers itself as a cipher with a priority that exceeds the
priority of the fixed time cipher.

So remove the link time dependency, and allocate the fallback cipher
via the crypto API. Note that this requires this driver's module_init
call to be replaced with late_initcall, so that the (possibly generic)
fallback cipher is guaranteed to be available when the builtin test
is performed at registration time.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-03-09 18:34:16 +08:00
Ard Biesheuvel efa7cebdbf crypto: arm/crc32 - add build time test for CRC instruction support
The accelerated CRC32 module for ARM may use either the scalar CRC32
instructions, the NEON 64x64 to 128 bit polynomial multiplication
(vmull.p64) instruction, or both, depending on what the current CPU
supports.

However, this also requires support in binutils, and as it turns out,
versions of binutils exist that support the vmull.p64 instruction but
not the crc32 instructions.

So refactor the Makefile logic so that this module only gets built if
binutils has support for both.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-03-01 19:47:53 +08:00
Ard Biesheuvel 1fb1683cb3 crypto: arm/crc32 - fix build error with outdated binutils
Annotate a vmov instruction with an explicit element size of 32 bits.
This is inferred by recent toolchains, but apparently, older versions
need some help figuring this out.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-03-01 19:47:51 +08:00
Ard Biesheuvel 1a20b96612 crypto: arm/aes - don't use IV buffer to return final keystream block
The ARM bit sliced AES core code uses the IV buffer to pass the final
keystream block back to the glue code if the input is not a multiple of
the block size, so that the asm code does not have to deal with anything
except 16 byte blocks. This is done under the assumption that the outgoing
IV is meaningless anyway in this case, given that chaining is no longer
possible under these circumstances.

However, as it turns out, the CCM driver does expect the IV to retain
a value that is equal to the original IV except for the counter value,
and even interprets byte zero as a length indicator, which may result
in memory corruption if the IV is overwritten with something else.

So use a separate buffer to return the final keystream block.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03 18:16:21 +08:00
Ard Biesheuvel 4a70b52620 crypto: arm/chacha20 - remove cra_alignmask
Remove the unnecessary alignmask: it is much more efficient to deal with
the misalignment in the core algorithm than relying on the crypto API to
copy the data to a suitably aligned buffer.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03 18:16:19 +08:00
Ard Biesheuvel 1465fb13d3 crypto: arm/aes-ce - remove cra_alignmask
Remove the unnecessary alignmask: it is much more efficient to deal with
the misalignment in the core algorithm than relying on the crypto API to
copy the data to a suitably aligned buffer.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03 18:16:16 +08:00
Ard Biesheuvel 13954e788d crypto: arm/aes-neonbs - fix issue with v2.22 and older assembler
The GNU assembler for ARM version 2.22 or older fails to infer the
element size from the vmov instructions, and aborts the build in
the following way;

.../aes-neonbs-core.S: Assembler messages:
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1h[1],r10'
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1h[0],r9'
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1l[1],r8'
.../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1l[0],r7'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2h[1],r10'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2h[0],r9'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2l[1],r8'
.../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2l[0],r7'

Fix this by setting the element size explicitly, by replacing vmov with
vmov.32.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-23 22:50:25 +08:00
Ard Biesheuvel 658fa754cd crypto: arm/aes - avoid reserved 'tt' mnemonic in asm code
The ARMv8-M architecture introduces 'tt' and 'ttt' instructions,
which means we can no longer use 'tt' as a register alias on recent
versions of binutils for ARM. So replace the alias with 'ttab'.

Fixes: 81edb42629 ("crypto: arm/aes - replace scalar AES cipher")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-13 18:47:21 +08:00
Ard Biesheuvel cc477bf645 crypto: arm/aes - replace bit-sliced OpenSSL NEON code
This replaces the unwieldy generated implementation of bit-sliced AES
in CBC/CTR/XTS modes that originated in the OpenSSL project with a
new version that is heavily based on the OpenSSL implementation, but
has a number of advantages over the old version:
- it does not rely on the scalar AES cipher that also originated in the
  OpenSSL project and contains redundant lookup tables and key schedule
  generation routines (which we already have in crypto/aes_generic.)
- it uses the same expanded key schedule for encryption and decryption,
  reducing the size of the per-key data structure by 1696 bytes
- it adds an implementation of AES in ECB mode, which can be wrapped by
  other generic chaining mode implementations
- it moves the handling of corner cases that are non critical to performance
  to the glue layer written in C
- it was written directly in assembler rather than generated from a Perl
  script

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-13 18:27:31 +08:00
Ard Biesheuvel 81edb42629 crypto: arm/aes - replace scalar AES cipher
This replaces the scalar AES cipher that originates in the OpenSSL project
with a new implementation that is ~15% (*) faster (on modern cores), and
reuses the lookup tables and the key schedule generation routines from the
generic C implementation (which is usually compiled in anyway due to
networking and other subsystems depending on it).

Note that the bit sliced NEON code for AES still depends on the scalar cipher
that this patch replaces, so it is not removed entirely yet.

* On Cortex-A57, the performance increases from 17.0 to 14.9 cycles per byte
  for 128-bit keys.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-13 00:26:50 +08:00
Ard Biesheuvel afaf712e99 crypto: arm/chacha20 - implement NEON version based on SSE3 code
This is a straight port to ARM/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher. It uses the new skcipher walksize
attribute to process the input in strides of 4x the block size.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-13 00:26:48 +08:00
Herbert Xu 5386e5d1f8 Revert "crypto: arm64/ARM: NEON accelerated ChaCha20"
This patch reverts the following commits:

8621caa0d4
8096667273

I should not have applied them because they had already been
obsoleted by a subsequent patch series.  They also cause a build
failure because of the subsequent commit 9ae433bc79.

Fixes: 9ae433bc79 ("crypto: chacha20 - convert generic and...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-28 17:39:26 +08:00
Ard Biesheuvel 8096667273 crypto: arm/chacha20 - implement NEON version based on SSE3 code
This is a straight port to ARM/NEON of the x86 SSE3 implementation
of the ChaCha20 stream cipher.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-27 17:47:29 +08:00
Ard Biesheuvel d0a3431a7b crypto: arm/crc32 - accelerated support based on x86 SSE implementation
This is a combination of the the Intel algorithm implemented using SSE
and PCLMULQDQ instructions from arch/x86/crypto/crc32-pclmul_asm.S, and
the new CRC32 extensions introduced for both 32-bit and 64-bit ARM in
version 8 of the architecture. Two versions of the above combo are
provided, one for CRC32 and one for CRC32C.

The PMULL/NEON algorithm is faster, but operates on blocks of at least
64 bytes, and on multiples of 16 bytes only. For the remaining input,
or for all input on systems that lack the PMULL 64x64->128 instructions,
the CRC32 instructions will be used.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-07 20:01:24 +08:00
Ard Biesheuvel 1d481f1cd8 crypto: arm/crct10dif - port x86 SSE implementation to ARM
This is a transliteration of the Intel algorithm implemented
using SSE and PCLMULQDQ instructions that resides in the file
arch/x86/crypto/crct10dif-pcl-asm_64.S, but simplified to only
operate on buffers that are 16 byte aligned (but of any size)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-07 20:01:21 +08:00
Herbert Xu efad2b61ae crypto: aes-ce - Make aes_simd_algs static
The variable aes_simd_algs should be static.  In fact if it isn't
it causes build errors when multiple copies of aes-ce-glue.c are
built into the kernel.

Fixes: da40e7a4ba ("crypto: aes-ce - Convert to skcipher")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-01 21:06:44 +08:00
Ard Biesheuvel 81126d1a8b crypto: arm/aesbs - fix brokenness after skcipher conversion
The CBC encryption routine should use the encryption round keys, not
the decryption round keys.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-30 20:01:51 +08:00
Herbert Xu 6fdf436fd8 crypto: arm/aes - Add missing SIMD select for aesbs
This patch adds one more missing SIMD select for AES_ARM_BS.  It
also changes selects on ALGAPI to BLKCIPHER.

Fixes: 211f41af53 ("crypto: aesbs - Convert to skcipher")
Reported-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-30 20:01:40 +08:00
Herbert Xu 585b5fa63d crypto: arm/aes - Select SIMD in Kconfig
The skcipher conversion for ARM missed the select on CRYPTO_SIMD,
causing build failures if SIMD was not otherwise enabled.

Fixes: da40e7a4ba ("crypto: aes-ce - Convert to skcipher")
Fixes: 211f41af53 ("crypto: aesbs - Convert to skcipher")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-29 16:11:14 +08:00
Herbert Xu 211f41af53 crypto: aesbs - Convert to skcipher
This patch converts aesbs over to the skcipher interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-28 21:23:21 +08:00
Herbert Xu da40e7a4ba crypto: aes-ce - Convert to skcipher
This patch converts aes-ce over to the skcipher interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-28 21:23:21 +08:00
Ard Biesheuvel 58010fa6f7 crypto: arm/aes-ce - fix for big endian
The AES key schedule generation is mostly endian agnostic, with the
exception of the rotation and the incorporation of the round constant
at the start of each round. So implement a big endian specific version
of that part to make the whole routine big endian compatible.

Fixes: 86464859cc ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-10-21 11:03:46 +08:00
Herbert Xu c3afafa478 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Merge the crypto tree to pull in vmx ghash fix.
2016-10-10 11:19:47 +08:00
Ard Biesheuvel f82e90b286 crypto: arm/aes-ctr - fix NULL dereference in tail processing
The AES-CTR glue code avoids calling into the blkcipher API for the
tail portion of the walk, by comparing the remainder of walk.nbytes
modulo AES_BLOCK_SIZE with the residual nbytes, and jumping straight
into the tail processing block if they are equal. This tail processing
block checks whether nbytes != 0, and does nothing otherwise.

However, in case of an allocation failure in the blkcipher layer, we
may enter this code with walk.nbytes == 0, while nbytes > 0. In this
case, we should not dereference the source and destination pointers,
since they may be NULL. So instead of checking for nbytes != 0, check
for (walk.nbytes % AES_BLOCK_SIZE) != 0, which implies the former in
non-error conditions.

Fixes: 86464859cc ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions")
Cc: stable@vger.kernel.org
Reported-by: xiakaixu <xiakaixu@huawei.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-09-13 18:44:59 +08:00
Ard Biesheuvel 71f89917ff crypto: arm/ghash - change internal cra_name to "__ghash"
The fact that the internal synchrous hash implementation is called
"ghash" like the publicly visible one is causing the testmgr code
to misidentify it as an algorithm that requires testing at boottime.
So rename it to "__ghash" to prevent this.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-09-07 21:10:19 +08:00
Ard Biesheuvel ed4767d612 crypto: arm/ghash-ce - add missing async import/export
Since commit 8996eafdcb ("crypto: ahash - ensure statesize is non-zero"),
all ahash drivers are required to implement import()/export(), and must have
a non-zero statesize. Fix this for the ARM Crypto Extensions GHASH
implementation.

Fixes: 8996eafdcb ("crypto: ahash - ensure statesize is non-zero")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-09-07 21:08:30 +08:00
Ard Biesheuvel 2117eaa62a crypto: arm/sha1-neon - add support for building in Thumb2 mode
The ARMv7 NEON module is explicitly built in ARM mode, which is not
supported by the Thumb2 kernel. So remove the explicit override, and
leave it up to the build environment to decide whether the core SHA1
routines are assembled as ARM or as Thumb2 code.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-09-07 21:08:29 +08:00
Herbert Xu 820573ebd6 crypto: ghash-ce - Fix cryptd reordering
This patch fixes an old bug where requests can be reordered because
some are processed by cryptd while others are processed directly
in softirq context.

The fix is to always postpone to cryptd if there are currently
requests outstanding from the same tfm.

This patch also removes the redundant use of cryptd in the async
init function as init never touches the FPU.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-06-23 18:29:54 +08:00
Linus Torvalds 70477371dc Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "Here is the crypto update for 4.6:

  API:
   - Convert remaining crypto_hash users to shash or ahash, also convert
     blkcipher/ablkcipher users to skcipher.
   - Remove crypto_hash interface.
   - Remove crypto_pcomp interface.
   - Add crypto engine for async cipher drivers.
   - Add akcipher documentation.
   - Add skcipher documentation.

  Algorithms:
   - Rename crypto/crc32 to avoid name clash with lib/crc32.
   - Fix bug in keywrap where we zero the wrong pointer.

  Drivers:
   - Support T5/M5, T7/M7 SPARC CPUs in n2 hwrng driver.
   - Add PIC32 hwrng driver.
   - Support BCM6368 in bcm63xx hwrng driver.
   - Pack structs for 32-bit compat users in qat.
   - Use crypto engine in omap-aes.
   - Add support for sama5d2x SoCs in atmel-sha.
   - Make atmel-sha available again.
   - Make sahara hashing available again.
   - Make ccp hashing available again.
   - Make sha1-mb available again.
   - Add support for multiple devices in ccp.
   - Improve DMA performance in caam.
   - Add hashing support to rockchip"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits)
  crypto: qat - remove redundant arbiter configuration
  crypto: ux500 - fix checks of error code returned by devm_ioremap_resource()
  crypto: atmel - fix checks of error code returned by devm_ioremap_resource()
  crypto: qat - Change the definition of icp_qat_uof_regtype
  hwrng: exynos - use __maybe_unused to hide pm functions
  crypto: ccp - Add abstraction for device-specific calls
  crypto: ccp - CCP versioning support
  crypto: ccp - Support for multiple CCPs
  crypto: ccp - Remove check for x86 family and model
  crypto: ccp - memset request context to zero during import
  lib/mpi: use "static inline" instead of "extern inline"
  lib/mpi: avoid assembler warning
  hwrng: bcm63xx - fix non device tree compatibility
  crypto: testmgr - allow rfc3686 aes-ctr variants in fips mode.
  crypto: qat - The AE id should be less than the maximal AE number
  lib/mpi: Endianness fix
  crypto: rockchip - add hash support for crypto engine in rk3288
  crypto: xts - fix compile errors
  crypto: doc - add skcipher API documentation
  crypto: doc - update AEAD AD handling
  ...
2016-03-17 11:22:54 -07:00
Stephan Mueller 49abc0d2e1 crypto: xts - fix compile errors
Commit 28856a9e52 missed the addition of the crypto/xts.h include file
for different architecture-specific AES implementations.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-17 15:53:02 +08:00
Stephan Mueller 28856a9e52 crypto: xts - consolidate sanity check for keys
The patch centralizes the XTS key check logic into the service function
xts_check_key which is invoked from the different XTS implementations.
With this, the XTS implementations in ARM, ARM64, PPC and S390 have now
a sanity check for the XTS keys similar to the other arches.

In addition, this service function received a check to ensure that the
key != the tweak key which is mandated by FIPS 140-2 IG A.9. As the
check is not present in the standards defining XTS, it is only enforced
in FIPS mode of the kernel.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-02-17 04:07:51 +08:00
Jeremy Linton bee038a4bd arm/arm64: crypto: assure that ECB modes don't require an IV
ECB modes don't use an initialization vector. The kernel
/proc/crypto interface doesn't reflect this properly.

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-15 15:48:29 +00:00
Baruch Siach 4d666dbefc crypto: arm - ignore generated SHA2 assembly files
These files are generated since commits f2f770d74a (crypto: arm/sha256 - Add
optimized SHA-256/224, 2015-04-03) and c80ae7ca37 (crypto: arm/sha512 -
accelerated SHA-512 using ARM generic ASM and NEON, 2015-05-08).

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-07-06 16:32:03 +08:00
Ard Biesheuvel 6499e8cfaa crypto: arm/aes - streamline AES-192 code path
This trims off a couple of instructions of the total size of the
core AES transform by reordering the final branch in the AES-192
code path with the rounds that are performed regardless of whether
the branch is taken or not. Other than the slight size reduction,
this has no performance benefit.

Fix up a comment regarding the prototype of this function while
we're at it.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-05-11 15:08:01 +08:00
Ard Biesheuvel c80ae7ca37 crypto: arm/sha512 - accelerated SHA-512 using ARM generic ASM and NEON
This replaces the SHA-512 NEON module with the faster and more
versatile implementation from the OpenSSL project. It consists
of both a NEON and a generic ASM version of the core SHA-512
transform, where the NEON version reverts to the ASM version
when invoked in non-process context.

This patch is based on the OpenSSL upstream version b1a5d1c65208
of sha512-armv4.pl, which can be found here:

  https://git.openssl.org/gitweb/?p=openssl.git;h=b1a5d1c65208

Performance relative to the generic implementation (measured
using tcrypt.ko mode=306 sec=1 running on a Cortex-A57 under
KVM):

  input size	block size	asm	neon	old neon

  16		16		1.39	2.54	2.21
  64		16		1.32	2.33	2.09
  64		64		1.38	2.53	2.19
  256		16		1.31	2.28	2.06
  256		64		1.38	2.54	2.25
  256		256		1.40	2.77	2.39
  1024		16		1.29	2.22	2.01
  1024		256		1.40	2.82	2.45
  1024		1024		1.41	2.93	2.53
  2048		16		1.33	2.21	2.00
  2048		256		1.40	2.84	2.46
  2048		1024		1.41	2.96	2.55
  2048		2048		1.41	2.98	2.56
  4096		16		1.34	2.20	1.99
  4096		256		1.40	2.84	2.46
  4096		1024		1.41	2.97	2.56
  4096		4096		1.41	3.01	2.58
  8192		16		1.34	2.19	1.99
  8192		256		1.40	2.85	2.47
  8192		1024		1.41	2.98	2.56
  8192		4096		1.41	2.71	2.59
  8192		8192		1.51	3.51	2.69

Acked-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-05-11 15:08:01 +08:00
Linus Torvalds cb906953d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "Here is the crypto update for 4.1:

  New interfaces:
   - user-space interface for AEAD
   - user-space interface for RNG (i.e., pseudo RNG)

  New hashes:
   - ARMv8 SHA1/256
   - ARMv8 AES
   - ARMv8 GHASH
   - ARM assembler and NEON SHA256
   - MIPS OCTEON SHA1/256/512
   - MIPS img-hash SHA1/256 and MD5
   - Power 8 VMX AES/CBC/CTR/GHASH
   - PPC assembler AES, SHA1/256 and MD5
   - Broadcom IPROC RNG driver

  Cleanups/fixes:
   - prevent internal helper algos from being exposed to user-space
   - merge common code from assembly/C SHA implementations
   - misc fixes"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (169 commits)
  crypto: arm - workaround for building with old binutils
  crypto: arm/sha256 - avoid sha256 code on ARMv7-M
  crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer
  crypto: x86/sha256_ssse3 - move SHA-224/256 SSSE3 implementation to base layer
  crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layer
  crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
  crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer
  crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
  crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer
  crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer
  crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer
  crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer
  crypto: sha512-generic - move to generic glue implementation
  crypto: sha256-generic - move to generic glue implementation
  crypto: sha1-generic - move to generic glue implementation
  crypto: sha512 - implement base layer for SHA-512
  crypto: sha256 - implement base layer for SHA-256
  crypto: sha1 - implement base layer for SHA-1
  crypto: api - remove instance when test failed
  crypto: api - Move alg ref count init to crypto_check_alg
  ...
2015-04-15 10:42:15 -07:00
Ard Biesheuvel 3abafaf219 crypto: arm - workaround for building with old binutils
Old versions of binutils (before 2.23) do not yet understand the
crypto-neon-fp-armv8 fpu instructions, and an attempt to build these
files results in a build failure:

arch/arm/crypto/aes-ce-core.S:133: Error: selected processor does not support ARM mode `vld1.8 {q10-q11},[ip]!'
arch/arm/crypto/aes-ce-core.S:133: Error: bad instruction `aese.8 q0,q8'
arch/arm/crypto/aes-ce-core.S:133: Error: bad instruction `aesmc.8 q0,q0'
arch/arm/crypto/aes-ce-core.S:133: Error: bad instruction `aese.8 q0,q9'
arch/arm/crypto/aes-ce-core.S:133: Error: bad instruction `aesmc.8 q0,q0'

Since the affected versions are still in widespread use, and this breaks
'allmodconfig' builds, we should try to at least get a successful kernel
build. Unfortunately, I could not come up with a way to make the Kconfig
symbol depend on the binutils version, which would be the nicest solution.

Instead, this patch uses the 'as-instr' Kbuild macro to find out whether
the support is present in the assembler, and otherwise emits a non-fatal
warning indicating which selected modules could not be built.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: http://storage.kernelci.org/next/next-20150410/arm-allmodconfig/build.log
Fixes: 864cbeed4a ("crypto: arm - add support for SHA1 using ARMv8 Crypto Instructions")
[ard.biesheuvel:
 - omit modules entirely instead of building empty ones if binutils is too old
 - update commit log accordingly]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-04-13 12:09:11 +08:00
Arnd Bergmann b48321def4 crypto: arm/sha256 - avoid sha256 code on ARMv7-M
The sha256 assembly implementation can deal with all architecture levels
from ARMv4 to ARMv7-A, but not with ARMv7-M. Enabling it in an
ARMv7-M kernel results in this build failure:

arm-linux-gnueabi-ld: error: arch/arm/crypto/sha256_glue.o: Conflicting architecture profiles M/A
arm-linux-gnueabi-ld: failed to merge target specific data of file arch/arm/crypto/sha256_glue.o

This adds a Kconfig dependency to prevent the code from being disabled
for ARMv7-M.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-04-13 12:07:13 +08:00