The majority of skcipher templates (including both the existing ones and
the ones remaining to be converted from the "blkcipher" API) just wrap a
single block cipher algorithm. This includes cbc, cfb, ctr, ecb, kw,
ofb, and pcbc. Add a helper function skcipher_alloc_instance_simple()
that handles allocating an skcipher instance for this common case.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The memcpy()s in the PCBC implementation use walk->iv as both the source
and destination, which has undefined behavior. These memcpy()'s are
actually unneeded, because walk->iv is already used to hold the previous
plaintext block XOR'd with the previous ciphertext block. Thus,
walk->iv is already updated to its final value.
So remove the broken and unnecessary memcpy()s.
Fixes: 91652be5d1 ("[CRYPTO] pcbc: Add Propagated CBC template")
Cc: <stable@vger.kernel.org> # v2.6.21+
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix multiple bugs in the OFB implementation:
1. It stored the per-request state 'cnt' in the tfm context, which can be
used by multiple threads concurrently (e.g. via AF_ALG).
2. It didn't support messages not a multiple of the block cipher size,
despite being a stream cipher.
3. It didn't set cra_blocksize to 1 to indicate it is a stream cipher.
To fix these, set the 'chunksize' property to the cipher block size to
guarantee that when walking through the scatterlist, a partial block can
only occur at the end. Then change the implementation to XOR a block at
a time at first, then XOR the partial block at the end if needed. This
is the same way CTR and CFB are implemented. As a bonus, this also
improves performance in most cases over the current approach.
Fixes: e497c51896 ("crypto: ofb - add output feedback mode")
Cc: <stable@vger.kernel.org> # v4.20+
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The memcpy() in crypto_cfb_decrypt_inplace() uses walk->iv as both the
source and destination, which has undefined behavior. It is unneeded
because walk->iv is already used to hold the previous ciphertext block;
thus, walk->iv is already updated to its final value. So, remove it.
Also, note that in-place decryption is the only case where the previous
ciphertext block is not directly available. Therefore, as a related
cleanup I also updated crypto_cfb_encrypt_segment() to directly use the
previous ciphertext block rather than save it into walk->iv. This makes
it consistent with in-place encryption and out-of-place decryption; now
only in-place decryption is different, because it has to be.
Fixes: a7d85e06ed ("crypto: cfb - add support for Cipher FeedBack mode")
Cc: <stable@vger.kernel.org> # v4.17+
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Like some other block cipher mode implementations, the CFB
implementation assumes that while walking through the scatterlist, a
partial block does not occur until the end. But the walk is incorrectly
being done with a blocksize of 1, as 'cra_blocksize' is set to 1 (since
CFB is a stream cipher) but no 'chunksize' is set. This bug causes
incorrect encryption/decryption for some scatterlist layouts.
Fix it by setting the 'chunksize'. Also extend the CFB test vectors to
cover this bug as well as cases where the message length is not a
multiple of the block size.
Fixes: a7d85e06ed ("crypto: cfb - add support for Cipher FeedBack mode")
Cc: <stable@vger.kernel.org> # v4.17+
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
sm3_compress() calls rol32() with shift >= 32, which causes undefined
behavior. This is easily detected by enabling CONFIG_UBSAN.
Explicitly AND with 31 to make the behavior well defined.
Fixes: 4f0fc1600e ("crypto: sm3 - add OSCCA SM3 secure hash")
Cc: <stable@vger.kernel.org> # v4.15+
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto_grab_*() doesn't set crypto_spawn::inst, so templates must set it
beforehand. Otherwise it will be left NULL, which causes a crash in
certain cases where algorithms are dynamically loaded/unloaded. E.g.
with CONFIG_CRYPTO_CHACHA20_X86_64=m, the following caused a crash:
insmod chacha-x86_64.ko
python -c 'import socket; socket.socket(socket.AF_ALG, 5, 0).bind(("skcipher", "adiantum(xchacha12,aes)"))'
rmmod chacha-x86_64.ko
python -c 'import socket; socket.socket(socket.AF_ALG, 5, 0).bind(("skcipher", "adiantum(xchacha12,aes)"))'
Fixes: 059c2a4d8e ("crypto: adiantum - add Adiantum support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Keys for "authenc" AEADs are formatted as an rtattr containing a 4-byte
'enckeylen', followed by an authentication key and an encryption key.
crypto_authenc_extractkeys() parses the key to find the inner keys.
However, it fails to consider the case where the rtattr's payload is
longer than 4 bytes but not 4-byte aligned, and where the key ends
before the next 4-byte aligned boundary. In this case, 'keylen -=
RTA_ALIGN(rta->rta_len);' underflows to a value near UINT_MAX. This
causes a buffer overread and crash during crypto_ahash_setkey().
Fix it by restricting the rtattr payload to the expected size.
Reproducer using AF_ALG:
#include <linux/if_alg.h>
#include <linux/rtnetlink.h>
#include <sys/socket.h>
int main()
{
int fd;
struct sockaddr_alg addr = {
.salg_type = "aead",
.salg_name = "authenc(hmac(sha256),cbc(aes))",
};
struct {
struct rtattr attr;
__be32 enckeylen;
char keys[1];
} __attribute__((packed)) key = {
.attr.rta_len = sizeof(key),
.attr.rta_type = 1 /* CRYPTO_AUTHENC_KEYA_PARAM */,
};
fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(fd, (void *)&addr, sizeof(addr));
setsockopt(fd, SOL_ALG, ALG_SET_KEY, &key, sizeof(key));
}
It caused:
BUG: unable to handle kernel paging request at ffff88007ffdc000
PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 2e05067 PTE 0
Oops: 0000 [#1] SMP
CPU: 0 PID: 883 Comm: authenc Not tainted 4.20.0-rc1-00108-g00c9fe37a7f27 #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
RIP: 0010:sha256_ni_transform+0xb3/0x330 arch/x86/crypto/sha256_ni_asm.S:155
[...]
Call Trace:
sha256_ni_finup+0x10/0x20 arch/x86/crypto/sha256_ssse3_glue.c:321
crypto_shash_finup+0x1a/0x30 crypto/shash.c:178
shash_digest_unaligned+0x45/0x60 crypto/shash.c:186
crypto_shash_digest+0x24/0x40 crypto/shash.c:202
hmac_setkey+0x135/0x1e0 crypto/hmac.c:66
crypto_shash_setkey+0x2b/0xb0 crypto/shash.c:66
shash_async_setkey+0x10/0x20 crypto/shash.c:223
crypto_ahash_setkey+0x2d/0xa0 crypto/ahash.c:202
crypto_authenc_setkey+0x68/0x100 crypto/authenc.c:96
crypto_aead_setkey+0x2a/0xc0 crypto/aead.c:62
aead_setkey+0xc/0x10 crypto/algif_aead.c:526
alg_setkey crypto/af_alg.c:223 [inline]
alg_setsockopt+0xfe/0x130 crypto/af_alg.c:256
__sys_setsockopt+0x6d/0xd0 net/socket.c:1902
__do_sys_setsockopt net/socket.c:1913 [inline]
__se_sys_setsockopt net/socket.c:1910 [inline]
__x64_sys_setsockopt+0x1f/0x30 net/socket.c:1910
do_syscall_64+0x4a/0x180 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: e236d4a89a ("[CRYPTO] authenc: Move enckeylen into key itself")
Cc: <stable@vger.kernel.org> # v2.6.25+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
- support -y option for merge_config.sh to avoid downgrading =y to =m
- remove S_OTHER symbol type, and touch include/config/*.h files correctly
- fix file name and line number in lexer warnings
- fix memory leak when EOF is encountered in quotation
- resolve all shift/reduce conflicts of the parser
- warn no new line at end of file
- make 'source' statement more strict to take only string literal
- rewrite the lexer and remove the keyword lookup table
- convert to SPDX License Identifier
- compile C files independently instead of including them from zconf.y
- fix various warnings of gconfig
- misc cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJcJieuAAoJED2LAQed4NsGHlIP/1s0fQ86XD9dIMyHzAO0gh2f
7rylfe2kEXJgIzJ0DyZdLu4iZtwbkEUqTQrRS1abriNGVemPkfBAnZdM5d92lOQX
3iREa700AJ2xo7V7gYZ6AbhZoG3p0S9U9Q2qE5S+tFTe8c2Gy4xtjnODF+Vel85r
S0P8tF5sE1/d00lm+yfMI/CJVfDjyNaMm+aVEnL0kZTPiRkaktjWgo6Fc2p4z1L5
HFmMMP6/iaXmRZ+tHJGPQ2AT70GFVZw5ePxPcl50EotUP25KHbuUdzs8wDpYm3U/
rcESVsIFpgqHWmTsdBk6dZk0q8yFZNkMlkaP/aYukVZpUn/N6oAXgTFckYl8dmQL
fQBkQi6DTfr9EBPVbj18BKm7xI3Y4DdQ2fzTfYkJ2XwNRGFA5r9N3sjd7ZTVGjxC
aeeMHCwvGdSx1x8PeZAhZfsUHW8xVDMSQiT713+ljBY+6cwzA+2NF0kP7B6OAqwr
ETFzd4Xu2/lZcL7gQRH8WU3L2S5iedmDG6RnZgJMXI0/9V4qAA+nlsWaCgnl1TgA
mpxYlLUMrd6AUJevE34FlnyFdk8IMn9iKRFsvF0f3doO5C7QzTVGqFdJu5a0CuWO
4NBJvZjFT8/4amoWLfnDlfApWXzTfwLbKG+r6V2F30fLuXpYg5LxWhBoGRPYLZSq
oi4xN1Mpx3TvXz6WcKVZ
=r3Fl
-----END PGP SIGNATURE-----
Merge tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig updates from Masahiro Yamada:
- support -y option for merge_config.sh to avoid downgrading =y to =m
- remove S_OTHER symbol type, and touch include/config/*.h files correctly
- fix file name and line number in lexer warnings
- fix memory leak when EOF is encountered in quotation
- resolve all shift/reduce conflicts of the parser
- warn no new line at end of file
- make 'source' statement more strict to take only string literal
- rewrite the lexer and remove the keyword lookup table
- convert to SPDX License Identifier
- compile C files independently instead of including them from zconf.y
- fix various warnings of gconfig
- misc cleanups
* tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
kconfig: surround dbg_sym_flags with #ifdef DEBUG to fix gconf warning
kconfig: split images.c out of qconf.cc/gconf.c to fix gconf warnings
kconfig: add static qualifiers to fix gconf warnings
kconfig: split the lexer out of zconf.y
kconfig: split some C files out of zconf.y
kconfig: convert to SPDX License Identifier
kconfig: remove keyword lookup table entirely
kconfig: update current_pos in the second lexer
kconfig: switch to ASSIGN_VAL state in the second lexer
kconfig: stop associating kconf_id with yylval
kconfig: refactor end token rules
kconfig: stop supporting '.' and '/' in unquoted words
treewide: surround Kconfig file paths with double quotes
microblaze: surround string default in Kconfig with double quotes
kconfig: use T_WORD instead of T_VARIABLE for variables
kconfig: use specific tokens instead of T_ASSIGN for assignments
kconfig: refactor scanning and parsing "option" properties
kconfig: use distinct tokens for type and default properties
kconfig: remove redundant token defines
kconfig: rename depends_list to comment_option_list
...
Pull crypto updates from Herbert Xu:
"API:
- Add 1472-byte test to tcrypt for IPsec
- Reintroduced crypto stats interface with numerous changes
- Support incremental algorithm dumps
Algorithms:
- Add xchacha12/20
- Add nhpoly1305
- Add adiantum
- Add streebog hash
- Mark cts(cbc(aes)) as FIPS allowed
Drivers:
- Improve performance of arm64/chacha20
- Improve performance of x86/chacha20
- Add NEON-accelerated nhpoly1305
- Add SSE2 accelerated nhpoly1305
- Add AVX2 accelerated nhpoly1305
- Add support for 192/256-bit keys in gcmaes AVX
- Add SG support in gcmaes AVX
- ESN for inline IPsec tx in chcr
- Add support for CryptoCell 703 in ccree
- Add support for CryptoCell 713 in ccree
- Add SM4 support in ccree
- Add SM3 support in ccree
- Add support for chacha20 in caam/qi2
- Add support for chacha20 + poly1305 in caam/jr
- Add support for chacha20 + poly1305 in caam/qi2
- Add AEAD cipher support in cavium/nitrox"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (130 commits)
crypto: skcipher - remove remnants of internal IV generators
crypto: cavium/nitrox - Fix build with !CONFIG_DEBUG_FS
crypto: salsa20-generic - don't unnecessarily use atomic walk
crypto: skcipher - add might_sleep() to skcipher_walk_virt()
crypto: x86/chacha - avoid sleeping under kernel_fpu_begin()
crypto: cavium/nitrox - Added AEAD cipher support
crypto: mxc-scc - fix build warnings on ARM64
crypto: api - document missing stats member
crypto: user - remove unused dump functions
crypto: chelsio - Fix wrong error counter increments
crypto: chelsio - Reset counters on cxgb4 Detach
crypto: chelsio - Handle PCI shutdown event
crypto: chelsio - cleanup:send addr as value in function argument
crypto: chelsio - Use same value for both channel in single WR
crypto: chelsio - Swap location of AAD and IV sent in WR
crypto: chelsio - remove set but not used variable 'kctx_len'
crypto: ux500 - Use proper enum in hash_set_dma_transfer
crypto: ux500 - Use proper enum in cryp_set_dma_transfer
crypto: aesni - Add scatter/gather avx stubs, and use them in C
crypto: aesni - Introduce partial block macro
..
Pull RCU updates from Ingo Molnar:
"The biggest RCU changes in this cycle were:
- Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.
- Replace calls of RCU-bh and RCU-sched update-side functions to
their vanilla RCU counterparts. This series is a step towards
complete removal of the RCU-bh and RCU-sched update-side functions.
( Note that some of these conversions are going upstream via their
respective maintainers. )
- Documentation updates, including a number of flavor-consolidation
updates from Joel Fernandes.
- Miscellaneous fixes.
- Automate generation of the initrd filesystem used for rcutorture
testing.
- Convert spin_is_locked() assertions to instead use lockdep.
( Note that some of these conversions are going upstream via their
respective maintainers. )
- SRCU updates, especially including a fix from Dennis Krein for a
bag-on-head-class bug.
- RCU torture-test updates"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits)
rcutorture: Don't do busted forward-progress testing
rcutorture: Use 100ms buckets for forward-progress callback histograms
rcutorture: Recover from OOM during forward-progress tests
rcutorture: Print forward-progress test age upon failure
rcutorture: Print time since GP end upon forward-progress failure
rcutorture: Print histogram of CB invocation at OOM time
rcutorture: Print GP age upon forward-progress failure
rcu: Print per-CPU callback counts for forward-progress failures
rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings
rcutorture: Dump grace-period diagnostics upon forward-progress OOM
rcutorture: Prepare for asynchronous access to rcu_fwd_startat
torture: Remove unnecessary "ret" variables
rcutorture: Affinity forward-progress test to avoid housekeeping CPUs
rcutorture: Break up too-long rcu_torture_fwd_prog() function
rcutorture: Remove cbflood facility
torture: Bring any extra CPUs online during kernel startup
rcutorture: Add call_rcu() flooding forward-progress tests
rcutorture/formal: Replace synchronize_sched() with synchronize_rcu()
tools/kernel.h: Replace synchronize_sched() with synchronize_rcu()
net/decnet: Replace rcu_barrier_bh() with rcu_barrier()
...
Remove dead code related to internal IV generators, which are no longer
used since they've been replaced with the "seqiv" and "echainiv"
templates. The removed code includes:
- The "givcipher" (GIVCIPHER) algorithm type. No algorithms are
registered with this type anymore, so it's unneeded.
- The "const char *geniv" member of aead_alg, ablkcipher_alg, and
blkcipher_alg. A few algorithms still set this, but it isn't used
anymore except to show via /proc/crypto and CRYPTO_MSG_GETALG.
Just hardcode "<default>" or "<none>" in those cases.
- The 'skcipher_givcrypt_request' structure, which is never used.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
salsa20-generic doesn't use SIMD instructions or otherwise disable
preemption, so passing atomic=true to skcipher_walk_virt() is
unnecessary.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
skcipher_walk_virt() can still sleep even with atomic=true, since that
only affects the later calls to skcipher_walk_done(). But,
skcipher_walk_virt() only has to allocate memory for some input data
layouts, so incorrectly calling it with preemption disabled can go
undetected. Use might_sleep() so that it's detected reliably.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch removes unused dump functions for crypto_user_stats.
There are remains of the copy/paste of crypto_user_base to
crypto_user_stat and I forgot to remove them.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The Kconfig lexer supports special characters such as '.' and '/' in
the parameter context. In my understanding, the reason is just to
support bare file paths in the source statement.
I do not see a good reason to complicate Kconfig for the room of
ambiguity.
The majority of code already surrounds file paths with double quotes,
and it makes sense since file paths are constant string literals.
Make it treewide consistent now.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
crypto_alg_mod_lookup() takes a reference to the hash algorithm but
crypto_init_shash_spawn() doesn't take ownership of it, hence the
reference needs to be dropped in adiantum_create().
Fixes: 059c2a4d8e ("crypto: adiantum - add Adiantum support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
CRYPTO_MSG_GETALG in NLM_F_DUMP mode sometimes doesn't return all
registered crypto algorithms, because it doesn't support incremental
dumps. crypto_dump_report() only permits itself to be called once, yet
the netlink subsystem allocates at most ~64 KiB for the skb being dumped
to. Thus only the first recvmsg() returns data, and it may only include
a subset of the crypto algorithms even if the user buffer passed to
recvmsg() is large enough to hold all of them.
Fix this by using one of the arguments in the netlink_callback structure
to keep track of the current position in the algorithm list. Then
userspace can do multiple recvmsg() on the socket after sending the dump
request. This is the way netlink dumps work elsewhere in the kernel;
it's unclear why this was different (probably just an oversight).
Also fix an integer overflow when calculating the dump buffer size hint.
Fixes: a38f7907b9 ("crypto: Add userspace configuration API")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The 2018-11-28 revision of the Adiantum paper has revised some notation:
- 'M' was replaced with 'L' (meaning "Left", for the left-hand part of
the message) in the definition of Adiantum hashing, to avoid confusion
with the full message
- ε-almost-∆-universal is now abbreviated as ε-∆U instead of εA∆U
- "block" is now used only to mean block cipher and Poly1305 blocks
Also, Adiantum hashing was moved from the appendix to the main paper.
To avoid confusion, update relevant comments in the code to match.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The kernel's ChaCha20 uses the RFC7539 convention of the nonce being 12
bytes rather than 8, so actually I only appended 12 random bytes (not
16) to its test vectors to form 24-byte nonces for the XChaCha20 test
vectors. The other 4 bytes were just from zero-padding the stream
position to 8 bytes. Fix the comments above the test vectors.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There is a draft specification for XChaCha20 being worked on. Add the
XChaCha20 test vector from the appendix so that we can be extra sure the
kernel's implementation is compatible.
I also recomputed the ciphertext with XChaCha12 and added it there too,
to keep the tests for XChaCha20 and XChaCha12 in sync.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the x86_64 SIMD implementations of ChaCha20 and XChaCha20 have
been refactored to support varying the number of rounds, add support for
XChaCha12. This is identical to XChaCha20 except for the number of
rounds, which is 12 instead of 20. This can be used by Adiantum.
Reviewed-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add an XChaCha20 implementation that is hooked up to the x86_64 SIMD
implementations of ChaCha20. This can be used by Adiantum.
An SSSE3 implementation of single-block HChaCha20 is also added so that
XChaCha20 can use it rather than the generic implementation. This
required refactoring the ChaCha permutation into its own function.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a 64-bit AVX2 implementation of NHPoly1305, an ε-almost-∆-universal
hash function used in the Adiantum encryption mode. For now, only the
NH portion is actually AVX2-accelerated; the Poly1305 part is less
performance-critical so is just implemented in C.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a 64-bit SSE2 implementation of NHPoly1305, an ε-almost-∆-universal
hash function used in the Adiantum encryption mode. For now, only the
NH portion is actually SSE2-accelerated; the Poly1305 part is less
performance-critical so is just implemented in C.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If the stream cipher implementation is asynchronous, then the Adiantum
instance must be flagged as asynchronous as well. Otherwise someone
asking for a synchronous algorithm can get an asynchronous algorithm.
There are no asynchronous xchacha12 or xchacha20 implementations yet
which makes this largely a theoretical issue, but it should be fixed.
Fixes: 059c2a4d8e ("crypto: adiantum - add Adiantum support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In order to have better coverage of algorithms operating on block
sizes that are in the ballpark of a VPN packet, add 1472 to the
block_sizes array.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch add the crypto_stats_init() function.
This will permit to remove some ifdef from __crypto_register_alg().
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since now all crypto stats are on their own structures, it is now
useless to have the algorithm name in the err_cnt member.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Like for userspace, this patch splits stats into multiple structures,
one for each algorithm class.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The use of the v64 intermediate variable is useless, and removing it
bring to much readable code.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some error count use the wrong name for getting this data.
But this had not caused any reporting problem, since all error count are shared in the same
union.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All crypto_stats functions use the struct xxx_request for feeding stats,
but in some case this structure could already be freed.
For fixing this, the needed parameters (len and alg) will be stored
before the request being executed.
Fixes: cac5818c25 ("crypto: user - Implement a generic crypto statistics")
Reported-by: syzbot <syzbot+6939a606a5305e9e9799@syzkaller.appspotmail.com>
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It is cleaner to have each stat in their own structures.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All the 32-bit fields need to be 64-bit. In some cases, UINT32_MAX crypto
operations can be done in seconds.
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
CRYPTO_STATS is using CRYPTO_USER stuff, so it should depends on it.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Even if CRYPTO_STATS is set to n, some part of CRYPTO_STATS are
compiled.
This patch made all part of crypto_user_stat uncompiled in that case.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since this user-space API is still undergoing significant changes,
this patch disables it for the current merge window.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull RCU changes from Paul E. McKenney:
- Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.
- Replace calls of RCU-bh and RCU-sched update-side functions
to their vanilla RCU counterparts. This series is a step
towards complete removal of the RCU-bh and RCU-sched update-side
functions.
( Note that some of these conversions are going upstream via their
respective maintainers. )
- Documentation updates, including a number of flavor-consolidation
updates from Joel Fernandes.
- Miscellaneous fixes.
- Automate generation of the initrd filesystem used for
rcutorture testing.
- Convert spin_is_locked() assertions to instead use lockdep.
( Note that some of these conversions are going upstream via their
respective maintainers. )
- SRCU updates, especially including a fix from Dennis Krein
for a bag-on-head-class bug.
- RCU torture-test updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In multiple functions, the algorithm fields are read after its reference
is dropped through crypto_mod_put. In this case, the algorithm memory
may be freed, resulting in use-after-free bugs. This patch delays the
put operation until the algorithm is never used.
Fixes: 79c65d179a ("crypto: cbc - Convert to skcipher")
Fixes: a7d85e06ed ("crypto: cfb - add support for Cipher FeedBack mode")
Fixes: 043a44001b ("crypto: pcbc - Convert to skcipher")
Cc: <stable@vger.kernel.org>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that synchronize_rcu() waits for bh-disable regions of code as
well as RCU read-side critical sections, the synchronize_rcu_bh() in
pcrypt_cpumask_change_notify() can be replaced by synchronize_rcu().
This commit therefore makes this change.
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: <linux-crypto@vger.kernel.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for the Adiantum encryption mode. Adiantum was designed by
Paul Crowley and is specified by our paper:
Adiantum: length-preserving encryption for entry-level processors
(https://eprint.iacr.org/2018/720.pdf)
See our paper for full details; this patch only provides an overview.
Adiantum is a tweakable, length-preserving encryption mode designed for
fast and secure disk encryption, especially on CPUs without dedicated
crypto instructions. Adiantum encrypts each sector using the XChaCha12
stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
function, and an invocation of the AES-256 block cipher on a single
16-byte block. On CPUs without AES instructions, Adiantum is much
faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
and decryption about 5 times faster.
Adiantum is a specialization of the more general HBSH construction. Our
earlier proposal, HPolyC, was also a HBSH specialization, but it used a
different εA∆U hash function, one based on Poly1305 only. Adiantum's
εA∆U hash function, which is based primarily on the "NH" hash function
like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
consequently, Adiantum is about 20% faster than HPolyC.
This speed comes with no loss of security: Adiantum is provably just as
secure as HPolyC, in fact slightly *more* secure. Like HPolyC,
Adiantum's security is reducible to that of XChaCha12 and AES-256,
subject to a security bound. XChaCha12 itself has a security reduction
to ChaCha12. Therefore, one need not "trust" Adiantum; one need only
trust ChaCha12 and AES-256. Note that the εA∆U hash function is only
used for its proven combinatorical properties so cannot be "broken".
Adiantum is also a true wide-block encryption mode, so flipping any
plaintext bit in the sector scrambles the entire ciphertext, and vice
versa. No other such mode is available in the kernel currently; doing
the same with XTS scrambles only 16 bytes. Adiantum also supports
arbitrary-length tweaks and naturally supports any length input >= 16
bytes without needing "ciphertext stealing".
For the stream cipher, Adiantum uses XChaCha12 rather than XChaCha20 in
order to make encryption feasible on the widest range of devices.
Although the 20-round variant is quite popular, the best known attacks
on ChaCha are on only 7 rounds, so ChaCha12 still has a substantial
security margin; in fact, larger than AES-256's. 12-round Salsa20 is
also the eSTREAM recommendation. For the block cipher, Adiantum uses
AES-256, despite it having a lower security margin than XChaCha12 and
needing table lookups, due to AES's extensive adoption and analysis
making it the obvious first choice. Nevertheless, for flexibility this
patch also permits the "adiantum" template to be instantiated with
XChaCha20 and/or with an alternate block cipher.
We need Adiantum support in the kernel for use in dm-crypt and fscrypt,
where currently the only other suitable options are block cipher modes
such as AES-XTS. A big problem with this is that many low-end mobile
devices (e.g. Android Go phones sold primarily in developing countries,
as well as some smartwatches) still have CPUs that lack AES
instructions, e.g. ARM Cortex-A7. Sadly, AES-XTS encryption is much too
slow to be viable on these devices. We did find that some "lightweight"
block ciphers are fast enough, but these suffer from problems such as
not having much cryptanalysis or being too controversial.
The ChaCha stream cipher has excellent performance but is insecure to
use directly for disk encryption, since each sector's IV is reused each
time it is overwritten. Even restricting the threat model to offline
attacks only isn't enough, since modern flash storage devices don't
guarantee that "overwrites" are really overwrites, due to wear-leveling.
Adiantum avoids this problem by constructing a
"tweakable super-pseudorandom permutation"; this is the strongest
possible security model for length-preserving encryption.
Of course, storing random nonces along with the ciphertext would be the
ideal solution. But doing that with existing hardware and filesystems
runs into major practical problems; in most cases it would require data
journaling (like dm-integrity) which severely degrades performance.
Thus, for now length-preserving encryption is still needed.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a generic implementation of NHPoly1305, an ε-almost-∆-universal hash
function used in the Adiantum encryption mode.
CONFIG_NHPOLY1305 is not selectable by itself since there won't be any
real reason to enable it without also enabling Adiantum support.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Expose a low-level Poly1305 API which implements the
ε-almost-∆-universal (εA∆U) hash function underlying the Poly1305 MAC
and supports block-aligned inputs only.
This is needed for Adiantum hashing, which builds an εA∆U hash function
from NH and a polynomial evaluation in GF(2^{130}-5); this polynomial
evaluation is identical to the one the Poly1305 MAC does. However, the
crypto_shash Poly1305 API isn't very appropriate for this because its
calling convention assumes it is used as a MAC, with a 32-byte "one-time
key" provided for every digest.
But by design, in Adiantum hashing the performance of the polynomial
evaluation isn't nearly as critical as NH. So it suffices to just have
some C helper functions. Thus, this patch adds such functions.
Acked-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In preparation for exposing a low-level Poly1305 API which implements
the ε-almost-∆-universal (εA∆U) hash function underlying the Poly1305
MAC and supports block-aligned inputs only, create structures
poly1305_key and poly1305_state which hold the limbs of the Poly1305
"r" key and accumulator, respectively.
These structures could actually have the same type (e.g. poly1305_val),
but different types are preferable, to prevent misuse.
Acked-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the generic implementation of ChaCha20 has been refactored to
allow varying the number of rounds, add support for XChaCha12, which is
the XSalsa construction applied to ChaCha12. ChaCha12 is one of the
three ciphers specified by the original ChaCha paper
(https://cr.yp.to/chacha/chacha-20080128.pdf: "ChaCha, a variant of
Salsa20"), alongside ChaCha8 and ChaCha20. ChaCha12 is faster than
ChaCha20 but has a lower, but still large, security margin.
We need XChaCha12 support so that it can be used in the Adiantum
encryption mode, which enables disk/file encryption on low-end mobile
devices where AES-XTS is too slow as the CPUs lack AES instructions.
We'd prefer XChaCha20 (the more popular variant), but it's too slow on
some of our target devices, so at least in some cases we do need the
XChaCha12-based version. In more detail, the problem is that Adiantum
is still much slower than we're happy with, and encryption still has a
quite noticeable effect on the feel of low-end devices. Users and
vendors push back hard against encryption that degrades the user
experience, which always risks encryption being disabled entirely. So
we need to choose the fastest option that gives us a solid margin of
security, and here that's XChaCha12. The best known attack on ChaCha
breaks only 7 rounds and has 2^235 time complexity, so ChaCha12's
security margin is still better than AES-256's. Much has been learned
about cryptanalysis of ARX ciphers since Salsa20 was originally designed
in 2005, and it now seems we can be comfortable with a smaller number of
rounds. The eSTREAM project also suggests the 12-round version of
Salsa20 as providing the best balance among the different variants:
combining very good performance with a "comfortable margin of security".
Note that it would be trivial to add vanilla ChaCha12 in addition to
XChaCha12. However, it's unneeded for now and therefore is omitted.
As discussed in the patch that introduced XChaCha20 support, I
considered splitting the code into separate chacha-common, chacha20,
xchacha20, and xchacha12 modules, so that these algorithms could be
enabled/disabled independently. However, since nearly all the code is
shared anyway, I ultimately decided there would have been little benefit
to the added complexity.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In preparation for adding XChaCha12 support, rename/refactor
chacha20-generic to support different numbers of rounds. The
justification for needing XChaCha12 support is explained in more detail
in the patch "crypto: chacha - add XChaCha12 support".
The only difference between ChaCha{8,12,20} are the number of rounds
itself; all other parts of the algorithm are the same. Therefore,
remove the "20" from all definitions, structures, functions, files, etc.
that will be shared by all ChaCha versions.
Also make ->setkey() store the round count in the chacha_ctx (previously
chacha20_ctx). The generic code then passes the round count through to
chacha_block(). There will be a ->setkey() function for each explicitly
allowed round count; the encrypt/decrypt functions will be the same. I
decided not to do it the opposite way (same ->setkey() function for all
round counts, with different encrypt/decrypt functions) because that
would have required more boilerplate code in architecture-specific
implementations of ChaCha and XChaCha.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for the XChaCha20 stream cipher. XChaCha20 is the
application of the XSalsa20 construction
(https://cr.yp.to/snuffle/xsalsa-20081128.pdf) to ChaCha20 rather than
to Salsa20. XChaCha20 extends ChaCha20's nonce length from 64 bits (or
96 bits, depending on convention) to 192 bits, while provably retaining
ChaCha20's security. XChaCha20 uses the ChaCha20 permutation to map the
key and first 128 nonce bits to a 256-bit subkey. Then, it does the
ChaCha20 stream cipher with the subkey and remaining 64 bits of nonce.
We need XChaCha support in order to add support for the Adiantum
encryption mode. Note that to meet our performance requirements, we
actually plan to primarily use the variant XChaCha12. But we believe
it's wise to first add XChaCha20 as a baseline with a higher security
margin, in case there are any situations where it can be used.
Supporting both variants is straightforward.
Since XChaCha20's subkey differs for each request, XChaCha20 can't be a
template that wraps ChaCha20; that would require re-keying the
underlying ChaCha20 for every request, which wouldn't be thread-safe.
Instead, we make XChaCha20 its own top-level algorithm which calls the
ChaCha20 streaming implementation internally.
Similar to the existing ChaCha20 implementation, we define the IV to be
the nonce and stream position concatenated together. This allows users
to seek to any position in the stream.
I considered splitting the code into separate chacha20-common, chacha20,
and xchacha20 modules, so that chacha20 and xchacha20 could be
enabled/disabled independently. However, since nearly all the code is
shared anyway, I ultimately decided there would have been little benefit
to the added complexity of separate modules.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
chacha20-generic doesn't use SIMD instructions or otherwise disable
preemption, so passing atomic=true to skcipher_walk_virt() is
unnecessary.
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some algorithms initialize their .cra_list prior to registration.
But this is unnecessary since crypto_register_alg() will overwrite
.cra_list when adding the algorithm to the 'crypto_alg_list'.
Apparently the useless assignment has just been copy+pasted around.
So, remove the useless assignments.
Exception: paes_s390.c uses cra_list to check whether the algorithm is
registered or not, so I left that as-is for now.
This patch shouldn't change any actual behavior.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ecc_point_mult is supposed to be used with a regularized scalar,
otherwise, it's possible to deduce the position of the top bit of the
scalar with timing attack. This is important when the scalar is a
private key.
ecc_point_mult is already using a regular algorithm (i.e. having an
operation flow independent of the input scalar) but regularization step
is not implemented.
Arrange scalar to always have fixed top bit by adding a multiple of the
curve order (n).
References:
The constant time regularization step is based on micro-ecc by Kenneth
MacKay and also referenced in the literature (Bernstein, D. J., & Lange,
T. (2017). Montgomery curves and the Montgomery ladder. (Cryptology
ePrint Archive; Vol. 2017/293). s.l.: IACR. Chapter 4.6.2.)
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Cc: kernel-hardening@lists.openwall.com
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move CHACHAPOLY_IV_SIZE to header file, so it can be reused.
Signed-off-by: Cristian Stoica <cristian.stoica@nxp.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add testmgr and tcrypt tests and vectors for Streebog hash function
from RFC 6986 and GOST R 34.11-2012, for HMAC-Streebog vectors are
from RFC 7836 and R 50.1.113-2016.
Cc: linux-integrity@vger.kernel.org
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Register Streebog hash function in Hash Info arrays to let IMA use
it for its purposes.
Cc: linux-integrity@vger.kernel.org
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
cts(cbc(aes)) as used in the kernel has been added to NIST
standard as CBC-CS3. Document it as such.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Suggested-by: Stephan Mueller <smueller@chronox.de>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently used scalar multiplication algorithm (Matthieu Rivain, 2011)
have invalid values for scalar == 1, n-1, and for regularized version
n-2, which was previously not checked. Verify that they are not used as
private keys.
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As per Sp800-38A addendum from Oct 2010[1], cts(cbc(aes)) is
allowed as a FIPS mode algorithm. Mark it as such.
[1] https://csrc.nist.gov/publications/detail/sp/800-38a/addendum/final
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There have been a pretty ridiculous number of issues with initializing
the report structures that are copied to userspace by NETLINK_CRYPTO.
Commit 4473710df1 ("crypto: user - Prepare for CRYPTO_MAX_ALG_NAME
expansion") replaced some strncpy()s with strlcpy()s, thereby
introducing information leaks. Later two other people tried to replace
other strncpy()s with strlcpy() too, which would have introduced even
more information leaks:
- https://lore.kernel.org/patchwork/patch/954991/
- https://patchwork.kernel.org/patch/10434351/
Commit cac5818c25 ("crypto: user - Implement a generic crypto
statistics") also uses the buggy strlcpy() approach and therefore leaks
uninitialized memory to userspace. A fix was proposed, but it was
originally incomplete.
Seeing as how apparently no one can get this right with the current
approach, change all the reporting functions to:
- Start by memsetting the report structure to 0. This guarantees it's
always initialized, regardless of what happens later.
- Initialize all strings using strscpy(). This is safe after the
memset, ensures null termination of long strings, avoids unnecessary
work, and avoids the -Wstringop-truncation warnings from gcc.
- Use sizeof(var) instead of sizeof(type). This is more robust against
copy+paste errors.
For simplicity, also reuse the -EMSGSIZE return value from nla_put().
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The acomp, akcipher, and kpp algorithm types already have .report
methods defined, so there's no need to duplicate this functionality in
crypto_user itself; the duplicate functions are actually never executed.
Remove the unused code.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Passing string 'name' as the format specifier is potentially hazardous
because name could (although very unlikely to) have a format specifier
embedded in it causing issues when parsing the non-existent arguments
to these. Follow best practice by using the "%s" format string for
the string 'name'.
Cleans up clang warning:
crypto/pcrypt.c:397:40: warning: format string is not a string literal
(potentially insecure) [-Wformat-security]
Fixes: a3fb1e330d ("pcrypt: Added sysfs interface to pcrypt")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto_cfb_decrypt_segment() incorrectly XOR'ed generated keystream with
IV, rather than with data stream, resulting in incorrect decryption.
Test vectors will be added in the next patch.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make the ARM scalar AES implementation closer to constant-time by
disabling interrupts and prefetching the tables into L1 cache. This is
feasible because due to ARM's "free" rotations, the main tables are only
1024 bytes instead of the usual 4096 used by most AES implementations.
On ARM Cortex-A7, the speed loss is only about 5%. The resulting code
is still over twice as fast as aes_ti.c. Responsiveness is potentially
a concern, but interrupts are only disabled for a single AES block.
Note that even after these changes, the implementation still isn't
necessarily guaranteed to be constant-time; see
https://cr.yp.to/antiforgery/cachetiming-20050414.pdf for a discussion
of the many difficulties involved in writing truly constant-time AES
software. But it's valuable to make such attacks more difficult.
Much of this patch is based on patches suggested by Ard Biesheuvel.
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the "aes-fixed-time" AES implementation, disable interrupts while
accessing the S-box, in order to make cache-timing attacks more
difficult. Previously it was possible for the CPU to be interrupted
while the S-box was loaded into L1 cache, potentially evicting the
cachelines and causing later table lookups to be time-variant.
In tests I did on x86 and ARM, this doesn't affect performance
significantly. Responsiveness is potentially a concern, but interrupts
are only disabled for a single AES block.
Note that even after this change, the implementation still isn't
necessarily guaranteed to be constant-time; see
https://cr.yp.to/antiforgery/cachetiming-20050414.pdf for a discussion
of the many difficulties involved in writing truly constant-time AES
software. But it's valuable to make such attacks more difficult.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For preventing uninitialized data to be given to user-space (and so leak
potential useful data), the crypto_stat structure must be correctly
initialized.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: cac5818c25 ("crypto: user - Implement a generic crypto statistics")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
[EB: also fix it in crypto_reportstat_one()]
[EB: use sizeof(var) rather than sizeof(type)]
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All bytes of the NETLINK_CRYPTO report structures must be initialized,
since they are copied to userspace. The change from strncpy() to
strlcpy() broke this. As a minimal fix, change it back.
Fixes: 4473710df1 ("crypto: user - Prepare for CRYPTO_MAX_ALG_NAME expansion")
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The simd wrapper's skcipher request context structure consists
of a single subrequest whose size is taken from the subordinate
skcipher. However, in simd_skcipher_init(), the reqsize that is
retrieved is not from the subordinate skcipher but from the
cryptd request structure, whose size is completely unrelated to
the actual wrapped skcipher.
Reported-by: Qian Cai <cai@gmx.us>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Qian Cai <cai@gmx.us>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The sign operation can operate in a non-hashed mode by running the RSA
sign operation directly on the input. This assumes that the input is
less than key_size_in_bytes - 11. Since the TPM performs its own PKCS1
padding, it isn't possible to support 'raw' mode, only 'pkcs1'.
Alternatively, a hashed version is also possible. In this variant the
input is hashed (by userspace) via the selected hash function first.
Then this implementation takes care of converting the hash to ASN.1
format and the sign operation is performed on the result. This is
similar to the implementation inside crypto/rsa-pkcs1pad.c.
ASN1 templates were copied from crypto/rsa-pkcs1pad.c. There seems to
be no easy way to expose that functionality, but likely the templates
should be shared somehow.
The sign operation is implemented via TPM_Sign operation on the TPM.
It is assumed that the TPM wrapped key provided uses
TPM_SS_RSASSAPKCS1v15_DER signature scheme. This allows the TPM_Sign
operation to work on data up to key_len_in_bytes - 11 bytes long.
In theory, we could also use TPM_Unbind instead of TPM_Sign, but we would
have to manually pkcs1 pad the digest first.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
This patch implements the verify_signature operation. The public key
portion extracted from the TPM key blob is used. The operation is
performed entirely in software using the crypto API.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
This patch implements the pkey_decrypt operation using the private key
blob. The blob is first loaded into the TPM via tpm_loadkey2. Once the
handle is obtained, tpm_unbind operation is used to decrypt the data on
the TPM and the result is returned. The key loaded by tpm_loadkey2 is
then evicted via tpm_flushspecific operation.
This patch assumes that the SRK authorization is a well known 20-byte of
zeros and the same holds for the key authorization of the provided key.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
This patch exposes some common functionality needed to send TPM commands.
Several functions from keys/trusted.c are exposed for use by the new tpm
key subtype and a module dependency is introduced.
In the future, common functionality between the trusted key type and the
asym_tpm subtype should be factored out into a common utility library.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
This patch impelements the pkey_encrypt operation. The public key
portion extracted from the TPM key blob is used. The operation is
performed entirely in software using the crypto API.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
This commit implements the pkey_query operation. This is accomplished
by utilizing the public key portion to obtain max encryption size
information for the operations that utilize the public key (encrypt,
verify). The private key size extracted from the TPM_Key data structure
is used to fill the information where the private key is used (decrypt,
sign).
The kernel uses a DER/BER format for public keys and does not support
setting the key via the raw binary form. To get around this a simple
DER/BER formatter is implemented which stores the DER/BER formatted key
and exponent in a temporary buffer for use by the crypto API.
The only exponent supported currently is 65537. This holds true for
other Linux TPM tools such as 'create_tpm_key' and
trousers-openssl_tpm_engine.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
For TPM based keys, the only standard seems to be described here:
http://david.woodhou.se/draft-woodhouse-cert-best-practice.html#rfc.section.4.4
Quote from the relevant section:
"Rather, a common form of storage for "wrapped" keys is to encode the
binary TCPA_KEY structure in a single ASN.1 OCTET-STRING, and store the
result in PEM format with the tag "-----BEGIN TSS KEY BLOB-----". "
This patch implements the above behavior. It is assumed that the PEM
encoding is stripped out by userspace and only the raw DER/BER format is
provided. This is similar to how PKCS7, PKCS8 and X.509 keys are
handled.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
The parsed BER/DER blob obtained from user space contains a TPM_Key
structure. This structure has some information about the key as well as
the public key portion.
This patch extracts this information for future use.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
The original pkcs1pad implementation allowed to pad/unpad raw RSA
output. However, this has been taken out in commit:
commit c0d20d22e0 ("crypto: rsa-pkcs1pad - Require hash to be present")
This patch restored this ability as it is needed by the asymmetric key
implementation.
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
Implement PKCS#8 RSA Private Key format [RFC 5208] parser for the
asymmetric key type. For the moment, this will only support unencrypted
DER blobs. PEM and decryption can be added later.
PKCS#8 keys can be loaded like this:
openssl pkcs8 -in private_key.pem -topk8 -nocrypt -outform DER | \
keyctl padd asymmetric foo @s
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Denis Kenzior <denkenz@gmail.com>
Tested-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Implement the encrypt, decrypt and sign operations for the software
asymmetric key subtype. This mostly involves offloading the call to the
crypto layer.
Note that the decrypt and sign operations require a private key to be
supplied. Encrypt (and also verify) will work with either a public or a
private key. A public key can be supplied with an X.509 certificate and a
private key can be supplied using a PKCS#8 blob:
# j=`openssl pkcs8 -in ~/pkcs7/firmwarekey2.priv -topk8 -nocrypt -outform DER | keyctl padd asymmetric foo @s`
# keyctl pkey_query $j - enc=pkcs1
key_size=4096
max_data_size=512
max_sig_size=512
max_enc_size=512
max_dec_size=512
encrypt=y
decrypt=y
sign=y
verify=y
# keyctl pkey_encrypt $j 0 data enc=pkcs1 >/tmp/enc
# keyctl pkey_decrypt $j 0 /tmp/enc enc=pkcs1 >/tmp/dec
# cmp data /tmp/dec
# keyctl pkey_sign $j 0 data enc=pkcs1 hash=sha1 >/tmp/sig
# keyctl pkey_verify $j 0 data /tmp/sig enc=pkcs1 hash=sha1
#
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Denis Kenzior <denkenz@gmail.com>
Tested-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Put a flag in the public_key struct to indicate if the structure is holding
a private key. The private key must be held ASN.1 encoded in the format
specified in RFC 3447 A.1.2. This is the form required by crypto/rsa.c.
The software encryption subtype's verification and query functions then
need to select the appropriate crypto function to set the key.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Denis Kenzior <denkenz@gmail.com>
Tested-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Provide a query function for the software public key implementation. This
permits information about such a key to be obtained using
query_asymmetric_key() or KEYCTL_PKEY_QUERY.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Denis Kenzior <denkenz@gmail.com>
Tested-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Make the X.509 and PKCS7 parsers fill in the signature encoding type field
recently added to the public_key_signature struct.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Denis Kenzior <denkenz@gmail.com>
Tested-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Provide the missing asymmetric key subops for new key type ops. This
include query, encrypt, decrypt and create signature. Verify signature
already exists. Also provided are accessor functions for this:
int query_asymmetric_key(const struct key *key,
struct kernel_pkey_query *info);
int encrypt_blob(struct kernel_pkey_params *params,
const void *data, void *enc);
int decrypt_blob(struct kernel_pkey_params *params,
const void *enc, void *data);
int create_signature(struct kernel_pkey_params *params,
const void *data, void *enc);
The public_key_signature struct gains an encoding field to carry the
encoding for verify_signature().
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Denis Kenzior <denkenz@gmail.com>
Tested-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
This reverts commit dd979b4df8.
This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
internal TCP socket for the initial handshake with the remote peer.
Whenever the SMC connection can not be established this TCP socket is
used as a fallback. All socket operations on the SMC socket are then
forwarded to the TCP socket. In case of poll, the file->private_data
pointer references the SMC socket because the TCP socket has no file
assigned. This causes tcp_poll to wait on the wrong socket.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After allocation, output and decomp_output both point to memory chunks of
size COMP_BUF_SIZE. Then, only the first bytes are zeroed out using
sizeof(COMP_BUF_SIZE) as parameter to memset(), because
sizeof(COMP_BUF_SIZE) provides the size of the constant and not the size of
allocated memory.
Instead, the whole allocated memory is meant to be zeroed out. Use
COMP_BUF_SIZE as parameter to memset() directly in order to accomplish
this.
Fixes: 336073840a ("crypto: testmgr - Allow different compression results")
Signed-off-by: Michael Schupikov <michael@schupikov.de>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the correct __le32 annotation and accessors to perform the
single round of AES encryption performed inside the AEGIS transform.
Otherwise, tcrypt reports:
alg: aead: Test 1 failed on encryption for aegis128-generic
00000000: 6c 25 25 4a 3c 10 1d 27 2b c1 d4 84 9a ef 7f 6e
alg: aead: Test 1 failed on encryption for aegis128l-generic
00000000: cd c6 e3 b8 a0 70 9d 8e c2 4f 6f fe 71 42 df 28
alg: aead: Test 1 failed on encryption for aegis256-generic
00000000: aa ed 07 b1 96 1d e9 e6 f2 ed b5 8e 1c 5f dc 1c
Fixes: f606a88e58 ("crypto: aegis - Add generic AEGIS AEAD implementations")
Cc: <stable@vger.kernel.org> # v4.18+
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Omit the endian swabbing when folding the lengths of the assoc and
crypt input buffers into the state to finalize the tag. This is not
necessary given that the memory representation of the state is in
machine native endianness already.
This fixes an error reported by tcrypt running on a big endian system:
alg: aead: Test 2 failed on encryption for morus640-generic
00000000: a8 30 ef fb e6 26 eb 23 b0 87 dd 98 57 f3 e1 4b
00000010: 21
alg: aead: Test 2 failed on encryption for morus1280-generic
00000000: 88 19 1b fb 1c 29 49 0e ee 82 2f cb 97 a6 a5 ee
00000010: 5f
Fixes: 396be41f16 ("crypto: morus - Add generic MORUS AEAD implementations")
Cc: <stable@vger.kernel.org> # v4.18+
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Due to an unfortunate interaction between commit fbe1a850b3
("crypto: lrw - Fix out-of bounds access on counter overflow") and
commit c778f96bf3 ("crypto: lrw - Optimize tweak computation"),
we ended up with a version of next_index() that always returns 127.
Fixes: c778f96bf3 ("crypto: lrw - Optimize tweak computation")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For historical reasons, the AES-NI based implementation of the PCBC
chaining mode uses a special FPU chaining mode wrapper template to
amortize the FPU start/stop overhead over multiple blocks.
When this FPU wrapper was introduced, it supported widely used
chaining modes such as XTS and CTR (as well as LRW), but currently,
PCBC is the only remaining user.
Since there are no known users of pcbc(aes) in the kernel, let's remove
this special driver, and rely on the generic pcbc driver to encapsulate
the AES-NI core cipher.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We already have OFB test vectors and tcrypt OFB speed tests.
Add OFB functional tests to tcrypt as well.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a generic version of output feedback mode. We already have support of
several hardware based transformations of this mode and the needed test
vectors but we somehow missed adding a generic software one. Fix this now.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add additional test vectors from "The SM4 Blockcipher Algorithm And Its
Modes Of Operations" draft-ribose-cfrg-sm4-10 and register cipher speed
tests for sm4.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch implement a generic way to get statistics about all crypto
usages.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the quest to remove all stack VLA usage from the kernel[1], this
replaces struct crypto_skcipher and SKCIPHER_REQUEST_ON_STACK() usage
with struct crypto_sync_skcipher and SYNC_SKCIPHER_REQUEST_ON_STACK(),
which uses a fixed stack size.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the quest to remove all stack VLA usage from the kernel[1], this
replaces struct crypto_skcipher and SKCIPHER_REQUEST_ON_STACK() usage
with struct crypto_sync_skcipher and SYNC_SKCIPHER_REQUEST_ON_STACK(),
which uses a fixed stack size.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In preparation for removal of VLAs due to skcipher requests on the stack
via SKCIPHER_REQUEST_ON_STACK() usage, this introduces the infrastructure
for the "sync skcipher" tfm, which is for handling the on-stack cases of
skcipher, which are always non-ASYNC and have a known limited request
size.
The crypto API additions:
struct crypto_sync_skcipher (wrapper for struct crypto_skcipher)
crypto_alloc_sync_skcipher()
crypto_free_sync_skcipher()
crypto_sync_skcipher_setkey()
crypto_sync_skcipher_get_flags()
crypto_sync_skcipher_set_flags()
crypto_sync_skcipher_clear_flags()
crypto_sync_skcipher_blocksize()
crypto_sync_skcipher_ivsize()
crypto_sync_skcipher_reqtfm()
skcipher_request_set_sync_tfm()
SYNC_SKCIPHER_REQUEST_ON_STACK() (with tfm type check)
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The encryption mode of pkcs1pad never uses out_sg and out_buf, so
there's no need to allocate the buffer, which presently is not even
being freed.
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: linux-crypto@vger.kernel.org
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Dan Aloni <dan@kernelim.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a test vector for lrw(aes) that triggers wrap-around of
the counter, which is a tricky corner case.
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the LRW block counter overflows, the current implementation returns
128 as the index to the precomputed multiplication table, which has 128
entries. This patch fixes it to return the correct value (127).
Fixes: 64470f1b85 ("[CRYPTO] lrw: Liskov Rivest Wagner, a tweakable narrow block cipher mode")
Cc: <stable@vger.kernel.org> # 2.6.20+
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In commit 9f480faec5 ("crypto: chacha20 - Fix keystream alignment for
chacha20_block()"), I had missed that chacha20_block() can be called
directly on the buffer passed to get_random_bytes(), which can have any
alignment. So, while my commit didn't break anything, it didn't fully
solve the alignment problems.
Revert my solution and just update chacha20_block() to use
put_unaligned_le32(), so the output buffer need not be aligned.
This is simpler, and on many CPUs it's the same speed.
But, I kept the 'tmp' buffers in extract_crng_user() and
_get_random_bytes() 4-byte aligned, since that alignment is actually
needed for _crng_backtrack_protect() too.
Reported-by: Stephan Müller <smueller@chronox.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since commit acb9b159c7 ("crypto: gf128mul - define gf128mul_x_* in
gf128mul.h"), the gf128mul_x_*() functions are very fast and therefore
caching the computed XTS tweaks has only negligible advantage over
computing them twice.
In fact, since the current caching implementation limits the size of
the calls to the child ecb(...) algorithm to PAGE_SIZE (usually 4096 B),
it is often actually slower than the simple recomputing implementation.
This patch simplifies the XTS template to recompute the XTS tweaks from
scratch in the second pass and thus also removes the need to allocate a
dynamic buffer using kmalloc().
As discussed at [1], the use of kmalloc causes deadlocks with dm-crypt.
PERFORMANCE RESULTS
I measured time to encrypt/decrypt a memory buffer of varying sizes with
xts(ecb-aes-aesni) using a tool I wrote ([2]) and the results suggest
that after this patch the performance is either better or comparable for
both small and large buffers. Note that there is a lot of noise in the
measurements, but the overall difference is easy to see.
Old code:
ALGORITHM KEY (b) DATA (B) TIME ENC (ns) TIME DEC (ns)
xts(aes) 256 64 331 328
xts(aes) 384 64 332 333
xts(aes) 512 64 338 348
xts(aes) 256 512 889 920
xts(aes) 384 512 1019 993
xts(aes) 512 512 1032 990
xts(aes) 256 4096 2152 2292
xts(aes) 384 4096 2453 2597
xts(aes) 512 4096 3041 2641
xts(aes) 256 16384 9443 8027
xts(aes) 384 16384 8536 8925
xts(aes) 512 16384 9232 9417
xts(aes) 256 32768 16383 14897
xts(aes) 384 32768 17527 16102
xts(aes) 512 32768 18483 17322
New code:
ALGORITHM KEY (b) DATA (B) TIME ENC (ns) TIME DEC (ns)
xts(aes) 256 64 328 324
xts(aes) 384 64 324 319
xts(aes) 512 64 320 322
xts(aes) 256 512 476 473
xts(aes) 384 512 509 492
xts(aes) 512 512 531 514
xts(aes) 256 4096 2132 1829
xts(aes) 384 4096 2357 2055
xts(aes) 512 4096 2178 2027
xts(aes) 256 16384 6920 6983
xts(aes) 384 16384 8597 7505
xts(aes) 512 16384 7841 8164
xts(aes) 256 32768 13468 12307
xts(aes) 384 32768 14808 13402
xts(aes) 512 32768 15753 14636
[1] https://lkml.org/lkml/2018/8/23/1315
[2] https://gitlab.com/omos/linux-crypto-bench
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Introduce a facility that can be used to receive a notification
callback when a new algorithm becomes available. This can be used by
existing crypto registrations to trigger a switch from a software-only
algorithm to a hardware-accelerated version.
A new CRYPTO_MSG_ALG_LOADED state is introduced to the existing crypto
notification chain, and the register/unregister functions are exported
so they can be called by subsystems outside of crypto.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As it turns out, the AVX2 multibuffer SHA routines are currently
broken [0], in a way that would have likely been noticed if this
code were in wide use. Since the code is too complicated to be
maintained by anyone except the original authors, and since the
performance benefits for real-world use cases are debatable to
begin with, it is better to drop it entirely for the moment.
[0] https://marc.info/?l=linux-crypto-vger&m=153476243825350&w=2
Suggested-by: Eric Biggers <ebiggers@google.com>
Cc: Megha Dey <megha.dey@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the quest to remove all stack VLA usage from the kernel[1], this uses
the newly defined max alignment to perform unaligned hashing to avoid
VLAs, and drops the helper function while adding sanity checks on the
resulting buffer sizes. Additionally, the __aligned_largest macro is
removed since this helper was the only user.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the quest to remove all stack VLA usage from the kernel[1], this
exposes a new general upper bound on crypto blocksize and alignmask
(higher than for the existing cipher limits) for VLA removal,
and introduces new checks.
At present, the highest cra_alignmask in the kernel is 63. The highest
cra_blocksize is 144 (SHA3_224_BLOCK_SIZE, 18 8-byte words). For the
new blocksize limit, I went with 160 (20 8-byte words).
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the quest to remove all stack VLA usage from the kernel[1], this
removes the VLAs in SHASH_DESC_ON_STACK (via crypto_shash_descsize())
by using the maximum allowable size (which is now more clearly captured
in a macro), along with a few other cases. Similar limits are turned into
macros as well.
A review of existing sizes shows that SHA512_DIGEST_SIZE (64) is the
largest digest size and that sizeof(struct sha3_state) (360) is the
largest descriptor size. The corresponding maximums are reduced.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the quest to remove all stack VLA usage from the kernel[1], this drops
AHASH_REQUEST_ON_STACK by preallocating the ahash request area combined
with the skcipher area (which are not used at the same time).
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the quest to remove all stack VLA usage from the kernel[1], this uses
the maximum blocksize and adds a sanity check. For xcbc, the blocksize
must always be 16, so use that, since it's already being enforced during
instantiation.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
These are unused, undesired, and have never actually been used by
anybody. The original authors of this code have changed their mind about
its inclusion. While originally proposed for disk encryption on low-end
devices, the idea was discarded [1] in favor of something else before
that could really get going. Therefore, this patch removes Speck.
[1] https://marc.info/?l=linux-crypto-vger&m=153359499015659
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Eric Biggers <ebiggers@google.com>
Cc: stable@vger.kernel.org
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This round brings couple of framework changes, a new driver and usual driver
updates:
- New managed helper for dmaengine framework registration
- Split dmaengine pause capability to pause and resume and allow drivers to
report that individually
- Update dma_request_chan_by_mask() to handle deferred probing
- Move imx-sdma to use virt-dma
- New driver for Actions Semi Owl family S900 controller
- Minor updates to intel, renesas, mv_xor, pl330 etc
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbdsctAAoJEHwUBw8lI4NHZrIP/3/HrNSUKApt1KdOcG5UA7nu
7O3BcvkAahmM285Hw3a/zLEnSm2sJ/6EI0lN1sz+VYi8IECG7nbCyHQh3Bd1Mxi1
XLHafdTGcI5b7rpicNtRS1BHCPtNrgOypFxs8b/bTatbzc/aWM8K8WFLX27sqGZT
1Sb2nNKKrVbQDVqJ+1ZEQ4q86w61tPHmmRH0icl1DAQREfsvbu/bRMdol5H7/orx
A+ZGH39Ig3FI8/Ri8KccqShvG0VM1yCVJca+0j30IL1x4JNZ36uG+NQbtkBIkOJC
kk9qfCu3ugm4NOtfKGOtkmmOwE9/GirRh+QMPpSmi6oQu4vdOVxyQyYpKukHIer1
vxwpvo2b+3POMfHi1kuqDJhcGIEPak6tH2Oyd01l7nA7Lyww9iC2AyiL89knw+i6
aUK4oHIhf2fFLUN6/ck4JbBqQ3MrDNraZfLJcnmQPtpTftW9Yqd2yqs7Cf1gcBC9
jyLAekJENiUmaNJsL5nJUMDVGG0lIiOnfwtPNfPZJuWu+4doKb2pM4+Ljcyfn2g0
ub4fPfXp0wcFaVarjpQr6T0tdZVMpmrPSTPGS5BdVZbWntrNOpiHmmPVEOLNz3zb
ibIMFn478/RYYB5pcNtHkUaOF4tu0w46fSqRp1ixkey+FIHKlj8/B+YeaAJF0nJh
fc4XaTTJgLufzc1F0ztU
=kbCC
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-4.19-rc1' of git://git.infradead.org/users/vkoul/slave-dma
Pull DMAengine updates from Vinod Koul:
"This round brings couple of framework changes, a new driver and usual
driver updates:
- new managed helper for dmaengine framework registration
- split dmaengine pause capability to pause and resume and allow
drivers to report that individually
- update dma_request_chan_by_mask() to handle deferred probing
- move imx-sdma to use virt-dma
- new driver for Actions Semi Owl family S900 controller
- minor updates to intel, renesas, mv_xor, pl330 etc"
* tag 'dmaengine-4.19-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (46 commits)
dmaengine: Add Actions Semi Owl family S900 DMA driver
dt-bindings: dmaengine: Add binding for Actions Semi Owl SoCs
dmaengine: sh: rcar-dmac: Should not stop the DMAC by rcar_dmac_sync_tcr()
dmaengine: mic_x100_dma: use the new helper to simplify the code
dmaengine: add a new helper dmaenginem_async_device_register
dmaengine: imx-sdma: add memcpy interface
dmaengine: imx-sdma: add SDMA_BD_MAX_CNT to replace '0xffff'
dmaengine: dma_request_chan_by_mask() to handle deferred probing
dmaengine: pl330: fix irq race with terminate_all
dmaengine: Revert "dmaengine: mv_xor_v2: enable COMPILE_TEST"
dmaengine: mv_xor_v2: use {lower,upper}_32_bits to configure HW descriptor address
dmaengine: mv_xor_v2: enable COMPILE_TEST
dmaengine: mv_xor_v2: move unmap to before callback
dmaengine: mv_xor_v2: convert callback to helper function
dmaengine: mv_xor_v2: kill the tasklets upon exit
dmaengine: mv_xor_v2: explicitly freeup irq
dmaengine: sh: rcar-dmac: Add dma_pause operation
dmaengine: sh: rcar-dmac: add a new function to clear CHCR.DE with barrier
dmaengine: idma64: Support dmaengine_terminate_sync()
dmaengine: hsu: Support dmaengine_terminate_sync()
...
Replace the use of a magic number that indicates that verify_*_signature()
should use the secondary keyring with a symbol.
Signed-off-by: Yannik Sembritzki <yannik@sembritzki.me>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: keyrings@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull integrity updates from James Morris:
"This adds support for EVM signatures based on larger digests, contains
a new audit record AUDIT_INTEGRITY_POLICY_RULE to differentiate the
IMA policy rules from the IMA-audit messages, addresses two deadlocks
due to either loading or searching for crypto algorithms, and cleans
up the audit messages"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
EVM: fix return value check in evm_write_xattrs()
integrity: prevent deadlock during digsig verification.
evm: Allow non-SHA1 digital signatures
evm: Don't deadlock if a crypto algorithm is unavailable
integrity: silence warning when CONFIG_SECURITYFS is not enabled
ima: Differentiate auditing policy rules from "audit" actions
ima: Do not audit if CONFIG_INTEGRITY_AUDIT is not set
ima: Use audit_log_format() rather than audit_log_string()
ima: Call audit_log_string() rather than logging it untrusted
Make it return -EINVAL if crypto_dh_key_len() is incorrect rather than
overflowing the buffer.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It was forgotten to increase DH_KPP_SECRET_MIN_SIZE to include 'q_size',
causing an out-of-bounds write of 4 bytes in crypto_dh_encode_key(), and
an out-of-bounds read of 4 bytes in crypto_dh_decode_key(). Fix it, and
fix the lengths of the test vectors to match this.
Reported-by: syzbot+6d38d558c25b53b8f4ed@syzkaller.appspotmail.com
Fixes: e3fe0ae129 ("crypto: dh - add public key verification test")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Like the skcipher_walk and blkcipher_walk cases:
scatterwalk_done() is only meant to be called after a nonzero number of
bytes have been processed, since scatterwalk_pagedone() will flush the
dcache of the *previous* page. But in the error case of
ablkcipher_walk_done(), e.g. if the input wasn't an integer number of
blocks, scatterwalk_done() was actually called after advancing 0 bytes.
This caused a crash ("BUG: unable to handle kernel paging request")
during '!PageSlab(page)' on architectures like arm and arm64 that define
ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
page-aligned as in that case walk->offset == 0.
Fix it by reorganizing ablkcipher_walk_done() to skip the
scatterwalk_advance() and scatterwalk_done() if an error has occurred.
Reported-by: Liu Chao <liuchao741@huawei.com>
Fixes: bf06099db1 ("crypto: skcipher - Add ablkcipher_walk interfaces")
Cc: <stable@vger.kernel.org> # v2.6.35+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Like the skcipher_walk case:
scatterwalk_done() is only meant to be called after a nonzero number of
bytes have been processed, since scatterwalk_pagedone() will flush the
dcache of the *previous* page. But in the error case of
blkcipher_walk_done(), e.g. if the input wasn't an integer number of
blocks, scatterwalk_done() was actually called after advancing 0 bytes.
This caused a crash ("BUG: unable to handle kernel paging request")
during '!PageSlab(page)' on architectures like arm and arm64 that define
ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
page-aligned as in that case walk->offset == 0.
Fix it by reorganizing blkcipher_walk_done() to skip the
scatterwalk_advance() and scatterwalk_done() if an error has occurred.
This bug was found by syzkaller fuzzing.
Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE:
#include <linux/if_alg.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
struct sockaddr_alg addr = {
.salg_type = "skcipher",
.salg_name = "ecb(aes-generic)",
};
char buffer[4096] __attribute__((aligned(4096))) = { 0 };
int fd;
fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(fd, (void *)&addr, sizeof(addr));
setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16);
fd = accept(fd, NULL, NULL);
write(fd, buffer, 15);
read(fd, buffer, 15);
}
Reported-by: Liu Chao <liuchao741@huawei.com>
Fixes: 5cde0af2a9 ("[CRYPTO] cipher: Added block cipher type")
Cc: <stable@vger.kernel.org> # v2.6.19+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
scatterwalk_done() is only meant to be called after a nonzero number of
bytes have been processed, since scatterwalk_pagedone() will flush the
dcache of the *previous* page. But in the error case of
skcipher_walk_done(), e.g. if the input wasn't an integer number of
blocks, scatterwalk_done() was actually called after advancing 0 bytes.
This caused a crash ("BUG: unable to handle kernel paging request")
during '!PageSlab(page)' on architectures like arm and arm64 that define
ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
page-aligned as in that case walk->offset == 0.
Fix it by reorganizing skcipher_walk_done() to skip the
scatterwalk_advance() and scatterwalk_done() if an error has occurred.
This bug was found by syzkaller fuzzing.
Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE:
#include <linux/if_alg.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
struct sockaddr_alg addr = {
.salg_type = "skcipher",
.salg_name = "cbc(aes-generic)",
};
char buffer[4096] __attribute__((aligned(4096))) = { 0 };
int fd;
fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(fd, (void *)&addr, sizeof(addr));
setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16);
fd = accept(fd, NULL, NULL);
write(fd, buffer, 15);
read(fd, buffer, 15);
}
Reported-by: Liu Chao <liuchao741@huawei.com>
Fixes: b286d8b1a6 ("crypto: skcipher - Add skcipher walk interface")
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Setting 'walk->nbytes = walk->total' in skcipher_walk_first() doesn't
make sense because actually walk->nbytes needs to be set to the length
of the first step in the walk, which may be less than walk->total. This
is done by skcipher_walk_next() which is called immediately afterwards.
Also walk->nbytes was already set to 0 in skcipher_walk_skcipher(),
which is a better default value in case it's forgotten to be set later.
Therefore, remove the unnecessary assignment to walk->nbytes.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All callers pass chain=0 to scatterwalk_crypto_chain().
Remove this unneeded parameter.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The ALIGN() macro needs to be passed the alignment, not the alignmask
(which is the alignment minus 1).
Fixes: b286d8b1a6 ("crypto: skcipher - Add skcipher walk interface")
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Avoid RCU stalls in the case of non-preemptible kernel and lengthy
speed tests by rescheduling when advancing from one block size
to another.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The wait_address argument is always directly derived from the filp
argument, so remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the swap macro and remove unnecessary variable *tmp*.
This makes the code easier to read and maintain.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make use of the swap macro and remove unnecessary variable *tmp*.
This makes the code easier to read and maintain.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix the b value to be compliant with FIPS 186-4 D.1.2.1. This fix is
required to make sure the SP800-56A public key test passes for P-192.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
By adding a zero byte-length for the DH parameter Q value, the public
key verification test is disabled for the given test.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The CTR DRBG requires two SGLs pointing to input/output buffers for the
CTR AES operation. The used SGLs always have only one entry. Thus, the
SGL can be initialized during allocation time, preventing a
re-initialization of the SGLs during each call.
The performance is increased by about 1 to 3 percent depending on the
size of the requested buffer size.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In case memory resources for *base* were allocated, release them
before return.
Addresses-Coverity-ID: 1471702 ("Resource leak")
Fixes: e3fe0ae129 ("crypto: dh - add public key verification test")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Stephan Müller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fix from Herbert Xu:
"This fixes an allocation error-path bug in af_alg discovered by
syzkaller"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: af_alg - Initialize sg_num_bytes in error code path
When EVM attempts to appraise a file signed with a crypto algorithm the
kernel doesn't have support for, it will cause the kernel to trigger a
module load. If the EVM policy includes appraisal of kernel modules this
will in turn call back into EVM - since EVM is holding a lock until the
crypto initialisation is complete, this triggers a deadlock. Add a
CRYPTO_NOLOAD flag and skip module loading if it's set, and add that flag
in the EVM case in order to fail gracefully with an error message
instead of deadlocking.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
The RX SGL in processing is already registered with the RX SGL tracking
list to support proper cleanup. The cleanup code path uses the
sg_num_bytes variable which must therefore be always initialized, even
in the error code path.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Reported-by: syzbot+9c251bdd09f83b92ba95@syzkaller.appspotmail.com
#syz test: https://github.com/google/kmsan.git master
CC: <stable@vger.kernel.org> #4.14
Fixes: e870456d8e ("crypto: algif_skcipher - overhaul memory management")
Fixes: d887c52d6a ("crypto: algif_aead - overhaul memory management")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The testmgr hash tests were testing init, digest, update and final
methods but not the finup method. Add a test for this one too.
While doing this, make sure we only run the partial tests once with
the digest tests and skip them with the final and finup tests since
they are the same.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some aead algorithms set .cra_flags = CRYPTO_ALG_TYPE_AEAD. But this is
redundant with the C structure type ('struct aead_alg'), and
crypto_register_aead() already sets the type flag automatically,
clearing any type flag that was already there. Apparently the useless
assignment has just been copy+pasted around.
So, remove the useless assignment from all the aead algorithms.
This patch shouldn't change any actual behavior.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Many shash algorithms set .cra_flags = CRYPTO_ALG_TYPE_SHASH. But this
is redundant with the C structure type ('struct shash_alg'), and
crypto_register_shash() already sets the type flag automatically,
clearing any type flag that was already there. Apparently the useless
assignment has just been copy+pasted around.
So, remove the useless assignment from all the shash algorithms.
This patch shouldn't change any actual behavior.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
sha512-generic and sha384-generic had a cra_priority of 0, so it wasn't
possible to have a lower priority SHA-512 or SHA-384 implementation, as
is desired for sha512_mb which is only useful under certain workloads
and is otherwise extremely slow. Change them to priority 100, which is
the priority used for many of the other generic algorithms.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
sha256-generic and sha224-generic had a cra_priority of 0, so it wasn't
possible to have a lower priority SHA-256 or SHA-224 implementation, as
is desired for sha256_mb which is only useful under certain workloads
and is otherwise extremely slow. Change them to priority 100, which is
the priority used for many of the other generic algorithms.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
sha1-generic had a cra_priority of 0, so it wasn't possible to have a
lower priority SHA-1 implementation, as is desired for sha1_mb which is
only useful under certain workloads and is otherwise extremely slow.
Change it to priority 100, which is the priority used for many of the
other generic algorithms.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
According to SP800-56A section 5.6.2.1, the public key to be processed
for the DH operation shall be checked for appropriateness. The check
shall covers the full verification test in case the domain parameter Q
is provided as defined in SP800-56A section 5.6.2.3.1. If Q is not
provided, the partial check according to SP800-56A section 5.6.2.3.2 is
performed.
The full verification test requires the presence of the domain parameter
Q. Thus, the patch adds the support to handle Q. It is permissible to
not provide the Q value as part of the domain parameters. This implies
that the interface is still backwards-compatible where so far only P and
G are to be provided. However, if Q is provided, it is imported.
Without the test, the NIST ACVP testing fails. After adding this check,
the NIST ACVP testing passes. Testing without providing the Q domain
parameter has been performed to verify the interface has not changed.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As of GCC 9.0.0 the build is reporting warnings like:
crypto/ablkcipher.c: In function ‘crypto_ablkcipher_report’:
crypto/ablkcipher.c:374:2: warning: ‘strncpy’ specified bound 64 equals destination size [-Wstringop-truncation]
strncpy(rblkcipher.geniv, alg->cra_ablkcipher.geniv ?: "<default>",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sizeof(rblkcipher.geniv));
~~~~~~~~~~~~~~~~~~~~~~~~~
This means the strnycpy might create a non null terminated string. Fix this by
explicitly performing '\0' termination.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Nick Desaulniers <nick.desaulniers@gmail.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
According to SP800-56A section 5.6.2.1, the public key to be processed
for the ECDH operation shall be checked for appropriateness. When the
public key is considered to be an ephemeral key, the partial validation
test as defined in SP800-56A section 5.6.2.3.4 can be applied.
The partial verification test requires the presence of the field
elements of a and b. For the implemented NIST curves, b is defined in
FIPS 186-4 appendix D.1.2. The element a is implicitly given with the
Weierstrass equation given in D.1.2 where a = p - 3.
Without the test, the NIST ACVP testing fails. After adding this check,
the NIST ACVP testing passes.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The function skcipher_walk_next declared as static and marked as
EXPORT_SYMBOL_GPL. It's a bit confusing for internal function to be
exported. The area of visibility for such function is its .c file
and all other modules. Other *.c files of the same module can't use it,
despite all other modules can. Relying on the fact that this is the
internal function and it's not a crucial part of the API, the patch
just removes the EXPORT_SYMBOL_GPL marking of skcipher_walk_next.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove the original version of the VMAC template that had the nonce
hardcoded to 0 and produced a digest with the wrong endianness. I'm
unsure whether this had users or not (there are no explicit in-kernel
references to it), but given that the hardcoded nonce made it wildly
insecure unless a unique key was used for each message, let's try
removing it and see if anyone complains.
Leave the new "vmac64" template that requires the nonce to be explicitly
specified as the first 16 bytes of data and uses the correct endianness
for the digest.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the VMAC template uses a "nonce" hardcoded to 0, which makes
it insecure unless a unique key is set for every message. Also, the
endianness of the final digest is wrong: the implementation uses little
endian, but the VMAC specification has it as big endian, as do other
VMAC implementations such as the one in Crypto++.
Add a new VMAC template where the nonce is passed as the first 16 bytes
of data (similar to what is done for Poly1305's nonce), and the digest
is big endian. Call it "vmac64", since the old name of simply "vmac"
didn't clarify whether the implementation is of VMAC-64 or of VMAC-128
(which produce 64-bit and 128-bit digests respectively); so we fix the
naming ambiguity too.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
syzbot reported a crash in vmac_final() when multiple threads
concurrently use the same "vmac(aes)" transform through AF_ALG. The bug
is pretty fundamental: the VMAC template doesn't separate per-request
state from per-tfm (per-key) state like the other hash algorithms do,
but rather stores it all in the tfm context. That's wrong.
Also, vmac_final() incorrectly zeroes most of the state including the
derived keys and cached pseudorandom pad. Therefore, only the first
VMAC invocation with a given key calculates the correct digest.
Fix these bugs by splitting the per-tfm state from the per-request state
and using the proper init/update/final sequencing for requests.
Reproducer for the crash:
#include <linux/if_alg.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
int fd;
struct sockaddr_alg addr = {
.salg_type = "hash",
.salg_name = "vmac(aes)",
};
char buf[256] = { 0 };
fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(fd, (void *)&addr, sizeof(addr));
setsockopt(fd, SOL_ALG, ALG_SET_KEY, buf, 16);
fork();
fd = accept(fd, NULL, NULL);
for (;;)
write(fd, buf, 256);
}
The immediate cause of the crash is that vmac_ctx_t.partial_size exceeds
VMAC_NHBYTES, causing vmac_final() to memset() a negative length.
Reported-by: syzbot+264bca3a6e8d645550d3@syzkaller.appspotmail.com
Fixes: f1939f7c56 ("crypto: vmac - New hash algorithm for intel_txt support")
Cc: <stable@vger.kernel.org> # v2.6.32+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The VMAC template assumes the block cipher has a 128-bit block size, but
it failed to check for that. Thus it was possible to instantiate it
using a 64-bit block size cipher, e.g. "vmac(cast5)", causing
uninitialized memory to be used.
Add the needed check when instantiating the template.
Fixes: f1939f7c56 ("crypto: vmac - New hash algorithm for intel_txt support")
Cc: <stable@vger.kernel.org> # v2.6.32+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The poll() changes were not well thought out, and completely
unexplained. They also caused a huge performance regression, because
"->poll()" was no longer a trivial file operation that just called down
to the underlying file operations, but instead did at least two indirect
calls.
Indirect calls are sadly slow now with the Spectre mitigation, but the
performance problem could at least be largely mitigated by changing the
"->get_poll_head()" operation to just have a per-file-descriptor pointer
to the poll head instead. That gets rid of one of the new indirections.
But that doesn't fix the new complexity that is completely unwarranted
for the regular case. The (undocumented) reason for the poll() changes
was some alleged AIO poll race fixing, but we don't make the common case
slower and more complex for some uncommon special case, so this all
really needs way more explanations and most likely a fundamental
redesign.
[ This revert is a revert of about 30 different commits, not reverted
individually because that would just be unnecessarily messy - Linus ]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull security subsystem fixes from James Morris:
- Smack: fix a regression caused by 1bbc55131e
- X.509: fix a (usually un-seen) bug in RSA signature parsing
* 'fixes-v4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
X.509: unpack RSA signatureValue field from BIT STRING
Smack: Mark inode instant in smack_task_to_inode
The signatureValue field of a X.509 certificate is encoded as a BIT STRING.
For RSA signatures this BIT STRING is of so-called primitive subtype, which
contains a u8 prefix indicating a count of unused bits in the encoding.
We have to strip this prefix from signature data, just as we already do for
key data in x509_extract_key_data() function.
This wasn't noticed earlier because this prefix byte is zero for RSA key
sizes divisible by 8. Since BIT STRING is a big-endian encoding adding zero
prefixes has no bearing on its value.
The signature length, however was incorrect, which is a problem for RSA
implementations that need it to be exactly correct (like AMD CCP).
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Fixes: c26fd69fa0 ("X.509: Add a crypto key parser for binary (DER) X.509 certificates")
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.morris@microsoft.com>
Pull crypto fixes from Herbert Xu:
- Fix use after free in chtls
- Fix RBP breakage in sha3
- Fix use after free in hwrng_unregister
- Fix overread in morus640
- Move sleep out of kernel_neon in arm64/aes-blk
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: core - Always drop the RNG in hwrng_unregister()
crypto: morus640 - Fix out-of-bounds access
crypto: don't optimize keccakf()
crypto: arm64/aes-blk - fix and move skcipher_walk_done out of kernel_neon_begin, _end
crypto: chtls - use after free in chtls_pt_recvmsg()
Trival fix to correct the indentation of a single statement
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the sha384 pre-computed 0-length hash so that device
drivers can use it when an hardware engine does not support computing a
hash from a 0 length input.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the sha512 pre-computed 0-length hash so that device
drivers can use it when an hardware engine does not support computing a
hash from a 0 length input.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the quest to remove VLAs from the kernel[1], this adjusts the
allocation of coefs and blocks to use the existing maximum values
(with one new define, MAX_DISKS for coefs, and a reuse of the
existing NDISKS for blocks).
[1] https://lkml.org/lkml/2018/3/7/621
Signed-off-by: Kyle Spiers <ksspiers@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
As we move stuff around, some doc references are broken. Fix some of
them via this script:
./scripts/documentation-file-ref-check --fix
Manually checked if the produced result is valid, removing a few
false-positives.
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
We must load the block from the temporary variable here, not directly
from the input.
Also add forgotten zeroing-out of the uninitialized part of the
temporary block (as is done correctly in morus1280.c).
Fixes: 396be41f16 ("crypto: morus - Add generic MORUS AEAD implementations")
Reported-by: syzbot+1fafa9c4cf42df33f716@syzkaller.appspotmail.com
Reported-by: syzbot+d82643ba80bf6937cd44@syzkaller.appspotmail.com
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
keccakf() is the only function in kernel that uses __optimize() macro.
__optimize() breaks frame pointer unwinder as optimized code uses RBP,
and amusingly this always lead to degraded performance as gcc does not
inline across different optimizations levels, so keccakf() wasn't inlined
into its callers and keccakf_round() wasn't inlined into keccakf().
Drop __optimize() to resolve both problems.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 83dee2ce1a ("crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize")
Reported-by: syzbot+37035ccfa9a0a017ffcf@syzkaller.appspotmail.com
Reported-by: syzbot+e073e4740cfbb3ae200b@syzkaller.appspotmail.com
Cc: linux-crypto@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
- Use overflow helpers in 2-factor allocators (Kees, Rasmus)
- Introduce overflow test module (Rasmus, Kees)
- Introduce saturating size helper functions (Matthew, Kees)
- Treewide use of struct_size() for allocators (Kees)
-----BEGIN PGP SIGNATURE-----
Comment: Kees Cook <kees@outflux.net>
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlsYJ1gWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJlCTEACwdEeriAd2VwxknnsstojGD/3g
8TTFA19vSu4Gxa6WiDkjGoSmIlfhXTlZo1Nlmencv16ytSvIVDNLUIB3uDxUIv1J
2+dyHML9JpXYHHR7zLXXnGFJL0wazqjbsD3NYQgXqmun7EVVYnOsAlBZ7h/Lwiej
jzEJd8DaHT3TA586uD3uggiFvQU0yVyvkDCDONIytmQx+BdtGdg9TYCzkBJaXuDZ
YIthyKDvxIw5nh/UaG3L+SKo73tUr371uAWgAfqoaGQQCWe+mxnWL4HkCKsjFzZL
u9ouxxF/n6pij3E8n6rb0i2fCzlsTDdDF+aqV1rQ4I4hVXCFPpHUZgjDPvBWbj7A
m6AfRHVNnOgI8HGKqBGOfViV+2kCHlYeQh3pPW33dWzy/4d/uq9NIHKxE63LH+S4
bY3oO2ela8oxRyvEgXLjqmRYGW1LB/ZU7FS6Rkx2gRzo4k8Rv+8K/KzUHfFVRX61
jEbiPLzko0xL9D53kcEn0c+BhofK5jgeSWxItdmfuKjLTW4jWhLRlU+bcUXb6kSS
S3G6aF+L+foSUwoq63AS8QxCuabuhreJSB+BmcGUyjthCbK/0WjXYC6W/IJiRfBa
3ZTxBC/2vP3uq/AGRNh5YZoxHL8mSxDfn62F+2cqlJTTKR/O+KyDb1cusyvk3H04
KCDVLYPxwQQqK1Mqig==
=/3L8
-----END PGP SIGNATURE-----
Merge tag 'overflow-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull overflow updates from Kees Cook:
"This adds the new overflow checking helpers and adds them to the
2-factor argument allocators. And this adds the saturating size
helpers and does a treewide replacement for the struct_size() usage.
Additionally this adds the overflow testing modules to make sure
everything works.
I'm still working on the treewide replacements for allocators with
"simple" multiplied arguments:
*alloc(a * b, ...) -> *alloc_array(a, b, ...)
and
*zalloc(a * b, ...) -> *calloc(a, b, ...)
as well as the more complex cases, but that's separable from this
portion of the series. I expect to have the rest sent before -rc1
closes; there are a lot of messy cases to clean up.
Summary:
- Introduce arithmetic overflow test helper functions (Rasmus)
- Use overflow helpers in 2-factor allocators (Kees, Rasmus)
- Introduce overflow test module (Rasmus, Kees)
- Introduce saturating size helper functions (Matthew, Kees)
- Treewide use of struct_size() for allocators (Kees)"
* tag 'overflow-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
treewide: Use struct_size() for devm_kmalloc() and friends
treewide: Use struct_size() for vmalloc()-family
treewide: Use struct_size() for kmalloc()-family
device: Use overflow helpers for devm_kmalloc()
mm: Use overflow helpers in kvmalloc()
mm: Use overflow helpers in kmalloc_array*()
test_overflow: Add memory allocation overflow tests
overflow.h: Add allocation size calculation helpers
test_overflow: Report test failures
test_overflow: macrofy some more, do more tests for free
lib: add runtime test of check_*_overflow functions
compiler.h: enable builtin overflow checkers and add fallback code
Pull aio updates from Al Viro:
"Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.
The only thing I'm holding back for a day or so is Adam's aio ioprio -
his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
but let it sit in -next for decency sake..."
* 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
aio: sanitize the limit checking in io_submit(2)
aio: fold do_io_submit() into callers
aio: shift copyin of iocb into io_submit_one()
aio_read_events_ring(): make a bit more readable
aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
aio: take list removal to (some) callers of aio_complete()
aio: add missing break for the IOCB_CMD_FDSYNC case
random: convert to ->poll_mask
timerfd: convert to ->poll_mask
eventfd: switch to ->poll_mask
pipe: convert to ->poll_mask
crypto: af_alg: convert to ->poll_mask
net/rxrpc: convert to ->poll_mask
net/iucv: convert to ->poll_mask
net/phonet: convert to ->poll_mask
net/nfc: convert to ->poll_mask
net/caif: convert to ->poll_mask
net/bluetooth: convert to ->poll_mask
net/sctp: convert to ->poll_mask
net/tipc: convert to ->poll_mask
...
This reverts commit eb772f37ae, as now the
x86 Salsa20 implementation has been removed and the generic helpers are
no longer needed outside of salsa20_generic.c.
We could keep this just in case someone else wants to add a new
optimized Salsa20 implementation. But given that we have ChaCha20 now
too, I think it's unlikely. And this can always be reverted back.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The x86 assembly implementations of Salsa20 use the frame base pointer
register (%ebp or %rbp), which breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.
Recent (v4.10+) kernels will warn about this, e.g.
WARNING: kernel stack regs at 00000000a8291e69 in syzkaller047086:4677 has bad 'bp' value 000000001077994c
[...]
But after looking into it, I believe there's very little reason to still
retain the x86 Salsa20 code. First, these are *not* vectorized
(SSE2/SSSE3/AVX2) implementations, which would be needed to get anywhere
close to the best Salsa20 performance on any remotely modern x86
processor; they're just regular x86 assembly. Second, it's still
unclear that anyone is actually using the kernel's Salsa20 at all,
especially given that now ChaCha20 is supported too, and with much more
efficient SSSE3 and AVX2 implementations. Finally, in benchmarks I did
on both Intel and AMD processors with both gcc 8.1.0 and gcc 4.9.4, the
x86_64 salsa20-asm is actually slightly *slower* than salsa20-generic
(~3% slower on Skylake, ~10% slower on Zen), while the i686 salsa20-asm
is only slightly faster than salsa20-generic (~15% faster on Skylake,
~20% faster on Zen). The gcc version made little difference.
So, the x86_64 salsa20-asm is pretty clearly useless. That leaves just
the i686 salsa20-asm, which based on my tests provides a 15-20% speed
boost. But that's without updating the code to not use %ebp. And given
the maintenance cost, the small speed difference vs. salsa20-generic,
the fact that few people still use i686 kernels, the doubt that anyone
is even using the kernel's Salsa20 at all, and the fact that a SSE2
implementation would almost certainly be much faster on any remotely
modern x86 processor yet no one has cared enough to add one yet, I don't
think it's worthwhile to keep.
Thus, just remove both the x86_64 and i686 salsa20-asm implementations.
Reported-by: syzbot+ffa3a158337bbc01ff09@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit 56e8e57fc3 ("crypto: morus - Add common SIMD glue code for
MORUS") accidetally consiedered the glue code to be usable by different
architectures, but it seems to be only usable on x86.
This patch moves it under arch/x86/crypto and adds 'depends on X86' to
the Kconfig options and also removes the prompt to hide these internal
options from the user.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently testmgr has separate encryption and decryption test vectors
for symmetric ciphers. That's massively redundant, since with few
exceptions (mostly mistakes, apparently), all decryption tests are
identical to the encryption tests, just with the input/result flipped.
Therefore, eliminate the redundancy by removing the decryption test
vectors and updating testmgr to test both encryption and decryption
using what used to be the encryption test vectors. Naming is adjusted
accordingly: each cipher_testvec now has a 'ptext' (plaintext), 'ctext'
(ciphertext), and 'len' instead of an 'input', 'result', 'ilen', and
'rlen'. Note that it was always the case that 'ilen == rlen'.
AES keywrap ("kw(aes)") is special because its IV is generated by the
encryption. Previously this was handled by specifying 'iv_out' for
encryption and 'iv' for decryption. To make it work cleanly with only
one set of test vectors, put the IV in 'iv', remove 'iv_out', and add a
boolean that indicates that the IV is generated by the encryption.
In total, this removes over 10000 lines from testmgr.h, with no
reduction in test coverage since prior patches already copied the few
unique decryption test vectors into the encryption test vectors.
This covers all algorithms that used 'struct cipher_testvec', e.g. any
block cipher in the ECB, CBC, CTR, XTS, LRW, CTS-CBC, PCBC, OFB, or
keywrap modes, and Salsa20 and ChaCha20. No change is made to AEAD
tests, though we probably can eliminate a similar redundancy there too.
The testmgr.h portion of this patch was automatically generated using
the following awk script, with some slight manual fixups on top (updated
'struct cipher_testvec' definition, updated a few comments, and fixed up
the AES keywrap test vectors):
BEGIN { OTHER = 0; ENCVEC = 1; DECVEC = 2; DECVEC_TAIL = 3; mode = OTHER }
/^static const struct cipher_testvec.*_enc_/ { sub("_enc", ""); mode = ENCVEC }
/^static const struct cipher_testvec.*_dec_/ { mode = DECVEC }
mode == ENCVEC && !/\.ilen[[:space:]]*=/ {
sub(/\.input[[:space:]]*=$/, ".ptext =")
sub(/\.input[[:space:]]*=/, ".ptext\t=")
sub(/\.result[[:space:]]*=$/, ".ctext =")
sub(/\.result[[:space:]]*=/, ".ctext\t=")
sub(/\.rlen[[:space:]]*=/, ".len\t=")
print
}
mode == DECVEC_TAIL && /[^[:space:]]/ { mode = OTHER }
mode == OTHER { print }
mode == ENCVEC && /^};/ { mode = OTHER }
mode == DECVEC && /^};/ { mode = DECVEC_TAIL }
Note that git's default diff algorithm gets confused by the testmgr.h
portion of this patch, and reports too many lines added and removed.
It's better viewed with 'git diff --minimal' (or 'git show --minimal'),
which reports "2 files changed, 919 insertions(+), 11723 deletions(-)".
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
One "kw(aes)" decryption test vector doesn't exactly match an encryption
test vector with input and result swapped. In preparation for removing
the decryption test vectors, add this test vector to the encryption test
vectors, so we don't lose any test coverage.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
None of the four "ecb(tnepres)" decryption test vectors exactly match an
encryption test vector with input and result swapped. In preparation
for removing the decryption test vectors, add these to the encryption
test vectors, so we don't lose any test coverage.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
One "cbc(des)" decryption test vector doesn't exactly match an
encryption test vector with input and result swapped. It's *almost* the
same as one, but the decryption version is "chunked" while the
encryption version is "unchunked". In preparation for removing the
decryption test vectors, make the encryption one both chunked and
unchunked, so we don't lose any test coverage.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Two "ecb(des)" decryption test vectors don't exactly match any of the
encryption test vectors with input and result swapped. In preparation
for removing the decryption test vectors, add these to the encryption
test vectors, so we don't lose any test coverage.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crc32c has an unkeyed test vector but crc32 did not. Add the crc32c one
(which uses an empty input) to crc32 too, and also add a new one to both
that uses a nonempty input. These test vectors verify that crc32 and
crc32c implementations use the correct default initial state.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since testmgr uses a single tfm for all tests of each hash algorithm,
once a key is set the tfm won't be unkeyed anymore. But with crc32 and
crc32c, the key is really the "default initial state" and is optional;
those algorithms should have both keyed and unkeyed test vectors, to
verify that implementations use the correct default key.
Simply listing the unkeyed test vectors first isn't guaranteed to work
yet because testmgr makes multiple passes through the test vectors.
crc32c does have an unkeyed test vector listed first currently, but it
only works by chance because the last crc32c test vector happens to use
a key that is the same as the default key.
Therefore, teach testmgr to split hash test vectors into unkeyed and
keyed sections, and do all the unkeyed ones before the keyed ones.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The Blackfin CRC driver was removed by commit 9678a8dc53 ("crypto:
bfin_crc - remove blackfin CRC driver"), but it was forgotten to remove
the corresponding "hmac(crc32)" test vectors. I see no point in keeping
them since nothing else appears to implement or use "hmac(crc32)", which
isn't an algorithm that makes sense anyway because HMAC is meant to be
used with a cryptographically secure hash function, which CRC's are not.
Thus, remove the unneeded test vectors.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The __crc32_le() wrapper function is pointless. Just call crc32_le()
directly instead.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crc32c-generic sets an alignmask, but actually its ->update() works with
any alignment; only its ->setkey() and outputting the final digest
assume an alignment. To prevent the buffer from having to be aligned by
the crypto API for just these cases, switch these cases over to the
unaligned access macros and remove the cra_alignmask. Note that this
also makes crc32c-generic more consistent with crc32-generic.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crc32-generic doesn't have a cra_alignmask set, which is desired as its
->update() works with any alignment. However, it incorrectly assumes
4-byte alignment in ->setkey() and when outputting the final digest.
Fix this by using the unaligned access macros in those cases.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds optimized implementations of MORUS-640 and MORUS-1280,
utilizing the SSE2 and AVX2 x86 extensions.
For MORUS-1280 (which operates on 256-bit blocks) we provide both AVX2
and SSE2 implementation. Although SSE2 MORUS-1280 is slower than AVX2
MORUS-1280, it is comparable in speed to the SSE2 MORUS-640.
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds a common glue code for optimized implementations of
MORUS AEAD algorithms.
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds test vectors for MORUS-640 and MORUS-1280. The test
vectors were generated using the reference implementation from
SUPERCOP (see code comments for more details).
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the generic implementation of the MORUS family of AEAD
algorithms (MORUS-640 and MORUS-1280). The original authors of MORUS
are Hongjun Wu and Tao Huang.
At the time of writing, MORUS is one of the finalists in CAESAR, an
open competition intended to select a portfolio of alternatives to
the problematic AES-GCM:
https://competitions.cr.yp.to/caesar-submissions.htmlhttps://competitions.cr.yp.to/round3/morusv2.pdf
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds optimized implementations of AEGIS-128, AEGIS-128L,
and AEGIS-256, utilizing the AES-NI and SSE2 x86 extensions.
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds test vectors for the AEGIS family of AEAD algorithms
(AEGIS-128, AEGIS-128L, and AEGIS-256). The test vectors were
generated using the reference implementation from SUPERCOP (see code
comments for more details).
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the generic implementation of the AEGIS family of AEAD
algorithms (AEGIS-128, AEGIS-128L, and AEGIS-256). The original
authors of AEGIS are Hongjun Wu and Bart Preneel.
At the time of writing, AEGIS is one of the finalists in CAESAR, an
open competition intended to select a portfolio of alternatives to
the problematic AES-GCM:
https://competitions.cr.yp.to/caesar-submissions.htmlhttps://competitions.cr.yp.to/round3/aegisv11.pdf
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>