Commit Graph

887100 Commits

Author SHA1 Message Date
Horia Geantă 7e2b89fb4a crypto: caam - add support for i.MX8M Plus
Add support for the crypto engine used in i.mx8mp (i.MX 8M "Plus"),
which is very similar to the one used in i.mx8mq, i.mx8mm, i.mx8mn.

Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:11 +08:00
Jason A. Donenfeld f9e7fe32a7 crypto: x86/poly1305 - emit does base conversion itself
The emit code does optional base conversion itself in assembly, so we
don't need to do that here. Also, neither one of these functions uses
simd instructions, so checking for that doesn't make sense either.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:11 +08:00
Colin Ian King 2203d3f797 crypto: hisilicon - fix spelling mistake "disgest" -> "digest"
There is a spelling mistake in an error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:11 +08:00
Jason A. Donenfeld 72c7943792 crypto: chacha20poly1305 - add back missing test vectors and test chunking
When this was originally ported, the 12-byte nonce vectors were left out
to keep things simple. I agree that we don't need nor want a library
interface for 12-byte nonces. But these test vectors were specially
crafted to look at issues in the underlying primitives and related
interactions.  Therefore, we actually want to keep around all of the
test vectors, and simply have a helper function to test them with.

Secondly, the sglist-based chunking code in the library interface is
rather complicated, so this adds a developer-only test for ensuring that
all the book keeping is correct, across a wide array of possibilities.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:11 +08:00
Jason A. Donenfeld 1f68689953 crypto: x86/poly1305 - fix .gitignore typo
Admist the kbuild robot induced changes, the .gitignore file for the
generated file wasn't updated with the non-clashing filename. This
commit adjusts that.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:11 +08:00
Colin Ian King 48d625e4c4 tee: fix memory allocation failure checks on drv_data and amdtee
Currently the memory allocation failure checks on drv_data and
amdtee are using IS_ERR rather than checking for a null pointer.
Fix these checks to use the conventional null pointer check.

Addresses-Coverity: ("Dereference null return")
Fixes: 757cc3e9ff ("tee: add AMD-TEE driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Acked-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:11 +08:00
Gilad Ben-Yossef 38c0d0abf2 crypto: ccree - erase unneeded inline funcs
These inline versions of PM function for the case of CONFIG_PM is
not set are never used. Erase them.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:11 +08:00
Gilad Ben-Yossef bc88606ac0 crypto: ccree - make cc_pm_put_suspend() void
cc_pm_put_suspend() return value was never checked and is not
useful. Turn it into a void functions.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:10 +08:00
Gilad Ben-Yossef 33c4b31098 crypto: ccree - split overloaded usage of irq field
We were using the irq field of the drvdata struct in
an overloaded fahsion - saving the IRQ number during init
and then storing the pending itnerrupt sources during
interrupt in the same field.

This worked because these usage are mutually exclusive but
are confusing. So simplify the code and change the init use
case to use a simple local variable.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:10 +08:00
Gilad Ben-Yossef 15fd2566bf crypto: ccree - fix PM race condition
The PM code was racy, possibly causing the driver to submit
requests to a powered down device. Fix the race and while
at it simplify the PM code.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Fixes: 1358c13a48 ("crypto: ccree - fix resume race condition on init")
Cc: stable@kernel.org # v4.20
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:10 +08:00
Ofir Drang 5c83e8ec4d crypto: ccree - fix FDE descriptor sequence
In FDE mode (xts, essiv and bitlocker) the cryptocell hardware requires
that the the XEX key will be loaded after Key1.

Signed-off-by: Ofir Drang <ofir.drang@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:10 +08:00
Gilad Ben-Yossef 8b0c4366cb crypto: ccree - cc_do_send_request() is void func
cc_do_send_request() cannot fail and always returns
-EINPROGRESS. Turn it into a void function and simplify
code.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:10 +08:00
Gilad Ben-Yossef cedca59fae crypto: ccree - fix pm wrongful error reporting
pm_runtime_get_sync() can return 1 as a valid (none error) return
code. Treat it as such.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:10 +08:00
Gilad Ben-Yossef c7b31c88da crypto: ccree - turn errors to debug msgs
We have several loud error log messages that are already reported
via the normal return code mechanism and produce a lot of noise
when the new testmgr extra test are enabled. Turn these into
debug only messages

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:09 +08:00
Gilad Ben-Yossef 2a6bc713f1 crypto: ccree - fix AEAD decrypt auth fail
On AEAD decryption authentication failure we are suppose to
zero out the output plaintext buffer. However, we've missed
skipping the optional associated data that may prefix the
ciphertext. This commit fixes this issue.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Fixes: e88b27c8ea ("crypto: ccree - use std api sg_zero_buffer")
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:09 +08:00
Hadar Gat 684cf266eb crypto: ccree - fix typo in comment
Fixed a typo in a commnet.

Signed-off-by: Hadar Gat <hadar.gat@arm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:09 +08:00
Hadar Gat 509f2885a2 crypto: ccree - fix typos in error msgs
Fixed typos in ccree error msgs.

Signed-off-by: Hadar Gat <hadar.gat@arm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:09 +08:00
Tudor Ambarus b46f36c05a crypto: atmel-{aes,sha,tdes} - Retire crypto_platform_data
These drivers no longer need it as they are only probed via DT.
crypto_platform_data was allocated but unused, so remove it.
This is a follow up for:
commit 45a536e3a7 ("crypto: atmel-tdes - Retire dma_request_slave_channel_compat()")
commit db28512f48 ("crypto: atmel-sha - Retire dma_request_slave_channel_compat()")
commit 62f72cbdcf ("crypto: atmel-aes - Retire dma_request_slave_channel_compat()")

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:09 +08:00
Kees Cook 41419a2890 crypto: x86/sha - Eliminate casts on asm implementations
In order to avoid CFI function prototype mismatches, this removes the
casts on assembly implementations of sha1/256/512 accelerators. The
safety checks from BUILD_BUG_ON() remain.

Additionally, this renames various arguments for clarity, as suggested
by Eric Biggers.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:08 +08:00
Vinay Kumar Yadav e0437dc647 crypto: chtls - Fixed listen fail when max stid range reached
Do not return error when max stid reached, to Fallback to nic mode.

Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:08 +08:00
Vinay Kumar Yadav c9f0d33c36 crypto: chtls - Corrected function call context
corrected function call context and moved t4_defer_reply
to apropriate location.

Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:08 +08:00
Horia Geantă 53146d1525 crypto: caam/qi2 - fix typo in algorithm's driver name
Fixes: 8d818c1055 ("crypto: caam/qi2 - add DPAA2-CAAM driver")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22 16:21:07 +08:00
Geert Uytterhoeven ab3d436bf3 crypto: essiv - fix AEAD capitalization and preposition use in help text
"AEAD" is capitalized everywhere else.
Use "an" when followed by a written or spoken vowel.

Fixes: be1eb7f78a ("crypto: essiv - create wrapper template for ESSIV generation")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:15 +08:00
Zaibo Xu 63fabc87a0 crypto: hisilicon - add branch prediction macro
This branch prediction macro on the hot path can improve
small performance(about 2%) according to the test.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:15 +08:00
Zaibo Xu 92f0726d9c crypto: hisilicon - adjust hpre_crt_para_get
Reorder the input parameters of hpre_crt_para_get to
make it cleaner.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:15 +08:00
Zaibo Xu 02ab994635 crypto: hisilicon - Fixed some tiny bugs of HPRE
1.Use memzero_explicit to clear key;
2.Fix some little endian writings;
3.Fix some other bugs and stuff of code style;

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:15 +08:00
Zaibo Xu dfee9955ab crypto: hisilicon - Bugfixed tfm leak
1.Fixed the bug of software tfm leakage.
2.Update HW error log message.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:15 +08:00
Zaibo Xu 2f072d75d1 crypto: hisilicon - Add aead support on SEC2
authenc(hmac(sha1),cbc(aes)), authenc(hmac(sha256),cbc(aes)), and
authenc(hmac(sha512),cbc(aes)) support are added for SEC v2.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:15 +08:00
Zaibo Xu 473a0f9662 crypto: hisilicon - redefine skcipher initiation
1.Define base initiation of QP for context which can be reused.
2.Define cipher initiation for other algorithms.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:14 +08:00
Zaibo Xu b9c8d897a0 crypto: hisilicon - Add branch prediction macro
After adding branch prediction for skcipher hot path,
a little bit income of performance is gotten.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:14 +08:00
Zaibo Xu 310ea0ac72 crypto: hisilicon - Add callback error check
Add error type parameter for call back checking inside.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:14 +08:00
Zaibo Xu d6de2a5943 crypto: hisilicon - Adjust some inner logic
1.Adjust call back function.
2.Adjust parameter checking function.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:14 +08:00
Zaibo Xu 7c7d902aa4 crypto: hisilicon - Update QP resources of SEC V2
1.Put resource including request and resource list into
  QP context structure to avoid allocate memory repeatedly.
2.Add max context queue number to void kcalloc large memory for QP context.
3.Remove the resource allocation operation.
4.Redefine resource allocation APIs to be shared by other algorithms.
5.Move resource allocation and free inner functions out of
  operations 'struct sec_req_op', and they are called directly.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:14 +08:00
Zaibo Xu a181647c06 crypto: hisilicon - Update some names on SEC V2
1.Adjust dma map function to be reused by AEAD algorithms;
2.Update some names of internal functions and variables to
  support AEAD algorithms;
3.Rename 'sec_skcipher_exit' as 'sec_skcipher_uninit';
4.Rename 'sec_get/put_queue_id' as 'sec_alloc/free_queue_id';

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:14 +08:00
Zaibo Xu a718cfce06 crypto: hisilicon - fix print/comment of SEC V2
Fixed some print, coding style and comments of HiSilicon SEC V2.

Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:14 +08:00
Zaibo Xu ca0d158dc9 crypto: hisilicon - Update debugfs usage of SEC V2
Applied some advices of Marco Elver on atomic usage of Debugfs,
which is carried out by basing on Arnd Bergmann's fixing patch.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Marco Elver <elver@google.com>
Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:14 +08:00
Rijo Thomas 279c075dc1 tee: amdtee: remove redundant NULL check for pool
Remove NULL check for pool variable, since in the current
code path it is guaranteed to be non-NULL.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:14 +08:00
Rijo Thomas f9568eae92 tee: amdtee: rename err label to err_device_unregister
Rename err label to err_device_unregister for better
readability.

Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:13 +08:00
Rijo Thomas 2929015535 tee: amdtee: skip tee_device_unregister if tee_device_alloc fails
Currently, if tee_device_alloc() fails, then tee_device_unregister()
is a no-op. Therefore, skip the function call to tee_device_unregister() by
introducing a new goto label 'err_free_pool'.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:13 +08:00
Rijo Thomas f4c58c3758 tee: amdtee: print error message if tee not present
If there is no TEE with which the driver can communicate, then
print an error message and return.

Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:13 +08:00
Rijo Thomas 5ae63958a6 tee: amdtee: remove unused variable initialization
Remove unused variable initialization from driver code.

If enabled as a compiler option, compiler may throw warning for
unused assignments.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 757cc3e9ff ("tee: add AMD-TEE driver")
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:13 +08:00
Daniel Axtens 1372a51b88 crypto: vmx - reject xts inputs that are too short
When the kernel XTS implementation was extended to deal with ciphertext
stealing in commit 8083b1bf81 ("crypto: xts - add support for ciphertext
stealing"), a check was added to reject inputs that were too short.

However, in the vmx enablement - commit 2396684193 ("crypto: vmx/xts -
use fallback for ciphertext stealing"), that check wasn't added to the
vmx implementation. This disparity leads to errors like the following:

alg: skcipher: p8_aes_xts encryption unexpectedly succeeded on test vector "random: len=0 klen=64"; expected_error=-22, cfg="random: inplace may_sleep use_finup src_divs=[<flush>66.99%@+10, 33.1%@alignmask+1155]"

Return -EINVAL if asked to operate with a cryptlen smaller than the AES
block size. This brings vmx in line with the generic implementation.

Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206049
Fixes: 2396684193 ("crypto: vmx/xts - use fallback for ciphertext stealing")
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[dja: commit message]
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:13 +08:00
Herbert Xu a8bdf2c42e crypto: curve25519 - Fix selftest build error
If CRYPTO_CURVE25519 is y, CRYPTO_LIB_CURVE25519_GENERIC will be
y, but CRYPTO_LIB_CURVE25519 may be set to m, this causes build
errors:

lib/crypto/curve25519-selftest.o: In function `curve25519':
curve25519-selftest.c:(.text.unlikely+0xc): undefined reference to `curve25519_arch'
lib/crypto/curve25519-selftest.o: In function `curve25519_selftest':
curve25519-selftest.c:(.init.text+0x17e): undefined reference to `curve25519_base_arch'

This is because the curve25519 self-test code is being controlled
by the GENERIC option rather than the overall CURVE25519 option,
as is the case with blake2s.  To recap, the GENERIC and ARCH options
for CURVE25519 are internal only and selected by users such as
the Crypto API, or the externally visible CURVE25519 option which
in turn is selected by wireguard.  The self-test is specific to the
the external CURVE25519 option and should not be enabled by the
Crypto API.

This patch fixes this by splitting the GENERIC module from the
CURVE25519 module with the latter now containing just the self-test.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: aa127963f1 ("crypto: lib/curve25519 - re-add selftests")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:13 +08:00
Horia Geantă 2a2fbf20ad crypto: caam - add support for i.MX8M Nano
Add support for the crypto engine used in i.mx8mn (i.MX 8M "Nano"),
which is very similar to the one used in i.mx8mq, i.mx8mm.

Since the clocks are identical for all members of i.MX 8M family,
simplify the SoC <--> clock array mapping table.

Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Tested-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Reviewed-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:13 +08:00
Corentin Labbe 4b0ec91af8 crypto: sun8i-ce - remove dead code
Some code were left in the final driver but without any use.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:13 +08:00
Corentin Labbe 93d24ac4b2 crypto: sun8i-ce - fix removal of module
Removing the driver cause an oops due to the fact we clean an extra
channel.
Let's give the right index to the cleaning function.

Fixes: 06f751b613 ("crypto: allwinner - Add sun8i-ce Crypto Engine")
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:13 +08:00
Corentin Labbe 24775ac2fe crypto: amlogic - fix removal of module
Removing the driver cause an oops due to the fact we clean an extra
channel.
Let's give the right index to the cleaning function.
Fixes: 48fe583fe5 ("crypto: amlogic - Add crypto accelerator for amlogic GXL")

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:12 +08:00
Corentin Labbe 7b3d853ead crypto: sun8i-ss - fix removal of module
Removing the driver cause an oops due to the fact we clean an extra
channel.
Let's give the right index to the cleaning function.
Fixes: f08fcced6d ("crypto: allwinner - Add sun8i-ss cryptographic offloader")

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:12 +08:00
Jason A. Donenfeld 31899908a0 crypto: {arm,arm64,mips}/poly1305 - remove redundant non-reduction from emit
This appears to be some kind of copy and paste error, and is actually
dead code.

Pre: f = 0 ⇒ (f >> 32) = 0
    f = (f >> 32) + le32_to_cpu(digest[0]);
Post: 0 ≤ f < 2³²
    put_unaligned_le32(f, dst);

Pre: 0 ≤ f < 2³² ⇒ (f >> 32) = 0
    f = (f >> 32) + le32_to_cpu(digest[1]);
Post: 0 ≤ f < 2³²
    put_unaligned_le32(f, dst + 4);

Pre: 0 ≤ f < 2³² ⇒ (f >> 32) = 0
    f = (f >> 32) + le32_to_cpu(digest[2]);
Post: 0 ≤ f < 2³²
    put_unaligned_le32(f, dst + 8);

Pre: 0 ≤ f < 2³² ⇒ (f >> 32) = 0
    f = (f >> 32) + le32_to_cpu(digest[3]);
Post: 0 ≤ f < 2³²
    put_unaligned_le32(f, dst + 12);

Therefore this sequence is redundant. And Andy's code appears to handle
misalignment acceptably.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:12 +08:00
Jason A. Donenfeld d7d7b85356 crypto: x86/poly1305 - wire up faster implementations for kernel
These x86_64 vectorized implementations support AVX, AVX-2, and AVX512F.
The AVX-512F implementation is disabled on Skylake, due to throttling,
but it is quite fast on >= Cannonlake.

On the left is cycle counts on a Core i7 6700HQ using the AVX-2
codepath, comparing this implementation ("new") to the implementation in
the current crypto api ("old"). On the right are benchmarks on a Xeon
Gold 5120 using the AVX-512 codepath. The new implementation is faster
on all benchmarks.

        AVX-2                  AVX-512
      ---------              -----------

    size    old     new      size   old     new
    ----    ----    ----     ----   ----    ----
    0       70      68       0      74      70
    16      92      90       16     96      92
    32      134     104      32     136     106
    48      172     120      48     184     124
    64      218     136      64     218     138
    80      254     158      80     260     160
    96      298     174      96     300     176
    112     342     192      112    342     194
    128     388     212      128    384     212
    144     428     228      144    420     226
    160     466     246      160    464     248
    176     510     264      176    504     264
    192     550     282      192    544     282
    208     594     302      208    582     300
    224     628     316      224    624     318
    240     676     334      240    662     338
    256     716     354      256    708     358
    272     764     374      272    748     372
    288     802     352      288    788     358
    304     420     366      304    422     370
    320     428     360      320    432     364
    336     484     378      336    486     380
    352     426     384      352    434     390
    368     478     400      368    480     408
    384     488     394      384    490     398
    400     542     408      400    542     412
    416     486     416      416    492     426
    432     534     430      432    538     436
    448     544     422      448    546     432
    464     600     438      464    600     448
    480     540     448      480    548     456
    496     594     464      496    594     476
    512     602     456      512    606     470
    528     656     476      528    656     480
    544     600     480      544    606     498
    560     650     494      560    652     512
    576     664     490      576    662     508
    592     714     508      592    716     522
    608     656     514      608    664     538
    624     708     532      624    710     552
    640     716     524      640    720     516
    656     770     536      656    772     526
    672     716     548      672    722     544
    688     770     562      688    768     556
    704     774     552      704    778     556
    720     826     568      720    832     568
    736     768     574      736    780     584
    752     822     592      752    826     600
    768     830     584      768    836     560
    784     884     602      784    888     572
    800     828     610      800    838     588
    816     884     628      816    884     604
    832     888     618      832    894     598
    848     942     632      848    946     612
    864     884     644      864    896     628
    880     936     660      880    942     644
    896     948     652      896    952     608
    912     1000    664      912    1004    616
    928     942     676      928    954     634
    944     994     690      944    1000    646
    960     1002    680      960    1008    646
    976     1054    694      976    1062    658
    992     1002    706      992    1012    674
    1008    1052    720      1008   1058    690

This commit wires in the prior implementation from Andy, and makes the
following changes to be suitable for kernel land.

  - Some cosmetic and structural changes, like renaming labels to
    .Lname, constants, and other Linux conventions, as well as making
    the code easy for us to maintain moving forward.

  - CPU feature checking is done in C by the glue code.

  - We avoid jumping into the middle of functions, to appease objtool,
    and instead parameterize shared code.

  - We maintain frame pointers so that stack traces make sense.

  - We remove the dependency on the perl xlate code, which transforms
    the output into things that assemblers we don't care about use.

Importantly, none of our changes affect the arithmetic or core code, but
just involve the differing environment of kernel space.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Samuel Neves <sneves@dei.uc.pt>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:12 +08:00